It’s hard to like the PHP ‘Elite’ *UPDATE from Terry

8 Jan

The latest spat between the rails elite and php elite is over a recent blog post by DHH, Mr. Moore gets to punt on sharding.

His argument can be summed up in a quote found in the comments:

“We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil”

- Donald Knuth

Basically, don’t jump into complex sharding/horizontal scaling techniques if you really don’t have to (yet).

Of course, Terry Chay has to take this opportunity to insert the fact that he works at a big company with tens of millions of daily page views and that vertical scaling is idiotic, blah blah blah.  (looks like he took down his blog post or I would link to it).

For every speaking engagement in which I’ve saved someone from a huge architectural misconception, DHH is probably creating 10 more future programmers who will make the mistake.

Damn, as a PHP developer I really want to like the PHP ‘Elite’ but this guy just doesn’t get it.  He works at his big desk at a big company and speaks at big PHP conventions and doesn’t have a clue what it takes to bootstrap a business from the ground up.

*UPDATE* – It appears Terry’s post got published before he could finish his thoughts.  He’s responded with the rest of it in the comments.  It’s a lot more well thought-out than what got published – so give it a read!

DHH’s point was that when your development team can fit in a VW Beetle you are going to cause a lot more trouble pre-optimizing with complex setups than simply relying on vertical scaling for as long as you can.

Forget who he is or what programming language he uses, that’s just damn good advice for anyone trying to bootstrap a startup.

  • Oh yeah, I the company I work at has around 25 developers, Plaxo had around 30, MyCasa had 4, Qixo had about 6, ZipAsia had about 20.

    Data was put into shards in four of those five. Sharding was a post hoc decision (the site had already failed) in three of those.

    As for the “big desk”, well, since I spec’d the desk used by every engineer on the floor so it’s no bigger than theirs. In fact it’s one of the smallest desks you can buy from Ikea, and the only reason it’s 4 inches wider than the smallest is because when you mash these desks together, you’d lose those four inches anyway.

    If there is one takeaway from the post-crash “Web 2.0” world it’s that sites can get very big before you have to buy that second VW beetle. ;-)
  • Ouch, that's unfortunate that it got posted before you could finish your train of thought. I've redacted my "this guy doesn't get it" statement - the rest of your argument is a lot more thought out. Are you going to pub it on your blog? Won't get heard here.
  • No I totally agree that it would seem that I didn't get it from the draft you read. Over half the things written were snippets to random comments I never posted.

    Yes, I'll finish the article and publish after I chill out and get some perspective. If you write something mean and mean it, it's hurtful. If you write something mean and don't mean it, it's playful. :-D
  • That shouldn't have been posted. I apologize.

    Normally I start writing all the things that get me angry, and then tame it down. I had the same draft open in two windows so when I accidentally posted it in one and corrected it in the other immediately, the autosave in the first (I think) reposted it). :-(

    I'm really sorry for that because what I wrote and what I meant at that stage differ in both coherence and tact.

    But, I *do* get it. Here's the part that didn't get posted at the time of this writing:

    If the site is a vertical or an enterprise, You Ain’t Gonna Need It and therefore horizontal partitioning is premature optimization.

    If the site is a consumer-facing viral-growth based company, you are definitely going to need it so it’s not premature to buffer against explosive growth, it’s common sense.

    A site based on the virality Web 2.0 model then it inherently runs on a hockey stick growth curve. This means that it you need to double in size in less than a month—that rate has happened at least four different times in Tagged's history and EVERY account of EVERY successful Web 2.0 company has had similar accounts of growth doublings occurring faster than an 18 month timescale provided by even the most generous definition from Moore’s law. Running your servers hot, in the face of this model is irresponsible.

    The reason that 37signals doesn't face the problem is because 37signals has never once experienced viral growth. My guess is it is because their product is a vertical in which social networking e-mail spam dynamics aren't effective. Another possibility is that their pay-for model limits their growth either intentionally or unintentionally. The last possibility is that it’s post hoc rationalization for the many missed opportunities from viral growth that never was sustained kept because the site fell to its knees (aka was “slashdotted”).

    But the pay-for niche argument is the rarity in the Web 2.0 world, NOT a Tagged or a Digg. Even Web 2.0 sites of modest success have seen this problem dynamic, that’s why there are whole lectures on, “surviving the hockey stick” and not a single one doesn’t mention sharding and shared-nothing.

    As for my defense of working for a site with hundreds of millions of dynamic pages a day, well that means that I know what it’s like to be able to scale, NOT DHH. So if I (and every other programmer who works on prominent Web 2.0 websites like Don MacAskill at SmugMug or Jeremy Zawodny at Yahoo/Craigslist) were to say that scaling isn’t hard or that for a Web 2.0 company should do logical horizontal partitioning sooner than later, then maybe the onus is on DHH to prove us otherwise.

    Because I bet that DHH hasn’t “survived the hockey stick” and that Donald Knuth, if educated on the realities of viral growth, would use that quote to defend his actions.
blog comments powered by Disqus