All entries for Friday 30 July 2004
July 30, 2004
With our new server up and running, it was time this week to start thinking a bit more about the performance of blogbuilder.
With only 50/60 blogs and 1300 entries, performance isn't really a big problem at the moment, some pages would be slow occasionally, but nothing too bad. However, we have to plan for the big start of year influx of students (hopefully all wanting a blog).
In my little test scenario, I am imagining 1000 blogs with between 0 and 30 entries on each in the first term. There really is no way of knowing if these figures are realistic or not, but it will have to do as a starting point. Maybe we'll only have 100 blogs, maybe we'll have 10000.
So, having loaded up the new database with 1000 blogs and 15000 entries, I set about trying to render a page on the new server. 30 seconds and an OutOfMemoryException later I knew I had some work to do :)
I use Hibernate for the persistence in blogbuilder and that does a really good job of hiding away the tediousness of writing SQL, however, if you don't get your relationships right and let it cascade too far through the object graph, things can get out of hand. On the very first request of the home page, it tried to load more of less the entire database…oops. This was partly because the test data was linked to just 10 users with 100 blogs each rather than 1000 users with a blog each…but still…not good.
Until no I'd avoided lazy-loading collections in hibernate because I didn't need them, the dataset was just not big enough to pose any problems…not any more. So, by adding lazy-loading to every collection in the system, the performance was hugely improved, but only after changing to using Spring's OpenSessionInViewFilter which enables objects to keep loading even in jsps. It's not the most elegant approach, but it is very efficient.
Having fixed that nasty problem, at least pages were loading now, but not particularly fast. Time for the profiler.
With JProfiler you can really easily profile JBoss or any java application. My main concern here was processor time rather than memory now.
It turns out that just 3/4 areas needed tweaking.
- The stats box on the front page counting the number of blogs and entries was doing "select *" rather than a "select count" type of query, so although Hibernate had everything cached, it still have to go through 1000 blogs. Not any more.
- The comment counts in the hover-tips on the front page were also very slow. I've not completely got to the bottom of this, but as it is not the most important feature in the world, it is just gone for now.
- The calendar widget on aggregate pages was very inefficient for large numbers of entries. On an individual blog where there are likely to be more days than entries, say 31 days and just 10 entries, it is more efficient to just get all the entries in that range and mark the days with entries on. However, on an aggregate of potentially 1000's of entries in a month, this just doesn't work! So, on an aggregate, each day is checked and counted instead of the entries.
- The last bottleneck which was still keeping big pages down to the 4/5 seconds rendering mark was the Textile-HTML markup rendering. On an individual entry, Textile-HTML might only take 0.1s, but when you have 20/30 entries on an aggregate page all with a couple of thousand characters, that can really add up. So, time to cache the rendered mark up, after all, it doesn't change often once you've created your entry. So as a result, the database now contains both a Textile version of your entry and an HTML version, thus letting Hibernate deal with the caching for me, without me having to implement a caching layer on top of the Textile renderer.
I was surprised by some of the finding of the profiler, they were certainly not the first places I would have looked. It certainly saved me a hell of a lot of time. It also reiterates to me what I read a while back.
Do not worry about performance until you have to, because you just never know where your bottlenecks are going to be.
Obviously you don't want to go doing anything stupid, but it's not worth wasting your time tweaking something that is never going to be a problem.
All of that adds up to a blog home page in front of a database with 1000 blogs and 15000 entries rendering in 0.4s
I'll be doing load tests next week to see how it manages under some heavy user load.
Writing about web page http://developer.apple.com/internet/webcontent/xmlhttpreq.html
John D found a really neat feature of someones blog yesterday, Andy Budd's blog has a live textile preview function. As you type your textile markup into the textarea, below it shows what it will really look like without having to submit the form. How clever is that?
Possibilities are endless, only question is…how does it scale? I can't imagine it can be all that efficient to send a web request for each character typed into a comment field!