Denormalising and rolling up data for performance reasons
As I finally got jProfiler working on a live load of BlogBuilder the other day, I've been finding bottlenecks where you'd not really expect them.
It's one thing to profile a bit of code or a few requests and see where problems might be, but profiling real load for 10 minutes or so gives a much better big picture. Without the efficiency of jProfiler 4 and Java 5, doing this on a live system would not be practical…but thankfully it worked. I didn't leave it running long because inevitably it increased the load on the system considerably, but it was bearable.
If you've got a bad bit of code, but it hardly ever gets called, you can let it slide. If you've got bad code that gets hit a lot…you've got a problem.
Most of the code that we've now optimised wasn't a problem as such before, but with the hibernate bug I mentioned in my last post, we're trying to reduce querying more aggresively than we were before.
The database person in me says that I want a nicely normalised database, but sometimes it's just not efficient. We have now rolled up comment, trackback and image counts as these were disproportionately expensive counts.
What makes doing these roll-ups a pain in a lot of places is the permissions system as we generally serve a completely personalised page to everyone. We do however have a "publcly viewable" flag for many objects which makes things a bit easier for non-logged in views.
The other thing that was surprisingly expensive is our textile renditions. We use textile in a lot of places to turn textile markup into html. These conversions range from one line titles to entries with thousands of words. We have always cached the converted textile->html for entries as these are large chunks of text. However, we have not until today (thanks to some coding done by Mat today) cached the converted text of comments. Even now with those cached, it is tempting to do a textile-lite that doesn't do the full parse of every little string (there are just too many bloody regexs), but just things like bold and italics.