All 5 entries tagged Hibernate
View all 12 entries tagged Hibernate on Warwick Blogs | View entries tagged Hibernate at Technorati | There are no images tagged Hibernate on this blog
February 16, 2006
As I finally got jProfiler working on a live load of BlogBuilder the other day, I've been finding bottlenecks where you'd not really expect them.
It's one thing to profile a bit of code or a few requests and see where problems might be, but profiling real load for 10 minutes or so gives a much better big picture. Without the efficiency of jProfiler 4 and Java 5, doing this on a live system would not be practical…but thankfully it worked. I didn't leave it running long because inevitably it increased the load on the system considerably, but it was bearable.
If you've got a bad bit of code, but it hardly ever gets called, you can let it slide. If you've got bad code that gets hit a lot…you've got a problem.
Most of the code that we've now optimised wasn't a problem as such before, but with the hibernate bug I mentioned in my last post, we're trying to reduce querying more aggresively than we were before.
The database person in me says that I want a nicely normalised database, but sometimes it's just not efficient. We have now rolled up comment, trackback and image counts as these were disproportionately expensive counts.
What makes doing these roll-ups a pain in a lot of places is the permissions system as we generally serve a completely personalised page to everyone. We do however have a "publcly viewable" flag for many objects which makes things a bit easier for non-logged in views.
The other thing that was surprisingly expensive is our textile renditions. We use textile in a lot of places to turn textile markup into html. These conversions range from one line titles to entries with thousands of words. We have always cached the converted textile->html for entries as these are large chunks of text. However, we have not until today (thanks to some coding done by Mat today) cached the converted text of comments. Even now with those cached, it is tempting to do a textile-lite that doesn't do the full parse of every little string (there are just too many bloody regexs), but just things like bold and italics.
February 15, 2006
Looks like my efforts to make BlogBuilder more efficient by caching more queries has caused me some unforseen trouble.
We had what appeared to be deadlocking in the application the other day. Doing a thread dump (thank god for "kill -3"), we saw that all of the threads were blocked in SoftLimitMRUCache.
Turns out it is a bug Concurrent access issues with both SoftLimitMRUCache and SimpleMRUCache
We were hitting the caches so much that a subtle hibernate bug appeared. We'll await a fix, but in the meantime I'm trying to optimise BlogBuilder in other ways.
- Roll up data rather than do live counts, such as comment, trackback and image counts
- Better indexes and improved query efficiency so that I don't have to cache the queries
- Profiling like crazy to find the hotspots. I recently got JProfiler working on a live instance of JBoss running BlogBuilder. It shed a lot of light on where our real bottlenecks are…and as usual, they are not where you expect.
My hopes when I was building BlogBuilder was that I could make everything dynamic and live for every user as this would provide a more personal and dynamic experience. Sadly this is not terribly efficient and I'm having to start to be a bit more pragmatic in where it is really neccessary to do live checks rather than static data.
February 08, 2006
First post of the year (bad boy)...
I've spent a lot of this year so far jumping between lots of different things. I've started dipping into the new SiteBuilder code which is far more familiar as it is now Spring/Hibernate based rather than Struts/EJB.
I've also as usual been working on Single Sign-On and BlogBuilder.
As the complexity of BlogBuilder grows and our page views grows (now averaging more than 50,000 proper real people page views per day), it has become more and more important to optimise BlogBuilder for better performance.
Hongfeng our resident Oracle expert pointed me in the direction of quite a lot of particularly bad and slow pieces of SQL that were being generated out of BlogBuilder. The problem with BlogBuilder is that it is very very dynamic. We do not serve any static pages as every single page is customised to the currently logged in user as every blog/image/entry has its own permissions. There are also just a lot of different views on the blogs data; daily views, monthly views, favourites views, entries by tags, blogs by group, images by day, etc…
When using Hibernate 2 I did most of these queries with HQL, and it worked quite well, but I'm starting to feel the strain as some of the queries got more and more complicated.
With Hibernate 3 I can now take advantage of the Criteria API, which is quite nice for building complicated queries, but it still has some problems so I've now got a mix of HQL, Criteria and plain old SQL when a particularly complicated aggregation is needed.
Don't forget to turn on query caching and specifically tell your criteria and queries to cache as although the documentation says that for most queries caching doesn't make much difference, I've found it can make a huge difference.
Another little trick is to be careful with date range queries. If you want to do something like find items based on the current time, round your time to the nearest hour or minute rather than passing in a date with second or millisecond accuracy as this will prevent those queries being cached for more than a second…not a lot of good.
Another trick when moving from Hibernate 2 to Hibernate 3 is that you used to have to do "query.iterate().next()" to get a result when you knew there was just a single result (such as a count query), but now there is the uniqueResult() method. It is important to switch over because the uniqueResult() calls get cached, but the iternate().next() ones don't.
October 06, 2005
I had to clear up some old bad data that was left over from a bit of bad code. Unfortunately the bad data didn't rear its ugly head until recently and a lot of data had built up. It was also very hard to detect the bad data because of many places it could be referenced from…only if it had no references to it from any of 7 places would it need to be deleted.
This means doing a really horrible query either like this:
select id from atable where id not in (select id from anothertable) and id not in (select id from yetanothertable) and id not in (select id from moretables) ..... .....
This is very, very, very slow.
A more efficient way of doing this is this:
select id from atable a where not exists (select id from anothertable b where a.id = b.id) and not exists (select id from yetanothertable c where a.id = c.id) and not exists (select id from moretables d where a.id = d.id) ..... .....
However, when you are dealing with potentially 100,000's of rows it is still quite slow…but it does get there. The next problem is actually deleting the data once you've managed to select it. As a little test I thought I'd try and delete the whole lot, but that just didn't work…too slow. Even if I did have the patience to leave it running for hours, I couldn't let it lock up the database like that for that long.
So, the only solution was to do it in batches. I wrote a quick java program that would iterate through an do the deletes in small batches of a 100 or so at a time. My first mistake was trying to reuse some Spring/Hibernate code I already had instead of going straight to old school and using JDBC.
Although in theory you can get a Connection object from the hibernate Session, via session.connection(), it really is NOT the same as just getting a good old fashioned JDBC connection. The deletes were taking absolutely ages, so I profiled it and noticed that hibernate was still trying to do some of its funky stuff in the background, really slowing things down.
Plan B (or is it D by now?). Spring comes with a handy little JdbcTemplate which lets you do real JDBC but without a lot of the exception and connection/statement/resultset closing pains. Finally…it worked.
So, lesson of the day:
- not exists type queries are faster than not in queries
- Bulk deletes can be verrrrrrry slow
- Batching deletes is better, but with real JDBC not hibernate SQL calls
October 04, 2005
Chris wrote about this a while back and I dutifully investigated and started using this beautifully simple bit of configuration to get hibernate statistics showing up in the JMX console for a couple of web applications. This all seemed fine and well for a while and I never really looked at it further.
However, we recently had some performance problems with BlogBuilder and I was left scratching my head. After a bit of investigation, it looked like the hibernate statistics were turned on when I thought they were off. This shouldn't really matter, but it turns out it was logging and keeping every single SQL statement that hibernate had run since the last application restart! It shows all this SQL in the JMX console, but my browser blew up trying to render the page because it was trying to dump out 100,000's of lines of SQL to the page :(
If it really was keeping a log like this it could well be the cause of a performance problem, so I turned it off and I think so far the performance has been better…I'll know for sure tonight.
So…word of warning.
Only turn on the hibernate statistics if you really need them