All entries for February 2006
February 28, 2006
I recently did a bit of work to make a nice little AJAX/DHTML user picker for SiteBuilder2. It is basically an in page popup that allows quick searching of Warwick users by first and last names to find their usercode. This is useful for helping people work out usercodes for permissions and properties pages and such.
One problem was that it was a touch slow, especially for very broad searches such as everyone with a first name starting with K and last name starting with S.
In LDAP terms, we were doing the following:
This works just fine and always used to return around 300 users. However, we always had to check for any expired accounts after the results were returned. Because account expiry was not very well populated in the past, only a few out of those 300 would be filtered out. However, after the recent tidying up of the directory due to password resets there are now many many more disabled accounts in NDS (our directory), which is a good thing. Now we can do this:
NamingEnumeration searchAnswer = ctx.search("o=Warwick", "(&(givenName=K*)(sn=S*))", sctls);
So only people matching the first name and last name searches who also do not have a logindisabled attribute. This now returns just 97 results and is around twice as fast meaning out user picker searches should be much faster from now on.
NamingEnumeration searchAnswer = ctx.search("o=Warwick", "(&(givenName=K*)(sn=S*)(!(logindisabled=*)))", sctls);
February 26, 2006
Writing about web page http://blogs.warwick.ac.uk/kieranshaw/gallery/wedding_stuff/
After a couple of months of trying to make our minds up, we have finally decided and paid the deposit on our wedding venue. We'll be having our wedding on the 25th August 2007 at The Mount in Wolverhampton.
What clinched it for us was the main room where we'll be having the civil ceremony and the wedding breakfast. The Great Hall is a beautiful double height, oak paneled and galleried room that beats most other venues hands down. We visited it again today for one final look and to hand over our deposit cheque and came away very happy that it is the right place for our big day. The photos look busy because there was a bit of a wedding fair on today as well with a few stalls and stuff which meant it had quite a wedding feel about it too…perfect.
It is a bit of a pain that it is about an hour away, but we just couldn't find somewhere close by that quite matched it for the money. This now gives us about a year and a half to keep saving and preparing and getting excited.
P.S. I really need to get a new camera, these pictures are so grainy (and it isn't just BlogBuilder resizing them badly either!
February 24, 2006
Writing about web page http://www.joelonsoftware.com/articles/Unicode.html
When you're dealing with reading data from various sources and then end up doing some processing on it and display it on the web, most of the time you don't worry about character encoding. However, occasionally it comes along and bites you.
I always used to know that there were different character encodings and you could end up not displaying international characters properly if you used the wrong type and so on, but I didn't really know about it in depth. This is where good old Joel comes in. He wrote an article a while back entitled:
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets . He does a pretty good job of explaining things.
My specific problem was an international students name was coming out of our directory (NDS) like this H??hner. It turns out they are actually called Hühner. So that one character was being turned into ??. No good. Usually I just say "oh, some character encoding problem" and give up. But sadly I was determined to get to the bottom it. Upon closer inspection, the ?? were an artifact of appear on the web (different encoding again), but in my java code, their name was: H├╝hner. Nice.
Doing an ethereal trace on the traffic to my machine when I queried NDS for this person, I saw that:
48 e2 94 9c e2 95 9d 68 6e 65 72
seemed to represent our users name. This is hex and having a look at some character encoding charts, it turns out that this is UTF-8. Is there an easy way of fiddling about with different encoding in java…not that I can find. So, following the instructions on UTF-8 encodings from here I worked out that in Unicode that UTF-8 sequence is:
0x48, 0x251C, 0x255D, 0x68, 0x6e, 0x65, 0x72
Which does indeed turn into H├╝hner. So, nothing was wrong in my code and it proved that NDS was storing something obsure. Pleasingly, a quick email to our friendly systems team with this evidence and they got it fixed and are now going through the directory trying to fix bad entries and work out where this strange encoding is coming from. Hopefully our international students will soon no longer be seeing their names scrambled :)
Geek talk over.
February 21, 2006
Writing about web page http://www.cipr.co.uk/prideawards/midlands/welcome.asp
Seeing as no one else has got around to mentioning it, I may as well…
The Warwick Blogs marketing campaign won two awards at the CIPR Midlands PRide Awards 2005/6 on Friday night. Not having my camera, I can't show you any pictures of the Warwick bunch in all their black tie and gown glory, but here are a few thumbnails photos from the official photographer.
Use of Photography, Design or New Media
- Gold – Warwick University, Warwick Blogs
- Silver – The Bright Consultancy, Cable Guy
- Finalist – Derbyshire County Council, “Face of B-Line” Bus
- Finalist – Neon Communications, Ten4 Magazine
- Finalist – Seal Communications, Well Constructed!
In House Campaign
- Gold – University of Warwick, Launching the University of Warwick in London
- Silver – University of Warwick, Warwick Blogs Silver
The great one to win was the "Use of Photography, Design or New Media" because it was up against quite a few people (rather than just other stuff from Warwick), including some proper PR consultancies. So, congrats especially to Hannah and Karen for all their hard work and great design skills. Hopefully John, Casey or Karen will put up some proper photos soon.
PR and marketing is not something that generally gets done well in a university environment, especially for IT projects. All too often departments will work hard on projects only for them to fall short of their potential through bad PR, marketing and advertising. The challenge for the future is to keep up the profile of the services that ITS offer so that our users know about all the great stuff we can provide them.
February 16, 2006
Writing about web page http://blogs.ittoolbox.com/bi/confessions/archives/007715.asp
If you're in IT, take a read of the whole thing…in brief though:
- Bad Technology is Your Fault
- Users aren't Born Stupid, You Train Them to be That Way
- You want to make your system easier to use than to not use.
- If the Solution Seems Too Simple, Use It
- Eliminate Jobs – Everywhere
- Make People Better
- Keep a Junior Nearby
- Understand the Good of "Good Enough"
- Respect the Database for What it is
- The 3-by Rule
As I finally got jProfiler working on a live load of BlogBuilder the other day, I've been finding bottlenecks where you'd not really expect them.
It's one thing to profile a bit of code or a few requests and see where problems might be, but profiling real load for 10 minutes or so gives a much better big picture. Without the efficiency of jProfiler 4 and Java 5, doing this on a live system would not be practical…but thankfully it worked. I didn't leave it running long because inevitably it increased the load on the system considerably, but it was bearable.
If you've got a bad bit of code, but it hardly ever gets called, you can let it slide. If you've got bad code that gets hit a lot…you've got a problem.
Most of the code that we've now optimised wasn't a problem as such before, but with the hibernate bug I mentioned in my last post, we're trying to reduce querying more aggresively than we were before.
The database person in me says that I want a nicely normalised database, but sometimes it's just not efficient. We have now rolled up comment, trackback and image counts as these were disproportionately expensive counts.
What makes doing these roll-ups a pain in a lot of places is the permissions system as we generally serve a completely personalised page to everyone. We do however have a "publcly viewable" flag for many objects which makes things a bit easier for non-logged in views.
The other thing that was surprisingly expensive is our textile renditions. We use textile in a lot of places to turn textile markup into html. These conversions range from one line titles to entries with thousands of words. We have always cached the converted textile->html for entries as these are large chunks of text. However, we have not until today (thanks to some coding done by Mat today) cached the converted text of comments. Even now with those cached, it is tempting to do a textile-lite that doesn't do the full parse of every little string (there are just too many bloody regexs), but just things like bold and italics.
February 15, 2006
Looks like my efforts to make BlogBuilder more efficient by caching more queries has caused me some unforseen trouble.
We had what appeared to be deadlocking in the application the other day. Doing a thread dump (thank god for "kill -3"), we saw that all of the threads were blocked in SoftLimitMRUCache.
Turns out it is a bug Concurrent access issues with both SoftLimitMRUCache and SimpleMRUCache
We were hitting the caches so much that a subtle hibernate bug appeared. We'll await a fix, but in the meantime I'm trying to optimise BlogBuilder in other ways.
- Roll up data rather than do live counts, such as comment, trackback and image counts
- Better indexes and improved query efficiency so that I don't have to cache the queries
- Profiling like crazy to find the hotspots. I recently got JProfiler working on a live instance of JBoss running BlogBuilder. It shed a lot of light on where our real bottlenecks are…and as usual, they are not where you expect.
My hopes when I was building BlogBuilder was that I could make everything dynamic and live for every user as this would provide a more personal and dynamic experience. Sadly this is not terribly efficient and I'm having to start to be a bit more pragmatic in where it is really neccessary to do live checks rather than static data.
February 08, 2006
First post of the year (bad boy)...
I've spent a lot of this year so far jumping between lots of different things. I've started dipping into the new SiteBuilder code which is far more familiar as it is now Spring/Hibernate based rather than Struts/EJB.
I've also as usual been working on Single Sign-On and BlogBuilder.
As the complexity of BlogBuilder grows and our page views grows (now averaging more than 50,000 proper real people page views per day), it has become more and more important to optimise BlogBuilder for better performance.
Hongfeng our resident Oracle expert pointed me in the direction of quite a lot of particularly bad and slow pieces of SQL that were being generated out of BlogBuilder. The problem with BlogBuilder is that it is very very dynamic. We do not serve any static pages as every single page is customised to the currently logged in user as every blog/image/entry has its own permissions. There are also just a lot of different views on the blogs data; daily views, monthly views, favourites views, entries by tags, blogs by group, images by day, etc…
When using Hibernate 2 I did most of these queries with HQL, and it worked quite well, but I'm starting to feel the strain as some of the queries got more and more complicated.
With Hibernate 3 I can now take advantage of the Criteria API, which is quite nice for building complicated queries, but it still has some problems so I've now got a mix of HQL, Criteria and plain old SQL when a particularly complicated aggregation is needed.
Don't forget to turn on query caching and specifically tell your criteria and queries to cache as although the documentation says that for most queries caching doesn't make much difference, I've found it can make a huge difference.
Another little trick is to be careful with date range queries. If you want to do something like find items based on the current time, round your time to the nearest hour or minute rather than passing in a date with second or millisecond accuracy as this will prevent those queries being cached for more than a second…not a lot of good.
Another trick when moving from Hibernate 2 to Hibernate 3 is that you used to have to do "query.iterate().next()" to get a result when you knew there was just a single result (such as a count query), but now there is the uniqueResult() method. It is important to switch over because the uniqueResult() calls get cached, but the iternate().next() ones don't.