All 19 entries tagged Programming

View all 125 entries tagged Programming on Warwick Blogs | View entries tagged Programming at Technorati | There are no images tagged Programming on this blog

June 01, 2006

Web groups and SSO integration for our web apps

I've recently been working on improving our Web Groups system. This is a central system that allows users to create their own arbitrary groups of Single Sign On users. These groups are then exposed through some web services which allow our other web apps to use them as the basis of permissions or grouping in whatever way they see fit.

Along with SSO, Web Groups is one of the systems that really helps us build very powerful systems with very easy and fine grained permissions…without having to actually do much work in each of those applications.

Web Groups now includes all sorts of groupings now such as:

  • Students in a department
  • Teaching staff in a department
  • Students going a particular course
  • A full or part time students in a department
  • Students in a particular year of a course
  • Tutor groups

All of this data is automatically pulled in from our Academic Data Store (ADS) project. This means that the data is always up to date. Previously if someone wanted to protect say a SiteBuilder page so that only people doing that module can see it, they had to find and keep up to date a list of the ITS usercodes of all students on that module. Now they just need to enter a group name and it'll be kept up to date for them.

If our groups are not good enough, people can make their own groups. So for instance you could create a group that is all the students on a module plus the 3 staff involved with that module. Again, this will all stay up to date as the students on that course change, even at the start of a new year.

We currently have a similar system in BlogBuilder, but we'll be moving over to this new system soon as it is more reliable, powerful and just plain faster.

By using these shared services such as SSO and Web Groups, we can build much more integrated and powerful solutions that we just probably couldn't get from an external vendor.

May 17, 2006

HTTPS Basic Auth RSS feeds for system monitoring

Most applications log messages out to a log file somewhere on a server but they are a pain to look at. You could setup log4j to append messages via email, but to me it seems unreliable. Also, you have to decide who is going to get these emails and they can be quite invasive if you don't always need to read them all.

RSS has been a great step forward in opting into information on the web rather than emails. So, why not do the same for system messages.

My particular use case involves some data from our Web Sign On system and these messages are quite sensitive so it is no good just publishing a public RSS feed.

My solution looks something like this:

1) I have a listener class within SSO that monitors activity and logs it in the usual way.
2) I have another listener that receives messages from the logging listener that looks for unusual activity, such as repeated login failures or lots of requests for the same IP address. When it finds something unusual, it puts an entry into the feed that will be displayed to the admin user. This is done with the Rome Atom/RSS java utilities project. This is a great open source project that allows you to easily create/read feeds in all different formats.

SyndEntry entry;
SyndContent description;
entry = new SyndEntryImpl();
entry.setTitle("Warning for user " + user);
entry.setPublishedDate(new Date());
entry.setUri("" + entry.getPublishedDate().getTime());
description = new SyndContentImpl();
description.setValue(logMessage + "<br><br>" + authFailures);



These SyndEntry's are generic enough to be turned into any kind of feed, be it Atom, RSS 2.0 or RSS 1.0.

3) I then have a controller that pulls all of those messages and puts them in a feed for that admin user view view:

SyndFeed feed = new SyndFeedImpl();

feed.setTitle("SSO brute force warning log");
feed.setDescription("This feed shows warnings when users repeatedly fail to login");


response.setContentType("application/xml; charset=UTF-8");
SyndFeedOutput output = new SyndFeedOutput();
output.output(feed, response.getWriter());

4) This page is protected by our SSOClientFilter. This will allow HTTP Basic Auth, but only over SSL. As I don't trust Bloglines or anyone with my username and password, I just need to put the address into Thunderbird or a similar RSS reader like this:
The "forcebasic=true" on the end tells the SSOClientFilter to use Basic Auth rather than redirecting to our SSO login screen as it would usually if it was requested by me in the browser. When Thunderbird tries to read the feed, it is prompted for authentication and so prompts me the user in Thunderbird for my username and password and sends those securely to the feed.

5) Hey presto, we have an authenticated RSS system log:

  <?xml version="1.0" encoding="UTF-8" ?> 
 <rss xmlns:taxo=""
xmlns:dc="" version="2.0">
  <title>SSO brute force warning log</title> 
  <description>This feed shows warnings when users repeatedly fail to login</description> 
  <title>Warning for user cusyac</title> 
  <description>The last 3 login attempts for user cusyac were failures. Check wsos_auth.log
Tue May 16 17:37:47 BST 2006|Auth failed|cusyac|Username/password not found|137.205.x.x<br>
 Tue May 16 17:36:38 BST 2006|Auth failed|cusyac|Username/password not found|137.205.x.x<br>
Tue May 16 17:34:37 BST 2006|Auth failed|cusyac|Username/password not found|137.205.x.x<br></description> 
  <pubDate>Tue, 16 May 2006 16:42:58 GMT</pubDate> 
  <guid isPermaLink="false">1147797778580</guid> 

May 09, 2006

OSCache and SSO

I am responsible for the Web Sign On system at Warwick. Between development and lives services, we have 80 registered services (quite a few duplicates because Blogs for instance is registered on my machine, a test server and the live server) that are allowed to talk to SSO.

SSO is responsible for authenticating users to the web apps and finding and sending attributes about users to those applications. The most heavily hit part of the system is the web service that allows an application to get the details for a usercode. It looks something like this:


The response is a simple bit of plain text with name/value pairs such as name, email, department, etc… This system has been kept simple for legacy reasons and does not use the complex transport that our Shibboleth/SAML based new SSO uses.

Last time I checked the stats, we had about 40,000 checks for users a day. Each application then caches these requests internally within our Userlookup API. This system has worked quite well for a long time, but recently we have had some reliability problems with NDS so if an application needs to get the names/emails for 100 users for some user listings and those users are not in the applications local cache, they then have 100 web requests to make which are suddenly taking a second or two each when they usually take 200ms each.

Even if SSO is running at full speed, requesting a lot of users at 200ms a time can still add up. With this in mind, we've now added a server side cache at the SSO end of things. OSCache to the rescue. Having never used this before, I wasn't sure how easy it would be, but it was simplicity itself: (approximate code)

try {
_results = (Map) _userByIdCache.getFromCache(getUser(), CACHE_TIMEOUT_SECS);
} catch (NeedsRefreshException nre) {
_results = _userByIdProcessor.getResults();
_userByIdCache.putInCache(getUser(), _results, new String[] { "all"});

I put a wrapper around the existing controller that published the results of the lookup that simply tries to get the results from the cache, if they are not there or have expired, it populates the cache. Simple. A bit of Spring config wires it all in, including a really simple binding of the Cache into the JMX console so I can easily monitor and clear the cache.

The outcome is that we now have a cache that gets quicky populated for the benefit of all client applications because if application A requests user USERA and it takes 200ms, when application B also requests user USERA, it now takes 5ms rather than the previous 200ms.

March 22, 2006

Client side woes

I spend most of my time programming server side stuff and don't really have to spend long knocking out a bit of HTML/CSS here and there.

However, we have recently got a bit more adventurous in terms of what we do on the client side so I've had to spend a bit more time on that side of things. My most recent and fiddly job was doing a nice DHTML/AJAX user and group finder for SiteBuilder (our CMS).

We have recently had designed a lovely new look and feel for the editing side of SiteBuilder2. One of the first screens for a proper make over is the permissions screen. Part of that is getting this new user and group picker working. Traditionally we did this with a good old fashioned popup window and javascript to populate the fields back again.

SiteBuilder Permissions screen

Now with our emphasis on smoother client side operations, I've done it with AJAX and a floating DHTML window.

SiteBuilder Permissions screen with AJAX

The popup is still a bit ugly as it's awaiting a new skin, but it functions well. However, my journey to this point has been a painful one.

  1. Getting my head around the Prototype library has taken a while, but now I have, I am pretty pleased with it
  2. Getting the CSS working for this popup when it can be positioned either above or below the little icon depending on what screen you're on and positioning arrows is nasty nasty work
  3. My crappy windows Apache 1.3 has been rubbish lately and complaining lots about (Resource deadlock avoided: mod_rewrite: failed to lock file descriptor) so I've finally upgraded to Apache 2 and all is well again
  4. I've moaned for ages about how hard it is to debug CSS problems in IE, but have just discovered that they do in fact have quite a nice DOM inspector very similar to the Firefox one. IE Developer Toolbar although you can't actually edit CSS live with it :(
  5. It is harder than it should be to do AJAX within AJAX. By this I mean that the popup has 4 tabs, each of which is loaded on demand from a different server side controller. Then, on 2 of those tabs, there is a search which shows results in a "find as you type" kind of way in another AJAX results box beneath the search boxes. Phew…nasty. The problem is the way javascript functions are registered with the browser, you have to pre-register all your javascript functions in the main page and can only call functions from returned AJAX content but can't create new functions :(

However, all the above trials were overcome and we now have quite a neat little user and group picker so that our users can really easily assign permissions to individual users or groups of users as defined in our WebGroups system. Yay for us.

February 28, 2006

LDAP filters

I recently did a bit of work to make a nice little AJAX/DHTML user picker for SiteBuilder2. It is basically an in page popup that allows quick searching of Warwick users by first and last names to find their usercode. This is useful for helping people work out usercodes for permissions and properties pages and such.

One problem was that it was a touch slow, especially for very broad searches such as everyone with a first name starting with K and last name starting with S.

In LDAP terms, we were doing the following:

NamingEnumeration searchAnswer ="o=Warwick", "(&(givenName=K*)(sn=S*))", sctls);
This works just fine and always used to return around 300 users. However, we always had to check for any expired accounts after the results were returned. Because account expiry was not very well populated in the past, only a few out of those 300 would be filtered out. However, after the recent tidying up of the directory due to password resets there are now many many more disabled accounts in NDS (our directory), which is a good thing. Now we can do this:
NamingEnumeration searchAnswer ="o=Warwick", "(&(givenName=K*)(sn=S*)(!(logindisabled=*)))", sctls);
So only people matching the first name and last name searches who also do not have a logindisabled attribute. This now returns just 97 results and is around twice as fast meaning out user picker searches should be much faster from now on.

February 24, 2006

Character encoding, Unicode and UTF–8

Writing about web page

When you're dealing with reading data from various sources and then end up doing some processing on it and display it on the web, most of the time you don't worry about character encoding. However, occasionally it comes along and bites you.

I always used to know that there were different character encodings and you could end up not displaying international characters properly if you used the wrong type and so on, but I didn't really know about it in depth. This is where good old Joel comes in. He wrote an article a while back entitled:
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets
. He does a pretty good job of explaining things.

My specific problem was an international students name was coming out of our directory (NDS) like this H??hner. It turns out they are actually called Hühner. So that one character was being turned into ??. No good. Usually I just say "oh, some character encoding problem" and give up. But sadly I was determined to get to the bottom it. Upon closer inspection, the ?? were an artifact of appear on the web (different encoding again), but in my java code, their name was: H├╝hner. Nice.

Doing an ethereal trace on the traffic to my machine when I queried NDS for this person, I saw that:
48 e2 94 9c e2 95 9d 68 6e 65 72
seemed to represent our users name. This is hex and having a look at some character encoding charts, it turns out that this is UTF-8. Is there an easy way of fiddling about with different encoding in java…not that I can find. So, following the instructions on UTF-8 encodings from here I worked out that in Unicode that UTF-8 sequence is:
0x48, 0x251C, 0x255D, 0x68, 0x6e, 0x65, 0x72
Which does indeed turn into H├╝hner. So, nothing was wrong in my code and it proved that NDS was storing something obsure. Pleasingly, a quick email to our friendly systems team with this evidence and they got it fixed and are now going through the directory trying to fix bad entries and work out where this strange encoding is coming from. Hopefully our international students will soon no longer be seeing their names scrambled :)

Geek talk over.

February 16, 2006

Technologist Manifesto…, or Things Everyone in IT Should Know

Writing about web page

If you're in IT, take a read of the whole thing…in brief though:

  • Bad Technology is Your Fault
  • Users aren't Born Stupid, You Train Them to be That Way
  • You want to make your system easier to use than to not use.
  • If the Solution Seems Too Simple, Use It
  • Eliminate Jobs – Everywhere
  • Make People Better
  • Keep a Junior Nearby
  • Understand the Good of "Good Enough"
  • Respect the Database for What it is
  • The 3-by Rule

Denormalising and rolling up data for performance reasons

As I finally got jProfiler working on a live load of BlogBuilder the other day, I've been finding bottlenecks where you'd not really expect them.

It's one thing to profile a bit of code or a few requests and see where problems might be, but profiling real load for 10 minutes or so gives a much better big picture. Without the efficiency of jProfiler 4 and Java 5, doing this on a live system would not be practical…but thankfully it worked. I didn't leave it running long because inevitably it increased the load on the system considerably, but it was bearable.

If you've got a bad bit of code, but it hardly ever gets called, you can let it slide. If you've got bad code that gets hit a lot…you've got a problem.

Most of the code that we've now optimised wasn't a problem as such before, but with the hibernate bug I mentioned in my last post, we're trying to reduce querying more aggresively than we were before.

The database person in me says that I want a nicely normalised database, but sometimes it's just not efficient. We have now rolled up comment, trackback and image counts as these were disproportionately expensive counts.

What makes doing these roll-ups a pain in a lot of places is the permissions system as we generally serve a completely personalised page to everyone. We do however have a "publcly viewable" flag for many objects which makes things a bit easier for non-logged in views.

The other thing that was surprisingly expensive is our textile renditions. We use textile in a lot of places to turn textile markup into html. These conversions range from one line titles to entries with thousands of words. We have always cached the converted textile->html for entries as these are large chunks of text. However, we have not until today (thanks to some coding done by Mat today) cached the converted text of comments. Even now with those cached, it is tempting to do a textile-lite that doesn't do the full parse of every little string (there are just too many bloody regexs), but just things like bold and italics.

February 15, 2006

Hibernate query caching

Looks like my efforts to make BlogBuilder more efficient by caching more queries has caused me some unforseen trouble.

We had what appeared to be deadlocking in the application the other day. Doing a thread dump (thank god for "kill -3"), we saw that all of the threads were blocked in SoftLimitMRUCache.

Turns out it is a bug Concurrent access issues with both SoftLimitMRUCache and SimpleMRUCache

We were hitting the caches so much that a subtle hibernate bug appeared. We'll await a fix, but in the meantime I'm trying to optimise BlogBuilder in other ways.

  • Roll up data rather than do live counts, such as comment, trackback and image counts
  • Better indexes and improved query efficiency so that I don't have to cache the queries
  • Profiling like crazy to find the hotspots. I recently got JProfiler working on a live instance of JBoss running BlogBuilder. It shed a lot of light on where our real bottlenecks are…and as usual, they are not where you expect.

My hopes when I was building BlogBuilder was that I could make everything dynamic and live for every user as this would provide a more personal and dynamic experience. Sadly this is not terribly efficient and I'm having to start to be a bit more pragmatic in where it is really neccessary to do live checks rather than static data.

February 08, 2006

Hibernate and efficient queries

First post of the year (bad boy)...

I've spent a lot of this year so far jumping between lots of different things. I've started dipping into the new SiteBuilder code which is far more familiar as it is now Spring/Hibernate based rather than Struts/EJB.

I've also as usual been working on Single Sign-On and BlogBuilder.

As the complexity of BlogBuilder grows and our page views grows (now averaging more than 50,000 proper real people page views per day), it has become more and more important to optimise BlogBuilder for better performance.

Hongfeng our resident Oracle expert pointed me in the direction of quite a lot of particularly bad and slow pieces of SQL that were being generated out of BlogBuilder. The problem with BlogBuilder is that it is very very dynamic. We do not serve any static pages as every single page is customised to the currently logged in user as every blog/image/entry has its own permissions. There are also just a lot of different views on the blogs data; daily views, monthly views, favourites views, entries by tags, blogs by group, images by day, etc…

When using Hibernate 2 I did most of these queries with HQL, and it worked quite well, but I'm starting to feel the strain as some of the queries got more and more complicated.

With Hibernate 3 I can now take advantage of the Criteria API, which is quite nice for building complicated queries, but it still has some problems so I've now got a mix of HQL, Criteria and plain old SQL when a particularly complicated aggregation is needed.

Don't forget to turn on query caching and specifically tell your criteria and queries to cache as although the documentation says that for most queries caching doesn't make much difference, I've found it can make a huge difference.

Another little trick is to be careful with date range queries. If you want to do something like find items based on the current time, round your time to the nearest hour or minute rather than passing in a date with second or millisecond accuracy as this will prevent those queries being cached for more than a second…not a lot of good.

Another trick when moving from Hibernate 2 to Hibernate 3 is that you used to have to do "query.iterate().next()" to get a result when you knew there was just a single result (such as a count query), but now there is the uniqueResult() method. It is important to switch over because the uniqueResult() calls get cached, but the iternate().next() ones don't.

February 2023

Mo Tu We Th Fr Sa Su
Jan |  Today  |
      1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28               


Search this blog

Most recent comments

  • One thing that was glossed over is that if you use Spring, there is a filter you can put in your XML… by Mathew Mannion on this entry
  • You are my hero. by Mathew Mannion on this entry
  • And may all your chickens come home to roost – in a nice fluffy organic, non–supermarket farmed kind… by Julie Moreton on this entry
  • Good luck I hope that you enjoy the new job! by on this entry
  • Good luck Kieran. :) by on this entry


Not signed in
Sign in

Powered by BlogBuilder