All 9 entries tagged Programming

View all 121 entries tagged Programming on Warwick Blogs | View entries tagged Programming at Technorati | There are no images tagged Programming on this blog

December 09, 2005

Serializing java objects to Oracle

We recently had a requirement to use our new Shibboleth based Single Sign On system with a cluster of jboss servers running an essentially stateless application.

The way that our new SSO works is through the SAML Post Profile meaning that an authentication assertion is posted by the user to the Shire service. This shire service then does an Attribute Request back to SSO and puts the results into a user cache in memory and generates a cookie which links to the user in the cache.

The problem is that the request might then go back to another member of the cluster which does not share the cache so it won't know about the user represented by the cookie. The obvious solution is some kind of clustered cache.

We've not needed to use any clustered cache technology before so passed on the likes of Coherence (insane pricing) and other open source caches such as memcached. It is best not to introduce new technologies that you can't support unless you have to.

I ended up building a simple two level cache that put the data both in memory and in the database. If when a request came in, there was nothing in the memory cache, it checked the database and populated the memory cache. I wouldn't want to go to the database everytime as this is a very busy application that could do without the additional overhead.

Now, the code.

ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos;
try {
oos = new ObjectOutputStream(baos);
oos.writeObject(value);
} catch (IOException e) {
throw new RuntimeException("Could not write object to stream",e);
}

SqlUpdate su = new SqlUpdate(getDataSource(), "INSERT INTO objectcache " + "(key, objectdata,createddate) "
+ "VALUES (?, ?,?)");
su.declareParameter(new SqlParameter("key", Types.VARCHAR));
su.declareParameter(new SqlParameter("objectdata", Types.BLOB));
su.declareParameter(new SqlParameter("createddate", Types.DATE));
su.compile();

Object[] parameterValues = new Object[3];
parameterValues[0] = key.toString();

LobHandler lobHandler = new DefaultLobHandler();
parameterValues[1] = new SqlLobValue(baos.toByteArray(), lobHandler);

parameterValues[2] = new java.sql.Date(new Date().getTime());

su.update(parameterValues);
Not knowing how big these objects were going to be, I figured it would be best to put this in a blob, but that has its own joys, especially with plain old JDBC. I used Spring's very handy JDBC helpers to make my life easier. If you want to get the object back out:
ObjectInputStream ois = new ObjectInputStream(new DefaultLobHandler().getBlobAsBinaryStream(resultSet, 1));
UserCacheItem dbItem = (UserCacheItem) ois.readObject();
return dbItem;
Basically just select back the object and use the ObjectInputStream to de-serialize the object back into existence. Simple.

December 08, 2005

Is 99.999% uptime only for Wal–Mart?

Writing about web page http://37signals.com/svn/archives2/dont_scale_99999_uptime_is_for_walmart.php

I've linked to an article on 37 Signals blog that talks about uptime for web applications. They state that you only need to worry about 99.999% uptime once you're doing big business.


Wright correctly states that those final last percent are incredibly expensive. To go from 98% to 99% can cost thousands of dollars. To go from 99% to 99.9% tens of thousands more. Now contrast that with the value. What kind of service are you providing? Does the world end if you’re down for 30 minutes?

If you’re Wal-Mart and your credit card processing pipeline stops for 30 minutes during prime time, yes, the world does end. Someone might very well be fired. The business loses millions of dollars. Wal-Mart gets in the news and loses millions more on the goodwill account.

Now what if Delicious, Feedster, or Technorati goes down for 30 minutes? How big is the inconvenience of not being able to get to your tagged bookmarks or do yet another ego-search with Feedster or Technorati for 30 minutes? Not that high. The world does not come to an end. Nobody gets fired.

Having a quick look at our wonderful IPCheck software, these are our values for the last 3 months.

  • BlogBuilder: 99.70% (5h40m downtime)
  • SiteBuilder: 99.93% (24m downtime)
  • Forums: 98.97% (27h downtime)
  • Single Sign On: 99.89% (1h43m downtime)

Whose fault that 0.30%, 0.07%, 1.03% and 0.11% are, it doesn't matter, sometimes things are just slow rather than down, sometimes things just break, sometimes it's the network, sometimes it's human error doing a redeploy. All our users see is that it is down for some small period of time. In many cases the system is not actually down, it is just that a single request from the monitoring server failed…but to be fair, if that happens, the chances are that occasionally it will happen to a use without the monitor noticing either.

This is just a small selection (but of the most commonly used systems we monitor), but you can see that we have good uptime. Would it matter if we were a couple of percentage points lower? As always…it depends.

If Single Sign On was down for an hour on a single Monday morning and that was the only downtime that month, it'd look like a fantastic month of 99.9% uptime. Unfortunately many systems rely on SSO and you would in some way at least degrade if not bring down completely all those other systems, adding up to a very nasty bit of downtime.

The 37 Signals article is correct that you do have to spend quite a bit of money to get that extra percentage point, but in the environment we work in where so many people come to rely on our services, it is important.

If however you need the occasional planned downtime and you can let everyone know, that is fine as people can make other plans, so pure uptime is not always important, it is keeping the unplanned downtimes to a minimum that counts.


December 07, 2005

Google: Ten Golden Rules

Writing about web page http://www.msnbc.msn.com/id/10296177/site/newsweek/

Google's ten golden rules are an interesting read to get a feel for what makes Google tick. My favourite:

Encourage creativity. Google engineers can spend up to 20 percent of their time on a project of their choice. There is, of course, an approval process and some oversight, but basically we want to allow creative people to be creative. One of our not-so-secret weapons is our ideas mailing list: a companywide suggestion box where people can post ideas ranging from parking procedures to the next killer app. The software allows for everyone to comment on and rate ideas, permitting the best ideas to percolate to the top.

November 24, 2005

Remote VisualGC for JBoss

Follow-up to Garbage collection and Hibernate performance tuning from Kieran's blog

We've recently upgraded the BlogBuilder server to Java 5 and JBoss 3.2.7. I was hoping that this would cure some of my garbage collection (GC) problems, but sadly it has not. I am still getting some very long blocking 20–30 second full GCs.

My backup plan was that if just the upgrade didn't fix things, then at least I could finally use the new remote monitoring tools that come with Java 5.

VisualGC is a tool that lets you visualise what is going on in a Java processes GC. As mentioned in my previous article, you can do some analysis by looking at and analysing the log files that you can get out about GC, but they don't really show quite whats going on. VisualGC will give you something like this:

This is easy to get to work on your local machine as you just follow the instructions here

The remote stuff is slightly trickier as you'll have to run an RMI Registry on the remote machine and also the jstatd program that helps generate the statistics about the process.

You'll need to get over a few permissions issues. Create a file called jstatd.all.policy in the directory where you want to run jstatd containing the below code:

grant codebase "file:${java.home}/../lib/tools.jar" {
permission java.security.AllPermission;
};
Make sure that the rmiregistry and jstatd are running as a user who has permission to see into the processes that you want to monitor.
rmiregistry 2020&
jstatd -J-Djava.security.policy=jstatd.all.policy -p 2020&

Importantly you have to specify a port, I am using 2020 here, but you can use any port that isn't in use. The reason is that the default port is 1099 and that will usually be used by your JBoss install. If you are getting access denied errors when trying to run jstatd, there is probably something wrong with your policy file. If you get a ClassNotFoundException for the sun.jvmstat.monitor.remote.RemoteHost class, then you probably have a problem with Java versions or paths.

Once these two processes are running, you should be able to hook into an monitor your remote process like this:

visualgc pid@host:2020
Good luck!

November 18, 2005

JBoss tuning and sliming

Writing about web page http://wiki.jboss.org/wiki/Wiki.jsp?page=JBossASTuningSliming

I came across this JBoss tuning and sliming page on the JBoss wiki a while back, but never really tried it out very much. I was forced to upgrade my JBoss from 3.2.6 to 3.2.7 today so that I can use the new Java 5 monitoring console so I thought I'd give it a go.

By taking out a load of stuff that I just don't need I managed these results:

Before:

  • Clean startup with no webapps deployed: 15s
  • Startup with BlogBuilder deployed: 34s

After:

  • Clean startup with no webapps deployed: 8s
  • Startup with BlogBuilder deployed: 27s

Try it out, it's worth a look.


November 16, 2005

Agile programming

Writing about web page http://www.dilbert.com/


October 12, 2005

Joel on Software – Set Your Priorities

Writing about web page http://www.joelonsoftware.com/articles/SetYourPriorities.html

Joel Spolsky is a clever guy. In case you've not come across him before, he's a genius software engineer who provides some great insight into the software development business.

If you're in IT, it wouldn't hurt to go back and read some of his archives as he is a great communicator with some great ideas.

His latest is all about setting priorities for features. If you're never sure which feature to do next…read it.


LDAP connection pooling

We recently had problems with load on our single sign on (SSO) server. Being the start of term, things are generally busier than the rest of the year and we often see higher load than normal. However, this was too far from normal to be right.

A bit of investigation showed that our JBoss instance had literally 100s and 100s of threads. Lsof is a very handy utility in cases like this.

lsof -p <procid>

This revealed 100s of open connections to our LDAP servers. Not good.

Looking at the LDAP code we have, there are two places where we make LDAP connections, or as they are known in Java; contexts.

Hashtable env = new Hashtable();
env.put(Context.INITIAL_CONTEXT_FACTORY, 
    "com.sun.jndi.ldap.LdapCtxFactory");
env.put(Context.PROVIDER_URL, "ldap://ourldap.warwick.ac.uk");
LdapContext ctx = new InitialLdapContext(env,null);
// do something useful with ctx
ctx.close()

This is pretty much how our code worked in both places. Importantly I'd checked that the contexts were always closed…and they were.

This is where LDAP connection pooling came into the picture. It turned out that one piece of code (not written by us), used this:

env.put("com.sun.jndi.ldap.connect.pool", "true");

This turns on connection pooling. However, we didn't use pooling in the other bit of code. So, one of the other wasn't working. Trying out pooling on both bits of code didn't improve things either, basically because it is a multi–threaded application with 100's of requests a minute, if you just keep creating new LdapContext's from a LdapCtxFactory, you are using a new LdapCtxFactory every time.

Thankfully our SSO application uses Spring so it was simple enough to create an XML entry for the LdapCtxFactory and the environment config and plug the same LdapCtxFactory into the two places it was needed. At least now we were using the same factory.

We could now do this:

Map env = new Hashtable();
env.putAll(getLdapEnv());
env.put("java.naming.security.principal", user);
env.put("java.naming.security.credentials", pass);
LdapContext ldapContext = (LdapContext) getLdapContextFactory().getInitialContext((Hashtable) env);

Where the base LDAP environment and LdapCtxFactory was injected into where it was needed. Then just the username and password to bind as is passed in dynamically.

To really know if pooling is working you need to turn on the debugging for the ldap connection pooling by adding a java option to your test/application/server. There are other handy options for tweaking the pooling behaviour as well.

-Dcom.sun.jndi.ldap.connect.pool.debug=fine
-Dcom.sun.jndi.ldap.connect.pool.initsize=20 -Dcom.sun.jndi.ldap.connect.pool.timeout=10000

The bugging will give you messages like this if pooling isn't working:

Create com.sun.jndi.ldap.LdapClient@c87d32[nds.warwick.ac.uk:389]
Use com.sun.jndi.ldap.LdapClient@c87d32
Create com.sun.jndi.ldap.LdapClient@c81a32[nds.warwick.ac.uk:389]
Use com.sun.jndi.ldap.LdapClient@c81a32
Create com.sun.jndi.ldap.LdapClient@a17d35[nds.warwick.ac.uk:389]
Use com.sun.jndi.ldap.LdapClient@a17d35
Create com.sun.jndi.ldap.LdapClient@1a7e35[nds.warwick.ac.uk:389]
Use com.sun.jndi.ldap.LdapClient@1a7e35

New connections are just being created every time with no reuse. What you should see is:

Use com.sun.jndi.ldap.LdapClient@17bd5d1
Release com.sun.jndi.ldap.LdapClient@17bd5d1
Create com.sun.jndi.ldap.LdapClient@cce3fe[nds.warwick.ac.uk:389]
Use com.sun.jndi.ldap.LdapClient@cce3fe
Release com.sun.jndi.ldap.LdapClient@cce3fe
Use com.sun.jndi.ldap.LdapClient@1922b38
Release com.sun.jndi.ldap.LdapClient@1922b38
Use com.sun.jndi.ldap.LdapClient@17bd5d1
Release com.sun.jndi.ldap.LdapClient@17bd5d1

As you can see, there are actually two differences here from a fully working connection pool and a well and truely broken one.

  1. There are very few creates and lots of reuse in the good code
  2. There are lots of releases after connection use in the good code

This is where we came across our second problem. Although in theory the connection pooling was working and I could see some reuse, it was still creating a lot of connections and I wasn't seeing barely any 'Release' messages.

Chris hit the nail on the head with pointing out that NamingEnumerations could well be just like PreparedStatements and ResultSets for JDBC. It is all fine and well closing the connection/context itself, but if you don't close the other resources, it won't actually get released.

The proof of this shows up again in lsof or netstat. A context that has been closed but still has an open NamingEnumeration shows up like this:

java    21533 jboss   80u  IPv6 0x32376e2cf70   0t70743    TCP ssoserver:60465->ldapserver.warwick.ac.uk:ldap (ESTABLISHED)

However, when it is closed, it should wait to be closed, like this:

java    21533 jboss   80u  IPv6 0x32376e2cf70   0t70743    TCP ssoserver:60465->ldapserver.warwick.ac.uk:ldap (TIME_WAIT)

Upon closing all NamingEnumerations, we finally got the perfect results. 100s of requests a minute and only ever around 10–15 ldap connections open at any one time.

So, lessons learnt.

  • When creating contexts, share the factory to use pooling
  • Make sure you close everything. If it has a close()...use it!
  • Occasionally take a look at the open connections and threads that you application has…it might surprise you.

Update:

Spring config:


<bean id="ldapContextFactory" class="com.sun.jndi.ldap.LdapCtxFactory" singleton="true"/>

<bean id="ldapEnv" class="java.util.Hashtable">
<constructor-arg>
<map>
<entry key="java.naming.factory.initial"><value>com.sun.jndi.ldap.LdapCtxFactory</value></entry>
<entry key="java.naming.provider.url"><value>ldaps://ourldap.ac.uk</value></entry>
<entry key="java.naming.ldap.derefAliases"><value>never</value></entry>
<entry key="com.sun.jndi.ldap.connect.timeout"><value>5000</value></entry>
<entry key="java.naming.ldap.version"><value>3</value></entry>
<entry key="com.sun.jndi.ldap.connect.pool"><value>true</value></entry>
        </map>
</constructor-arg>
</bean>

Update:
We now do connection pooling with LDAPS so we use the additional system property:

-Dcom.sun.jndi.ldap.connect.pool.protocol="plain ssl"

October 06, 2005

Bulk deleting bad data

I had to clear up some old bad data that was left over from a bit of bad code. Unfortunately the bad data didn't rear its ugly head until recently and a lot of data had built up. It was also very hard to detect the bad data because of many places it could be referenced from…only if it had no references to it from any of 7 places would it need to be deleted.

This means doing a really horrible query either like this:

select id from atable where
id not in (select id from anothertable)
and
id not in (select id from yetanothertable)
and
id not in (select id from moretables)
.....
.....

This is very, very, very slow.

A more efficient way of doing this is this:

select id from atable a where
not exists (select id from anothertable b where a.id = b.id)
and
not exists (select id from yetanothertable c where a.id = c.id)
and
not exists (select id from moretables d where a.id = d.id)
.....
.....

However, when you are dealing with potentially 100,000's of rows it is still quite slow…but it does get there. The next problem is actually deleting the data once you've managed to select it. As a little test I thought I'd try and delete the whole lot, but that just didn't work…too slow. Even if I did have the patience to leave it running for hours, I couldn't let it lock up the database like that for that long.

So, the only solution was to do it in batches. I wrote a quick java program that would iterate through an do the deletes in small batches of a 100 or so at a time. My first mistake was trying to reuse some Spring/Hibernate code I already had instead of going straight to old school and using JDBC.

Although in theory you can get a Connection object from the hibernate Session, via session.connection(), it really is NOT the same as just getting a good old fashioned JDBC connection. The deletes were taking absolutely ages, so I profiled it and noticed that hibernate was still trying to do some of its funky stuff in the background, really slowing things down.

Plan B (or is it D by now?). Spring comes with a handy little JdbcTemplate which lets you do real JDBC but without a lot of the exception and connection/statement/resultset closing pains. Finally…it worked.

So, lesson of the day:

  • not exists type queries are faster than not in queries
  • Bulk deletes can be verrrrrrry slow
  • Batching deletes is better, but with real JDBC not hibernate SQL calls

January 2021

Mo Tu We Th Fr Sa Su
Dec |  Today  |
            1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Tags

Search this blog

Most recent comments

  • One thing that was glossed over is that if you use Spring, there is a filter you can put in your XML… by Mathew Mannion on this entry
  • You are my hero. by Mathew Mannion on this entry
  • And may all your chickens come home to roost – in a nice fluffy organic, non–supermarket farmed kind… by Julie Moreton on this entry
  • Good luck I hope that you enjoy the new job! by on this entry
  • Good luck Kieran. :) by on this entry

Galleries

Not signed in
Sign in

Powered by BlogBuilder
© MMXXI