All entries for Thursday 06 October 2005

October 06, 2005

Domain hacks

Writing about web page http://www.xona.com/domainhacks/

Great way of finding a personal domain name using different international domain extensions. In theory I can get:

kieran domain hacks

<a href="http://kier.an/">link</a>  kier.an  .an  Netherlands Antilles  .an  [whois]
<a href="http://ki.er/an/">link</a> ki.er .er Eritrea .er [whois]
<a href="http://k.ie/ran/">link</a> k.ie .ie Ireland .ie [whois]
<a href="http://x.ki/eran/">link</a> x.ki .ki Kiribati .ki [whois]

Google Maps coolness: Frapper

Writing about an entry you don't have permission to view

Stuart has found a new google maps applications called Frapper which lets you map your own groups of people. Where are the Warwick bloggers from?

Bulk deleting bad data

I had to clear up some old bad data that was left over from a bit of bad code. Unfortunately the bad data didn't rear its ugly head until recently and a lot of data had built up. It was also very hard to detect the bad data because of many places it could be referenced from…only if it had no references to it from any of 7 places would it need to be deleted.

This means doing a really horrible query either like this:

select id from atable where
id not in (select id from anothertable)
and
id not in (select id from yetanothertable)
and
id not in (select id from moretables)
.....
.....

This is very, very, very slow.

A more efficient way of doing this is this:

select id from atable a where
not exists (select id from anothertable b where a.id = b.id)
and
not exists (select id from yetanothertable c where a.id = c.id)
and
not exists (select id from moretables d where a.id = d.id)
.....
.....

However, when you are dealing with potentially 100,000's of rows it is still quite slow…but it does get there. The next problem is actually deleting the data once you've managed to select it. As a little test I thought I'd try and delete the whole lot, but that just didn't work…too slow. Even if I did have the patience to leave it running for hours, I couldn't let it lock up the database like that for that long.

So, the only solution was to do it in batches. I wrote a quick java program that would iterate through an do the deletes in small batches of a 100 or so at a time. My first mistake was trying to reuse some Spring/Hibernate code I already had instead of going straight to old school and using JDBC.

Although in theory you can get a Connection object from the hibernate Session, via session.connection(), it really is NOT the same as just getting a good old fashioned JDBC connection. The deletes were taking absolutely ages, so I profiled it and noticed that hibernate was still trying to do some of its funky stuff in the background, really slowing things down.

Plan B (or is it D by now?). Spring comes with a handy little JdbcTemplate which lets you do real JDBC but without a lot of the exception and connection/statement/resultset closing pains. Finally…it worked.

So, lesson of the day:

  • not exists type queries are faster than not in queries
  • Bulk deletes can be verrrrrrry slow
  • Batching deletes is better, but with real JDBC not hibernate SQL calls

October 2005

Mo Tu We Th Fr Sa Su
Sep |  Today  | Nov
               1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30
31                  

Tags

Search this blog

Most recent comments

  • One thing that was glossed over is that if you use Spring, there is a filter you can put in your XML… by Mathew Mannion on this entry
  • You are my hero. by Mathew Mannion on this entry
  • And may all your chickens come home to roost – in a nice fluffy organic, non–supermarket farmed kind… by Julie Moreton on this entry
  • Good luck I hope that you enjoy the new job! by on this entry
  • Good luck Kieran. :) by on this entry

Galleries

Not signed in
Sign in

Powered by BlogBuilder
© MMXIX