All 25 entries tagged Tech

View all 318 entries tagged Tech on Warwick Blogs | View entries tagged Tech at Technorati | There are no images tagged Tech on this blog

February 24, 2006

Designing for developers

Colin and I came across a rather interesting, and quite counter-intuitive, design problem yesterday.

Colin's working on a bit of framework for sitebuilder2 to manage long-running transaction locks. He and I discussed, briefly, the requirement a couple of days ago, which was that we wanted to be able to mark either a page or a hierarchy of pages as locked, for edit or admin access, for the duration of a use-case, request, or component of a request, with the option to either fail if the hierarchy can't be locked entirely, or continue with those portions that can be locked.

So, Colin went away and excersised his giant brain, and constructed a system which allowed for all of those options, with the option to support more if needed. It was elegant, simple, generic, and re-useable.

It was also completely baffling.
By abstracting the design away from the use cases, and into the generic 'these kinds of locks on these kinds of objects', we'd ended up with an architecture that really only made sense if you were thinking about locks. If you were thinking (as most developers will be) about, say, editing a page, it was far from obvious how you should integrate with the framework.

So I thought about this for a while, and realised that actually, the fault was mine. In taking advantage of Colin's ability to turn out code at breakneck pace, I'd not bothered to think through which of the locking tactics we actually needed, and which ones just seemed like logical extensions of the system that we might need one day. I had fallen foul of the rule of YAGNI: Don't over-generalise, because You Aint Gonna Need It.

So, we looked again at the requirements, and realised that of the 24 scenarios we had originally identified, we only actually required 3. Lock a page, Test a hierarchy of pages for lockability at ADMIN, test a hierarchy of pages for lockability at EDIT and fail if you can't lock the whole lot.

Armed with this simplification, suddenly it started to make a lot more sense. The annotations we were using began to look like something that would actually be meaningful to a developer focussed on a particular use-case. And a whole bunch of code was deleted, which is always good.

The moral of the story then, is that generalised, re-useable code is good to a point. Sometimes you do want to have code that tells the story of the use-case you're implementing, and it can be important not to loose that in the rush to make the framework do everything.

February 23, 2006

I broke the internet

Sorry everyone. I’ve rebooted it, though, and it seems OK now.

More seriously, apologies to anyone who’s been seeing ropier-than-usual performance from pages. We’ve been trying to tune the server a bit, and, as is so often the way, it turned out to have some unexpected side effects.

All of the content on www2 is served out of Sitebuilder, a Java content management system. In order to get decent page-loading times, the java app does a hell of a lot of caching*. This means that the app has a pretty big memory footprint (~3.5GB; the most you can allocate to a 32-bit JVM on solaris) This in turn means that when the garbage collector runs, you get pauses of 15-20 seconds. That might not sound like much, but it’s bloody annoying when you’re browsing – it’s the click,click,click,click, ...wait…wait…ohThereItIs scenario. So I’ve been testing out some alternative GC strategies to see if we can make it a bit better. The basic idea is to sacrifice a bit of CPU idle time to cut down on the big pauses – effectively swapping one 15-second pause every 10 minutes for a thousand 0.015-second ones (which will disappear into the background). Unfortunately this takes a fair bit of fiddling to get right; push it too far and the server spends all of its time doing tiddly GCs and none doing actual work. Which has been the state of affairs from time to time today. It takes some work to get an 8-way sparc box to 100% utilisation, but with perseverance I’ve managed it. Oops. To make things harder, the reliance on in-memory cache means that if you restart the server things are very slow for a while, until the caches rebuild. So we can’t just ‘apachectl graceful’ it every few minutes with a different config :, the box load is back into the green. For now. No doubt the moment I get out the door it will go back through the roof until I get home and fix it :(

  • Sitebuilder2, the work-in-progress replacement for Sitebuilder, does no cacheing, and just relies on very tight SQL and faster CPUs, with the option of a strategicaly-placed squid cache if we need it. Which I think, in the long run, is a better plan.

Update It’s not broken yet. For the benefit of anyone coming here from google, I thought it might be interesting to outline what changes I made:

  1. switched from default to concurrent-mark-and-sweep GC.
  2. Increased the minimum size for the young generation to 150MB (previously it was 32MB, which was causing minor collections at a rate of > 1/s
  3. increased the RMI gc interval to about 2 hours (unfortunately we can’t get rid of it completely as we use some remote EJBs)

The big win was increasing the size for the YG; this is what killed the CPU before. increasing the size up to 150MB has caused the minor collection time to go up to about .3 of a second, but we’re only collecting once every 10-20 seconds. Once the GC interval drops below the average request duration, all hell breaks loose as the collections can never catch up. We’ll need to keep an eye on this when the load ramps up next week though.

Update on the update Increasing the RMI GC interval above about an hour seemed to completely disable it, so I’ve set it back to exactly an hour, which seeems to work. Also, I’ve dropped the MinHeapFreeRatio to 20%, because when our page-cache is fully loaded we need about 2.1GB of heap. The default MinHeapFreeRatio of 40% won’t allow for that on a 3GB heap, which meant that the server was continuously doing full collections until something caused the cache to empty.

February 14, 2006

IDEA 5.1: You're a bit weird

Follow-up to Netbeans 5: Still not switching from Secret Plans and Clever Tricks

So, having irrationally given up on netbeans because it was too ugly and clunky, I thought I'd take another look at IDEA. I tried it a while back and got on with it quite well. This time, though, it just seemed to be an immense struggle.

Some background: the project I'm working on lives in CVS, and is worked on by half a dozen developers, mostly using eclipse. So anything that I do to it needs to not break things for them. It's layout looks something like this:

        -uk/ac/warwick/{java stuff}
        -uk/ac/warwick/{java stuff}
            -{an exploded war file structure}           
        -{docs, scripts, odds and ends}

First off, I naively tried checking out the project from CVS in the hope that IDEA could just work out how it was structured. No dice. IDEA checked the files out but refused to open them as a project. Then I tried using the eclipse exporter to make an IPR file. It sort of worked, but when I opened the project I couldn't see any of my source directories, or indeed any of the files below the root directory. It seemed that the exporter hadn't created any 'modules' which seems to be sort-of IDEA-ish for 'source folder'. So I created a couple of java modules by hand for my src and unit-test folders, and that kind of worked. But still I can't see any of my exploded war, or my docs directory. And my source files don't seem to have picked up the project's classpath – in fact they seem to be more or less completely disconnected from the root project. It's all very bizarre.

I really want to like IDEA, because I've seen in the past how nice it can be once you've got it going. But right now I feel like I'm chatting it up, but I only speak English and it only speaks French. If we can just get onto the same wavelength, I think we could get along quite well together…

update I'm getting there. I've worked out that I want just one java module with multiple source folders in it, and I've got the classpath sorted. I've even managed to make the fonts look nice (13pt monacco, just like my eclipse settings). Now what I need is a quick-reference to the keybindings…

update on the update I found the quick-ref. I was quite getting to like IDEA, and then it's crappy CVS integration allowed me to check in over the top of a conflict without even warning me. I broke the build :-( Since we do so much concurrent development here, a failing like that is pretty fatal. So I still won't be shelling out $500 for a copy. ho hum, back to eclipse again. Shame really, as I was quite enjoying the inspection gadgets, the hierarchy, and the JSP editor.

January 12, 2006

Netbeans 5: Still not switching

Writing about web page

Some people have been saying good things about the latest release of Sun's Netbeans Java IDE. I thought that in the interests of not missing out on anything I ought to give it a try. I have very few complaints about Eclipse, but one thing that does bite me from time to time is the slugish performance of SWT on OSX, especially on my G4 powerbook. I'd had good experiences with IDEA 's swing-based IDE in the past, so I thought Netbeans might be worth a try.

So, download it and unpack it; up it comes. It took rather a long time to start (about 90s) on the powerbook, but that's it's first time – maybe it'll be quicker next time.
Now, how do I import an eclipse project? Off the the website, discover I need to install a plug-in. WTF? I thought one of the advantages of NB was 'the best out of the box experience' – I'm installing plugins and I haven't written a line of code yet.

So, plugin installed, I locate my project and click 'import'. A progress bar whizzes to 100% in about 2 seconds. Neat. It stays like that for another 2 minutes. Not neat. Just as I'm about to kill -9 it, it comes back to me. Open the project, open a source file. UGLEEE serif fonts. ugh. Change fonts to monaco (monaco is the default for everything else on the mac… guess NB had to be different). Wait a few minutes for a background classpath scan to complete so I can see a structure view for the class.

Spot 'eclipse mode' keybindings. nice touch. Switch them on. Open a class. ctrl-O for the outline pop-up. Doesn't work. There isn't one. shift-apple-T for the 'open type' dialog. Doesn't work. (it's shift-ctrl-o for some reason). F3 to navigate to a type. Doesn't work. Give up and find keybindings crib sheet.

Navigate to a testcase. How to run it? Run->run file says 'class doesn't have a main method'. Guess I probably need a plug-in or something.

Give up. The GUI's not any faster than eclipse, it looks kinda wierd, half the functionality I'm used to isn't there, it's going to take weeks to relearn the keyboard bindings and make it work how I want to.

Now, I grant you, I didn't give it much of a chance. But then, it's not as if I need to switch. Eclipse does everything I want it too, and I'm used to it. But when I tried IDEA I at least got the feeling 'oh, I could get to like that…' a few times. Netbeans still leaves me cold.

December 14, 2005

New kit from Sun

Writing about web page

I don’t normally get very excited about server hardware. I mean, it’s just stuff. You buy it, fiddle with it for a week or two, and then it just sits and does it’s thing for a few years. Then it breaks and you get a new one.

But the new Sparc T1000 chip really is quite neat. For starters, it’s an 8 core chip. There are quite a few dual-cores out there – we’re using a lot of DC opterons right now – but this picks it up another level. For seconds, each core has 4-way chip multithreading, meaning that the unit as a whole can run 32 threads more or less in parallel. If you write multithreaded apps like, say, web servers, that’s a big deal.

And this thing is fast. It recently scored something over 50,000 tx/second on one of the java benchmarks (SpecJbbs I think). To put that in perspective, Sun used to hold the record for that benchmark with about 400,000 tx/second – on a box with 106 Sparc III cpus (the same chip we have in our 8-way ‘big’ server). So one of these new chips can do the work of about 12 sparc IIIs. We could replace our wardrobe-sized server with something the size of a pizza-box, and still get a performance boost.

And what’s more, the power consumption on the boxes is tiddly. One of the downsides of the opteron boxes (Sun V40Z: very fast) is that the cooling fans are taken straight from the top of an RAF Sea King. The T1000s run very cool. What’s even neater is that when the box is only lightly loaded it’ll shut down cores on the chip to match the load, thus reducing the power consumption still further.

It’s also got a whole load of resilience features that you don’t get on uni- or dual-core chips; one or more cores, registers, or DRAMs can fail and the chip will just ignore them and carry on. So another reason for the wardrobe-sized server goes out the window.

And best of all, they’re cheap as chips! An equivalent 12-way sparc box would set you back about £100K. These things start at 3 grand. I can’t wait to get my hands on a few of them…

October 04, 2005

Http Digest Authentication

You may want to skip this entry; unless you're interested in HTTP it won't be terribly interesting. I'm testing out the theory that writing something down is a good way to understand it, since this is about the third time I've tried to get Digest Auth to stick in my head.

So, here goes:

Digest authentication (spec ) is one of the standardised HTTP authentication mechanisms. It was designed to protect against some of HTTP Basic's more egregious failings (such as the fact that it passes the user credentials in plain text )

At it's most basic, it works as follows

  • Client requests a resource which is protected.

  • Server responds with HTTP 401, and the header line
        WWW-Authenticate: Digest realm="Some realm", domain="/urlspace",nonce="long_random_string" 
  • client re-submits the first response, this time with an additional header
       Authorization: Digest username="user",
realm="some realm",
response="md5 hash of username, pwd, uri, method, and nonce"
  • Server verifies hash and serves response.

The optional qop (Quality of Protection) parameter in the www-authenticate header specifies which additional safeguards are to be used. In particular, if qop=auth-int is specified by the server, then the client returns a cnonce (client nonce) value, which is used as part of the hash. This stops a malicious proxy from specifying a nonce value designed to make cracking the hash possible.

Most implementations refine this by using a nonce that varies with time. However, there are some performance issues to consider here; if you vary the nonce on every request then parallel (pipelined) requests become impossible. Since virtually every browser now supports pipelining, this will have a fairly serious impact. The optional nc (nonce-count) attribute in the Authorization header allows the server to periodically supply a fresh nonce for further use.

Digest authentication more or less completely disables proxy cacheing, unless the response is marked as 'Cache-control: public' or 'Cache-control: must-revalidate' (in the latter case, the cache must HEAD the request back to the original server to verify before serving to the client).

Whilst digest authentication does provide reasonably good protection of user credentials, and (with a sufficently short-lived nonce), can also prevent replaying of requests, it does nothing to protect against packet-sniffing to extract content. For this, HTTPS is required. (In which case, many of the motivations for not using Basic go away).

Client support for digest authentication is good in modern browsers, but pretty shonky in V4 and earlier user-agents.

So, in summary:

  • HTTP Basic BAD
  • Digest Better
  • HTTPS + Basic good
  • HTTPS + Digest not appreciably better.

September 26, 2005


So, first day of term. I won't be getting too much coding done today. Instead, I'll be spending the day with one eye stuck to the green-screen (our application performance monitor) and the other on ganglia (the server monitor), to see how the uni web server stands up to the first day of term. This is the first time we've served the home pages from Sitebuilder, so there's a lot of extra load compared to previous years.
So far, not too bad. request times are a bit slower than usual for logged in users, but we're handling about 1200 page impressions (about 3500 hits ) per minute at the moment, and it seems to be OK.
By about wednesday, if past experience is anything to go by, I'll have regained my faith in the server enough to concentrate on other stuff for more than 15 minutes at a time (assuming it doesn't break in the meantime) …

September 01, 2005

Can you do big applications in little languages?

Follow-up to Ruby vs Java from Secret Plans and Clever Tricks

In a comment on my previous post, Jon said

My overall impression of Ruby on Rails is that it might be good for getting things going quickly, but it's bad for building large, stable, maintainable systems

I think this is interesting enough to examine in more detail. Though before I start, I'd better add a disclaimer: I'm a Java programmer, and whatever I might say in the rest of this entry, I'm likely to remain one for the forseeable (I hope!)

Anyway, on with the show. I think you can take two approaches to the statement above. Approach 1 is the easy one: Sure, you wouldn't write an airline reservation system in RoR, any more than you'd use J2EE to munge the output of top, but how large is large? Does any UK university have a bespoke web system which is too big to manage in RoR ? Sometimes, Java programmers can be guilty of treating every application as if it was the flight control system for a 747, when it's really just CRUD for 3 database tables.

…which leads me to approach 2, the more interesting approach. What are the limiting factors for an app. written in RoR? Or PHP+{Cake/Biscuit/Mojavi/etc.}, since it shares many of the same characteristics ?

I think that scalability in terms of performance is a complete red herring. There are any number of mahooosive apps running on LAMP architectures – tens or hundreds of millions of hits a day. There are less for Rails, in part because it's much newer technology, but I don't see anything there that makes me think it wouldn't scale in the same way.

Similarly stability, at least in terms of uptime. PHP and Rails' shared-nothing, sessionless architectures actually (ISTM) make it easier to provide resilience in the form of load-balanced servers, and the periodic cycling of httpd workers makes worries about memory leaks and the like much less of a big deal. Again, looking at the real world there are loads of LAMP sites whose application uptime is up above 99.99%; I'd contend that there are very few web applications with an uptime requirement that couldn't be met with a scripting langage-based architecture.

So we're left with questions of maintainability, which is where it gets interesting. There's absolutely no doubt in my mind that there are some awful bits of PHP out there running stuff on the internet. Not least because I've seen and had to clear up some of it. But there's some bloody terrible java too. And when you consider Ruby, the picture gets even muddier; Ruby is at least as OO as Java, if not more. There's nothing inherehent in Ruby that's any more likely to make you write crap than there is in Java.

At the end of the day, maintainability is, ISTM, a people issue and not a language issue. And this may be one area where Java scores. Because the barrier to entry for java apps is higher, (a) the average coding skill level is higher, and (b) the average team size (and thereby probability of at least one good coder involved in the project) is higher. But that just suggests that the same team ought to be able to produce equally maintainable code, regardless of platform – so they should choose whichever one makes the job easier.

There are a few thing that do weigh heavily in Java's favour though. Decent tooling (IDES, build tools, etc) and, arguably more importantly, high quality libraries for core stuff like socket malarkey, threading, unicode handling and XML parsing.

Nontheless, I'm skeptical of the claim that a scripting langage "can't do" big complicated applications. It feels a little bit like something that (to paraphrase Cal Henderson) "Is said to be true because it would be good if it was true".

Still, I'm sticking with Java. It's like a favourite jumper; sure it might be a bit scratchy, and some of those new t-shirts the cool kids are wearing sure look good, but I can't quite bring myself to risk getting caught out in the cold :-)

August 31, 2005

Ruby vs Java

Writing about web page

Actually, it's not a Ruby vs. Java post as such, if you want a language p*ssing contest you can look here.

However, following on from last week's Flickr event, I've been devoting a bit of time to thinking about the alternatives to our current J2EE deveopment environment, and whether we can learn anything from them. I couldn't quite bring myself to try PHP, but Ruby seemed like an a suitable point of comparison.

So… Language-wise, Ruby is quite nice. It's properly OO, dynamically typed, with a reasonable exception system. Using begin/end instead of { and } makes my toes curl a bit, but at least it's optional.

Rails is to Ruby as (approximately) JSP,Spring&Hibernate (or JDO&JSF) is to Java; an MVC-ish framework, a templating language and a persistence framework. It's really easy to do basic CRUD in; the framework does most of the work for you and there are code-generators to get you started. However, if you want that sort of thing in Java you can have it, with something like appfuse

The (apparent) lack of a decent IDE is aggravating; I've got pretty used to just banging on ctrl-. ('fill the next bit in') and ctrl-1 ('fix this error') in eclipse, and having to go back to vi was a bit of a slap in the face. ISTM that this is one of the big disadvantages of a dynamically typed language. But the tradeoff is the instant deploy: change code, hit refresh, view results. I'd forgotten how efficient that makes things; I must try and get that working in eclipse again. This is especially a problem with Spring and Hibernate, both of which take ages to post-process a deploy for various reasons

For my next trick, I'm going to try and do something which isn't quite standard CRUD, to see if Rails is trading off flexibility for ease-of-use or not.

August 05, 2005

AJAX libraries

Way back in January, I set out an aspiration to learn a bit more javascript, because 'rich' web applications are becoming the norm these days.

Of course, in the last 6 months things have moved on quite a bit. Then, if you wanted to do a google-suggest style autocomplete textbox, you pretty much had to code it yourself from the ground up. By now though, there are enough folk like me that just want it to work that people have started to produce libraries to do it. I've been trying a few out for the search engine front-end, and here's my observations so far:

DWR It's good at what it does, but because it uses it's own custom controller it's a bit hard to do any kind of unusual binding of the HTTP request to the java objects (our cookie-based authentication system, for example). There's supposed to be some Spring integration in the works, but right now it seems pretty limited.

Prototype : Has the benefit of a pretty big and active user base (The entire Ruby on Rails community), but the disadvantage, for a newbie like me, of having no documentation or example code whatsoever – if you can't read the script then you won't be able to use it. I'm sure this would be a good library to use if only I could work out how to drive it. : Fantastic visual effects, easy to use, good examples, but no actual AJAX (i.e. XmlHttpRequest) stuff over and above what Prototype provides, as far as I can see.

RICO : Nice and easy to use, good examples, easy AJAX code, but the AJAX callback syntax is a bit of a faff and it doesn't work on Safari. Their scrolling table/list implementation rocks, though. I want one of those.

DataRequestor : looks good, nice and simple, but I haven't tried it yet.

Right now I'm sticking with RICO for a little while, but I'm intending to keep looking around for the next few weeks to see if there's a library out there that can do everything I want.

Most recent entries


Search this blog

on twitter...


    RSS2.0 Atom
    Not signed in
    Sign in

    Powered by BlogBuilder
    © MMXXI