All 86 entries tagged Java

View all 243 entries tagged Java on Warwick Blogs | View entries tagged Java at Technorati | There are no images tagged Java on this blog

May 19, 2008

Spring extensible XML and web controllers

So, having got spring XML config extensions working quite nicely, I thought I’d have a go at another area of our codebase that’s been bugging me.

We have a few areas where we re-use the same MVC controller many times, with a different command and a different view. So the config looks something like this:

<bean name="/view/userbreakdown.htm" id="userBreakdownController" class="uk.ac.warwick.sbr.web.hitlog.HitLogStatsController">
        <property name="commandClass" value="uk.ac.warwick.sbr.hitlog.HitLogStatsUserInfoBreakdownCommand"/>
        <property name="statsView" value="hitLogUserBreakdownView"/>
</bean>

<bean name="/view/ipbreakdown.htm" id="ipBreakdownController" class="uk.ac.warwick.sbr.web.hitlog.HitLogStatsController">
        <property name="commandClass" value="uk.ac.warwick.sbr.hitlog.HitlogStatsIPInfoBreakdownCommand"/>
        <property name="statsView" value="hitLogIPBreakdownView"/>
</bean>

... continue for many more HitLogStatsControllers...

Hmm, I thought, would’t this be nicer if we could just say

   <stats:controller name="/view/all.htm" id="summaryController" command="HitLogStatsSummaryCommand" view="hitLogAllView"/>

?

Well, it turns out to be harder than you might think. As you can see from the bean definitions, we’re using the BeanNameUrlHandlerMapping to let SMVC map requests onto controllers. This relies on setting the name attribute of a bean to the URL you want (You can’t use the ID, because slashes are illegal for ID attributes). N.b. this is an attribute of the bean def; not a property of the bean.

So, we need to set the bean name. But, this doesn’t appear possible using
NamespaceHandlerSupport. The name isn’t actually an attribute of the BeanDefinition itself, rather it’s part of the BeanDefinitionHolder class. You set it using the 3-arg constructor of BeanDefinitionHolder. Alas, all beans defined by non-default XML have their BeanDefinitionHolders created for them in AbstractBeanDefinitionParser.parse, which calls the 2-arg version of the constructor (which doesn’t set a beanName). Default XML elements, by comparison, are created in BeanDefinitionParserDelegate, which uses the 3-arg version.

So, can we fix it? Making the custom parsing code call the 3-arg constructor would involve ripping a great deal of the guts of the XML parsing code out; not something I’d be too keen on. Maybe I should raise a JIRA with the Spring MVC team.

An easier solution might be to write a different HandlerMapping, that used a bean property (“path”, say) rather than bean names/aliases to store the URL path in. This strikes me as a nicer solution (not least because it doesn’t overload the bean name with behaviour that’s nothing to do with names), though I don’t know whether it would perform as well ( lookups presumably get cached,though, so it would be a one-off cost).

Alternatively, I could convert the one-controller-several-commands model into several ThrowawayControllers (all inheriting a common base), and then use the annotation-based config to set them all up. This seems like it might be a neater long-term solution, so long as there’s nothing that’s too expensive to set up in the controllers (which can’t be pushed out into an injected service).


May 13, 2008

Making Spring XML config better

One of my pet peeves with Spring is the way that, left unchecked, it can grow yards and yards of inscrutiable XML. Over the last few days, amongst other things, I’ve been looking at whether we can improve things.

Here’s an example to get started with. We use classes called ModelAccessors to hold references to data that our Spring WebFlow processes need. A typical flow might have half-a-dozen ModelAccessors, for all of it’s various bits of state. They look like this in the application context:

  <bean id="emptyFilesAccessor" class="uk.ac.warwick.sbr.webflow.FlowScopeModelAccessor">
    <constructor-arg index="0" value="duplicateFiles"/>
    <constructor-arg index="1" value="java.util.List"/>
  </bean>
  <bean id="invalidFileNamesAccessor" class="uk.ac.warwick.sbr.webflow.FlowScopeModelAccessor">
    <constructor-arg index="0" value="createdFiles"/>
    <constructor-arg index="1" value="java.util.List"/>    
  </bean>

  <bean id="uploadZipFileFormAccessor" class="uk.ac.warwick.sbr.webflow.FlowScopeModelAccessor">
    <constructor-arg index="0" value="uploadZipFileForm"/>
    <constructor-arg index="1" value="uk.ac.warwick.sbr.webflow.action.upload.UploadZipFileForm"/>
  </bean>
  ... continues for many more...

Now, there’s a couple of problems with this:
1) There’s 4 lines of XML for every accessor, and only 2 things ever change; the ID and the second constructor argument (the first arg is derivable from the ID). Even the second argument is usually just ‘java.util.List’

2) They’re not very communicative. The most important thing about this element (that it’s a ModelAccessor) is an attribute. There’s very little information about what those constructor-arguments actually mean. Surely we could have something a bit more expressive?

And luckily, there’s a simple fix for both of these problems. Spring has support for extending the XML context syntax by adding in your own custom namespaces. Define your extension in an XSD schema, write a parser plug-in, and away you go:

<sbr:model-accessor id="uploadZipFileFormAccessor" modelclass="uk.ac.warwick.sbr.webflow.action.upload.UploadZipFileForm"/>
  <sbr:list-model-accessor id="invalidFileNamesAccessor"/>
  <sbr:list-model-accessor id="emptyFilesAccessor" />

- much nicer.

So, how much effort is this to implement? Not that much, as it turns out. The instructions are a pretty good start. The only annoyance you’re likely to face is cryptic SAX parsing errors like this:

 org.springframework.beans.factory.xml.XmlBeanDefinitionStoreException: Line 8 in XML document from class path resource [uk/ac/warwick/sbr/spring/sbr-modelaccessor.xml] is invalid; nested exception is org.xml.sax.SAXParseException: cvc-complex-type.2.4.c: The matching wildcard is strict, but no declaration can be found for element 'sbr:model-accessor'.

What this means is that somewhere in either the header of your context.xml file, or in your META-INF/spring.handlers file, there’s a typo, so the parser can’t go from xmlns: element to xsi:schemalocation element to spring.handlers mapping.

Spring provides a base class for the NamespaceHandler and BeanDefinitionParser, so there’s really not much work to do in implementing them:

    public void init() {
        registerBeanDefinitionParser("model-accessor", new ModelAccessorBeanDefinitionParser()); 
    }
    final class ModelAccessorBeanDefinitionParser extends AbstractSingleBeanDefinitionParser{
        protected Class getBeanClass(Element element) {
            return FlowScopeModelAccessor.class; 
         }
        protected void doParse(Element element, BeanDefinitionBuilder bean) {
         bean.addConstructorArgValue(element.getAttribute("id").replaceAll("Accessor$", ""));           
         bean.addConstructorArgValue(element.getAttribute("modelclass"));               
         }

Easy!


May 01, 2008

The SpringSource Application Platform: why would I want it?

In my experience, software development projects can be divided into two worlds: those that can be done with a small team, and those that need a big team.

By “small team”, I mean about 4 people, and certainly no more than 6. Basically, if you need more than 2 pizzas to feed the team, it’s not small. Big teams, on the other hand, are typically more than 10. That’s because there’s an interesting phenomenon which happens on teams between about 6-10 people, which is that suddenly the effort to communicate between everyone in the team becomes much larger, so much so that adding more people actually makes things worse, until you get above about 10. Then you can start to form a normal hierarchy (2 sub-teams and a co-ordinating team) and get going again.*

Small teams, I have found, are much more productive per person. Before I came to Warwick I was involved in a 100-person development team (40 java coders, 20 business analysts/testers, 20 assorted managers and 20 admins/testers/other hangers-on) which eventually collapsed under it’s own inefficiency after 18 months. The best 6 people were pulled from the wreckage, and re-implemented the entire project in 6 months flat. This is not unusual. Sure, there are lots of projects that are simply too big for a 6-person team to pull off, but every year the range of stuff that can be accomplished by a small, focussed team, gets larger and larger.

There’s a marked difference in the kinds of tools and frameworks that suit ‘small-team’ development and ‘big-team’. Small-team frameworks focus on enabling you to do as much as possible, as quick as possible. The poster-child for a small-team framework is surely Rails, but of course there are many others.
Big-team software is focussed on preventing other people from screwing up your stuff. In a big-team project, individual productivity isn’t that important compared to effective modularisation and decoupling, because you can always just add more programmers to go faster. J2EE is (or at least, was) the ne plus ultra of big-team frameworks, although with JEE5 (and even moreso with 6) it’s making it’s way back towards the little guys.

It’s a bit like the difference between vertical-scaling software (buy the fastest single server you can) and horizontal-scaling (make it possible to run your software on lots of servers)

Now, I have a personal preference for small-team development. Slightly unusually, I also have a preference for java – small teams frequently prefer dynamically typed languages, as they fit closer with the ‘sacrifice safety for speed’ philosophy. The single biggest factor that’s enabled me to square this circle has been the Spring Framework – a set of libraries that give me the ability to get a simple web-based project up and running in next to no time, but knowing that whatever I’m going to need in the future, be it asynchronous message processing, WS-* remoting, distributed cacheing, flexible declarative security,or whatever, it will be available, and it will fit in with everything else.

So, I have to say that I was a teeny bit disappointed when I read about the new SpringSource Application Platform. An application server which is based on OSGi rather than EJB for it’s modularisation.
Now, I don’t doubt for a moment that OSGi is a much better technology than EJB for modularisation. Lots of folk are using it, for humungously complex projects like eclipse, and it works really well.

What irks me, though, is this: why would I want OSGi modules? I’m quite happy with 1 great big WAR file, thanks. Neither I, nor my happy few developers, need the ability to break our app up into little bits, version them and then dynamically lazy-load them. In fact, I think lazy-loading a web app is a terrible idea, and I can do all the modularisation I need to with ivy at build time.

Of course, this is just sour grapes. Big-team developers do want this, and SpringSource have every right to give it to them. It’s just that I’d got kinda used to Spring spending time and effort on what I want, not what those enterprisey guys in suits were after.

Spring started out it’s life as a reaction against the excess baggage that J2EE development entailed. By their own admission, SpringSource have put a lot of time and effort into this product, and doubtless they will need to keep on doing so – and AFAICS that’s time and effort that’s not being spent on making ‘lightweight java’ easier. SpringSource, you have become what you beheld; are you content that you have done right?

* Astute readers will point out that the ‘web development’ team that I run has 12 people in; but in terms of product development it really runs as 3 or 4 2-3 person dev teams, plus a 3-4 person ops/support team (with some overlap, maths fans). We only function as a team of 12 when we need to take over a corner of the pub


October 31, 2007

Netbeans surprises me

Follow-up to Netbeans 5: Still not switching from Secret Plans and Clever Tricks

I’ve never been able to get on with Netbeans as a java IDE. Somehow, if you’re used to Eclipse it’s just too wierd and alien, and things that ought to be simple seem hard. I’m sure that if you’re used to it, it’s very lovely, but I just can’t get started with it.

However, one thing Eclipse is not very good at, IME, is Ruby development. There are plugins, but I’ve never had much success with them; debugging support is patchy-going-on-broken, syntax highlighting / completion is super-basic, and it’s generally only one (small) step up from Emacs with ruby-mode and pabbrev.

(Note that I’m not talking about Rails development here, I’m talking about using Ruby to write stuff that would previously have been done in perl – sysadmin scripts, monitors, little baby apps and so on. Things of a couple of hundred lines or so – nothing very big, but enough that an unadorned text editor is a bit of a struggle.)

There are other Ruby IDEs of course, but they’re almost all (a) OSX specific (b) Windows specific, (c) proprietary, or (d) crap. I’d like something free, that runs on linux, but doesn’t suck, please.

Now, Sun have been making a big noise about their Ruby support generally for about the last 12 months or so, so I thought I’d grab a copy of the Ruby-specific Netbeans 6 bundle and try it out.

And, surprise surprise, it’s really good. Out of the box it almost just works – the only minor hackery I had to do was a manual install of the fastdebug gem, but the error message linked me to a web page explaining what I had to do and why. Debugging works, you can do simple refactorings, syntax highlighting and code completion are reasonably sophisticated. And it looks nice, performs well, and is all fairly intuitive to use, even for a died-in-the-wool eclipse-er like me.

So, three cheers for the Netbeans team, for filling the gaping void in the Ruby IDE space. Development still seems to be pretty active, so hopefully we can expect even more goodness in the months to come.


October 30, 2007

Spring 2.5 web mvc gripe

Spring 2.5 is almost upon us, so I thought I’d grab the RC and have a look at what’s new.

My eye was drawn immediately to the enhancements to the MVC layer; specifically, support for convention-over-configuration and annotation-based configuration. Both of these techniques should help to reduce the yards of XML needed to configure spring web applications (although to be fair, things have been getting better since 2.0 re-worked the config file formats).

Anyway, I started building a little demo, using the sample apps as a template. And came up against an interesting problem almost immediately. Here’s an excerpt from the docs, showing a MultiActionController using annotated configuration:

@Controller
public class ClinicController {

    @RequestMapping("/vets.do")
    public ModelMap vetsHandler() {
        return new ModelMap(this.clinic.getVets());
    }

    @RequestMapping("/owner.do")
    public ModelMap ownerHandler(@RequestParam("ownerId") int ownerId) {
        return new ModelMap(this.clinic.loadOwner(ownerId));
    }

Spot the obvious mistake? There is no need for the @RequestMapping annotation to have to repeat the name of the mapping, if every method follows the same convention. Just take ‘handler” off the end of the method name and use that for the URL. Don’t make me type it twice!
What’s more annoying is that this works in the old-style MultiActionController – but if you go down that route your controller methods have to take HttpServletRequest/Response parameters and you can’t use the lovely new @RequestParam binding annotations. Gah!

If you’re content to have a separate controller per URL, with separate functions for GET/POST (SimpleFormController style), then the convention + annotation based approach works pretty well – so it’s a shame that they couldn’t finish the job and sort out MultiActionController as well; then we could have rails-style create/read/update/delete controllers. Oh well…


October 16, 2007

Writing functional Java with Google Collections

I’ve been experimenting with Google’s new Collections API, which are a kind of type-safe, stripped-down version of Jakarta Commons Collections, providing you with some (though not all) of the list-processing features that are commonplace in more functional languages – like each, collect, detect and inject in Ruby for instance.

In theory, this should give a big win in terms of reducing code complexity and making for a better-decoupled and more testable design.

In practice, this turns out to be true, but with a somewhat unpleasant side effect. Java’s type-checking and lack of support for closures or code blocks means that when you switch to this style of coding, you end up introducing a lot of new little classes, typically anonymous inner classes for things like predicates and functions, which get used once and then thrown away.

For instance; this code has a cyclomatic complexity of 9, which is just barely acceptable. But it’s fairly readable, if you know what the object model looks like.

 for (Content content: page.getContents().values()) {
            for (ContentFetcher cf: content.getContentFetchers()) {
                if (cf instanceof AbstractFileBackedContentFetcher) {
                    String cfFile = ((AbstractFileBackedContentFetcher) cf).getFileName();
                    File directory = new File(rootDir, cfFile).getParentFile();
                    if (directory != null && directory.exists()) {
                        for (File subfile: directory.listFiles()) {
                            if (subfile.isFile() && !filenames.contains(subfile.getAbsolutePath())) {
                                files.add(subfile);
                                filenames.add(subfile.getAbsolutePath());
                            }
                        }
                    }
                }
            }
        }

listifying it, we get something like this. Here the cyclomatic complexity is about 5 – much better – but we’ve had to introduce two new anonymous inner classes, and there are a lot of awfully long lines of code.

        Predicate<AbstractFileBackedContentFetcher> cfDirectoryExists = new Predicate<AbstractFileBackedContentFetcher>() {
            public boolean apply(AbstractFileBackedContentFetcher cf) {
                File dir = new File(rootDir, cf.getFileName()).getParentFile();
                return dir != null && dir.exists();
            }
        };
        FileFilter okToAddFile = new FileFilter() {
            public boolean accept(File pathname) {
                return pathname.isFile() && !filenames.contains(pathname.getAbsolutePath());
            }
        };

        for (Content content: page.getContents().values()) {

            Iterable<AbstractFileBackedContentFetcher> contentFetchersWithExistingDirs = filter(filter(
                    content.getContentFetchers(), AbstractFileBackedContentFetcher.class), cfDirectoryExists);

            for (AbstractFileBackedContentFetcher cf: contentFetchersWithExistingDirs) {
                File[] subfiles = new File(rootDir, cf.getFileName()).getParentFile().listFiles(okToAddFile);
                for (File subfile: subfiles) {
                    files.add(subfile);
                    filenames.add(subfile.getAbsolutePath());
                }
            }
        }

Is that better? I’m not convinced either way. The second block of code is structurally simpler, but it’s about 50% more code. As a general rule, less code is better code...

I think that if you have frequently-used predicates and transformation functions (the instanceOf predicate which GC uses in Iterables.filter(Iterable,class) for instance), then it’s well worth the effort. But if you’re defining a predicate just to avoid having an if inside a foreach, then it’s less clear. Maybe when java 7 comes along and gives us closures (with some syntactic sugar to keep them concise) things will be better.

A related ugliness here (at least, if you’re used to the Ruby way of doing this) is google’s choice not to extend java.util.Iterable, but rather to have the collections methods defined static on the Iterables class. So instead of writing something like

GoogleIterable pages = new GoogleIterable(getPagesList())
files = pages.filter(WebPage.class).filter(undeletedPages).transform(new PagesToFilesTransformation())

we have to do

files = Iterables.transform(Iterables.filter(Iterables.filter(getPagesList(),WebPage.class),undeletedPages),new PagesToFilesTransformation()))

which to my eye is a lot more confusing – it’s harder to see which parameters match up with which methods.

Update I’ve now written my own IternalIterable class, which wraps an Iterable and provides filter(), transform(), inject(), sort() and find() methods that just delegate to google-collections (except for inject() which is all mine :-) ). It’s not very long or complex, and it’s already tidied up some previously ugly code rather nicely.


August 29, 2007

Wishlist: thread–safe hibernate sessions

Hibernate is lovely, but I wish that it would provide me a thread-safe Session implementation. We’ve got a couple of servers which are more or less as fast as you can buy, in terms of single-threaded performance, and a few points in our app which could be faster. Now, these servers have 8 cores each, and typically only have one or two threads in the run queue at any given time; i.e. 6 of the cores are basically idle. So it would make a lot of sense if we could take a request, split it into 4 components, and distribute those components over 4 separate threads. If we could do that without too much overhead, we could actually switch from our relatively power-greedy opteron chips, to something a bit more frugal like a Sun T1, which would be nice.

However, our apps are processing hibernate persistent objects, which means that they need a session to operate within, which means they are bound to a single thread.

We could create 4 threads and give each one a new session, but that means that each thread will need to re-query for it’s data, since the hibernate L1 cache is bound to the session. So that won’t work. We could deploy a level 2 cache, but they we have to somehow manage to invalidate all the data that a particular request has loaded without affecting other, concurrent requests (we have no shared cache between requests, so that (a) we don’t need much heap, and (b) we can loadbalance between multiple machines and JVMs without needing to share the cache.

So we’re stuck in an awkward position. We can’t multithread the app any further, because all the performance we’d gain by spreading processing over more cores, we’d loose by having to re-read the same data over and again from the DB.

What I really want, I guess, is a Level 1.5 cache – something that’s bound to the operation rather than the session, and that can be shared between multiple threads which are co-operating to do all of the processing that’s needed. Alas, it seems such a thing doesn’t exist.

Update It would seem I’m not alone. “Uncle” Bob Martin blogs about exactly the same problem, albeit at a slightly higher level of abstraction.


February 20, 2007

Spring and the golden XML hammer

Writing about web page http://www.theserverside.com/tt/articles/article.tss?l=SpringLoadedObserverPattern

This article describes as best practice, one of the things that I’m really coming to dislike about the Spring Framework – the tendency to use XML for object construction for no better reason than ‘because I can’.

Now, I love spring; It’s revolutionised the way I, and many others, write code, and for the better. But it does have a tendency to produce reams of XML. As a data format, I think XML is OK. It’s precise, and the tooling is good, though it’s a good deal more verbose than something like JSON or YAML, which, IMO, have 80% of the functionality with 20% of the overhead.

For aspects of an application which are genuinely configuration, such as the mapping of URLs to controllers, or configuration of persistence contexts, XML is better than code; no doubt about it. For the construction of object graphs, XML is sometimes better than code. But this example is just pushing it too far. It describes setting up an observer/observable pair, using the side-effects of spring’s MethodInvokingFactoryBean to call the addListener() method, rather than doing it in code.

Now, this is just clunky. Instead of one line of code that says

townCrier.addListener(townResident);

we have this

<bean id="registerTownResident1" 
class="org.springframework.beans.factory.config.MethodInvokingFactoryBean">
    <property name="targetObject"><ref local="townCrier"/></property>
    <property name="targetMethod"><value>addListener</value></property>
    <property name="arguments">
    <list>
      <ref bean="townResident1"/>
    </list>
    </property>
</bean>

Ten lines of XML. No static type-checking (I hope you’ve got a bunch of tests that verify your contexts…) The addListener invocation, the thing we’re trying to achieve here, is kind of buried; the bean that’s actually generated is never used, the whole thing is far from obvious in it’s intent.

The only notional advantage I can see is that you can add and remove listeners without touching the code. But how much of an advantage is that? In most situations, where you’re using a method-invoking synchronous observer/subject pattern like this, listeners are part of the application, and not part of the configuration; you wouldn’t remove one without first consulting a developer anyway. When you’ve got genuinely replaceable listeners, then it’s more common IME to have some kind of an abstraction like a JMS queue or a message bus in between subject and listener, so that the listeners are registered with the queue, not the subject itself.

If it were up to me, I’d probably have a class called when the context is built (via an ApplicationListener maybe), which explicitly built up the subject/observer relations. If I had some configurable relationships, I might pass in a list of observers, but that’s about as far as it would go;

\\ set by IOC
setChangeEventListeners(List<ChangeEventListener> listeners){
   this.changeListenersToRegister = listeners;
}
onContextRefreshed(){

  \\ configure a subject with a list of observers
  \\\
   for (ChangeEventListener listener : this.changeListenersToRegister){
       this.changeEventBroadcaster.addListener(listener);
   }

   \\ now hard-code a subject that won't need to change frequently
   auditLog.addListener(new log4j.Category("AUDIT_LOG");

  \\ ... and so on  
}

- this object starts to look a bit vague and ill-defined, doing a little with lots of objects, but that’s because really it’s just a part of the context/configuration; it’s not a part of the domain per se.

There are a few other options that, in some situations, might be better than this;

  • Give the subject a constructor that takes a list of observers, and let it wire them at construction time – then pass the list from within your XML context
  • If you can’t modify the subject itself, make a custom FactoryBean that takes the list of observers, constructs the subject and adds all the observers to it
  • One that requires a bit of divergence from the standard Spring usage. Have a context that’s defined by a bit of scripting code – JRuby, or BSH, or javascript/rhino, rather than by XML. That way you make your method calls more explicit, and allow developers to easily see what relationships are being built up , whilst still keeping some clear separation between the configuration and the java code. If you had loads of Observer/subject configuration to maintain, you could define a little DSL for it (or store it in a database) and have a custom context to parse the DSL and configure the beans.

December 08, 2006

Solaris SMF manifest for a multi–instance jboss service

Today I have mostly been writing SMF manifests. We typically run several JBoss instances per physical server (or zone), using the JBoss service binding framework to take care of port allocations. I couldn’t find a decent SMF manifest that would be safe to use in a context where you’ve got lots of JBosses running, so I wrote my own. Here it is…

It’s still a tad rough around the edges.
  • It assumes you’ll name your SMF instances the same as your JBoss server instances
  • The RMI port for shutdowns is specified as a per-instance property – in theory one could parse it out of the service bindings file, but doing that robustly is just too much like hard work at the moment.
  • It assumes that you’ll want to run the service as a user called jboss, whose primary group is webservd – adjust to suit.
  • The jvm_opts instance property allows you to pass specific options (for example, heap size) into the JVM
  • It assumes that you’ll have a log directory per instance, located in /var/jboss/log/{instance name}-{rmi port}. The PID file is stored there, and the temp. file dir is set to there too (using /tmp for temporary files is a bad idea if you hoover your temp dir periodically, as you’ll delete useful stuff)
  • The stop method waits for the java process to terminate (otherwise restart won’t work. The start method doesn’t wait for the server to be ready and to have opened it’s HTTP listener, just for the VM to be created. I might add that next, although given that svcadm invocations are asynchronous there doesn’t seem much point.

The manifest itself:

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
  <service name='application/jboss' type='service' version='0'>
    <instance name='default' enabled='true'>
      <dependency name='network' grouping='require_all' restart_on='error' type='service'>
        <service_fmri value='svc:/milestone/network:default'/>
      </dependency>
      <dependency name='sysconfig' grouping='require_all' restart_on='error' type='service'>
        <service_fmri value='svc:/milestone/sysconfig:default'/>
      </dependency>
      <dependency name='fs-local' grouping='require_all' restart_on='error' type='service'>
        <service_fmri value='svc:/system/filesystem/local:default'/>
      </dependency>
      <exec_method name='start' type='method' exec='/usr/local/jboss/bin/svc-jboss start' timeout_seconds='180'>
        <method_context>
            <method_credential user='jboss' group='webservd' />
        </method_context>
      </exec_method>
      <exec_method name='stop' type='method' exec='/usr/local/jboss/bin/svc-jboss stop' timeout_seconds='180'>
        <method_context/>
      </exec_method>
      <property_group name='jboss' type='application'>
        <propval name='instance-rmi-port' type='astring' value='1099'/>
        <propval name='jvm-opts' type='astring' value='-server -Xmx1G -Xms1G'/>
      </property_group>
    </instance>
    <stability value='Evolving'/>
    <template>
      <common_name>
        <loctext xml:lang='C'>JBoss J2EE application server</loctext>
      </common_name>
    </template>
  </service>
</service_bundle>

... and the service method

#!/usr/bin/sh
#

. /lib/svc/share/smf_include.sh

# General config
#
JAVA_HOME=/usr/java/
JBOSS_HOME=/usr/local/jboss
JBOSS_CONSOLE=/dev/null

# instance-specific stuff:
# sed the instance name out of the FMRI
JBOSS_SERVICE=`echo $SMF_FMRI | sed 's/.*:\(.*\)/\1/'`
JBOSS_SERVICE_RMI_PORT=`svcprop -p jboss/instance-rmi-port $SMF_FMRI`
SERVICE_JVM_OPTS=`svcprop -p jboss/jvm-opts $SMF_FMRI`

# Derived stuff
#
JBOSS_VAR=/var/jboss/jboss-3.2.7/${JBOSS_SERVICE}-${JBOSS_SERVICE_RMI_PORT}
PIDFILE=${JBOSS_VAR}/JBOSS_${JBOSS_SERVICE}.PID
JAVA=${JAVA_HOME}/bin/java
JAVA_OPTS="-Djava.io.tmpdir=${JBOSS_VAR} -Djava.awt.headless=true" 

if [ -z "$SMF_FMRI" ]; then
        echo "JBOSS startup script must be run via the SMF framework" 
        exit $SMF_EXIT_ERR_NOSMF
fi

if [ -z "$JBOSS_SERVICE" ]; then
        echo "Unable to parse service name from SMF FRMI $SMF_FRMI" 
        exit $SMF_EXIT_ERR_NOSMF
fi

jboss_start(){
        echo "starting jboss.." 
        JBOSS_CLASSPATH=${JBOSS_HOME}/bin/run.jar:${JAVA_HOME}/lib/tools.jar
        if [ ! -z "$SERVICE_JVM_OPTS" ]; then
           JAVA_OPTS="${JAVA_OPTS} ${SERVICE_JVM_OPTS}" 
        fi

        $JAVA -classpath $JBOSS_CLASSPATH $JAVA_OPTS $SERVICE_JVM_OPTS org.jboss.Main -c ${JBOSS_SERVICE} >$JBOSS_CONSOLE 2>&1 & echo $! >${PIDFILE}
}

jboss_stop(){
        echo "stopping jboss.." 
        stop_service="--server=localhost:${JBOSS_SERVICE_RMI_PORT}" 
        JBOSS_CLASSPATH=${JBOSS_HOME}/bin/shutdown.jar:${JBOSS_HOME}/client/jnet.jar
        $JAVA -classpath $JBOSS_CLASSPATH org.jboss.Shutdown $stop_service
        PID=`cat ${PIDFILE}`
        echo "waiting for termination of process $PID ..." 
        pwait $PID
        rm $PIDFILE
}

case $1 in
'start')
        jboss_start
        ;;

'stop')
        jboss_stop
;;

'restart')
        echo "Restarting jboss" 
        jboss_stop
        jboss_start
        ;;

*)
        echo "Usage: $0 { start | stop | restart }" 
        exit 1
        ;;
esac

        

enjoy!

postscript I wrote above that parsing the service-bindings file to find the RMI port is too hard; this turns out not to be true. Praise be to Blastwave!

pkg-get install xmlstarlet

xml sel -t -v "/service-bindings/server[@name='${INSTANCE_NAME}']/service-config[@name='jboss:service=Naming']/binding/@port" service-bindings.xml 

November 22, 2006

Tuning Java 5 garbage collection for mixed loads

Once again, I find myself glaring balefully at the output of garbage collection logs and wondering where my CPU is going. Sitebuilder2 has a very different GC profile to most of our apps, and whilst it’s not causing user-visible problems, it’s always good to have these things under control.

So, SB2 has an interesting set of requirements. Simplistically, we can say it does 3 things:

1) Serve HTML pages to users
2) Serve files to users
3) Let users edit HTML/Files/etc

these 3 things have interestingly different characteristics. HTML requests generate a moderate amount of garbage, but almost always execute much quicker than the gap between minor collections. So, in principle, as long as our young generation is big enough we should get hardly any old gen. garbage from them. Additionally, HTML requests need to execute quickly, else users will get bored and go elsewhere.

Requests for small files are rather similar to the HTML requests, but most of our file serving time is spent drip-feeding whacking great files (10MB and up) to slow clients. This kind of file-serving generates quite a lot of garbage, and it looks as if a lot of it sticks around for long enough that it ends up in the old gen. Certainly the requests themselves take much longer than the time between minor collects, so any objects which have a lifetime of the HTTP request will end up as heap garbage. Large file serving, though, is mostly unaffected by the odd GC pause. If your 50MB download hangs for a second or two halfway through, you most likely won’t notice.

Edit requests are a bit of a mishmash. Some are short and handle only a little data, others (uploading the aforementioned big files, for instance) are much longer running. But again, the odd pause here and there doesn’t really matter. There are orders of magnitude fewer edit requests than page/file views.

So, the VM is in something of a quandry. It needs a large heap to manage the large amounts of garbage generated from having multiple file serving requests going on at any given time. And it needs to minimise the number of Full GCs so as to minimise pauses for the HTML server. But, the cost of doing a minor collection goes as a function of the amount of old generation allocated, so a big, full heap implies a lot of CPU sucked up by the (parallel) minor collectors. It also means longer-running minor collections, and a greater chance of an unsuccessful minor collect, leading to a full GC.
(For reference, on our 8-way (4 proc) opteron box, a minor collect takes about 0.05s with 100MB of heap allocated, and about 0.7S with 1GB of heap allocated)

So, an obvious solution presents itself. Divide and Conquer.

Have a VM (or several) dedicated to serving HTML. These should have a small heap, and a large young generation, so that parallel GCs are generally fast, and even a full collection is not going to take too long. This VM will be very consistent, since pauses should be minimal.

Secondly, have a VM for serving big files. This needs a relatively big heap, but it can be instructed to do full GCs fairly frequently to keep things under control. There will be the occasional pause, but it doesn’t matter too much. Minor collections on this box will become rather irrelevant, since most requests will outlive the minor GC interval.

Finally, have a VM for edit sessions. This needs a whacking big heap, but it can tolerate pauses as and when required. Since the frequency of editor operations is low, the frequency of minor collects (and hence their CPU overhead) is also low.

The only downside is that we go from having 2 active app server instances to 6 (each function has a pair of VMs so we can take one down without affecting service). But that really only represents a few extra hundred MB of memory footprint, and a couple of dozen more threads on the box. It should, I hope, be a worthwhile trade off.


Most recent entries

Loading…

Search this blog

on twitter...


    Tags

    RSS2.0 Atom
    Not signed in
    Sign in

    Powered by BlogBuilder
    © MMXVII