All entries for Tuesday 16 October 2007

October 16, 2007

Writing functional Java with Google Collections

I’ve been experimenting with Google’s new Collections API, which are a kind of type-safe, stripped-down version of Jakarta Commons Collections, providing you with some (though not all) of the list-processing features that are commonplace in more functional languages – like each, collect, detect and inject in Ruby for instance.

In theory, this should give a big win in terms of reducing code complexity and making for a better-decoupled and more testable design.

In practice, this turns out to be true, but with a somewhat unpleasant side effect. Java’s type-checking and lack of support for closures or code blocks means that when you switch to this style of coding, you end up introducing a lot of new little classes, typically anonymous inner classes for things like predicates and functions, which get used once and then thrown away.

For instance; this code has a cyclomatic complexity of 9, which is just barely acceptable. But it’s fairly readable, if you know what the object model looks like.

 for (Content content: page.getContents().values()) {
            for (ContentFetcher cf: content.getContentFetchers()) {
                if (cf instanceof AbstractFileBackedContentFetcher) {
                    String cfFile = ((AbstractFileBackedContentFetcher) cf).getFileName();
                    File directory = new File(rootDir, cfFile).getParentFile();
                    if (directory != null && directory.exists()) {
                        for (File subfile: directory.listFiles()) {
                            if (subfile.isFile() && !filenames.contains(subfile.getAbsolutePath())) {
                                files.add(subfile);
                                filenames.add(subfile.getAbsolutePath());
                            }
                        }
                    }
                }
            }
        }

listifying it, we get something like this. Here the cyclomatic complexity is about 5 – much better – but we’ve had to introduce two new anonymous inner classes, and there are a lot of awfully long lines of code.

        Predicate<AbstractFileBackedContentFetcher> cfDirectoryExists = new Predicate<AbstractFileBackedContentFetcher>() {
            public boolean apply(AbstractFileBackedContentFetcher cf) {
                File dir = new File(rootDir, cf.getFileName()).getParentFile();
                return dir != null && dir.exists();
            }
        };
        FileFilter okToAddFile = new FileFilter() {
            public boolean accept(File pathname) {
                return pathname.isFile() && !filenames.contains(pathname.getAbsolutePath());
            }
        };

        for (Content content: page.getContents().values()) {

            Iterable<AbstractFileBackedContentFetcher> contentFetchersWithExistingDirs = filter(filter(
                    content.getContentFetchers(), AbstractFileBackedContentFetcher.class), cfDirectoryExists);

            for (AbstractFileBackedContentFetcher cf: contentFetchersWithExistingDirs) {
                File[] subfiles = new File(rootDir, cf.getFileName()).getParentFile().listFiles(okToAddFile);
                for (File subfile: subfiles) {
                    files.add(subfile);
                    filenames.add(subfile.getAbsolutePath());
                }
            }
        }

Is that better? I’m not convinced either way. The second block of code is structurally simpler, but it’s about 50% more code. As a general rule, less code is better code...

I think that if you have frequently-used predicates and transformation functions (the instanceOf predicate which GC uses in Iterables.filter(Iterable,class) for instance), then it’s well worth the effort. But if you’re defining a predicate just to avoid having an if inside a foreach, then it’s less clear. Maybe when java 7 comes along and gives us closures (with some syntactic sugar to keep them concise) things will be better.

A related ugliness here (at least, if you’re used to the Ruby way of doing this) is google’s choice not to extend java.util.Iterable, but rather to have the collections methods defined static on the Iterables class. So instead of writing something like

GoogleIterable pages = new GoogleIterable(getPagesList())
files = pages.filter(WebPage.class).filter(undeletedPages).transform(new PagesToFilesTransformation())

we have to do

files = Iterables.transform(Iterables.filter(Iterables.filter(getPagesList(),WebPage.class),undeletedPages),new PagesToFilesTransformation()))

which to my eye is a lot more confusing – it’s harder to see which parameters match up with which methods.

Update I’ve now written my own IternalIterable class, which wraps an Iterable and provides filter(), transform(), inject(), sort() and find() methods that just delegate to google-collections (except for inject() which is all mine :-) ). It’s not very long or complex, and it’s already tidied up some previously ugly code rather nicely.


Most recent entries

Loading…

Search this blog

on twitter...


    Tags

    Not signed in
    Sign in

    Powered by BlogBuilder
    © MMXXI