If you've ever revisited some code and "tidied it up", you've probably performed a refactoring. If the code continued to work afterwards, then it satisfies some definitions of refactoring.
Don Roberts's PhD. thesis was a stimulating read about specifying and implementing refactorings, albeit a bit thin in places – the claim that conservative static checking matched with liberal dynamic checking could produce exact results wasn't explored or justified.
Despite this, the paper was very useful, especially for two things:
- The concept of extending the syntax of the language to produce a meta–language for pattern matching and specifying program transformations. The language used in the paper was smalltalk, so I need to think about a suitably pythonic version of this concept.
- A formal basis for reasoning about refactorings:
A refactoring is an ordered triple R = (pre, T , P ) where pre is an assertion that must be true on a program for R to be legal, T is the program transformation, and P is a function from assertions to assertions that transforms legal assertions whenever T transforms programs.
This is later extended to reason about dependencies between refactorings using a superficially similar sounding method to what Darcs uses to represent changes. I've not properly read up on Darcs's patch theory, so at the moment I consider them similar because they both use commutativity to establish independence.
The title of this post is a Python Abstract Syntax Tree. ASTs appear to be the only sensible way of transforming a program, although converting the changes back into source code whilst preserving formatting and comments is challenging. Ideas gleaned from the thesis include extending the AST to have a Comment node or storing "textual coordinates" on the nodes – something like this is already stored in order to provide sensible error diagnostics.
At the moment, I am playing around with the Python standard library's compiler module to produce ASTs and writing Visitors to traverse them. A brief conversation in #pypy on freenode indicates that it might be worth using PyPy instead, one reason given was that I'll need to produce a flow graph in order to do refactoring properly and PyPy already does something along these lines.