Extract method is one of the core refactorings performed when cleaning up
a large function. A block of code is analysed for references to previously
existing variables, then moved into a new function with those variables passed
in. Any assignments that are later used are then returned from this new
function, which is trival to do with tuples in Python.
Fowler in Refactoring suggests alternatives for languages like Java: replace
temporary values with query methods or even creating a method object instead
of a new method (i.e. encapsulate the state changes inside a new class which has
a “calculate” or similar method. The absurdity of this style of programming
was commented on in this excellent article by Steve Yegge)
Anyway, here’s fun problem with performing extract method automatically: How to
propagate early exits?
i = f(i)
# Begin extract method
if i == 0:
# End extract method
i = g(i)
In order to propagate the early return, we need to return a boolean indicating
such, but this means we must introduce temporaries, either with ugly automatic
naming to avoid conflicts:
_cond_1, _ret_1_true, _ret_1_false = extracted(i)
or asking the user for suitable names.
This problem also extends to other control flow statements: break, continue,
yield (of course, raise isn’t a problem).
Yield is particularly noteworthy because the yield must remain in the original
function in order for the function to remain a generator i.e. any refactorings
around a yield must be transformed beyond simple code movement.
In Python 2.5, yield was extended to allow it to return a value inside the
generator, so data flow between a generator and its caller became bidirectional.
Another fun thing is detecting and rejecting invalid ranges of extract method:
Attempting to extract between any two of the function calls is invalid (apart
from moving the entire try/except/finally block). Similarly trying to extract
from j() into the new function definition is invalid.
EDIT: Fixed Python doc URL