A bit of functional Python will help you discover alternate approaches to how programs can be structured and why it’s important to practice such skills on small, well-defined programs.
In this article you will find:
- praise for deliberate practice and gaining experience on small, simple programs;
- some surprising connections between the object-oriented and functional paradigms;
- how the declarative approach is changing the way we think about software.
The German composer Robert Schumann wrote in an article dedicated to young musicians:
IX: Strive to play easy pieces well and beautifully; it is better than to render harder pieces only indifferently well.
Robert Schumann, Rules for Young Musicians
I try to make the very same kind of deliberate practice from time to time by writing little tools that make my life easier.
One that I’ve written recently is the doc
command so that I don’t have to think about how to access Docker configuration files in the projects I use.
In the projects I work with, I may use the original docker-compose
command in one of the following ways:
docker-compose [...]
– provided that there’s a file in the main directorydocker-compose -f docker/development/docker-compose.yml [...]
– using the file in the docker/development directory, which is a common practice at our company./vendor/bin/sail [...]
– I have one project in the Laravel framework.
The aim of the doc tool is to make each of the above projects run with a single command:
doc [...]
So what, three if
s in a shell script and we’re done?
Deliberate practice
I decided that I would practice my skills in writing declarative code.
I recommend this talk (after reading the article).
What exactly is a declarative approach? About two years ago, while watching another Kevlin Henney talk, I came across the answer. What was it? Let’s look at the code and try to figure it out together.
First version of the doc program
https://github.com/dragonee/dotfiles/blob/1331c158c1cee42615d398d1181f70e23b10ffdc/bin/doc
My first and most original assumption about this program was that if my program found a file in a directory at a particular location, it should invoke the relevant command and arguments.
This is reflected in the relation at the bottom of the file:
to_check = (
('vendor/bin/sail', sail_args),
('docker-compose.yml', docker_without_path_args),
('docker/development/docker-compose.yml', docker_with_path_args),
)
Mathematically speaking, to_check
is a relation from the domain of relative paths into the domain of functions that return a list of arguments for the resulting command.
To make it work, you also need:
- three functions that generate corresponding arguments
sail_args
,docker_without_path_args
,docker_with_path_args
; - a helper function
print_and_run
, which, according to its name, displays the resulting command and executes it; - a bit of infrastructure code, configuring the program so that it can be interrupted by pressing the Ctrl-C keys.
- and the program evaluation code.
What is program evaluation code? This is the code that ultimately causes the program to execute.
Nothing happens in the doc
program up to line 57, apart from the program configuration and the functions that, if executed once by some caller with such and such parameters, would return such and such a result. Only at line 57 lies the five-line essence of the program’s operation.
for path, func in to_check:
if not os.path.exists(path):
continue
sys.exit(print_and_run(func, path))
This rather trivial procedure de facto transforms our program, written in a general-purpose language (Python), into a DSL (Domain-Specific Language), in which we communicate in the language of paths and argument functions, and the purpose of their existence is to run the corresponding docker-compose
.
Separating the code between the evaluation algorithm and the declarative code reduces our cognitive load when reading the code – but this happens if and only if the evaluation code thus becomes short and simple, and the declarative elements can be read independently of the rest of the code.
In order for declarative elements to be read independently of the rest of the code, it would be ideal if our functions operated on intuitive data types. And if a function has side effects, these should be immediately apparent – as in the case of the print_and_run
function whose meaning can be inferred from the function name.
Iwo wrote more about naming in his article Scientific perspective on naming in programming.
Similarities
Having made sure the program was working as it should, I immediately had the urge to improve it. I had two reasons for this. The first was the thought:
What if we used a test function instead of a file path? Then we would have both a test function and a related argument-generating function.
Then the evaluative code would become even simpler:
for test, func in to_check:
if not test():
continue
sys.exit(print_and_run(func()))
The second reason was the code smell I left in the underlying code. Since one function generating arguments had a path
argument, all of them needed to have the same argument in their signature, since the original evaluation code passed it to each function (by calling func(path)
), regardless of whether the function needed that argument or not. It was for this reason that the signatures of functions returning lists of arguments were not very readable:
def docker_with_path_args(path):
...
def sail_args(*args):
...
All right, we know what we want to eliminate, but how do we do it?
Consequences
First of all, let’s compare the evaluators of our DSLs once again:
Original evaluator:
for path, func in to_check:
if not os.path.exists(path):
continue
sys.exit(print_and_run(func, path))
And the ideal evaluator:
for test, func in to_check:
if not test():
continue
sys.exit(print_and_run(func()))
If we use the second evaluator, we lose all kinds of the file path information – no longer does the evaluator pass it to the associated func
function. What’s more, the evaluator doesn’t even know the concept of the file path, which is hidden behind the abstraction of the test function.
This means that the test function and the associated argument function must somehow agree on the path to the correct file. We have three options here:
A mindless solution
def test_if_docker_development_docker_compose_yml_exists():
...
def args_for_docker_development_docker_compose_yml():
...
Well, it’s not perfect. Let’s look further.
An object-oriented solution
Someone among you might get the idea that, after all, it should be enough to bind the two functions with an object:
class PathDockerTest(object):
def __init__(self, path):
self.path = path
def test():
...
def func():
...
And that would be a working solution. Can we do it any other way?
A functional programming solution
The above is nothing but a lexical closure. JavaScript language comes in handy here, as it clearly demonstrates that we can do exactly the same thing as in the example above, without having to use objects per se.
function functions_from_path(path) {
return [
function() {
// ...
},
function() {
// ...
}
]
}
test, func = functions_from_path("docker-compose.yml")
And it’s true! Lexical closure, state (memory cells that we can modify), and procedures build an entire object model for us! It turns out that object-oriented and functional programming are not so far apart after all.
A modified functional programming solution
What I wanted was for the test function and the argument function to be unrelated until the last possible moment. To achieve this goal, I created some higher-level functions:
def _curry_1(arg, wrapped: Callable) -> Callable:
"""
Return a new function with n-1 arguments of `f`.
Bind its first argument to `arg`.
"""
@wraps(wrapped)
def new_func(*args, **kwargs):
return wrapped(arg, *args, **kwargs)
return new_func
def bind(arg, *funcs: Callable) -> Tuple[Callable, ...]:
"""Bind multiple functions to the same argument."""
return tuple(_curry_1(arg, f) for f in funcs)
The first of these, _curry_1
, is used to create a lexical closure of the function f, storing its first argument. The name refers to a currying operation, and the number (1) present in the name suggests implementation – it is a so-called ‘unary currying.’
The other is a helper function that calls _curry_1
with the same argument on multiple functions. The name bind is relevant here since it is this function that binds previously unrelated functions to each other with their first argument.
With these two functions, the to_check
relation also changes:
to_check = (
bind("vendor/bin/sail", os.path.exists) + (sail_args,),
bind("docker-compose.yml", os.path.exists) + (docker_without_path_args,),
bind(
"docker/development/docker-compose.yml", os.path.exists, docker_with_path_args
),
)
All of a sudden, it appears that:
- I don’t have to write classes;
- I can use the
os.path.exists
function as a test function – I don’t need to write new functions; - where needed, I bind the test and argument function, and where unnecessary, only the test function – the signatures match;
- I have just shown you in a few lines everything that my program can currently do.
Here is the code after changes:
https://github.com/dragonee/dotfiles/blob/1331c158c1cee42615d398d1181f70e23b10ffdc/bin/doc
(By the way, many thanks to Kamil Supera for Three friends of the better code style – Python – in the commit above you will also see the applied fixes from the use of all three tools).
A minor refactoring
The _curry_1
function is unnecessary since it can be replaced by the functools.partial
function from the standard Python library. Let’s remove it now that it’s clear what it does.
https://github.com/dragonee/dotfiles/commit/8b2c3d8f3399bdfb8203ee7f056522a7918b7e91
Getting rid of sys.argv from argument functions
Time for one last refactoring.
Up to now, each argument function has directly referred to sys.argv
in its body. This is usually not a problem, because sys.argv
does not usually change during program execution, so it does not complicate our reasoning on the program execution very much.
def docker_without_path_args() -> list[str]:
"""Construct call arguments for docker-compose command."""
return [
"docker-compose",
] + sys.argv[1:]
But if we wanted to write tests for our program, it would already be more difficult. So let’s get rid of the sys.argv
from the argument functions and move this array to where it should go. To the evaluator.
https://github.com/dragonee/dotfiles/commit/78f8cddcfb56796db3640afc75c035454305aede
Summary
Here is the final code, color-coded:
- green is the program configuration;
- blue is the declarative code, functions, and declarations of what would be done if it had to be done;
- red is the evaluator code.
By separating the code responsible for how we perform a task from the code responsible for what we want to accomplish, we achieve a relatively elegant code.
And Kevlin Henney’s response, to which I referred earlier? It sounds more or less like this:
But this is the clever bit, this is how the lazy evaluation of functional programming works. It’s so lazy it doesn’t happen. It’s kind of like a promise, it’s kind of like an idea.
It’s just like “If I were” – so this is the way you do; this is how Haskell and functional programming gets around things like side-effects […]. In a pure functional language you cannot have side-effects, because these are pure functions, “but let me describe to you, how I would do I/O. I’m not doing it, oh no no no no, doing is an imperative thing, but if I were to do it […] it would look like this. But I’m not doing I/O. And then you hand this off to the runtime – but you are still pure. It’s an outsourcing trick.”
Kevlin Henney, Get Kata (48m0s)
And so we outsourced the execution of our code to one part of it, leaving the rest of it much more declarative.
Below is the whole thing on Github.
https://github.com/dragonee/dotfiles/blob/78f8cddcfb56796db3640afc75c035454305aede/bin/doc
And if you need a software consultancy to improve your legacy code…
Let’s talk!Co-founder and CIO of Makimo, deeply fascinated with philosophy, humans, technology and the future.