A few months ago, I wrote
the definitive guide about Python method declaration,
which had quite a good success. I still fight every day in
OpenStack to have the developers declare their methods
correctly in the patches they submit.
Automation plan
The thing is, I really dislike doing the same things over and over again.
Furthermore, I'm not perfect either, and I miss a lot of these kind of problems
in the reviews I made. So I decided to replace me by a program a more
scalable and less error-prone version of my brain.
In OpenStack, we rely on
flake8 to
do static analysis of our Python code in order to spot common programming
mistakes.
But we are really pedantic, so we wrote some extra hacking rules that we
enforce on our code. To that end, we wrote a
flake8 extension called
hacking. I really like these rules, I
even recommend to apply them in your own project. Though I might be biased or
victim of Stockholm syndrome. Your call.
Anyway, it's pretty clear that I need to add a check for method declaration in
hacking. Let's write a
flake8 extension!
Typical error
The typical error I spot is the following:
class Foo(object):
# self is not used, the method does not need
# to be bound, it should be declared static
def bar(self, a, b, c):
return a + b - c
That would be the correct version:
class Foo(object):
@staticmethod
def bar(a, b, c):
return a + b - c
This kind of mistake is not a show-stopper. It's just not optimized. Why you
have to manually declare static or class methods might be a language issue, but
I don't want to debate about Python misfeatures or design flaws.
Strategy
We could probably use some big magical regular expression to catch this
problem.
flake8 is based on the
pep8
tool, which can do a line by line analysis of the code. But this method would
make it very hard and error prone to detect this pattern.
Though it's also possible to do an
AST based analysis on on a per-file basis with
pep8. So
that's the method I pick as it's the most solid.
AST analysis
I won't dive deeply into Python AST and how it works. You can find plenty of
sources on the Internet, and I even talk about it a bit in my book
The Hacker's Guide to Python.
To check correctly if all the methods in a Python file are correctly declared,
we need to do the following:
- Iterate over all the statement node of the AST
- Check that the statement is a class definition (
ast.ClassDef
)
- Iterate over all the function definitions (
ast.FunctionDef
) of that class
statement to check if it is already declared with @staticmethod
or not
- If the method is not declared static, we need to check if the first argument
(
self
) is used somewhere in the method
Flake8 plugin
In order to register a new plugin in
flake8 via
hacking, we just need to
add an entry in
setup.cfg
:
[entry_points]
flake8.extension =
[ ]
H904 = hacking.checks.other:StaticmethodChecker
H905 = hacking.checks.other:StaticmethodChecker
We register 2
hacking codes here. As you will notice later, we are actually
going to add an extra check in our code for the same price. Stay tuned.
The next step is to write the actual plugin. Since we are using an AST based
check, the plugin needs to be a class following a certain signature:
@core.flake8ext
class StaticmethodChecker(object):
def __init__(self, tree, filename):
self.tree = tree
def run(self):
pass
So far, so good and pretty easy. We store the tree locally, then we just need
to use it in
run()
and
yield
the problem we discover following
pep8
expected signature, which is a tuple of
(lineno, col_offset, error_string,
code)
.
This AST is made for walking
The
ast
module provides the
walk
function, that allow to iterate easily on
a tree. We'll use that to run through the AST. First, let's write a loop that
ignores the statement that are not class definition.
@core.flake8ext
class StaticmethodChecker(object):
def __init__(self, tree, filename):
self.tree = tree
def run(self):
for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
We still don't check for anything, but we know how to ignore statement that are
not class definitions. The next step need to be to ignore what is not function
definition. We just iterate over the attributes of the class definition.
for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
# If it's a class, iterate over its body member to find methods
for body_item in stmt.body:
# Not a method, skip
if not isinstance(body_item, ast.FunctionDef):
continue
We're all set for checking the method, which is
body_item
. First, we need to
check if it's already declared as static. If so, we don't have to do any
further check and we can bail out.
for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
# If it's a class, iterate over its body member to find methods
for body_item in stmt.body:
# Not a method, skip
if not isinstance(body_item, ast.FunctionDef):
continue
# Check that it has a decorator
for decorator in body_item.decorator_list:
if (isinstance(decorator, ast.Name)
and decorator.id == 'staticmethod'):
# It's a static function, it's OK
break
else:
# Function is not static, we do nothing for now
pass
Note that we use the special
for/else
form of Python, where the
else
is
evaluated unless we used
break
to exit the
for
loop.
for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
# If it's a class, iterate over its body member to find methods
for body_item in stmt.body:
# Not a method, skip
if not isinstance(body_item, ast.FunctionDef):
continue
# Check that it has a decorator
for decorator in body_item.decorator_list:
if (isinstance(decorator, ast.Name)
and decorator.id == 'staticmethod'):
# It's a static function, it's OK
break
else:
try:
first_arg = body_item.args.args[0]
except IndexError:
yield (
body_item.lineno,
body_item.col_offset,
"H905: method misses first argument",
"H905",
)
# Check next method
continue
We finally added some check! We grab the first argument from the method
signature. Unless it fails, and in that case, we know there's a problem: you
can't have a bound method without the
self
argument, therefore we raise the
H905
code to signal a method that misses its first argument.
Now you know why we registered this second
pep8
code along with
H904
in
setup.cfg
. We have here a good opportunity to kill two birds with one stone.
The next step is to check if that first argument is used in the code of the
method.
for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
# If it's a class, iterate over its body member to find methods
for body_item in stmt.body:
# Not a method, skip
if not isinstance(body_item, ast.FunctionDef):
continue
# Check that it has a decorator
for decorator in body_item.decorator_list:
if (isinstance(decorator, ast.Name)
and decorator.id == 'staticmethod'):
# It's a static function, it's OK
break
else:
try:
first_arg = body_item.args.args[0]
except IndexError:
yield (
body_item.lineno,
body_item.col_offset,
"H905: method misses first argument",
"H905",
)
# Check next method
continue
for func_stmt in ast.walk(body_item):
if six.PY3:
if (isinstance(func_stmt, ast.Name)
and first_arg.arg == func_stmt.id):
# The first argument is used, it's OK
break
else:
if (func_stmt != first_arg
and isinstance(func_stmt, ast.Name)
and func_stmt.id == first_arg.id):
# The first argument is used, it's OK
break
else:
yield (
body_item.lineno,
body_item.col_offset,
"H904: method should be declared static",
"H904",
)
To that end, we iterate using
ast.walk
again and we look for the use of the
same variable named (usually
self
, but if could be anything, like
cls
for
@classmethod
) in the body of the function. If not found, we finally yield the
H904
error code. Otherwise, we're good.
Conclusion
I've
submitted this patch to hacking,
and, finger crossed, it might be merged one day. If it's not I'll create a new
Python package with that check for flake8. The actual submitted code is a bit
more complex to take into account the use of
abc
module and include some
tests.
As you may have notice, the code walks over the module AST definition several
times. There might be a couple of optimization to browse the AST in only one
pass, but I'm not sure it's worth it considering the actual usage of the tool.
I'll let that as an exercise for the reader interested in contributing to
OpenStack.
Happy hacking!