-
Notifications
You must be signed in to change notification settings - Fork 8
wcEcoli Python style guide
I condensed the PEP8 and Google style guides here, along with our decisions. Attach ππΎ, ππΎ, and β annotations. -- Jerry
- See https://www.python.org/dev/peps/pep-0008/
- See Google's Python style guide
- See our March 12, 2018 meeting slides with notes on the tradeoffs between alternatives.
Style guides make recommendations among programming alternatives like imports, docstrings, names, and formatting. The point is to reduce the likelihood of some bugs, increase code readability and familiarity, and make it easier for programmers to collaborate and merge code. But don't overdo consistency.
For each guideline, we could decide to:
- Set an expectation. Point it out in code reviews.
- Soft target. Don't sweat it in code reviews.
- Don't adopt it.
and we can plan to:
- Adopt it for new code.
- Plan to change existing code gradually or rapidly.
-
ππΎIndentation: Stick with TABs in this project.
-
Set your editor's TAB stops to 4 spaces.
-
Python assumes TAB stops are at 8 spaces unless told otherwise.
-
Python 3 disallows mixing the use of tab and space indentation.
-
Set a nightly build script or a check-in script to flag SPACE indentation. It could use
python -t sourcefile.py, ortabnannywhich is less lenient but still allows mixtures that Python allows, or just search for any SPACE in indentation (although it's normal to use TABs followed by some SPACEs, esp. for odd indentation to line up with a(or for half-TAB indentation when tab stops are 8 spaces):find . -name "*.py" -o -name "*.pyx" | xargs grep '^\s* 'although this will include indentation in multi-line comments and strings and indentation that aligns a continuation line with the previous line's
(or[. -
(PEP8 recommends 4 spaces per indentation level but that's primarily for shared code.)
-
-
Use ASCII text in Python 2; UTF-8 in Python 3. Otherwise the file needs an encoding declaration.
-
ππΎThe line length soft target is 79 columns; harder target at 99 columns; no hard limit. The same limit for comments and docstrings.
-
A standard line length aids editor window layout and diff displays, but bio simulations might have many long names. It's annoying to stay within a hard limit but very useful to have a shared target.
-
(PEP8 recommends 79 columns, but 72 for comments and docstrings.)
-
A shell script to check for very long lines in source files:
find . -name '*.py' -exec awk '{ if (length($0) > max) max = length($0) } END { if (max > 199) print max, FILENAME }' {} \;
-
-
ππΎDon't use implicit relative imports (e.g.
import siblingwheresiblingis in the same directory) because it can import the wrong file (e.g.import random), it can import the same module twice (really?), and it doesn't work in Python 3.Instead use absolute imports (preferred) or explicit relative imports:
from __future__ import absolute_import # prevents implicit relative imports from . import sibling from path.to.mypkg import sibling from .sibling import example -
ππΎPut imports at the top of the file.
Occasionally there are good reasons to break this rule, like
import pdb.Plan to fix cases like analysis plots that have imports nested within classes or functions.
-
ππΎImport separate modules on separate lines.
-
ππΎAvoid wildcard imports (
from <module> import *).- Never
import *within a class or a function. That generates slow code and it won't compile in Python 3.
- Never
-
Use
if x is None:orif x is not None:rather than==or!=, and likewise for other singletons like enum values (see pip enum34). It states a simpler aim. It's faster, esp. if it avoids calling a custom__eq__()method, and it might avoid exceptions or incorrect results in__eq__(None). -
ππΎPrefer to use Python's implied line continuation inside parentheses, brackets and braces over a backslash for line continuation.
-
Write docstrings for all public modules, functions, classes, and methods.
This is a nice goal.
ππΎ (Travis) - might be worth it to go back and try to add some for functions that already exist but this should essentially be required for any new pushes.
ππΎ (John) - Yes, this is very important. Even just a sentence would make much of the code comprehensible. I think we should also reevaluate what is 'public' or 'private' - I tried to use
_leading_underscoressensibly but the current framework was one of my first big OOP endeavors, and oftentimes the pattern was broken. -
Comments that contradict the code are worse than no comments.
-
ππΎ Line continuation
PEP8 alternative:
foo = long_function_name(var_one, var_two, var_three, var_four)preferred:
def long_function_name( var_one, var_two, var_three, var_four): print(var_one)preferred:
def long_function_name( var_one, var_two, var_three, var_four ): print(var_one)PyCharm is configurable but it implements this indentation style by default, and using the Refactor command to rename "long_function_name" will suitably adjust the indentation of the continuation lines.
β (Travis) - I think the second is preferable because sometimes if we line up with the start of the parenthesis it limits the amount of space we have to work with and creates too many new lines.
ππΎ (John) - Agreed, I prefer the latter. Often I'll try to group related arguments on one line.
-
Prefer to put a line break before a binary operator, but after is also OK.
-
ππΎ Put at least one space before an inline comment, then
#β£(that's one space after the#).β (John) - I don't quite get the logic on this one [two spaces before the
#], except that without code formatting a#with one space on each side would look like an operator. Inline comments are already so hard to fit on a line that adding one more character sounds irritating. -
Two blank lines between top-level definitions, one blank line between method definitions. Other blank lines sparingly.
β (John) - I'm pretty aggressive with whitespace, it helps me with readability.
-
Import order: standard library imports, blank line, third party imports, blank line, local imports.
-
Order:
"""Module docstring.""" from __future__ import ... __all__ = ['a', 'b', 'c'] # and other "dunder" settings like __version__ and __author__ imports code -
"""Doc strings in triple quotes."""whether it's one line or many; whether"""or'''. The"""that ends a multiline docstring should be on a line by itself. -
ππΎ Spacing like this (see the PEP8 doc for more info):
# Put no spaces immediately within `()`, `[]`, or `{}`. spam(ham[1], {eggs: 2, salsa: 10}) # Put a space between `,` `;` or `:` and any following item. demo = (0,) + (2, 3) if x == 4: print x, y; x, y = y, x # Put no space in a simple slice expression, but parentheses to clarify complicated slice precedence # or construct a slice object or put subexpressions into variables. ham[1:9], ham[1:9:3], ham[:9:3], ham[1::3], ham[1:9:] ham[(lower+offset):(upper+offset)] # Put no space in function application or object indexing. spam(1) < spam(2) dct['key'] += lst[index] # Don't line up the `=` on multiple lines of assignment statements. x = 1 long_variable = (3, 10) # Spaces around keyword `=` are OK, unlike in PEP8, which recommends them only # when there's a Python 3 parameter annotation. c = magic(real=1.0, imag=10.5) c = magic(real = 1.0, imag = 10.5) def munge(input: AnyStr, sep: AnyStr = None, limit=1000): ... # Use spaces or parentheses to help convey precedence. # Put zero or one space on both sides of a binary operator (except indentation). hypot2 = x*x + y*yAvoid trailing whitespace. A backslash followed by a space and a newline does not count as a line continuation marker.
β (John) - I prefer more spaces to less. As with line-breaks I tend to use white space to group ideas together, so smaller ideas in a larger expression will sometimes get no spaces. I don't think I will ever become accustomed to having no spaces around keyword arguments.
-
Avoid compound statements on one line.
if foo == 'blah': do_something() -
Comments should be complete sentences. The first word should be capitalized unless it's an identifier that begins with a lower case letter.
-
Style names like this:
ClassName ExceptionName # usually ends with "Error" GLOBAL_CONSTANT_NAME function_name, method_name decorator_name local_var_name, global_var_name, instance_var_name, function_parameter_name camelCase # OK to match the existing style __mangled_class_attribute_name _internal_name module_name package # underscores are discouraged- Public names (like a class used as a decorator) follow conventions for usage rather than implementation.
- Use a trailing "_" to avoid conflicting with a Python keyword like
yield_,complex_, andmax_. - Don't invent
__double_leading_and_trailing_underscore__special names. - Always use
selffor the first argument of an instance method andclsfor the first argument of a class method. - Don't use
l,O, orIfor single character variable names. - Don't make exceptions for scientific conventions like
Kcatand math conventions like matrixM, and any name is better than a single letter. - Avoid using properties for expensive operations. The attribute notation suggests it's cheap.
- Use the verb to distinguish methods like
get_value()fromcompute_value().
-
Documented interfaces are considered public, unless the documentation says they're provisional or internal. Undocumented interfaces are assumed to be internal.
β (John) - I'm a bit fuzzy on what is considered an 'interface' in Python.
-
The
__all__attribute is useful for introspection.
Programming tips:
-
if x is not Noneis more readable thanif not x is None. -
When implementing ordering operations with rich comparisons, it's best to implement all six operations or use the
functools.total_ordering()decorator to fill them out. -
Use
def f(x): return 2*xinstead off = lambda x: 2*xfor more helpful stack traces. -
Derive exceptions from Exception rather than BaseException unless catching it is almost always the wrong thing to do.
-
When designing and raising exceptions aim to answer the question "What went wrong?" rather than only indicating "A problem occurred."
-
In Python 2, use
raise ValueError('message')instead ofraise ValueError, 'message'(which is not legal in Python 3). -
Use the bare
except:clause only when printing/logging the traceback. -
Use the form
except Exception as exc:to bind the exception name. -
Limit a
tryclause to a narrow range of code so it only doesn't bury totally unexpected exceptions. -
Use a
withstatement or try/finally to ensure cleanup gets done. For a file-like object that that doesn't support thewithstatement, usewith contextlib.closing(urllib.urlopen("https://www.python.org/")):. -
In a function, make either all or none of the
returnstatements return an explicit value.- ππΎ Furthermore, have a consistent return type. Make a class instance, tuple,
namedtuple, or dictionary to handle a union of different cases. - ππΎ Any sort of failure should raise an explicit exception.
- ππΎ Furthermore, have a consistent return type. Make a class instance, tuple,
-
Use string methods instead of the string module. They're faster and have the same API as Unicode strings.
-
String
.startswith()and.endswith()are less error prone than string slicing. -
Use e.g.
isinstance(obj, int)instead oftype(obj) is type(1)to check an object's type. Useisinstance(obj, basestring)to accept both str and unicode.- ππΎ Better yet, avoid checking types except to catch common errors. It's cleaner to call different function for distinct input patterns or use O-O dispatch.
-
Use
a_string.join()rather than looping overa_string += stuffto combine strings. It takes linear rather than n^2 time.
See https://google.github.io/styleguide/pyguide.html
-
Use pylint. [And/or PyCharm inspections.]
-
Import packages and modules only, not names from modules.
-
Use full pathnames to import a module; no relative imports to help prevent importing a package twice.
-
Avoid global variables.
-
Use the operator module e.g.
operator.muloverlambda x, y: x * y. There's alsooperator.itemgetter(*items),operator.attrgetter(*attrs), andoperator.methodcaller(name[, args...]). -
Don't use mutable objects as default values in a function or method definition.
-
Use Python falsy tests for empty sequences and 0, e.g.
if sequence:rather thanif len(sequence):orif len(sequence) > 0:, but not for testing if a value is (not)None.- But don't write
if value:to test for a non-empty string. That can be confusing.
- But don't write
-
Avoid features such as metaclasses, access to bytecode, on-the-fly compilation, dynamic inheritance, object reparenting, import hacks, reflection, modification of system internals, etc. [at least without compelling reasons].
-
Use extra parentheses instead of backslash line continuation.
-
Don't use parentheses in return statements or conditional statements except for implied line continuation or tuples.
- ππΎ Prefer explicit tuple parentheses, definitely for 1-element tuples.
(x,) = func()notx, = func().for (i, x) in enumerate(...):. - [PyCharm recommends
return x, yoverreturn (x, y).]
ππΎ (John) - I personally have started using the latter since the output of the function will be a tuple, and the parentheses make that explicit. I also always use parentheses when tuple unpacking.
- ππΎ Prefer explicit tuple parentheses, definitely for 1-element tuples.
-
Write a doc string as a summary sentence, a blank line, then the rest.
-
A function must have a docstring, unless it's: not externally visible, short, and obvious. Explain what you need to know to call it.
-
Classes should have doc strings.
-
Don't put code at the top level that you don't want executed when the file is imported, unit tested, or pydoc'd.
-
It's common to write floating point literals like
1.0or0.1rather than1.or.1for clarity, but NumPy uses1.and0.1so I guess we should follow suit.ππΎ (John) - I strongly prefer 1.0 and 0.1; dangling periods look strange.
-
ππΎ Adopt
from __future__ import divisionat the top of every file regardless of whether or not it is doing any division. This is an extremely common (and typically silent) error to hit in Python 2. Use//for explicit integer division.- ππΎ Adopt
from __future__ import absolute_import. - Sooner or later, adopt
from __future__ import print_function. - These will help facilitate an eventual Python 3 transition.
- ππΎ Adopt