Blog
AI & Machine Learning

Python Coding Practices

Reading time:
7
min
Published on:
Jun 24, 2022

Charles Gaillard

Charles Gaillard

Maxime Churin

Maxime Churin

Summary

Share the article

In software development, developers usually work in a collaborative and fast-evolving environment. For this reason, it is highly recommended that they follow common guidelines & coding practices that will ensure efficiency, clarity, easy bug-spotting, and onboarding. After all, code that is properly formatted and documented is more likely to be shared and used by the entire developer community

We propose here a set of Python developer tools that can help you achieve such a target, tested with Python 3.8 versions:

  • Black
  • Flake8
  • Isort
  • Mypy
  • Pydocstyle
  • Darglint

Black

Black is a quick formatting tool that can be used very easily on the command line.

Reformatting your code in black-style allows you to produce clean, readable code, making it easy to spot errors.

In this sample:

class Car:
  

   def __init__(self,
       brand: str ,  color: str  ) -> None:
       self.brand=   brand
       self.color=   color
   def __str__(self) -> str:
       return (
           f"Car(brand: {self.brand}, color: {self.color})"
       )

Running:

black sample.py

Returns:

reformatted sample.py

All done! ✨ 🍰 ✨
1 file reformatted.

And it transforms sample.py:

class Car:
  def __init__(self, brand: str, color: str) -> None:
      self.brand = brand
      self.color = color

  def __str__(self) -> str:
      return f"Car(brand: {self.brand}, color: {self.color})"

Recommended configuration

[black]
line-length = 100
skip-magic-trailing-comma = true

Note: Skip-magic-trailing-comma: to avoid exploding a collection into one item per line

Flake8

Flake8 is a tool that wraps pycodestyle, pyflakes, and mccabe

  • Pycodestyle: spots any formatting-related errors in the code related to PEP8 compliance. Below is a non-exhaustive list of errors:

E for error codes and W for warnings

  • E1: for indentation errors.
  • E2: for whitespace errors.
  • E7: for statement errors
  • W6: for deprecation warnings
  • Pyflakes: spots any incoherency in the code. Below is a non-exhaustive list of errors:

F for error codes

  • F4: for imports errors
  • F5: for format errors
  • F6: for incompatible assignments/comparison errors
  • F7: for syntax errors
  • F8: for variable related errors
  • Mccabe: spots complexity errors.

Error code: C (error thrown always C901 for complexity violation)

In this sample.py:

import numpy as np
a=math.cos(math.pi)

Running:

flake8 sample.py

Gives you the following errors:

sample.py:2:1: F401 'numpy as np' imported but unused
sample.py:3:2: E225 missing whitespace around operator
sample.py:3:3: F821 undefined name 'math'
sample.py:3:12: F821 undefined name 'math'
sample.py:3:20: W291 trailing whitespace

Recommended configuration

[flake8]
max-line-length = 100
extend-ignore = E203, E501
ban-relative-imports = parents

We ignore those two errors:

We also have a set of flake8 plugins that you can install:

  • flake8-use-fstring: check for % or .format and suggest using f-strings
  • flake8-print: check for print statements
  • flake8-tidy-imports: write tidier imports (bans imports from parent modules and above, i.e. with more than one .)

Isort

Isort is a tool that provides a command-line utility that sorts your imports (alphabetically and separates sections into standard, third party, first-party, and finally imports from the local folder).

In this sample:

import numpy as np
import cv2
from a import b
from c import f, e, d
from typing import List, Tuple, Dict, Any
import os
import json
from .z import x, y
from .w import u, v

Running:

isort sample.py

Transforms sample.py:

import json
import os
from typing import Any, Dict, List, Tuple

import cv2
import numpy as np
from a import b
from c import d, e, f

from .w import u, v
from .z import x, y

Recommended configuration

[isort]
profile = "black"
line_length = 100

We need to set the profile to “black” to avoid bad interactions between the two tools.

Mypy

Mypy is a static type checker. You need to type in your functions and variables to use it. You can configure its level of strictness if you still want to resolve the type dynamically in some parts of your code, but keep in mind that having a statically typed project enhances productivity, and clarity adds safety barriers everywhere so that you can spot errors even before executing your code.

In this sample:

from typing import List, Tuple

a: str = "abc"
b: int = 3

c = a + b

d: List[int] = []
d.append(a)
d.append((a, b))

e: Tuple[int, int]
e = (a, b)

def sample_function(x: int, y: int) -> Tuple[int, List[int], str]:
  pass

e = sample_function(a, b)

Running:

mypy sample.py

Returns:

sample.py:6: error: Unsupported operand types for + ("str" and "int")
sample.py:9: error: Argument 1 to "append" of "list" has incompatible type "str"; expected "int"
sample.py:10: error: Argument 1 to "append" of "list" has incompatible type "Tuple[str, int]"; expected "int"
sample.py:13: error: Incompatible types in assignment (expression has type "Tuple[str, int]", variable has type "Tuple[int, int]")
sample.py:18: error: Incompatible types in assignment (expression has type "Tuple[int, List[int], str]", variable has type "Tuple[int, int]")
sample.py:18: error: Argument 1 to "sample_function" has incompatible type "str"; expected "int"
Found 6 errors in 1 file (checked 1 source file)

Recommended configuration

[mypy]
python_version = "3.8"
exclude = ["tests"]
# --strict
disallow_any_generics = true
disallow_untyped_defs = true
no_implicit_optional = true
warn_redundant_casts = true
warn_unused_ignores = true
warn_return_any = true
implicit_reexport = false
strict_equality = true
# --strict end

We choose to only require typing for production code and not for the tests.

For a new codebase, you should add the strict options but for legacy code, you should add mypy options iteratively.

Pydocstyle

Pydocstyle is a tool that verifies the compliance from most of PEP 257 regarding your docstrings. Docstring is an essential tool to document your code. We choose to adopt the Google style convention.

It implements different groups of error:

  • D1: for Missing Docstrings
  • D2: for Whitespace Issues
  • D3: for Quote Issues
  • D4: for Docstring Content Issues

Recommended configuration

[pydocstyle]
convention = "google"

In the previous undocumented sample:

class Car:
  def __init__(self, brand: str, color: str) -> None:
      self.brand = brand
      self.color = color

  def __str__(self) -> str:
      return f"Car(brand: {self.brand}, color: {self.color})"

Running:

pydocstyle sample.py

Returns:

sample.py:1 at module level:        D100: Missing docstring in public modulesample.py:1 in public class `Car`:        D101: Missing docstring in public classsample.py:2 in public method `__init__`:        D107: Missing docstring in __init__sample.py:6 in public method `__str__`:        D102: Missing docstring in public method

We should have:

"""A one line summary of the module or program, terminated by a period."""


class Car:
    """Summary of class here.

    Longer class information...
    """

    def __init__(self, brand: str, color: str) -> None:
        """Inits Car with blah.
        Args:
            brand: A string with the car brand.
            color: A string with the car color.        """
        self.brand = brand
        self.color = color

    def __str__(self) -> str:
        """Performs operation blah."""
        return f"Car(brand: {self.brand}, color: {self.color})"

Darglint

Darglint is a tool to verify that the docstring matches the function or method implementation along its whole lifecycle. It avoids having deprecated documentation when the signature has changed. It is better when used in combination with a docstring style checker like pydocstyle.

It implements different groups of errors:

  • DAR0: Syntax, formatting, and style
  • DAR1: Args section
  • DAR2: Returns section
  • DAR3: Yields section
  • DAR4: Raises section
  • DAR5: Variables section

In the previous documented sample:

"""A one line summary of the module or program, terminated by a period."""


class Car:
    """Summary of class here.

    Longer class information...
    """

    def __init__(self, brand: str, color: str) -> None:
        """Inits Car with blah.
        Args:
            brand: A string with the car brand.
            color: A string with the car color.        """
        self.brand = brand
        self.color = color

    def __str__(self) -> str:
        """Performs operation blah."""
        return f"Car(brand: {self.brand}, color: {self.color})"

Running:

darglint sample.py

Returns:

sample.py:str:20: DAR201: - return

We should have:

"""A one line summary of the module or program, terminated by a period."""


class Car:
    """Summary of class here.

    Longer class information...
    """

    def __init__(self, brand: str, color: str) -> None:
        """Inits Car with blah.

        Args:
            brand: A string with the car brand.
            color: A string with the car color.
        """
        self.brand = brand
        self.color = color

    def __str__(self) -> str:
        """Performs operation blah.

        Returns:
            str: object representation
        """
        return f"Car(brand: {self.brand}, color: {self.color})"

Integration

We would like to gather all these tool configurations into one file. For configuration storage, our recommendation is to use .flake8 + pyproject.toml.

[flake8]
max-line-length = 100
extend-ignore = E203, E501
ban-relative-imports = parents
[tool.black]
line-length = 100
skip-magic-trailing-comma = true

[tool.isort]
profile = "black"
line_length = 100

[tool.mypy]
python_version = "3.8"
exclude = ["tests"]
# --strict
disallow_any_generics = true
disallow_untyped_defs = true
no_implicit_optional = true
warn_redundant_casts = true
warn_unused_ignores = true
warn_return_any = true
implicit_reexport = false
strict_equality = true
# --strict end

[tool.pydocstyle]
convention = "google"

Then if we want to run all of them using only one command, we have different options:

  • Sh script or Makefile

We need to define a list of dev dependencies like in requirements-dev.txt.

isort~=5.10.1
black~=22.3.0
flake8~=4.0.1
flake8-print==4.0.0flake8-use-fstring==1.3flake8-tidy-imports==4.7.0pydocstyle[toml]=6.1.1mypy=0.960darglint~=1.8.1

We create a lint.sh file (darglint is automatically launched by flake8 when installed in the same env so no need to specify it)

#!/bin/bash
set -o pipefail

flake8
black .
isort .
mypy .
pydocstyle

And we can run it:

./lint.sh

If you want to use a Makefile:

lint:
  flake8
  black .
  isort .
  mypy .
  pydocstyle

Then run:

make lint
  • Pre-commit [Preferred]

Pre-commit is another alternative to putting all the tools together and running them inside a dedicated closed environment, so you do not even need them in a dev environment.

pip install pre-commitputtingpre-commit run -a my-hook

Additionally, it can interact with the git hook regarding the modified files on commit, push, etc. before submission to remote code storage and avoid cluttering the CI for lint and style problems.

pre-commit install
# follow by git add/commit/push which trigger pre-commit

Recommended configuration

.pre-commit-config.yaml

repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.2.0
    hooks:
      - id: check-ast
      - id: check-builtin-literals
      - id: check-docstring-first
        exclude: tests
      - id: check-merge-conflict
      - id: check-yaml
      - id: check-toml
      - id: debug-statements
      - id: end-of-file-fixer
      - id: trailing-whitespace
  - repo: https://github.com/pycqa/isort
    rev: 5.10.1
    hooks:
      - id: isort
  - repo: https://github.com/psf/black
    rev: 22.3.0
    hooks:
      - id: black
  - repo: https://github.com/pycqa/flake8
    rev: "4.0.1"
    hooks:
      - id: flake8
        additional_dependencies:
          - flake8-use-fstring
          - flake8-print
          - flake8-tidy-imports
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: "v0.960"
    hooks:
      - id: mypy
        exclude: tests
        args: []
  - repo: https://github.com/pycqa/pydocstyle
    rev: 6.1.1
    hooks:
      - id: pydocstyle
        exclude: tests
        additional_dependencies: [toml]
  - repo: https://github.com/terrencepreilly/darglint
    rev: v1.8.1
    hooks:
    - id: darglint

Conclusion

Once you have converged on a configuration and chosen an integration method, your workflow will be stable and productive. Moreover, they will most likely not evolve much in the future, so maintaining a codebase with such tools is a no-brainer.

In addition, when used from the start of a project, the cost is close to zero, but when applied to an existing codebase, it can take quite some time to solve all errors. The earlier you start, the better!

Featured image by Kuznetcov_Konstantin

AI & Machine Learning