A Data Scientist’s Guide To Improving Python Code Quality | by Egor Howell | Aug, 2023

Tools and packages to write production worthy Python code

Photo by Christopher Gower on Unsplash

Nowadays, Data Scientists are becoming more and more involved in the production side of deploying a machine learning model. This means we need to be able to write production standard Python code like our fellow software engineers. In this article, I want to go over some of the key tools and packages that can aid in creating production-worthy code for your next model.


Linters are a tool that catches small bugs, formatting errors, and odd design patterns that can lead to runtime problems and unexpected outputs.

In Python, we have PEP8 which fortunately gives us a global style guide to how our code should look. Numerous linters exist in Python that adhere to PEP8, however my preference is flake8.


Flake8 is actually a combination of the Pyflakes, pycodestyle and McCabe linting packages. It checks for errors, code smells and enforces PEP8 standards.

To install flake8 pip install flake8 and you can use it by flake8 <file_name.py>. It really is that simple!

For example, let’s say we have the function add_numbers in a file flake8_example.py:

def add_numbers(a,b):
result = a+ b
return result

print(add_numbers(5, 10))

To call flake8 on this file, we execute flake8 flake8_example.py and the output looks like this:

Flake8 has picked up several styling errors that we should correct to be in line with PEP8.

See here for more information about flake8 and how to customise it for your needs.


Linters often just tell you what’s wrong with your code but don’t actively fix it for you. Formatters do fix your code and help expedite your workflow, ensure your code adheres to style guides, and makes it more readable for other people.


Source link

Leave a Comment