A Simple CI/CD Setup for ML Projects | by Marcello Politi | Dec, 2023


Photo by vackground.com on Unsplash

Apply best practices and learn to use GitHub Actions to build robust code

Introduction

Dealing with integrations, deployment, scalability and all those topics that make Machine Learning projects a real product is a job on its own. There is a reason why there exist different job positions ranging from data scientist to ML Engineer and MLOps. Still, even if you don’t need to be an expert on these topics, it is good to have some standard well-defined practices that can help you when you kick off a project. Certainly! In this article, I outline the best practices I’ve developed — a balance between code quality and the time invested in implementing them. I run my code on Deepnote, which is a cloud-based notebook that’s great for collaborative data science projects.

Start Simple — Readme

This may seem trivial but try to keep a Readme file more or less up to date. If it costs you little time, and you like it, also try to make a Readme that looks good. Include imagine headers icons or whatever. This file must be clear and understandable. Remember that in a real project, you will not only be working with other developers but also with salespeople, and project managers, and every now and then they might have to read the Readme to understand what you are working on.

You can find here a really nice readme template!

Use virtual environments, your laptop will be happy

You probably know this better than I do, in order to develop a cool project we need external libraries. Often a lot of them! These libraries may have dependencies or conflicts. That is why it is a good idea to create virtual environments. A virtual environment helps you to have projects isolated from each other, to have completely different development environments. Usually, to do this in Python you use pip or conda.



Source link

Leave a Comment