Success in ML Projects through Technical Drawings | by Benjamin Thürer | May, 2023

Improve workflows and expectation management in ML through technical drawings

Machine Learning (ML) projects are becoming increasingly popular in business as organizations strive to gain a competitive advantage or increased market value by utilizing AI. However, unlike traditional software development or analytical projects, scheduling machine learning projects can be more challenging since the success of a project is hardly known before it is completed and because of less structured work processes. One could also say:

“You don’t know if the ML project succeeds until the model is developed and deployed into production”

In other words, you want to make sure you set your team up for success with a structured workflow and reasonable expectations before starting a large ML project (If you want to know more about ML in general, the best tutorials you will find are from Cassie Kozyrkov). A key factor for success is effective communication enabling proper workflows and project management.

Communication in words is hard and can lead to misunderstandings, especially when people speak slightly different languages (the business side on one hand and the technology on the other). In addition, words describing complex relationships require a lot of text or long meetings. Drawings, however, have the benefit to be easy to understand and can be very intuitive (of course only when they are done well). Sometimes, a single drawing (or picture) can replace 1000 words.

Together with my team, I started with a first technical drawing mostly to help ourselves to keep an overview of our ML project and to make sure we have a standardized workflow. Over time, I realized that working with such drawings can help also in future projects and can help inform others and leadership in the company what an ML project entails. The overall feedback so far has been so good that I like to show here what we do and how we use that for communication, deadlines, and setting expectations.

Complex technical drawings can be overwhelming which is why it is better to start simple and add more layers later. This is a potential first drawing of an ML project:

Simplified drawing of the “inner loop” that uses features and a target to iteratively train a model.

As can be seen, this technical drawing is very high level and shows how incoming data creates the two necessary ingredients needed to train a model: features and target. Highlighted in red is the so-called “inner loop” of the ML cycle describing simplified the iterative process using data to train and improve a model.

Before we add more details to the drawing, let’s finish the process first and add what happens after a model has been trained:

Simplified drawing of a part of the “outer loop” that brings the trained model into production and produces output.

This drawing now adds the production pipeline and partly the so-called “outer loop” where features are fed into the model on a scheduled basis and final predictions are outputted into a dataset or directly to the user.

This is the point where we are having a high-level but complete technical drawing of an ML project cycle. Now it is time to add some missing and very important details:

Monitoring is essential for every model that makes it into production and must be part of the drawing.

Monitoring is essential for every product but especially for ML. Slight changes in features can have massive impacts on predictions or vice versa. If you want to prevent your customers from doing the monitoring for you (meaning that you wait until they complain), it is always recommended to monitor the data closely before and after the model. In case your model is re-trained and updated on the fly, additional monitoring would be needed (not shown in the drawing).

Additional details are now very dependent on the audience of the drawing. The audience in this example is on the one end an ML team with experienced data scientists and on the other end a partly technical leadership team that needs to be informed on the process. Given the mostly technical background, I would add more details about the feature store first:

Good feature extraction is key to model performance and a scalable and maintainable feature extraction process will ensure faster turnaround.

Since we are working with tabular data (geospatial data over time), we do not “just” leave the data scientists and data engineers alone with the abstract term “feature store”. Together, we define in more detail up front what we believe has the most value to improve the model and set up a data management system that can easily handle the required data types. For instance, we do separate static context that does not have a timeline from the historical dynamic context that has a timeline but is not regularly updated. In addition, we separate an ongoing dynamic context that has a live feed. In the end, all of these different datasets come together in a unified dataset which we call the feature store to have one place where everyone can pull and test new features and also request to have more features added.

For a data scientist, this is a great situation to be in. The features are already available and formatted in a standard way, so the fun (training a model) can immediately begin. But, before the data scientist can start, we need to adjust our inner loop in the drawing:

A detailed refinement of the inner loop adding preprocessing, the split into 4 datasets for modeling, and layers of evaluation.

This drawing shows a detailed refinement of the inner loop adding 3 important aspects to the training cycle:

  • Preprocessing: to keep the feature store somewhat scalable, any processing of an existing feature (like creating a lagged version for autoregression) will happen on the fly in the preprocessing using views/pointer functions and not be added to the store as another redundant feature.
  • Split in 4 datasets: training of a model is an iterative process. However, to improve on those iterations we need to gain insights into how to improve the model, and with that, we automatically introduce a bias that can lead to overfitting. Therefore, we create one dataset for training, one dataset for validation of the training, one dataset where we can debug and gain insights for iterative improvements, and one dataset at the very end as the final test of the project and if the model needs success criteria.
  • Layers of evaluation: with the 4 datasets in mind we also introduce 3 different evaluations based on validation, debugging, or test set. Before starting to work on the inner loop it is important to align and write down what “success” actually means for the project in terms of these evaluations. Only when the test evaluation performs as planned, is the model a success and ready for production.

With this, we are almost done with a medium-detailed technical drawing of an ML project and only have to add a small thing:

A policy layer at the very end ensures that unexpected model behavior is filtered out or replaced.

One disadvantage of using machine learning is that a model will always give you a result, no matter how much that makes sense. So it doesn’t matter how good your training data is, since you can never ensure that new incoming data is following the norm of your training data, you should expect some unpleasant model behavior at least sometimes. The idea of the policy layer is to filter or replace that behavior. To give an example, a model predicting how many people visit a location could easily predict negative counts for certain situations. The policy layer detects that and replaces those with hardcoded numbers or output from a different model situated on top.

We now landed on a technical drawing covering our entire ML project. However, even though we went in small steps and set up the technical drawing layer-by-layer, it still can be overwhelming looking at the drawing. Imagine you would require more details. Having more complexity will make the drawing less and less intuitive and, thus, useless.

This is where intuitive color codings or shapes can come in. For instance, we use color codings in the drawing to keep track of our progress. As an example, when a new feature was added to the feature store and the code runs in production, the feature store task would be marked as green. That informs the entire team (and company) about the current process in the project.

Green shapes highlight tasks that are “done”. Orange marks tasks that are “in process”. All other tasks are in a “coming up” state.

In addition, discussing the drawing with the team will make setting deadlines way easier. It is possible to split the drawing into sub-parts where for each of these it is fairly predictable how long the work will take. For instance, setting up a feature store or bringing an existing model to production are predictable working tasks. The inner loop, however, is harder to predict but also here it is possible to set estimates. For instance by restricting the time in the inner loop to X weeks. By the time is done, the test evaluation criteria must be met or, otherwise, the project will be down-prioritized (but might be picked up at a later point).

The process from the product idea utilizing machine learning to a production-ready state can face unique challenges. Due to its complexity, ML projects are difficult to estimate, and with weak project management only a limited number of projects will reach the production state. Technical drawings can overcome these limitations by providing visual clarity and aiding in effective communication. Making the overall complexity of the project visible in an intuitive way will aid the team and the entire organization to understand what the project entails and setting reasonable estimates. With this, technical drawings can contribute to more effective communication and workflows throughout the organization.

All images, unless otherwise noted, are by the author.

Source link

Leave a Comment