1d.works

Applying AI in the real-world | Part 2

Feb

Sarunas Simaitis

Introduction

As AI is maturing as a technology with massive amounts of available research to tap into and many frameworks allowing to play and deploy it, it is becoming a tool that every business manager wants and even needs to have in their tool stack.

In the previous part, we’ve elaborated on what challenges the AI brings into projects. Here we will share how to structure it in order to keep it on point, within budget and on time.

How to structure an AI project

Research and proof of concept stage

Every project starts with research and prototype development even when it is not in the plan. If this stage is joined with the development of the production version, there is a risk of confusing solving the problem with solving the solution software problems. That can lead to scope explosion, losing sight of the big picture and focusing on solving details which are not relevant to the problem.

Dedicating time to explore how the problem can be approached, what tools can be used and tradeoffs they bring, not only improves the end solution quality but also allows for more smooth final version development. For example, it happens that one arrives at a problem which cannot be solved by sticking to the plan and having explored alternatives it allows to make quick adjustments to the plan to solve the issue. Otherwise, you might be forced to stop the development and find out how to solve the problem during the main version development which could hurt your timeline significantly and worse, unpredictably.

In order to make the best use of this stage, consider the following tips:

Get real data for available inputs and desirable outputs. In more cases than not even though you are sure you have the right kind of data you don’t or don’t have enough of it. There is a reason why the breakthrough in deep learning happened shortly after a release of a massive image dataset Imagenet. You might be required to do costly data augmentation ( eg. manual image labelling ) and having some estimates on how long that would take is very valuable;
Explore existing research/ tools/ approaches/ models. Even though you might be committed to developing the solution from scratch on your own, researching the work of others will allow you to jump through the solved issues and truly develop a bleeding-edge solution. Moreover, it is quite frequent that the solution benefits from combining aspects of multiple approaches ( eg. doing feature extraction with a pre-trained neural network and mapping the features to output with some more data-efficient algorithm );
Try training, but don’t focus on it. There are many blog posts showing how to train your network with a few lines of code. However, in many real-life situations that is deceptively simple. To train a production-ready model one needs a large dataset, tailored parameter configuration and significant computing power and time to learn all those things. It might be worthwhile to explore cloud training systems ( Eg. Google Tables ) to get a feel for what training could achieve;
Define and calculate performance metrics. This needs to be done carefully to make sure that the solution solves the problem at hand. Since the system has randomness it is important to shift the uncertainty from expensive to cheap spots ( ie. allow to make large errors in cases where it does not matter and small errors in cases where it matters ). Consider whether a false-positive error has the same impact as a false-negative one. Once the metrics are defined try to get actual numbers to use in further decisions. If these metrics are defined properly, they become a very good communication tool between the client and product developer;
Consider the options on how to ensure the system validity after deployement. Since the data requirements are drafted during this stage, it usually means that the appropriate data collection will happen simultaneously with the production version development. Therefore, the ML algorithms will likely have been trained on a data sample which is not yet representative. It is important to consider how the new data will be collected and models updated while the system is deployed, so as to avoid making design choices which would restrict maintenance.
Plan the production version development plan and estimate costs. Knowing how the problem can be approached and what tools and work are needed, enables one to answer what needs to be done, what is a reasonable timeline, what are the requirements. Having that gives us the cost estimate.

Once the feasibility of the approach has been explored and reference numbers are collected it is time to judge whether the expected result would solve the problem, add value to your business and justify the costs. Usually, this stage is pretty fast so it is perfect to use it as cost control. If the prototype is severely lacking in accuracy expect to spend much more time and resources to get more data, train bigger models and tweak the system. Ideally, the prototype is reasonably or nearly reasonably accurate and fine-tuning should be sufficient. Expecting to get required data promptly and training the models fastly is a common mistake.

Production-grade solution

Having collected the data and made the choice to commit to the development the next step is familiar to those in other IT sectors. Plan the work, execute and deploy, or more detailed:

Plan the software layout;
Design the infrastructure;
Setup the infrastructure;
Develop the solution;
Fine-tune the parameters;
Deploy it!

This photo was taken by backlighting the subject with polarized light (the PC screen) and applying a polarizing filter on the camera. I shot in the same way all the photos named “Polarizer” — Photo by Daniele Levis Pelusi / Unsplash

Although this step is very similar to regular IT projects, it is noteworthy to mention one detail. It is important to make sure to limit the effort spent on fine-tuning. The goal of the stage is to have a fully working system deployed and it should be separate from making the system work better. You can continue tweaking and improving the models as it is being used.

Support and fine-tunning

Now that the system is up and running it is there are two issues to address.

Ensure the longevity of the system by defining the maintenance process

Usually, the time between the start and when the system is deployed is too short to get a representative sample. Eg. the carwash project took 3 months and the data we had spanned only 1 weather season and the weather was pretty similar during the whole sample. So the deployed system is very likely to face situations which were not present during the data algorithm training invalidating its result.

Use this stage to implement continuous data sampling pipeline and plan periodic evaluation of the system with that data. Also, it is necessary to define the quality threshold when it would require retraining of the data algorithms and how to approach that.

Dirty Hands — Photo by Photos Hobby / Unsplash

Sometimes it is possible to create the system with some capacity to retrain on incremental changes, but it is also likely the data after deployment would present with an unexpected surprise requiring deeper changes.

Every iteration will move the system exponentially ahead and the system is going to converge after a few updates and the periodic evaluation is necessary to only track fundamental changes in the data.

Estimate the benefits of improved system performance decide how to approach the improvements

Now that the system is running and actual system benefits are becoming known it is simpler to relate added value and ML performance. Unless you have spent the time and money to perfect the ML algorithms before deployment, there are improvements to be made.

Some of the improvements are clear and they have been suggested by either the research stage or productions version development. So you judge the potential value with the expected cost and make the choice.

Other improvements require data analysis and deeper insight. However, this stage requires careful evaluation as the costs and value is rarely known in advance. In relation to the Pareto law, the last 20% improvement is going to cost 80% of the effort and this is that stage, so one has to be sure that the last 20% performance is worth the cost.

This completes the structure which we’ve been using when approaching AI/ML-based software. As much as we’re not done figuring out AI, this structure is not final and subject to change as new tools and new issues arise. Before jumping to conclusions there is one more thing we would like to add.

How to increase the AI system’s transparency

As our previous post showed data-based algorithms tend to be black-boxes making all the development, the runtime and the maintenance harder when compared to rule-based systems. In order for personnel to use the software’s full potential understanding of the system is very important. That understanding also helps to develop trust the system which is necessary for its use.

It is very easy to make an AI/ML algorithm totally opaque - just give an API with some model trained on some data. Then it is enough to find a few cases where the algorithm gives a weird or wrong case and the users will deem the system bad and never use it again.

To address this aspect and ensure the long term success of AI/ML algorithms we’ve relied on user interfaces to help people understand and work better with the system. Well made UI has to help understand the problem, process and solution. A few tips what we found helps to improve UI:

Showcase the inputs to the system;
Showcase the intermediary result of the AI system whenever possible;
Showcase the final result of the system;
Allow users to input what their desired output be and contrast it to the system’s response.

Such UIs helped not only to debug the algorithms but also find cases where it was not the algorithm at fault. Funny example when we’ve been developing the queuing system for the car wash, we’ve noticed that some cameras were not properly protected from rain which made the images uncomprehensible even for humans. Since those cameras were installed before the issue went unnoticed for quite a while and our system helped identify that.

Conclusions

In the last blogposts, we’ve described two computer vision products we’ve made in two very different times. The contrast allows us to identify what has changed in the last decade. Then we’ve shared the structure which we’re currently using when working on AI/ML products. The main takeaway is that as project managers and business owners we have to grow to handle the uncertainty of both the project and solution and judge it against the value it would add. At the time of writing that is not an easy task, but worthwhile to learn and hope you’ll find our tips helpful in that endeavour.

‍