Is Predictive Maintenance using Machine Learning possible for you?

Predictive maintenance can help but does it always make sense to use machine learning? (Photo by rawpixel on Unsplash)

Predictive maintenance has diffused through the cycle of innovation to the extent that it is highly probable that you have heard about the term in one way or another. While the jury is still out on what the best practices to exactly implement predictive maintenance are, there is no doubt of the business value that drives much of the fan-fare around the buzzword.

This, coupled with the increasing accessibility of the machine learning toolbox to handle large scale sensor data, has led to many attempts in building ML pipelines for predicting failures and optimizing maintenance schedules. In this article, I aim to break down the baseline prerequisites that need to be present in order to leverage machine learning for predictive maintenance.

Full disclosure: I am one of the founders and AI lead at maiot, where we utilize machine learning for predictive maintenance based use-cases in the mobility sector.

Broadly speaking, machine learning folks deal with two different types of data. The first is what might come immediately to mind when thinking about maintenance in working assets: actual time-stream sensor data. However, this category of data is far more useful when it is combined with the second type, namely failure label data. The labels indicate what exactly when wrong with the asset, and they are usually the target to predict ahead-of-time.

Given the importance of labels, it is important to work ‘failure-backwards’. This entails that the first step in any predictive maintenance project should be to look at exactly how much information you have regarding how the assets failed.

It is important not only to know WHEN a failure happened, but also to know HOW and WHAT happened. Broadly speaking, if you have more failures recorded of the same (or similar) root cause, then it is far more likely you will capture a statistically significant pattern of failure that can be used to predict future problems.

Simply put, the more information you have about your failures, the more precise predictions in the future you can get.

The sensor data itself is, of course, the basis for the failure predictions. Usually, there is a heavy reliance on the fact that the combination of sensor readings over time will somehow indicate degradation or failure patterns before something major happens. This makes it especially important that we use the sensors that are most relevant to the types of failures we are trying to predict. For example, if we are trying to predict oil seepage problems, then it would be highly likely that observing trends of oil temperature or pressure would be most indicative of failure.

In the context of modeling time series sensor data in machine learning, it is most beneficial if the data is recorded continuously, i.e., there is no gap in time in our recording values. It is also easier if the recordings are all regular, i.e., the sensors are all recording at roughly the same frequency. These two properties save a lot of hassle and need for assumptions in the actual modeling of our machine learning models.

Ask yourself this: If you were to try to predict when a certain model of a car, lets say a Toyota Corolla, will have an engine failure, would it be more useful for you to look at 100 Toyota Corollas, or instead ten each of 10 different types of Toyota's? Intuitively, it would help if we model many instances of the same type of asset, rather than trying to define a model that can account for all different sorts of failure in all different sorts of assets.

As for humans, so it is for machine learning: It is better if there is data present for many different assets of the same (or close) type. This also extends to not only the make of the asset, but also ideally similar conditions of operation, geographical location, working mode etc. The more data from different sorts of similar machines, the better. This way, the machine learning models can look across many different assets and detect patterns of behavior that lead to similar failures in these assets. This way, we can make use of lots of data and ideally lots of failure labels in our predictive maintenance model.

Using the same logic as above, it is also extremely beneficial if there are many recordings of very similar types of failures. For instance, it is better to have a 100 Toyota Corollas with engine related failures recorded, than the same amount of Corollas with engine, brake, gearbox and other miscellaneous errors recorded. Of course, this entails that you would have to increase your data collection to capture similar failures, but the more homogeneous labels you can capture, the better.

Alas, one must face the inevitability that it is impossible to predict every sort of failure that can possibly happen. For example, breakdowns or failures that happen due to human error are sometimes just too stochastic to predict (at least for now!). Similarly, failures that happen suddenly, without showing any recognizable changes in all sensor values (even at high frequency of data collection), are also highly unlikely to be predictable.

However, in my experience, the majority of failures tend to show signs early on before they happen (which is great news). A good way to see if your failures show patterns is actually quite straightforward: Look at some relevant sensor values before a certain failure happens. Are there any humanly obvious signs of fluctuations of values that look suspicious? Or is it all behaving like it did before?

This criteria is perhaps the hardest to check off the list. Even if you cannot see any obvious signs of degradation, they might be simply too hard to detect without using advanced data analytics. That’s why, this one is good to take a look at but is certainly not a show stopper if you cannot spot it!

One important decision to be taken when developing a predictive machine learning model is deciding exactly how far ahead to model the predictions. Perhaps it is possible to get an extremely high precision and recall (fancy metrics to evaluate how good your predictions are), but only predicting 1 minute in advance — which might not be the most useful in a real-world scenario. Similarly, you can imagine to predict weeks in advance by sacrificing model performance, when a few hours notice would be sufficient.

Hence, it is highly useful to have a clear picture in mind as to how many minutes or hours notice might be enough to prevent further damage. Generally, the farther ahead you try to predict in time, the less accurate you get (something backed up by intuition). One should be mindful of the real-world requirements when dealing with predictive maintenance.

Luckily, coming up with answers for most of the above is not an overly complicated task, and would give a fair approximation of how well machine learning would work in your predictive maintenance use-case.

Shameless plug: If you think that your industry is in dire need of predictive maintenance, please don’t hesitate to reach out at hamza@maiot.io. At maiot, we offer a quick, hassle-free feasibility analysis to validate on the technological and business end whether predictive maintenance makes sense for you. Head over to our feasibility analysis page for more details.

Follow maiot on: Facebook | LinkedIn

Software Engineer turned ML Engineer. Interested in building tech products end-to-end. Co-creator of PicHance, you-tldr, and ZenML.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store