The Analytics Compass Blog

Twice weekly articles to help SMB companies optimize business performance with data analytics and to improve their analytics expertise.

Subscribe via E-mail

Your email:

Search SimaFore

FREE SMB Survey Report

describe the image
  • Our report (pdf) is a survey of analytics usage and needs of more than 100 SMBs
  • Find out which analytics applications have
    • Highest value
    • Most demand
    • Best ROI

Affordable Analytics for SMB


Browse by Tag

Current Articles | RSS Feed RSS Feed

Predictive analytics for defect detection in manufacturing process


We have recently had the opportunity to offer predictive analytics solutions to the manufacturing industry in one of the most "traditional" of areas: defect detection.

Customer problem

The customer manufactures complex assembled components that go into automobiles. Despite a rigorous testing procedure and tight quality control, there were still close to 10% defective parts that would make their way into the final assembled automobiles. This resulted in a lot of warranty issues and obviously the car company (the customer's customer) was far from happy. While the defect per se was not debilitating, i.e. would not make the car non-functional, it resulted in unseemly cosmetic issues to the consumer. And which new car owner would like to see his new car look used or aged?

The problem could be addressed on several fronts:

  1. designing a more robust sub-system
  2. developing a more reliable testing protocol
  3. developing a more stringent or fool-proof inspection process

We chose to start by addressing #2 and #3 with the idea that the insights from predictive analytics would help us eventually tackle #1. In addressing #2, the first step was to identify root causes for the failure of current testing schemes in detecting potential problems. Our analyses aimed to achieve the following:

  • Identify if the measures collected during the testing adequately represent all the factors which potentially cause the problem.
  • Narrow down and rank a list of critical factors selected from the recorded measures which strongly influence the occurrence of the issue.

The value to customer? Clearly this would enable us to determine why a detection failure might occur and potentially IMPROVE upon the ABILITY/EFFICIENCY to detect failures in the future.

For identifying root causes, KeyConnect is a very handy tool. KeyConnect can identify the most important factors within data and removing variables which do not contain much information or value. KeyConnect quantifies the strength of a relationship between each pair of variables using mutual information, which is a better choice than many other techniques such as Principal Component Analysis or Linear Discriminant Analysis, particularly when there are many nonlinear interactions between the variables. Additionally, interpreting the results of some of the techniques such as PCA is time consuming and can be confusing to the end-users of the analysis. key performance indicators ranked using keyconnect resized 600

As a next step, to address #3, we focused on creating a tool to act as an additional filter to identify potentially defective sub-systems. This is where we can apply advanced machine learning techniques to model the inspection process of fault or failure detection and deploy these models either in an assembly line setting or during the testing phase

The value to customer? The machine learning algorithms can help to predict WHEN a detection failure  is most likely to occur based on historical evidence and provide the manufacturer an opportunity to PROACTIVELY IMPROVE quality control to PREVENT the defective product from going into field.

Data mining and predictive analytics has been employed in various accounting, controlling, and auditing related tasks by companies, financial institutions, insurance companies, and tax authorities. Accounting data is typically analyzed and leverage to build models for detecting patterns and irregularities, improper accounting practices, dubious transactions, potential fraud, money laundering, and other undesired activities as well as for transaction monitoring, credit scoring, credit default prediction, risk assessment and minimization, finding risk factors, and predicting expected future demand, prices, and sales. Our approach to solving #3 was to use RapidMiner’s capabilities by mirroring some of the highly successful applications in more traditional predictive analytics applications such as credit default prediction and fraud detection.

The underlying idea behind applying machine learning for predicting defect outcomes is as follows: Build a model using historic data and train it using known cases, to be able to identify or classify outcomes (defective or not). Then use this model to predict outcomes for unknown cases. The model would be validated using current data before applying to actual inspection data where the outcome is “Likelihood of failure”. Validation provides information about the efficacy of the algorithm by identifying true positive, false positive, true negatives and false negatives. This approach would help to provide an additional layer of filtering during the inspection process which would allow the manufacturer to improve confidence in identifying defective assemblies. Data which could potentially be used for this type of model building would include Plant environmental conditions, Operator/Person, Geographic location of failures (consumers), Month/Date/Season of failures, Month/Date, Month/date/Season of manufacture of failed parts, Material suppliers, Lot numbers (raw materials or components), Vehicle Make and Model, OEM or Aftermarket, Manufacturing Date and Location, Installed by, Transportation method Shipping/and Shipping/Packaging methods and so on.machine learning for defect detection process resized 600

While traditional manufacturing models and product quality and failure models are often human-defined and hence limited in complexity and number of variables considered, big data analytics approaches such as these will allow to automatically gather and analyse large number of sensor measurements over long periods of time and to deploy statistics and machine learning to generate even complex models with very many influence factors. Data mining software like RapidMiner and RapidAnalytics makes such large scale analysis and modelling feasible and supports the understanding and optimization of even complex manufacturing and quality problems.

How can you affordably implement predictive analytics in your manufacturing business? Sign up for our pilot program to find out how.


Currently, there are no comments. Be the first to post one!
Post Comment
Website (optional)

Allowed tags: <a> link, <b> bold, <i> italics