182 
Advances in Production Engineering & Management ISSN 1854-6250 
Volume 19 | Number 2 | June 2024 | pp 182–196 Journal home: apem-journal.org 
https://doi.org/10.14743/apem2024.2.500 Original scientific paper 
Optimization of reliability and speed of the end-of-line 
quality inspection of electric motors using machine learning 
Ml i nar i č, J.
a,b,
∗
, Pregelj, B.
a
, Boškoski, P.
a
, Dolanc, G.
a
, P et r ovč ič, J.
a
 
a
Jozef Stefan Institute, Jamova cesta 39, 1000 Ljubljana, Slovenia 
b
Jozef Stefan International Postgraduate School, Ljubljana, Slovenia 
A B S T R A C T  A R T I C L E   I N F O 
Consistently maintaining high-end product quality in the production process 
is challenging. End-quality inspection must be highly sensitive to detect even 
minimal deviations, while being fast and accurate. However, quality inspection 
systems often face calibration intricacies, are time-consuming, and rely heav-
ily on expert knowledge. They handle substantial data flows and inspect nu-
merous features, some of which contribute minimally to the final grade. To ad-
dress these challenges, the paper proposes employing statistically supervised 
machine learning methods for classification. Decision trees, Random forests, 
Bagging, and Gradient boosting classifiers are recommended for feature selec-
tion and accurate diagnosis, particularly for electric motor classification. By 
utilizing the feature importance attribute for feature selection, the proposed 
approach compares model accuracies, reducing ramp-up and commission 
times significantly. The study found that all suggested classifiers achieved high 
accuracy in classifying electric motors in end-of-line quality inspection system. 
Moreover, they effectively reduced the number of features and optimize data-
base operations. Utilizing a reduced feature set streamlined diagnostic algo-
rithms, accelerated learning, and improved model interpretability, enhancing 
overall efficiency and comprehension. Furthermore, analysing the feature im-
portance attribute could simplify diagnostic hardware and expedite quality in-
spection by eliminating unnecessary steps. Newly generated models can also 
verify expert decisions on feature selection and limit adjustments, enhancing 
efficiency in production processes.  
 Keywords: 
Quality inspection;  
Fault detection;  
Machine learning;  
Feature selection and 
classification; 
Feature importance; 
Decision trees; 
Random forests; 
Bagging; 
Gradient boosting algorithm 
*Corresponding author: 
jernej.mlinaric@ijs.si 
(Mlinarič, J.) 
Article history:  
Received 20 February 2024 
Revised 8 May 2024 
Accepted 27 May 2024 
Content from this work may be used under the terms of 
the Creative Commons Attribution 4.0 International 
Licence (CC BY 4.0). Any further distribution of this work 
must maintain attribution to the author(s) and the title of 
the work, journal citation and DOI.
1. Introduction
Electric motors are one of the most mass-produced devices and they are produced at highly au-
tomated manufacturing lines equipped with 100 % end-of-line (EoL) quality inspection, [1, 2]. 
Maintaining constant product quality, detecting faulty products and preventing them to be deliv-
ered to the customers or further built into devices and systems is highly important. With each 
production/integration step of faulty part, replacement costs increase substantially [1]. For this 
reason, a great attention is put to the design and implementation of fully automated EoL quality 
inspection systems. These must be reliable and at the same time fast enough not to hinder the 
production line pace. In the presented case, quality inspection of electric motors is performed in 
a non-invasive way, where several variables are measured during short test run of the motor. The 
measured variables comprise electric parameters (voltage, current and power), speed, torque, 
vibrations at different points of motor body and sound at different rotational speeds [1]. The 
Optimization of reliability and speed of the end-of-line quality inspection of electric motors using machine learning 
 
Advances in Production Engineering & Management 19(2) 2024 
183 
 
mentioned signals are sampled by high frequency (e.g. 50 kHz) and further processed by signal 
processing methods (e.g. Digital Filtering, Fast Fourier Transformation, etc.). This reduces the 
amount of data considerably while preserving relevant information, but still results in a high 
number of calculated parameters – features, representing basis for motor fault detection and iso-
lation. High number of features can be impractical from several reasons: a) not all features carry 
useful information, b) some feature may carry the same information, c) comparing a high number 
of features against their thresholds values may be time consuming and finally, d) determining 
(learning) feature threshold values is demanding and time-consuming process. 
With an ultimate goal of developing reliable and fast quality inspection methods, this paper 
deals with the problem of reduction of feature space to a limited subspace of relevant features, 
carrying enough information for motor quality inspection. The space of features can be reduced 
by implementing machine learning methods which select only the relevant features. Since not all 
relevant features contribute the same amount of information to final classification, additionally 
evaluation of feature selection describes the influence of each observed feature. Therefore, the 
features with the minimal influence can be eliminated from learning procedure. This can signifi-
cantly decrease computational demand during learning (ramp-up and commission time) and op-
eration phases. Moreover, in certain cases it can even lead to elimination of particular measure-
ments (sensors), thus simplifying the inspection system hardware and software as well as speed-
ing-up the EoL testing procedure. Study presented in this paper is based on the real industrial 
data derived from real EoL quality inspection systems installed at the production site of one of 
renowned European mass producer of electric motors. EoL quality inspection line, which is sub-
ject of this paper, was designed and implemented by the authors of the paper. 
The paper is organized as follows: In Section 2, the subject of inspection and existing quality 
inspection procedure/system are briefly described. The structure of the measured data record 
and resulting feature set generated by the inspection of one motor is described. This is then fol-
lowed by the Section 3, where machine learning algorithms (Decision tree, Random forest, Bag-
ging and Gradient boosting) for feature selection are presented. In Section 4 the presented algo-
rithms are evaluated and compared. 
2. Problem description: Subject of inspection and quality inspection system  
The subjects of inspection are brushless DC (BLDC) motors for domestic and automotive applica-
tions. An example of such motor used for vacuum cleaning applications, is shown on the Fig. 1. 
The addressed motors are manufactured by the renowned mass producer (Domel Slovenia, [3]). 
The production takes place at fully automated assembly line equipped with modular EoL quality 
inspection system, presented on the Fig. 2. More details can be found in [2], which describes sim-
ilar system. 
 
 
Fig. 1 Example of the BLDC motor (subject of EoL quality inspection) 
Mlinarič, Pregelj, Boškoski, Dolanc, Petrovčič 
 
184 
Advances in Production Engineering & Management 19(2) 2024 
 
 
Fig. 2 Modular EoL quality inspection system 
 
The EoL quality inspection system assures “100 % quality inspection” meaning that each pro-
duced motor undergoes the test procedure. In general, the faults fall in two categories: electrical 
and mechanical. The latter can be further divided to rotor, bearing and turbine faults. Rotor and 
bearing faults are comprehensively elaborated in [4] whereas the explanation of turbine faults 
can be found in [1, 2]. The main steps of quality inspection procedure are described in following 
sections 2.1, 2.2 and 2.3 and in the Fig. 3.  
2.1 Measurement and data acquisition 
In the EoL quality inspection system, each motor is started several times and during short test 
runs various motor parameter are measured by the automatic measuring and data acquisition 
system (Fig. 3, square 1). The following parameters are measured by sensors: electric parameters 
(winding voltage and current, power, power supply current and voltage), vacuum pressure, rota-
tion speed, vibrations at several points of motor body, sound at low and high rotational speed and 
also environmental conditions (ambient temperature and pressure) to compensate their effect 
on motor performance. The results of measurements are time series (waveforms) of particular 
parameter, and they represent ‘’raw signals’’. Depending on observed parameter and derived fea-
tures, signals are acquired at specific sampling frequency (typically 10-60 kHz) and measurement 
duration (typically from 0.1 up to 1 s), resulting in timeseries of various lengths (from 1000 sam-
ples to 30000 samples). 
2.2 Feature extraction by signal processing 
To reduce the amount of data and to extract the relevant information, raw signals are processed 
by signal processing methods, such as filtering (low-pass, high-pass, band-pass filters), down-
sampling, averaging, frequency analysis, etc. The outcome of signal processing is a set of ‘’fea-
tures’’, which are detailed in [4] and shown on the Fig. 3 (square 2). They are in general: 
• Root-Mean-Square (RMS) values of band-pass-filtered waveforms; 
• Power of signals at particular frequencies; 
• Aggregated/actual values obtained from specific measurement equipment. 
Details of feature extraction and signal processing algorithm are not described in this paper as 
they are subject of past research and development, elaborated in detail in [1, 5]. In this particular 
case the signal processing algorithm generates 80 features, where each feature is represented by 
floating-point numeric value. 
2.3 Diagnostic result generation 
Based on the on the values of the features, final diagnostic result of the inspected motor is gener-
ated by simple rules, as follows from the Table 1. For each feature it is checked if it is inside spec-
ified range. 
Optimization of reliability and speed of the end-of-line quality inspection of electric motors using machine learning 
 
Advances in Production Engineering & Management 19(2) 2024 
185 
 
Table 1 Diagnostic result generation 
   Measurements Features Diagnostic result 
 Completed 
All features are within specified ranges Motor GOOD 
One or more features are outside specified range Motor BAD 
 Not completed, due to: 
• measurement faults 
• sensor faults 
• motor manipulation fault 
• motor transport faults 
• etc. 
 UNDEFINED 
 
At the start of the production of new motor type, the motors from test-production set (series 
0) are assessed as good or bad by the skilled experts. Based on experiences, the experts select 
features that are going to be used in diagnostic result generation. For this case-study system and 
motor type, 36 of total 80 features were chosen, and for each of chosen features two limit values 
(low and high) are set. In practice, all this is done manually by skilled experts. This is time-con-
suming and highly depends on expert skills. In addition, this method requires regular updates 
and fine-tuning of limit values when the mass production starts and production volumes in-
creases [4]. 
Fig. 3 illustrates the entire procedure of measurement and data acquisition, feature extraction 
and diagnostic result generation. The whole process for one motor can be executed under 30 s, 
but due to parallel execution of diagnosis steps, the motor inspection rate is 10 s, which means 
that every 10 s one motor exists the EoL inspection system. 
The described quality inspection algorithm successfully detects motors with insufficient quality, 
but it has some drawbacks: 
• High number of original features (80) leads to high number of feature range limits that must 
be defined; 
• Some features carry similar information (redundancy); 
• Some features carry no useful information; 
• Skilled expert is required to remove redundant features and features that carry no relevant 
information and to adjust limit values of feature in use. This is difficult and time consuming 
and becomes an issue during start of production and commissioning of new motor types. 
Based on that there are four main goals of the study presented in this paper: 
• Automatic selection relevant of features (removing redundant features and features that 
do not carry relevant information); 
• Decrease the dependence on human expert skills; 
• Automatic determination of feature limit values; 
• Generating classification models with set of features that hold 95 % of useful information; 
• Reducing ramp-up and commission time of the quality inspection system. 
 
 
Fig. 3 Data transformation during the procedure 
Mlinarič, Pregelj, Boškoski, Dolanc, Petrovčič 
 
186 
Advances in Production Engineering & Management 19(2) 2024 
 
3. Feature selection using machine learning methods 
In this section, dedicated machine learning methods will be used to automatically select the rele-
vant features and set their threshold values. In general, feature selection is an effective way to 
deal with dimensionality [6] and it is often used in areas where a huge amount data is being ob-
tained, such as identifying genes [7-9], image classification and analysing [10-13] text classifica-
tion [14-16], in recent times also for monitoring manufacturing processes and quality control [17-
19]. Feature selection aims to identify and retain the most relevant features while discarding re-
dundant or uninformative ones by determining the “degree of usefulness” of a specific feature. By 
reducing the number of features, also the number feature limit values to be adjusted is reduced. 
The risk of the associated quality inspection errors [6] is also decreased. However, within the set 
of informative features, some may be significantly more informative than others.  
The methods presented not only eliminate non-informative features but also sort the remain-
ing features according to their informativeness. The goal of this paper is therefore to assess 
whether the quality inspection can be successfully performed by using only the limited number 
of the most informative features. Feature selection reduces the dimensionality of the data; there-
fore, data mining algorithms can be operated faster and more efficient [6]. Reduced amount of 
input data simplifies the interpretability of tree-like machine learning methods [20]. Additionally, 
such simplified classification methods and reduced input datasets decrease the ramp-up and 
commission time of quality inspection system and whole production line. Since some redundant 
and non-informative features are removed, sensors associated with removed features can poten-
tially be eliminated. Optimization and reordering of the diagnostic steps based on feature im-
portance can speed-up the quality inspection procedure.  
3.1 Supervised machine learning classification methods 
Supervised machine learning methods were selected since the labelled data for learning is avail-
able. These methods offer several advantages, including high reliability based on statistics, ro-
bustness, and reduction of the need for expert knowledge (e.g. knowledge about physical back-
ground of the system). However, in order to establish a supervised learning method, a sufficient 
amount of data from the production process is required. Therefore, these methods are suitable 
for manufacturing lines for mass production, like the one presented in this paper, where a lot of 
data is generated. In this paper 4 different methods were tested and compared: 
• Decision tree classifier (DT); 
• Random forest classifier (RF); 
• Bagging classifier (BG); 
• Gradient boosting classifier (GB). 
The decision tree classifier partitions the instance space through a recursive process, forming 
a tree model where top nodes (roots) lack incoming edges, while internal nodes (test nodes) split 
the space based on attribute values. Internal nodes symbolize decision points, and bottom nodes 
(leaves) indicate decision outcomes [21, 22]. 
Random forests is a powerful ensemble learning method that combines multiple tree predic-
tors. It belongs to family of averaging methods, meaning, the driving principle is to build several 
estimators independently and then to average their predictions [23-25]. 
Bagging, short for bootstrap aggregating, is a powerful and straightforward method for con-
structing an ensemble of classifiers. It also belongs to family of averaging methods. It combines 
multiple classifiers' outputs for improved accuracy by training each on a subset of instances ran-
domly drawn from the training set [22, 26, 27]. 
Gradient boosting is a technique for improving the performance of weak learners [28]. It be-
longs to the family of boosting methods, meaning, base estimators are built sequentially, and one 
tries to reduce the bias of the combined estimator. It enhances weak learners sequentially, aiming 
to reduce the combined estimator's bias. The technique combines weak models for a powerful 
ensemble, particularly effective in decision trees [22, 25].  
While the decision tree classifier (DT) stands alone, the remaining three classifiers (RF, BA, 
Optimization of reliability and speed of the end-of-line quality inspection of electric motors using machine learning 
 
Advances in Production Engineering & Management 19(2) 2024 
187 
 
GB) belong to the ensemble-based methods category. All methods here are “tree-like” classifiers 
and can be represented as decision trees using IF-THEN rules. The methods automatically gener-
ate IF-THEN rules for classification, including threshold values for each observed feature.  
3.2 Data for learning and evaluation 
To implement and test the methods, the data is needed. Data was generated by existing EoL qual-
ity inspection system (Fig. 2) during inspection of a total of 37440 motors. Generated data con-
tains raw time series of measured signals and extracted features (mentioned 80 features), as fol-
lows from the Section 2. For machine learning algorithms, all features are used. The whole data 
set of features can be presented as 37440 × 80 matrix. The quality inspection results (1=Motor 
GOOD, 2 = Motor BAD, 0 = UNDEFINED) represents 37440 × 1 vector. The entire data set was 
divided into two parts: training data (75 % of all data) and testing data (25 % of all data). The 
situation is represented by the Table 2.  
The data records of all 37440 motors were randomly distributed between training data and 
testing data to compensate for possible time drift of product quality. The same training and test-
ing datasets were used during the training and testing of all 4 machine learning methods. Dataset 
remained unchanged throughout the entire training process. All involved features are named by 
symbols and anonymized to prevent the disclosure of sensitive technical data. 
 
Table 2 Training and testing data arrangement 
 F1 F2 … F80 R  
M1 X X X X X 
Training data  
(75 %) 
M2 X X X X X 
… X X X X X 
M28080 X X X X X 
M28081 X X X X X 
Testing data  
(25 %) 
… X X X X X 
M37440 X X X X X 
M – motors, F – features, R – quality inspection results 
3.3 Implementation 
Fig. 4 illustrates the proposed process of feature selection. 
 
 
Fig. 4 Flow chart of the procedure 
  
Mlinarič, Pregelj, Boškoski, Dolanc, Petrovčič 
 
188 
Advances in Production Engineering & Management 19(2) 2024 
 
The process is carried-out in six steps: 
1. Generation of original feature set (an input matrix of size 80 features × 37440 motors and 
output matrix of size one feature – grade × 37440 motors) obtained by data acquisition and 
signal processing, described in Section 2.  
2. First machine learning to train the classifiers. Feature selection is utilized to eliminate re-
dundant features. Depending on the type of classifier, the range of selected features is re-
duced to 38 (DT classifier), 78 (RF classifier), 42 (BG classifier), and 35 (GB classifier). 
3. Evaluation of trained classifiers to present their performance and capability to evaluate 
features’ classification impact and importance. From obtained results it followed: 
a. Set of features with 95 % informativeness additionally reduce space of features. De-
pending on the type of classifier, the range of selected features is additionally reduced 
to 17 (DT classifier), 51 (RF classifier), 22 (BG classifier), and 17 (GB classifier). 
b. Certain features, despite their low importance, still persist.  
Therefore, it was decided to check the performance of reduced classifiers with fea-
tures that contains 95 % of all useful information. 
4. Generation of reduced sets of training and testing data with features with 95 % of useful 
information for each classifier, output matrix remains the same from step 1. 
5. Second machine learning to train the classifiers with new reduced dataset. 
6. Evaluation of new classifiers and comparison to the results of classifiers from step 2. 
All classification methods were implemented and tested in Python using the scikit-learn 
(sklearn) library. Cross-validation was employed during training (step 2 and step 5) to ensure 
robust model evaluation and to prevent model overfitting. For visualization and data manipula-
tion, matplotlib, numpy, and pandas libraries were also utilized. 
3.4 Presentation of the results and comparison of the methods 
Following the training phases, the trained algorithms were evaluated using the testing data, and 
the predicted output classes (quality inspection result) were compared to the actual output clas-
ses. For each method, outcomes are presented in the form of well-known Confusion Matrix (CM). 
The CM provides numerical and visual representation of the classification algorithm’s accuracy. 
It consists of columns representing the predicted output classes and rows representing the actual 
output classes. In the presented case, since there are three classes, the size of the CM is 3 × 3 as 
shown in the Table 3. The diagonal elements represent correctly classified instances, while the 
off-diagonal elements represent miss-classified instances. 
 
Table 3 Confusion matrix structure 
  PREDICTED 
  
UNDEFINED 
Motor BAD 
Motor GOOD 
     ACTUAL 
UNDEFINED    
Motor BAD    
Motor GOOD    
 
The CM provides valuable information about the miss-classification, but it does not directly 
capture the cost associated with each type of miss-classification. To address this, the Miss-classi-
fication cost matrix can be introduced. This matrix assigns specific costs to different types of miss-
classifications, considering the relative importance or impact of miss-classifying different classes. 
The Miss-classification cost matrix has the same dimension as the Confusion matrix and consists 
Optimization of reliability and speed of the end-of-line quality inspection of electric motors using machine learning 
 
Advances in Production Engineering & Management 19(2) 2024 
189 
 
of numerical values, as shown in the Table 4. The diagonal elements of the Miss-classification cost 
matrix are set to zero (correct classification represent no cost). The off-diagonal elements repre-
sent the costs associated with miss-classified instances, with larger values indicating higher risks 
or costs associated with miss-classification. In our case the highest cost is associated with situa-
tion when actual faulty or undefined motor is recognized as good one and delivered to the cus-
tomer (actual class BAD or UNDEFINED, predicted class GOOD). The cost of this miss-classifica-
tion is set to 10. Bad motor that is predicted as undefined (and the opposite) presents low cost of 
miss-classification, therefore these cases are graded with 0.5. Good motor, ranked as bad or un-
defined, does not present any risk for the customer, but it represents an unnecessary waste of 
motors, therefore it is marked with 1.  
To calculate the overall miss-classification cost, the CM the Miss-classification cost matrix are 
multiplied (element-by-element multiplication) resulting in new 3 × 3 matrix. Total cost is calcu-
lated by adding up all 9 elements of resulting matrix. This value provides a measure of the total 
cost incurred due to miss-classification. Ideally, a well-performing classifier would have a miss-
classification cost close to zero, indicating minimal miss-classification and associated risks. 
The accuracy of a classifier is a measure of its performance and is calculated as the ratio be-
tween the number of correctly classified elements and the total number of elements. The desired 
accuracy is close to 1 (100 %), indicating that almost all elements were classified correctly.  
 
Table 4 Cost matrix 
 
 
PREDICTED 
 
 
UNDEFINED 
Motor BAD 
Motor GOOD 
    ACTUAL 
UNDEFINED 0 0.5 10 
Motor BAD 0.5 0 10 
Motor GOOD 1 1 0 
 
4. Analysis of the results 
During machine learning, all methods generated own IF-THEN rules for classification. In 3.1 it is 
explained that chosen methods are “tree-like” and can be explained with IF-THEN rules. In the 
Fig. 5, an example of the decision tree of the decision tree classifier is presented. The figure shows 
a diagnosis procedure with a tree-like set of rules and leaves. Since the generated decision tree is 
very extensive, one rule for the first branch is explained. At the enlarged part of the figure, the 
auto generated threshold value for one particular feature (BE_H2) is illustrated. At enlarged part 
the gini index value [6, 21, 29] is also illustrated and used as splitting criteria [6]. Further, at this 
branch 27064 samples are involved in classification process, where 600 of them are marked as 
UNDEFINED, 1355 as Motor BAD and 25109 as Motor GOOD. 
Each of the methods generates similar decision tree scheme where rules are defined by ob-
served features and their threshold values. Threshold values are set automatically during ma-
chine learning and since they are presented as real values. They can be easily checked and re-
adjusted. Such examination of the decision tree structure of each classifier enhances comprehen-
sion of the decision-making process employed by each method. This understanding is particularly 
valuable in industrial applications where interpretability is paramount, as it allows users to gain 
insights and interpret the decision-making process with clarity. 
 
Mlinarič, Pregelj, Boškoski, Dolanc, Petrovčič 
 
190 
Advances in Production Engineering & Management 19(2) 2024 
 
 
Fig. 5 Decision tree of the decision tree classifier 
 
Results of feature selection are collected in Table 5, where the comparison between classifiers 
is presented. The results show that all methods successfully executed feature selection in the step 
2 (Fig. 4). The gradient boosting classifier selected the lowest number of important features, 
meaning, this method requires the smallest amount of data for successful classification. The least 
successful method here is random forest. 
However, the best performance shows the classifier with the lowest number of wrong classi-
fied motors and with the lowest miss-classification cost. Therefore, bagging overperformed all 
other classifiers and gradient boosting yield the worse performance. Fig. 6, Fig. 8, Fig. 10 and Fig. 
12 present confusion matrices for each classifier, generated at step 2. 
 
Table 5 Comparison of observed classifiers 
 Decision 
tree 
Random 
forest 
Bagging Gradient 
boosting 
No. of selected features with original dataset after 
step 2 
38 78 42 35 
Number of features for 95 % informativeness 17 51 22 17 
Influence of 10 most important features 92 % 69 % 88.6 % 91.4 % 
Accuracy of classifiers with original dataset (step 3) 99.2 % 99.33 % 99.46 % 99.19 % 
Accuracy of classifiers with reduced dataset (step 6) 99.16 % 99.33 % 99.46 % 99.16 % 
No. of wrong classified motors of classifiers with 
original dataset (step 3)  
75 53 51 76 
No. of wrong classified motors of classifiers with 
reduced dataset (step 6) 
79 53 51 79 
Miss-classification cost of classifiers with original 
dataset (step 3) 
426 477 312 679 
Miss-classification cost of classifiers with reduced 
dataset (step 6) 
430 468 348 637 
 
  
Optimization of reliability and speed of the end-of-line quality inspection of electric motors using machine learning 
 
Advances in Production Engineering & Management 19(2) 2024 
191 
 
Further insights into feature selection across different classifiers show that the most influen-
tial features hold the majority of information, useful for classification. Third row in the Table 5 
shows a part of information contained in the 10 most important features of each method. In all 
methods (except RF) top ten features contain the majority of information and to achieve 95 % of 
data informativeness, 17-51 features are required. This comprehensive evaluation provides in-
sights into the strength and weakness of each classifier, considering both accuracy and miss-clas-
sification cost. At the Table 6, where the 10 most influential features for each classifier in the 
order of importance are listed, it is shown that all classifiers recognize the majority of features as 
important (Figs. 14-17 visually illustrate the informativeness of each observed classifier for 10 
most important features). At the Table 7 all features involved in machine learning are listed and 
in columns each of them is marked whenever it appears to be important for each classifier. With 
gravy are coloured rows of features that are important for all observed classifiers.  
These findings can be highly beneficial for tuning classification parameters. Instead of adjust-
ing the thresholds for all observed features, only the most influential features need to be ad-
dressed. Additionally, since some features do not contribute to the final classification decision, 
they do not need to be stored in the company database. This results in the reduced computational 
burden by focusing only on the most informative features, a smaller data flow between the local 
computer and the company database, reduces the potential for communication errors, and ulti-
mately requires less storage space. 
 
Table 6 List of 10 most important features for each classifier 
 Decision tree Random 
forest 
Bagging Gradient boost 
1. important feature BE_H1 BE_H2 BE_H1 BE_H1 
2. important feature BE_H2 BE_H1 BE_H2 BE_H2 
3. important feature BE_H3 BE_H3 BE_H5 VRC 
4. important feature VA BE_H5 VRC VA 
5. important feature VRC BE_H4 VA VA_H1 
6. important feature HW_H1S BE_H6 HW_H1S HW_H1S 
7. important feature VRL_H2 VRC VA_H1 BE_H3 
8. important feature HW_H6 VA BE_H3 BE_H6 
9. important feature AVR_V HW_H1S BE_H6 VRL_H2 
10. important feature VW_V VA_H1 HW_H6 FR_H2 
 
Table 7 All features involved in machine learning and their recognition as important for each classifier 
   Feature 
   name 
DT RF BG GB 
VA X X X X 
VA_H1  X X X 
VA_H2  X   
VA_H3  X   
VA_H4  X   
VA_H5  X   
VA_H6 X X   
VA_H7  X   
VA_H8  X   
VA_H9  X  X 
VA_H10  X   
VA_H11  X   
VA_H12  X  X 
VA_H13 X X   
VA_H14  X   
VA_H15 X X   
AVR_U  X X X 
AVR_V X X X X 
AVR_W  X   
BM_U  X   
BM_V X X X X 
BM_W X X   
FR_H1 X X X X 
FR_H2 X X X X 
BE_H0     
BE_H1 X X X X 
BE_H2 X X X X 
BE_H3 X X X X 
BE_H4 X X X X 
BE_H5 X X X X 
BE_H6 X X X X 
CP X X X  
VW_U  X X X 
VW_V X X X  
VW_W X X X X 
VRC X X X X 
VRC_H1  X   
VRC_H2 X X   
VRC_H3  X   
VRC_H4 X X X  
Mlinarič, Pregelj, Boškoski, Dolanc, Petrovčič 
 
192 
Advances in Production Engineering & Management 19(2) 2024 
 
VRC_H5  X  X 
VRC_H6  X   
VRC_H7  X   
VRC_H8  X   
VRC_H9 X X X  
VRC_H10  X   
VRC_H11  X  X 
VRC_H12 X X   
VRC_H13  X X X 
VRC_H14  X  X 
VRC_H15  X X X 
VRL  X   
VRL_H1  X X  
VRL_H2 X X  X 
VRL_H3  X   
VRL_H4  X   
VRL_H5  X X X 
VRL_H6 X X X X 
VRL_H7  X X  
VRL_H8  X X  
VRL_H9 X X X X 
VRL_H10  X X X 
VRL_H11  X   
VRL_H12  X X  
VRL_H13 X X   
VRL_H14  X   
VRL_H15  X   
MV X X X X 
CW_U X X   
CW_V X X X  
CW_W X X X  
HW_H1S X X X X 
HW_H18 X X X X 
HW_H2 X X X X 
HW_H2S X X X X 
HW_H3 X X X X 
HW_H4 X X X X 
HW_H6 X X X X 
HW_H9  X X  
REV     
At last, the performance of classifiers, trained with features of 95 % information (step 5), is 
evaluated (step 6). The result in Table 5 shows that the accuracies and performances of classifiers 
do not change a lot. However, despite the RF and GB classifier, the cost of miss-classification in-
creased, meaning, those classifiers miss-classified performed slightly worse. (as shown at Fig 6-
13). Despite minor fluctuations in accuracy, the overall performance remains relatively stable 
what indicate the robustness of chosen methods. However, the increase in miss-classification 
costs for certain classifiers indicates potential areas for optimization in future iterations.  
 
 
 Fig. 6 Confusion matrix for decision tree classifier 
      (38 features) 
 
 
Fig. 7 Confusion matrix for decision tree classifier 
      (17 features) 
 
Optimization of reliability and speed of the end-of-line quality inspection of electric motors using machine learning 
 
Advances in Production Engineering & Management 19(2) 2024 
193 
 
 
Fig. 8 Confusion matrix for random forest classifier 
    (78 features) 
 
Fig. 9 Confusion matrix random forest classifier 
       (51 features) 
 
 
Fig. 10 Confusion matrix for bagging classifier 
         (42 features) 
 
Fig. 11 Confusion matrix for bagging classifier 
         (22 features) 
 
Fig. 12 Confusion matrix for gradient boost classifier 
  (35 features) 
 
Fig. 13 Confusion matrix for gradient boost classifier 
   (17 features) 
Mlinarič, Pregelj, Boškoski, Dolanc, Petrovčič 
 
194 
Advances in Production Engineering & Management 19(2) 2024 
 
 
Fig. 14 Feature importance for decision tree classifier 
 
Fig. 15 Feature importance for random forest classifier 
 
Fig. 16 Feature importance for bagging classifier 
 
Fig. 17 Feature importance for gradient boost classifier 
5. Conclusion 
This study introduced and compared various classifiers for feature selection purposes used for 
automated end-of-line quality inspection of electric motors within the real manufacturing line. 
Decision tree, Random forest, Bagging, and Gradient boosting classifiers were implemented and 
assessed based on their complexity (number of selected features), accuracy, and the impact of the 
important features. Initial goal of the study was achieved successfully. All four tested classifiers 
demonstrated high accuracy, proving their suitability for electric motor classification in an end-
of-line quality inspection system. All investigated classifiers successfully reduced the number of 
features and thus optimized the database operation. Further, the second step of feature selection, 
with a reduced dataset featuring features that hold 95 % of useful information, yielded high ac-
curacy of trained classifiers. This reduced feature set simplifies the diagnostic algorithm, speeds-
up its’ learning, improves the interpretability of the observed models and makes them more un-
derstandable and explainable. In addition, new classification models, learned with reduced da-
taset, simplify the end-of-line quality inspection, decrease the ramp-up and commission time, 
eliminate unnecessary steps of diagnosis, reduce equipment complexity (in some cases eliminate 
the need for particular sensors), reduce costs, and minimize data flow. Consequently, company 
databases are optimized. Due to fully automated learning procedures, reliance on specialized ex-
perts is reduced. Developed classification models can also be used as a verification of experts’ 
decision regarding feature selection and threshold values adjustment. In summary, this study en-
compasses insights into feature selection, practical implications for industrial applications con-
sidering methods robustness and comprehensive evaluation of different classifier, considering 
accuracy and miss-classification cost, aiding in decision making when selecting the most suitable 
classifier for specific application. 
 
Optimization of reliability and speed of the end-of-line quality inspection of electric motors using machine learning 
 
Advances in Production Engineering & Management 19(2) 2024 
195 
 
During research and implementation, two interesting and useful future topics of research were 
identified: 1. Transferability of classification models, and 2. Condition monitoring of production 
lines. 
Within the first topic it could be investigated if classification models derived for one motor 
type could be used for quality inspection of similar motor types, or if they could increase learning 
procedure of new motor types. Based on the manufacturer’s experiences, the diagnosis procedure 
across various manufacturing lines follows a similar approach, leading to the detection of similar 
faults and malfunctions across different products. Furthermore, common features are identified, 
suggesting that the feature selection process from observed classifiers of a particular motor type 
could be applied to other product. This transferability would be beneficial especially for small-
series products. Some motor types are manufactured in small-series (e.g. up to thousand pieces 
per year), therefore it is challenging to establish accurate classification model with limited 
amount of learning data. Exploring the applicability of these findings across various motor types 
can speed-up the creation of quality inspection algorithms for new motor types in the future. As 
new motor types are developed frequently and produced in varying quantities, the transferability 
of the methods can establish a standardized approach to implement quality inspection algorithms 
for new motor types, reducing costs and thus enhancing the whole manufacturing process. 
Second topics regards possibilities of condition monitoring of production line. In normal con-
ditions (when there is no degradation of the manufacturing process) features importance attrib-
utes do not significantly change with time. On the other hand, an increase of importance of par-
ticular feature may indicate the issue of particular manufacturing operation or an issue of input 
material or components. Periodic evaluation of feature importance attributes can therefore help 
to detect faults or degradations of various steps of manufacturing process or issues with input 
materials and components. 
Acknowledgments 
This work was supported by the Slovenian Research Agency under Grant P2-0001; Slovenian Research Agency under 
Grant L2-4454. 
References 
[1] Juričić, Ð., Petrovčič, J., Benko, U., Musizza, B., Dolanc, G., Boškoski, P., Petelin, D. (2013). End-quality control in the 
manufacturing of electrical motors, In: Strmčnik, S., Juričić, Đ. (eds.), Case studies in control, Advances in industrial 
control, Springer, London, United Kingdom, 221-256, doi: 10.1007/978-1-4471-5176-0_8. 
[2] Benko, U., Petrovčič, J., Mussiza, B., Juričić, Đ. (2008). A system for automated final quality assessment in the man-
ufacturing of vacuum cleaner motors, IFAC Proceedings Volumes, Vol. 41, No. 2, 7399-7404, doi: 10.3182/ 
20080706-5-KR-1001.01251. 
[3] Domel, Domel d.o.o., from https://www.domel.com/sl, accessed September 26, 2023. 
[4] Boškoski, P., Petrovčič, J., Musizza, B., Juričić, Đ. (2011). An end-quality assessment system for electronically com-
mutated motors based on evidential reasoning, Expert Systems with Applications, Vol. 38, No. 11, 13816-13826, 
doi: 10.1016/j.eswa.2011.04.185. 
[5] Benko, U., Petrovčič, J., Juričić, Đ. (2005). In-depth fault diagnosis of small universal motors based on acoustic 
analysis, IFAC Proceedings Volumes, Vol. 38, No. 1, 323-328, doi: 10.3182/20050703-6-cz-1902.01856. 
[6] Rokach, L., Maimon, O. (2014). Data mining with decision trees, Theory and applications, 2
nd
 Edition, World Scien-
tific, New Jersey, USA, doi: 10.1142/9097. 
[7] Kim, S., Xing, E.P. (2009). Statistical estimation of correlated genome associations to a quantitative trait network, 
PLOS Genetics, Vol. 5, No. 8, Article No. e1000587, doi: 10.1371/journal.pgen.1000587. 
[8] Beisvag, V., Jünge, F.K.F., Bergum, H., Jølsum, L., Lydersen, S., Günther, C.-C., Ramampiaro, H., Langaas, M., Sandvik, 
A.K., Lægreid, A. (2006). GeneTools - application for functional annotation and statistical hypothesis testing, BMC 
Bioinformatics, Vol. 7, No. 1, Article No. 470, doi: 10.1186/1471-2105-7-470. 
[9] Kuehl, P.M., Weisemann, J.M., Touchman, J.W., Green, E.D., Boguski, M.S. (1999). An effective approach for analyz-
ing "prefinished" genomic sequence data, Genome Research, Vol. 9, No. 2, 189-194, doi: 10.1101/gr.9.2.189. 
[10] Núñez, J., Llacer, J. (2003). Astronomical image segmentation by self-organizing neural networks and wavelets, 
Neural Networks, Vol. 16, No. 3-4, 411-417, doi: 10.1016/s0893-6080(03)00011-x.  
[11] Chen, E.-L., Chung, P.-C., Chen, C.-L., Tsai, H.-M., Chang, C.-I. (1998). An automatic diagnostic system for CT liver 
image classification, IEEE Transactions on Biomedical Engineering, Vol. 45, No. 6, 783-794, doi: 10.1109/ 
10.678613. 
Mlinarič, Pregelj, Boškoski, Dolanc, Petrovčič 
 
196 
Advances in Production Engineering & Management 19(2) 2024 
 
[12] Makadia, A., Pavlovic, V., Kumar, S. (2008). A new baseline for image annotation, In: Forsyth, D., Torr, P., Zisser-
man, A. (eds.), Computer Vision – ECCV 2008. ECCV 2008. Lecture notes in computer science, Vol. 5304, Springer, 
Berlin, Heidelberg, Germany, doi: 10.1007/978-3-540-88690-7_24. 
[13] Chen, Z.-Y., Lin, W.-C., Ke, S.-W., Tsai, C.-F. (2015). Evolutionary feature and instance selection for traffic sign 
recognition, Computers in Industry, Vol. 74, 201-211, doi: 10.1016/j.compind.2015.08.007. 
[14] Deng, X., Li, Y., Weng, J., Zhang, J. (2019). Feature selection for text classification: A review, Multimedia Tools and 
Applications, Vol. 78, No. 3, 3797-3816, doi: 10.1007/s11042-018-6083-5. 
[15] Baccianella, S., Esuli, A., Sebastiani, F. (2014). Feature selection for ordinal text classification, Neural Computation, 
Vol. 26, No. 3, 557-591, doi: 10.1162/NECO_a_00558. 
[16] Baecchi, C., Uricchio, T., Bertini, M., Del Bimbo, A. (2016). A multimodal feature learning approach for sentiment 
analysis of social network multimedia, Multimedia Tools and Applications, Vol. 75, No. 5, 2507-2525, doi: 
10.1007/s11042-015-2646-x. 
[17] Chen, J., Wang, T., Gao, X., Wei, L. (2018). Real-time monitoring of high-power disk laser welding based on support 
vector machine, Computers in Industry, Vol. 94, 75-81, doi: 10.1016/j.compind.2017.10.003. 
[18] Guinea, D., Ruiz, A., Barrios, L.J. (1991). Multi-sensor integration—An automatic feature selection and state iden-
tification methodology for tool wear estimation, Computers in Industry, Vol. 17, No. 2-3, 121-130, doi: 10.1016/ 
0166-3615(91)90025-5. 
[19] Hidalgo-Mompeán, F., Gómez Fernández, J.F., Cerruela-García, G., Márquez, A.C. (2021). Dimensionality analysis 
in machine learning failure detection models. A case study with LNG compressors, Computers in Industry, Vol. 128, 
Article No. 103434, doi: 10.1016/j.compind.2021.103434. 
[20] EdwardI, G., Foster, D.P. (2000). Calibration and empirical Bayes variable selection, Biometrika, Vol. 87, No. 4, 
731-747, doi: 10.1093/biomet/87.4.731. 
[21] Breiman, L., Friedman, J., Olshen, R.A., Stone, C.J. (1984). Classification and regression trees, 1
st 
Edition, Chapman 
and Hall/CRC, New York, USA, doi: 10.1201/9781315139470. 
[22] Rokach, L. (2010). Pattern classification using ensemble methods, World Scientific, Singapore, doi: 10.1142/7238. 
[23] Breiman, L. (2001). Random forest, Machine learning, Vol. 45, No. 1, 5-32, doi: 10.1023/A:1010933404324. 
[24] Breiman, L. (1998). Arcing classifier (with discussion and a rejoinder by the author), Annals of Statistics, Vol. 26, 
No. 3, 801-849, doi: 10.1214/aos/1024691079. 
[25] Hastie, T., Tibshirani, R., Friedman, J. (2009). The elements of statistical learning, Data mining, inference, and pre-
diction, Second Edition, Springer, New York, USA, doi: 10.1007/978-0-387-84858-7. 
[26] Breiman, L. (1996). Bagging predictors, Machine Learning, Vol. 24, No. 2, 123-140, doi: 10.1007/bf00058655. 
[27] Breiman, L. (1999). Pasting small votes for classification in large databases and on-line, Machine Learning, Vol. 
36, No. 1, 85-103, doi: 10.1023/A:1007563306331. 
[28] Okun, O. (2011). Feature selection and ensemble methods for bioinformatics: Algoritmic classification and imple-
mentations, IGI Global, Hershey, Pennsylvania, USA, doi: 10.5555/2050025. 
[29] Gelfand, S.B., Ravishankar, C.S., Delp, E.J. (1991). An iterative growing and pruning algorithm for classification 
tree design, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 13, No. 2, 163-174, doi: 
10.1109/34.67645. 
[30] Igual, L., Segui, S. (2017). Introduction to data science; A Python approach to concepts, Techniques and applications, 
Springer, Cham, Switzerland. 
[31] Carletti, M., Masiero, C., Beghi, A., Susto, G.A. (2019). Explainable machine learning in Industry 4.0: Evaluating 
feature importance in anomaly detection to enable root cause analysis, In: Proceedings of 2019 IEEE International 
Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 21-26, doi: 10.1109/SMC.2019.8913901.