Abstract
To enhance the functions and improve the safety of the new generation of vehicles, this paper collected abundant history data of vehicles and then created a rule-based model by using machine learning methods, so as to detect the faulty vehicle in a fleet. Several steps were designed for detailed illustration, and the validation of the method was conducted through electrical fault of the LV (lithium-cobalt) battery. The results can be used as input for the test bench tests of the following vehicle generations.
Development and testing in the automotive industry is increasingly taking place virtually due to ever shorter development cycle
Bergmeie
The analysed fleet data consists of >12 000 vehicles of the same vehicle model equipped with an electric powertrain. Together, these have completed >225 million kilometres and >6 million charging operations at the time of the analysis. The fleet data are available as load spectra. Load spectra refers to the totality of all loads occurring on a component over a certain period of time, which are triggered by torques, speeds, accelerations, temperatures, etc. The data is continuously recorded during customer operation and converted from the time domain to the frequency of occurrence and stored. For the classification (see

Fig. 1 Load spectra classification
If a component is replaced in the workshop due to a defect, this is stored in a separate database and the vehicles concerned are classified as faulty vehicles. In addition to the information on the vehicle, date and component, a high-level error description is also saved. Based on the high-level error description, an analysis of the most frequent error cases was carried out and these were examined for their causes using the method presented below. In this paper, the electrical fault of the LV battery is used as an example application.
Faulty vehicles | Total vehicles | Load spectra classes |
---|---|---|
215 | > 8,500 | > 500 |
The dataset studied also has the following special features that had to be considered in the development of the method. Most of the data included are not normally distributed. There are some linear dependencies between the variables, as everything is time-based, for example. In addition, the ratio of faulty vehicles to the rest of the fleet is extremely imbalanced, which also influenced the chosen evaluation metrics.
Various metrics exist for evaluating the model quality of the created and trained models. In the paper, two metrics are used: the balanced accuracy scor
Balanced Accuracy Score(BAS): BAS calculates the weighted accuracy suitable for unbalanced datasets. Each sample is weighted with the inverse frequency of its true class. BAS is calculated according to
(1) |
where: is the predicated value; is the real value; is calculated from the real value and the associated weighting according to
(2) |
Recall Score: The recall score is the ratio of true-positive predictions (tp) to the sum of true-positive and false-negative predictions (fn) and thus describes the algorithm's ability to find the positive samples. It is suitable for unbalanced datasets, where the positive samples are the under-represented group in the dataset. The corresponding equation is:
(3) |
The procedure developed is divided into seven steps, which are shown in

Fig.2 Seven steps of the method
In Step 1, pre-processing, the datasets are prepared for the subsequent algorithms. For the faulty vehicles, a dataset is created in which the load spectra at the time of the fault occurrence are stored. For the rest of the fleet, a dataset is created in which the most recent load spectra per vehicle are stored. The two datasets are then merged and a label is introduced that distinguishes the faulty vehicles from the rest of the fleet. Through the label, supervised learning algorithms can subsequently be applied.
Filtering removes the columns in which only the value 0 occurs or which consist of 10% NaN values. Then the rows in which NaN values still occur are removed. Finally, no NaN values may be present in the dataset for the machine learning algorithms. Additionally, only vehicles with a mileage greater than 1 000 km are considered further.
During further analysis of the dataset, it was found that there are correlations between the individual features. For example, the values of the energy meters and current integrals of a component are directly linearly dependent. Such correlations generally do not improve machine learning models. Especially with random forest algorithms, correlations can lead to worse results. In this case, the existing correlations are determined by means of a correlation analysis and removed by deleting one of the affected features from the dataset.
The dataset used is composed of various physical quantities. It contains date values, time-based and trip-based counter values, kilometre readings, energy counters and electricity integrals. The corresponding value ranges of the different variables can differ by several powers of ten. In order to weight all variables equally for the following evaluations and to avoid that variables with a large value range (and thus large variance) mainly describe the dataset, the individual variables must be scaled. For the problem at hand, a percentage scaling was applied. The load spectra with more than one class are considered (one class corresponds to one feature in the following) and the percentage distributions of the classes per load spectra are calculated from the counter values. The value range is between 0 and 1, whereby with the value 1 only one class of the respective load spectra has counter values. The advantage over standard scaling is that the scaled values can still be interpreted by the user. However, the percentage scaling can only be applied to load spectra with more than one feature. In this respect, it is combined with the standard scaling.
Step 2 is to visualise the dataset for the user. The aim is to highlight the faulty vehicles in comparison to the rest of the fleet in order to identify similarities where applicable. Since the dataset still consists of >500 features after filtering, a direct visualisation is not possible. In this respect, data reduction methods must first be applied. These reduce the dataset to two dimensions, which can then be plotted for the user. The data reduction is carried out in two stages for the present dataset. In the first stage, the principal component analysi

Fig. 3 Visualisation of the fleet and faulty vehicles
In Step 3, the outliers determined visually in
The best result with a balance accuracy score of 84% was achieved by the local outlier factor algorithm , applied to the dataset reduced by t-SNE and then SMOTEEN
In Step 4, for further analysis, the dataset was split into a training and test dataset. The StratifiedShuffleSplit algorith
In Step 5, a challenge in the existing dataset is the imbalance between faulty and non-faulty vehicles. This imbalance is, for example, 1:35 for the electrical fault of the LV battery.
The imbalance becomes problematic with algorithms that weight all results equally to calculate the accuracy, especially since the state of interest usually represents the minority. The problem can be solved in two ways. On the one hand, some algorithms allow the accuracy calculation to be switched to a balanced accuracy score. On the other hand, the dataset can be adjusted by over- and undersampling algorithms in such a way that the imbalance is eliminated. Various algorithms were investigated for this purpose. The SMOTEENN algorithm showed the best results and was selected accordingly. It is a combination of the oversampling algorithm SMOT
In Step 6, the feature selection step has two reasons. First, it should give the user the opportunity to get an overview of the relevant features. If necessary, there may still be features in the dataset that are, for example, not a cause but a consequence of the error. These must be removed for further analysis.
Furthermore, it was investigated whether the model quality can be increased by removing irrelevant features. In other words, whether it is possible to reduce the number of features while maintaining the same model quality. If this is possible, a reduced dataset can be used for the following rule-learning procedures. This leads directly to shorter calculation times.
Three methods were investigated to determine the relevant features: Bergmeir's metho
The method according to Bergmeir was not pursued further due to its long runtime, but only used as a reference.
The recursive feature selection algorithm is applied after Step 1, scaling the data. The 10 most important features are selected and displayed. Afterwards, the user has the option to remove the features he or she considers unsuitable (expert step).
The third implemented and investigated method of feature selection is by means of a pipeline. A pipeline is used to execute several algorithms one after the other and to test them with different parameters within a cross-validation. The procedure is shown in

Fig. 4 Feature selection via pipeline
Step 7 is to determine rules. A rule is a simple if-then statement consisting of a condition and a prediction. The prediction in the present use case is the detection of the faulty vehicles. This is already known. The condition is the cause of the faults occurring. This is the relevant aspect to be determined in Step 7. Within the framework of the developed method, the Skope Rules, IREP and RIPPER algorithms are used for this purpose.
Skope Rules: Skope-Rules is an interpretable rule-based classifier. The rules are semantically deduplicated based on the variables that make up the rule
IREP & RIPPER: The algorithms IREP (incremental reduced error pruning) and RIPPER (repeated incremental pruning to produce error reduction) are based on the same principle, whereby RIPPER is an extension of IREP. A rule is learned and the learned rule is added to a rule set. The data points covered by the learned rule are removed from the data set. A new rule is then learned on the remaining dataset. This happens until a termination criterion is reached. Both algorithms differ in the definition of the termination criteria. In addition, an optimisation phase was added to the RIPPER algorith
The algorithms RIPPER, IREP, and Skope Rules were applied to the problem at hand.
IREP | RIPPER | Skope Rules | |||
---|---|---|---|---|---|
BAC | Recall | BAC | Recall | BAC | Recall |
79.1 | 83.3 | 79.7 | 83.3 | 82.3 | 74.1 |
Furthermore, an example rule of the algorithm Skope Rules is listed (see
Load spectra | Description | Rule condition 1 | Rule condition 2 |
---|---|---|---|
LK108_1_X5 | Outside temperature slass 5 | >0.000 23 % | - |
LK74_2_X6 | Current onboard charger class 6 | >0.002 5 % | ≤99.6 % |
LK85_2_X5 | Charging power class 5 | >0.17 % | ≤99.6 % |
PROD_DT | Production date | >30.04.2014 | ≥22.02.2015 |
SALES_AREA_ID | Sales are | >221 | - |
PROD_DT <= 22.02.2015 & LK74-2_X6 > 0.0025% & LK85-2_X1 > 0.17%
To check the plausibility of the method, the rules determined were divided into the individual load spectra classes. A comparison was then made between the faulty vehicles and the rest of the fleet (reference) for these load spectra classes.

Fig. 5 Boxplot diagram of outside temperature
The distributions of the production date are shown in

Fig. 6 Histogram of the production date
The influence of the charging power on the investigated error derived from the rules is shown in

Fig. 7 Boxplot diagram of the charging power
The comparison between faulty vehicles and the reference fleet for the load spectrum of the current of the onboard charger is shown in

Fig. 8 Boxplot diagram of the current of the onboard charger

Fig. 9 Histogram of sales area
In summary, the rules derived by means of the presented method for the investigated electrical fault of the LV battery could be plausibilized. It could be shown that with the rules the load spectra classes were found in which a deviation between the fleet and the faulty vehicles is present.
A method for identifying the causes of faults from fleet data was developed which is divided into several steps with the aim of first processing the input data and then learning rules for detecting the faulty vehicles. The processing of the input data serves to improve the detection result. The application of the method was implemented using Python scripts and exemplified with the electrical fault of the LV battery. Subsequently, a plausibility check was performed and the functionality was proven. The results can be used as input for the test bench tests of the following vehicle generations.
References
FIETKAU P, KISTNER B, MUNIER J. Virtual powertrain development[J]. Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering, 2020, 234(14): 3288. [Baidu Scholar]
BERGMEIR P. Enhanced Machine Learning and Data Mining Methods for Analysing Large Hybrid Electric Vehicle Fleets based on Load Spectrum Data[D]. Stuttgart: University of Stuttgart, 2017. [Baidu Scholar]
SIDERIS A, KALOGEROPOULOS E C, MOIROGIORGOU K. Data analysis techniques for predictive maintenance on fleet of heavy-duty vehicles[J]. International Journal of Mechanical and Mechatronics Engineering, 2021, 15(7): 300. [Baidu Scholar]
GUYON I, BENNETT K, CAWLEY G, et al. Design of the 2015 ChaLearn AutoML challenge[C]// 2015 International Joint Conference on Neural Networks (IJCNN). Killarney: IEEE, 2015. [Baidu Scholar]
PEDREGOSA F, VAROQUAUX G, GRAMFORT A, et al. Scikit-learn: Machine learning in python[J]. Journal of Machine Learning Research, 2011(12): 2825. [Baidu Scholar]
JOLLIFFE I T. Principal component analysis[M]. 2nd ed. New York: Springer-Verlag New York Inc., 2002. [Baidu Scholar]
VAN DER MAATEN L , HINTON G E. Visualizing high-dimensional data using t-SNE[J]. Journal of Machine Learning Research, 2008(9): 2579. [Baidu Scholar]
BREUNIG M. M., KRIEGEL H. P., NG R. T., SANDER J., LOF: identifying density-based local outliers[J]. ACM SIGMOD Record, 2000, 29(2): 93. [Baidu Scholar]
BATISTA G E, PRATI R C, MONARD M C. A study of the behavior of several methods for balancing machine learning training data[J]. ACM SIGKDD Explorations Newsletter, 2004, 6(1): 20. [Baidu Scholar]
ROUSSEEUW P J. Least median of squares regression[J]. J Am Stat Ass, 1984, 79: 871. [Baidu Scholar]
CHAWLA N V, BOWYER K W, HALL L O, et al. SMOTE: synthetic minority over-sampling technique[J]. Journal of Artificial Intelligence Research, 2002, 16: 321. [Baidu Scholar]
WILSON D L. Asymptotic properties of nearest neighbor rules using edited data[J]. IEEE Transactions on Systems, Man, and Cybernetics, 1972, 2(3): 408. [Baidu Scholar]
GUYON I, WESTON J, BARNHILL S, et al. Gene selection for cancer classification using support vector machines[J]. Machine Learning, 2002, 46: 389. [Baidu Scholar]
GEURTS P, ERNST D, WEHENKEL L. Extremely randomized trees[J]. Machine Learning, 2006, 63: 3. [Baidu Scholar]
BREIMAN L. Random forests[J]. Machine Learning, 2001, 45: 5. [Baidu Scholar]
BERGSTRA J, BENGIO Y. Random search for hyper-parameter optimization[J]. Journal of Machine Learning Research, 2012, 13(1): 281. [Baidu Scholar]
GAUTIER R, JAFFRE G, NDIAYE B. Interpretability with diversified-by-design rules: skope-rules, a python package, 2008. [Baidu Scholar]
FURNKRANTZ J, WIDMER G. Incremental reduced error pruning[C]// Proceedings of the Eleventh International Conference on Machine Learning 1994. New Brunswick: Rutgers University, 1994. [Baidu Scholar]
COHEN W W. Fast effective rule induction[C]// Proceedings of the Twelfth International Conference on Machine Learning 1995. Tahoe City: Elsevier, 1995. [Baidu Scholar]