Strojniški vestnik - Journal of Mechanical Engineering 62(2016)1, 60-75 © 2016 Journal of Mechanical Engineering. All rights reserved. D0l:10.5545/sv-jme.2015.2905 Review Scientific Paper Received for review: 2015-07-23 Received revised form: 2015-11-10 Accepted for publication: 2015-12-10 A Review of the Extrapolation Method in Load Spectrum Compiling Jixin Wang* - Hongbin Chen - Yan Li - Yuqian Wu - Yingshuang Zhang Jilin University, School of Mechanical Science and Engineering, China Load spectrum is the basis of fatigue analysis and life prediction in engineering, and load extrapolation is an essential procedure in determining a long-term load spectrum from a short-term one. Selecting a proper extrapolation method is of great significance when considering various forms and characteristics of load. Over the past few decades, several load extrapolation methods have been proposed, therefore the reasonability and accuracy of a load spectrum extrapolated using different methods should be of great concern. This paper conducts a literature review of commonly used extrapolation methods and proposes some future areas of research. The critical factors, the advantages and disadvantages, and the application ranges of extrapolation methods are summarized using literature and illustrations to provide guidance when selecting a method. In the future, more methods and applications of extrapolation methods will be able to be explored with the further development of statistics and computer software technology. Keywords: short-term load spectrum, long-term load spectrum, load extrapolation, parametric extrapolation, nonparametric extrapolation, quantile extrapolation Highlights • This paper is focused on reviewing the commonly used extrapolation methods in load spectrum compiling in engineering; • The extrapolation methods are classified as the parametric extrapolation method, the nonparametric extrapolation method and the quantile extrapolation method; • Characteristics of each extrapolation method are summarized using literature and illustrations; • The guidance when selecting an extrapolation method and some research prospects in this field are proposed. 0 INTRODUCTION In engineering, many mechanical structures and components are subjected to complex and random loads, which determine the fatigue reliability and life of the machinery [1] and [2]. Thus, it is indispensable to conduct fatigue analysis and life prediction of the structures and components based on a load spectrum [3] and [4]. Currently, a load spectrum is widely used in the fields of aerospace [5] and [6], vehicle [7] and [8], wind power [9] and [10], construction machinery [11] and [12], and so on [13] and [14]. In practice, a long-term load spectrum contains the complete load information, but it is difficult to be directly measured due to the restrictions of testing technology, as well as time and cost. Therefore, it is necessary to obtain a long-term load spectrum based on a short-term one. The traditional load spectrum compiling method multiplies a short-term load spectrum with a constant proportionality coefficient [15] to [17]. Since only the data measured in a finite time is repeated, the extreme loads that cannot be measured and have a greater impact on damage are ignored. Load extrapolation methods can overcome the above limitation of the traditional method. With the development of statistics and computer software, new methods have been applied to load extrapolation. In load spectrum compiling, results may vary from each other with different extrapolation methods. Therefore, selecting an appropriate load extrapolation method is very important, but that is difficult in practice. For a better understanding of the methods and to provide selection guidance, several commonly used extrapolation methods are reviewed and summarized based on the literature and illustrations in this paper. The extrapolation methods are classified as the parametric extrapolation method (PE), nonparametric extrapolation method (NPE) and quantile extrapolation method (QE). In PE, sample data is supposed to obey a known distribution, and the parameters in the function are estimated according to the load sample. In NPE, an extrapolated result is obtained because the density distribution with an arbitrary shape can be received based on a nonparametric density estimation. When the sample data has different load characteristics due to different working conditions and different operating behaviors in the testing process, QE can break the data into a series of clusters and computes the damage of each rainflow matrix. The literature and illustrations are presented to evaluate the extrapolation methods and the characteristics of various extrapolation methods, such as the critical factors, the advantages and disadvantages, and the application ranges, are summarized. Some potential research prospects are 60 *Corr. Author's Address:School of Mechanical Science and Engineering, Jilin University, Changchun, China, 1518051537@qq.com Strojniški vestnik - Journal of Mechanical Engineering 62(2016)1, 60-75 also discussed. The aim of this review is to be all encompassing, but this is an impossible task, so we apologize for any omissions. 1 EXTRAPOLATION METHODS 1.1 Parametric Extrapolation Method (PE) Fitting sample data with a distribution function and estimating the parameters are included in PE. Due to the different types of sample data, PE is divided into the parameter-estimate extrapolation method (PEE) and the extreme-value extrapolation method (EVE). 1.1.1 Parameter-Estimate Extrapolation Method (PEE) PEE is a traditional extrapolation method and extrapolates a short-term load spectrum counted from a measured load time history. PEE includes one-dimension extrapolation, in which only amplitudes accompanied by the frequencies are extrapolated, and the two-dimensional extrapolation extrapolates both the means and amplitudes together with the frequencies [18] to [20]. In practice, the two-dimensional extrapolation method is commonly used and the process is reviewed as follows: 1. Preprocess the measured load The preprocessing mainly includes discretizing the analog signal, filtering the digital signal, eliminating the trend item, checking and eliminating the abnormal peaks [21]. 2. Transform the load time history into a short- term load spectrum. The rainflow counting method (RCM) is frequently used in PEE [16] and [18]. RCM, which was proposed by Matsuiski and Endo more than 50 years ago and developed in the following decades [22] and [23], is a procedure for determining the damaging load cycles in a load time history [24], and the cycles are usually summed into bins referenced by their mean values and amplitudes. For examples, in Wang et al. [25], the outfield load spectrum was divided into one main cycle and four sub cycles by RCM. 3. Fit the amplitudes and mean values with distribution functions. The relationship between the mean values and frequencies usually obeys a normal distribution [20]. Meanwhile, the relationship between the amplitudes and frequencies usually obeys a Weibull distribution [26]. When the assumed variables obey a two-dimensional normal distribution, a probability density function is introduced by Holling and Mueller [27]: f( x, y) = 1 2nala2yjl—j -i 2(1-p2)^ Oi (1) where ¡1, i2 are the mathematical expectations of x and y, respectively, ah o2 are the standard deviations of x and y, respectively, and p is the correlation coefficient. In the equation, ¡1, ¡2, oh o2, p are all constants, and o1 > 0, o2 > 0, -1

umax or X < umin. By POT, the maxima above the threshold umax and the minima below the threshold umin are randomly regenerated, and only these extreme values will be extrapolated. For the threshold, on one hand, the level must be high enough so that only true peaks, with Poisson arrival rates, are selected. Small values for the threshold will lead to a biased estimation [47]. On the 70 Wang, J.X. - Chen, H.B. - Li, Y. - Wu, Y.Q. - Zhang, Y.S. Strojniški vestnik - Journal of Mechanical Engineering 62(2016)1, 60-75 other hand, the level must be low enough to ensure that sufficient data will be selected to guarantee an accurate estimation of the distribution parameters, and the variance of the parameters will be decreased [47]. Johannesson [40] suggested a simple method that sets the threshold equal to the square root of the cycle number in the signal and works well in many cases [48]. Other threshold-selection methods have also been proposed, for example, Davison [49], Ledermann et al. [50] and Walshaw [51]. Level upcrossings (LU): According to Johannesson and Thomas [17], LU is proposed to obtain the maxima and minima of the load cycles, then determine the limiting shape of the rainflow matrix (RFM) and estimate the limiting RFM G [17]: G=UL' (5a) r E[f"] (5b) gij= lim-(5b) z^rc z where the elements f of Fz are the number of rainflow cycles in distance z, with a minimum in class i and a maximum in class j. Fz is the rainflow matrix in distance z. This approach is based on an asymptotic theory for the crossings of extreme (high and low) levels. First, obtain a measured RFM F [17]: Hfj (6) where f is the number of the cycle with minimum i and maximum j. Then, calculate the LU from F and determine a suitable threshold. The level upcrossings spectrum is calculated as follows [17]: N=(n L. (7) 0 5 10 15 20 25 30 35 40 45 50 Time Fig. 1. Schematic diagram of BMM 10 15 Time Fig. 2. Schematic diagram of POT A Review of the Extrapolation Method in Load Spectrum Compiling 63 o 3 3 2 0 0 5 20 25 Strojniški vestnik - Journal of Mechanical Engineering 62(2016)1, 60-75 where nk is the accumulative cycle number from the load level i below k to the load level j above k: nk = Z f • i 0, I K(x)dx = 1, (14) The main process of NPE is as follows [66]: 70 Wang, J.X. - Chen, H.B. - Li, Y. - Wu, Y.Q. - Zhang, Y.S. Strojniški vestnik - Journal of Mechanical Engineering 62(2016)1, 60-75 1. Transform the measured load time history into a rainflow counted histogram. 2. Select the appropriate kernel function and bandwidth, then use the nonparametric method in combination with the Monte Carlo method [67] to extrapolate the RFM that is obtained from the lifecycle one. 3. Reconstruct a new load spectrum from the RFM lifecycle. For NPE, a lot of research was conducted on the selection of the kernel function and bandwidth. Wang et al. [68] proposed a selection method for the kernel function and the multi-criteria decision making technique was successfully used to solve the problem of the kernel function selection. For the bandwidth selection, Heidenreich et al. [69] reviewed the bandwidth selections for the kernel density estimation and some of the methods can be used in NPE. Sheather [70] proposed two kinds of bandwidth determination methods: Sheather-Jones plug-in bandwidth and least squares cross validation. The Sheather-Jones plug-in bandwidth was widely used because of its overall good performance, but this method was prone to be over-smoothing in some situations. As a supplement, it was solved by the least squares cross validation. Besides, Bayesian methods [71] and [72] were used to estimate the adaptive bandwidth and adaptive bandwidth matrix in univariate and multivariate KDEs. For the applications of NPE, Dressler et al. [64] transformed the discrete rainflow matrix into a smooth function that is more accessible with a kernel density estimator. In the literature, the RFM is seen as two-dimensional histograms of the opening and closing points of hystereses, and can only be described by a nonparametric method due to its arbitrary shapes. Socie [73] employed nonparametric kernel smoothing techniques to transform the discrete rainflow histogram of cycles into a probability density histogram and extrapolated the short-term measured load to an expectedly long-term one. The key role of the bandwidth in KDE is also indicated in the literature. Johannesson [17] considered that kernel smoothing is a feasible smoothing technique and well-established statistical method for nonparametric estimation. A kernel smoother method is also proposed to estimate the RFM for the cycles with small and moderate amplitudes. Mattetti et al. [74] extrapolated the RFM by NPE in carrying out of accelerated structural tests of tractors. 1.3 Quantile Extrapolation Method (QE) Considering the influences of different working conditions and operating behaviours in engineering, load extrapolation is difficult. Under these circumstances, the quantile extrapolation method (QE) is capable of taking various conditions and behaviors into consideration and optimizing the extrapolation results. The main process of QE is as follows [64]: 1. Break the data set of the rainflow-counted histogram into a series of clusters B1, B2,..., Bm with similar variables and damages. 2. Compute the damage of each original RFM R by Miner's rule. Damage vectors [64]: (A( R),..., Dm (Ri)) (Di( Rn),..., Dm (Rn)) are obtained for all original RFMs, where Rj, R2, ..., Rn represent the influence of various conditions and behaviors. 3. Estimate the expected damage for the x% quantile. The quantile damage vector (qu q2,..., qm), which describes the damage distribution between the individual clusters of the rainflow matrix, is used to construct the rainflow matrix. The original rainflow matrix is superposed such that [64]: Rg = R1 +••• + Rn. (15) 4. Construct and extrapolate the corresponding RFM into a matrix, the extrapolation of the resulting matrix [64] is: Re = extrapol (RG ), (16) where RE represents the extrapolated result and is made up of the basic process and peak values. Socie and Pompetzki [66] described a method for statistically extrapolating a single measured service load time history to an expected long-term load spectrum. Because of the difference between operating behaviors, the extrapolation method was extended to combine data from several users. The extrapolated load spectrum would represent more severe users in the population and the optimization effect was obvious. Mattetti et al. [74] introduced a method for an accelerated test on tractors and employed QE to calculate rainflow matrices for 20 tasks repeated in five different working forms. In the selected sample, the 95th percentile of the most damaging conditions are considered. A Review of the Extrapolation Method in Load Spectrum Compiling 61 Strojniški vestnik - Journal of Mechanical Engineering 62(2016)1, 60-75 In load spectrum compiling, QE is usually combined with other extrapolation methods and it is also an important component in computer software. 1.4 Classification of the Extrapolation Methods The extrapolation methods are integrated into one figure for clarity. As shown in Fig. 3, the classifications and pivotal elements of the methods are reflected. 2 CASE ANALYSIS In this section, some illustrations and examples are displayed to evaluate and demonstrate the extrapolation methods. 2.1 Case Analysis of PEE In PEE, distributions of the sample data affect the extrapolation results [26]. In this section, the load on an axle shaft of a loader powertrain was taken as the research object. According to the obvious segment working characteristics of a loader, the operation process was divided into six sections. In this paper, the load on the axle shaft in the spading and the no load backward sections were illustrated to verify the characteristics of PEE. Amplitudes of the load with different characteristics and distributions were focused on. The Weibull distribution was employed, with the fitting results shown in Fig. 4 and Fig. 5. Compared with Fig. 4a, the fitting in middle of Fig. 5a diverges from the distribution function more remarkably. In Fig. 5b, the tail of the fitting seriously diverges from the skew line, as in Fig. 5c. Based on the comparison, the conclusion is that the fitting between the function and load in the spading section is better. So, when PEE is applied to extrapolating the load on an axle shaft with different characteristics, the repeatability of the result will be influenced. Therefore, the distributions of the sample data will influence the fitting error in PEE, and the fitting error will lead to an inaccuracy in the extrapolation results. 2.2 Case Analysis of EVE In EVE, both the data extracting and fitting function selection will affect the extrapolation results. During the data extracting process, selection of the threshold or block size is important [75] to [78], as this will influence the data utilization ratio and the distribution characteristics of the extreme values. Fitting precisions vary from each other due to different load characteristics, thus the extrapolation results of EVE are dependent on the fitting precisions. Several examples will illustrate the influences of different thresholds on the fitting precisions. In this section, the load on an axle shaft of a loader powertrain in the spading section was used. The automatic threshold selection method, which was proposed in Thompson [79], was adopted to determine the original threshold. In data processing, based on the sample data, 2897.3 Nm was set as the automatic threshold and the number of extreme values was 5734, thus 5734 exceedances were calculated. With GPD, the exceedances were fitted with the parameters estimated by the maximum likelihood method, and the results are shown in Fig. 6. In order to reflect the effects of thresholds on fitting precisions, 2000 Nm and 3500 Nm were selected as the other thresholds to extract values, thus Fig. 3. Classification of the extrapolation methods 70 Wang, J.X. - Chen, H.B. - Li, Y. - Wu, Y.Q. - Zhang, Y.S. Strojniški vestnik - Journal of Mechanical Engineering 62(2016)1, 60-75 5CCC 45CC 4CCC 35CC