The increasing use of artificial neural network in the prediction of yarn quality properties calls for constant improvement of the models. This research work reports the use of a novel training algorithm christened extreme learning machines (ELM) to prediction yarn tensile strength (strength). ELM was compared to the Backpropagation (BP) and a hybrid algorithm composed of differential evolution and ELM and named DE-ELM. The three yarn strength prediction models were trained up to a mean squared error (mse) of 0.001. This is an arbitrary level of mse that was selected to enable a comparative study of the performance of the three algorithms. According to the results obtained in this research work, the BP model needed more time for training, while the ELM model recorded the shortest training time. The DE-ELM model was in between the two models. The correlation coefficient (R2) of the BP model was lower than that of ELM model. In comparison to the other two models the DE-ELM model gave the highest R2 value.
Yarn quality properties can be predicted by modelling selected inputs and outputs of the cotton spinning system. This approach is widely applied in the study of the fiber to yarn process, where yarn properties can be predicted using mathematical, statistical and artificial neural network (ANN) models just to mention a few. A study of the comparison of ANN and other models (mathematical and statistical) has also been undertaken, and reports indicated that the ANN models have comparatively higher prediction efficiency (Guha et al. 2001; Ureyen and Gurkan 2008a, b; Majumdar and Majumdar 2004). The efficiency of the ANN algorithms has enabled the design of yarn quality prediction models, which can be used in the spinning industry (Furferi and Gelli 2010). The increasing use of the ANN models in the textiles industry warrants more attention to ensure that necessary improvements are made. This is expected to improve the cotton spinning process. It is with the afore-mentioned reasons that the objectives of studying the improvement of the ANN models used in the prediction of cotton yarn tensile strength (strength) were envisioned. This was done by comparing the working of Backpropagation (BP) models with the Extreme Learning Machines (ELM) algorithm during the prediction of yarn tensile strength. Further improvement of the ELM algorithms was undertaken to produce more efficient prediction models made from a hybrid of differential evolution (DE) and ELM algorithms christened DE-ELM algorithm.
Yarn quality prediction models
The design of ANN models suitable for the prediction of yarn quality properties could take a variety of forms. As explained by Cybenko (1989) an ANN model with one hidden layer is robust enough and can be used to design yarn properties prediction models. The three layers of the ANN model can be designed using a multi-layer perception (MLP). In this research work a one hidden layer MLP was designed and used for the prediction of yarn tensile properties. The architecture of an MLP as explained by several researchers (Ham and Kostanic 2003; Huang et al. 2006a), with one hidden layer consists of several elements which include, input to hidden layer weights, hidden layer biases, hidden layer transfer function, hidden layer to output layer weights, output layer biases and output layer transfer function. The training of the MLP used to predict yarn properties in this research work was initially implemented using BP based algorithms, namely Levenberg Marquart. The BP based algorithms select the initial weights and biases using a random process which is likely to search for the weights and biases in the local area of the vector space hence leading to the problem of local minima. The selected weights and biases are normally updated using an iterative process. The iterative process could however slow down the working of the algorithms.
Attempts taken to improve the efficiency and speed of the BP algorithms were reported by Huang et al. (2006a, b) who suggested that by randomly selecting the input weights and hidden layer biases the efficiency and speed of the ANN model can be improved. This is due to the fact that the weights and biases of an MLP need not be iteratively updated. If the output layer function is eliminated then the logarithm becomes a linear system, and hence the hidden layer to output layer weights can be analytically determined. The aforementioned modifications introduced a new training algorithm christened ELM. ELM has been tested by Huang et al. (2006a) in several fields which include medical and forestry studies and proved to be faster and more efficient than the BP algorithm. Up to date there are no reports available in public domain for the study of the use of ELM in the cotton spinning process. This paper attempts to fill this void.
While ELM may be faster than BP algorithms there is still room for improvement. Given that ELM computes the output weights based on prefixed input weights and hidden layer biases, there is a possibility of a set of non-optimal or unnecessary input weights and hidden layer biases being selected. Furthermore the problem of local minima which is common in BP algorithms may also exist in ELM, albeit to a lower degree. As suggested by Zhu et al. (2005) the problems experienced while using ELM as a network training algorithm can be minimized by using the DE algorithm for the initial weights and biases selection process. This idea can be implemented by combining the DE and ELM algorithm to form a hybrid training algorithm. The hybrid algorithm will thereafter be referred to as DE-ELM for lack of a better name.
Designing and training of prediction models
Input factors and data pre-processing
The cotton fiber-to-yarn process involves processing cotton lint through a set of machines to produce yarn. The manufactured yarn is expected to meet some quality standards so that it can be produced at optimum productivity and perform within set standards in the subsequent processes. The quality of cotton yarn can be evaluated using yarn quality properties which include yarn elongation, strength, work of rupture, hairiness, unevenness etc. The aim of this research work was to study the prediction of yarn strength. While the selection of input factors was based on published works (Mwasiagi et al. 2008, 2012) (see Table 1), data pre-processing was also undertaken to ensure better performance of the models. The cotton lint and yarn samples were collected from Kenyan factories, and the cotton and yarn samples tested according to testing standards, in fiber and yarn laboratory. After collection and testing of the samples the data used in this research work was prepared. The collected data consisted of 144 samples each made up of 19 input factors. The date was subdivided into three sets: training, validation and testing sets in the ratio of 4:1:1, respectively. This was done in a random manner. The use of validation data will ensure that the BP network minimizes overfitting.
To further ensure a high quality of the input data, the 19 input factors were pre-processed using principal component analysis (PCA) as discussed by Chattopadhyay et al. (2001) and Bernstein et al. (1988). PCA was designed to come up with a new set of variables that have as little correlation with one another as possible. The level of correlation allowable can be determined based on the percentage confidence limit selected by the researcher. In this research work a 95 % confidence limit was selected which reduced the data set to 14 inputs. As given in Table 1, the five inputs removed from the initial data set were ring frame draft, traveller weight, fibre reflectance, fibre trash weight and fibre trash area.
Data pre-processing undertaken in this research work also included data normalization, which is a process of scaling the input factors in a data set so as to improve the accuracy of the subsequent numeric computations. One way to normalize the input factors is to subtract the mean of the input factor from each input factor unit and then divide the results with the standard deviation of the input factor (Demuth et al. 2005; MathWorks et al. 2004). Data normalization is necessary to ensure that the operation of the network is optimized. Long training times can be caused by the presence of an input vector whose length is much larger or smaller than the other input factors. By normalizing the data as described above all the values of the data will fall within a given range, and the impact of the input factors can be judged based on the pattern shown by the input factors but not on the numeric magnitude. Data normalization therefore has two main advantages: It reduces the scale of the network and ensures that input factors with large numeric values do not overshadow those with smaller numeric values. After the network has been trained, these vectors were transformed back to the original values by reversing the normalization process so that the outputs are presented as they were originally. This was done to ensure that the results can be interpreted with ease.
Designing and training of prediction models
The architecture of the BP strength prediction model was designed using Cybenko theorem (Cybenko 1989). Using the network design procedure reported by Mwasiagi et al. (2012), the final BP network had 14 inputs, one input layer, one hidden layer and one output (yarn strength). The ELM and DE-ELM were designed according to the reports of Huang et al. (2006a) and Zhu et al. (2005) respectively.
The BP yarn strength prediction models were trained using the Levenberg Marquart Backpropagation algorithm, which is one of the faster BP training algorithms used in training of prediction models (Hagan and Menhaj 1994; Demuth et al. 2005), until the set target error of 0.001 was attained. This is an arbitrary level that was selected for the purpose of comparing the three algorithms used in this research work. The performance of the strength prediction model, trained using the BP algorithm was monitored as the number of neurons was varied from 2 in steps of 1 until the set target error of 0.001 was attained.
To study the predictive power of the algorithm, three performance factors were considered. These were mean square error (mse), training time and correlation coefficient (R2). These are the factors commonly reported by many researchers (Cheng and Adams 1995; Chattopadhyay et al. 2004; Desai et al. 2004; Majumdar and Majumdar 2004; Majumdar et al. 2005; Huang et al. 2006a; Ureyen and Gurkan 2008a, b; Mehment 2009). The ELM trained algorithms were trained in a similar manner like the LMBP algorithm, however the training algorithm used was ELM. Finally the yarn strength prediction model was trained using the DE-ELM algorithm. The performance of the DE-ELM yarn strength prediction models was monitored as the number of generations were varied from 1 to 10 in steps of 1. For every generation number the performance of the DE-ELM model was also monitored as the number of neurons were varied from 1 to 10.
Results and discussions
Prediction of yarn strength using BP algorithms
The yarn strength prediction model, trained using the BP algorithms, was used to predict yarn strength using the 14 inputs as discussed earlier. The performance of the strength prediction model, trained using the BP algorithm as the number of neurons was varied from 2 in steps of 1 until the set target error of 0.001 was attained, is given in Table 2. The strength model was able to attain the set target error when the number of neurons in the hidden layer reached 10.
The BP trained strength model exhibited a typical network behavior whereby the mse showed a steady improvement as the number of neurons in the hidden layer increased. The mse reached the set target (0.001) when the number of hidden neurons was 10. The R2 value measured using the testing data was 0.917.
Prediction of yarn strength using ELM algorithm
The prediction of yarn strength using the ELM model was carried out in the same manner like the BP trained model, and the results are given in Table 3.
The performance of the ELM strength prediction model improved rapidly especially when the number of neurons was varied from 2 to 15 in steps of 1. Thereafter the change in the mse value was relatively smaller, with the set target error of 0.001 being attained when the number of neurons was 41.
The time needed for network training kept on increasing as the number of neurons was increased. When the number of neurons was 41, and the set target error (0.001) had been attained the time needed for training was 0.0311 sec. This is much lower than the time of 1.438 needed by the BP trained algorithm. The much lower training time for the ELM models could be due to the saving of time that may be needed to iteratively update the weights and biases in the BP trained models. AS reported by Huang et al. (2006a), this is one of main advantages of the ELM trained models, when compared to the BP models.
The 41 neurons strength prediction model recorded an R2 value of 0.988, when exposed to the testing data. This is an improvement of 7.7 % when compared to the BP trained model, and it could be an indication of better generalization and ability to avoid local minima trap.
The ELM yarn strength prediction therefore recorded faster training time and better correlation coefficient (R2) but much higher neurons in the hidden layer when compared to the BP trained yarn strength model.
Prediction of yarn strength using DE-ELM algorithm
Having established that the ELM model needed 41 neurons in the hidden layer to attain the set target error of 0.001, DE-ELM was used to train the yarn strength prediction model with an aim of reducing the number of neurons. In the DE-ELM hybrid training algorithm the initial selection of the weights and biases was done using the differential evolution algorithm which is a global search algorithm, and thereafter they were updated analytically using the ELM algorithm. Using the DE algorithm the number of generations (G) was increased from 1 to 10 in steps of 1 and the number of neurons was varied from 2 to 10 in steps of 1. The results of the above-mentioned experiments are given in Table 4.
The results of using DE-ELM to train the yarn prediction model as given in Table 4 indicated that the set target mse value of 0.001 could be attained by all the neurons (2–10) experimented. The only difference could be seen in the number of generations needed. The 2 neurons algorithm need more generations (5), the 10 neurons model needed 3 generations while all the other neurons were in between. It is therefore clear that while the ELM model needed 41 neurons in the hidden layer to attain the set target mse value (0.001) the DE-ELM strength prediction model could attain the set target error with even 2 neurons in the hidden layer and 5 generations of the DE-ELM algorithms. This is much lower than the 10 neurons needed by the BP trained models. The use of the DE for the initial selection of the weights and biases seem to be able to drastically improve the performance of the ELM model.
The general trend of the DE-ELM model was such that the mse value improved as the number of neurons and generations were increased. In Table 5, a boundary between the models which did not attain the set target error (0.001) and those which attained the set target is shown. It is worth noting that the R2 values for the models are all higher (0.990 and above) than the R2 value depicted by the BP trained models (R2 value of 0.917) and the ELM trained models (R2 value of 0.988).
The ability of the DE-ELM algorithm to reach the set mse target (0.001) while using fewer number of neurons in the hidden could be attributed to that fact that it was able to optimize the selection of the initial weights and biases. The better performance of the DE-ELM model needed more time when compared to the ELM model. However the time taken by the DE-ELM model is much lower when compared to the time needed by BP model.
In summary it is clear that the DE-ELM model inherited the advantages of the ELM models discussed earlier on. The only disadvantage is the higher training time, which could be due to the fact that the DE algorithm needs time to first select the weights and biases.
Comparison of BP, ELM and DE-ELM strength prediction models
Three types of models have been designed and trained to predict yarn strength in this research work. The performance of the yarn strength prediction models can be compared by using Table 6.
The ELM model needed many neurons in the hidden layer (41) to reach the set target error of 0.001. This is one of the disadvantages of the ELM algorithm as reported by Zhu et al. (2005). The ELM algorithm needed less time to train, when compared to the other models. The DE-ELM model gives very good performance with a reduced number of neurons and a higher R2 value. Its training speed is slower than that of the ELM model but still much faster than that of the BP model.
The predicted and measured values for the 2 neuron model with 5 generation are given in Fig. 1. The predicted strength values traced the measured values so closely such that the success rate was at 99.2 %. This implies that the error rate is <1 %. This could be a sign of very good network generalization.
Yarn strength prediction models using BP, ELM and DE-ELM models were designed and trained up to a mse of 0.001. The performance of the BP algorithms was compared to two non-BP algorithms namely ELM and DE-ELM during the prediction of yarn strength.
The BP trained model needed 10 neurons in the hidden layer and was the slowest among the three algorithms. The ELM models exhibited the shortest training time (0.0311 s) but needed 41 neurons in the hidden layer. The DE-ELM hybrid models needed 2 neurons in the hidden layer and 5 generations, and its training time (0.7188 s) was shorter than the BP model but much slower than the ELM model. The BP yarn strength prediction model needed 1.438 s to attain the set mse target of 0.001. The hybrid model (DE-ELM) gave the highest prediction efficiency (R2 of 0.992), while the BP and ELM models recorded R2 values of 0.917 and 0.988 respectively.
Bernstein, I. R., Garbin, C. P., & Teng, G. K. (1988). Applied multivariate analysis (pp. 157–197). Berlin: Springer.
Huang, G. B., Chen, L., & Siew, C. K. (2006a). Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Transactions on Neural Networks,17(4), 879–892.
Majumdar, A., Majumdar, P. K., & Sarkar, B. (2005). Application of an adaptive neuro-fuzzy system for the prediction of cotton yarn strength from HVI fibre properties. Journal of Textile Institute,96(1), 55–60.
Ureyen, M. E., & Gurkan, P. (2008a). Comparison of artificial neural network and linear regression models for prediction of ring spun yarn properties. i: prediction of yarn tensile properties. Fibers and Polymers,9(1), 87–91.
Ureyen, M. E., & Gurkan, P. (2008b). Comparison of artificial neural network and linear regression models for prediction of ring spun yarn properties ii: prediction of yarn hairiness and unevenness. Fibers and Polymers,9(1), 92–96.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.