Intelligent fault diagnosis system based on big data

In view of the actual problems existing in life-cycle health monitoring and diagnosis of large complex equipment, the machine-learning algorithm is applied to data mining of the equipment operation big data, the expert knowledge base is established, the diagnosis rules related to the fault are obtained, the intelligent online monitoring and remote diagnosis of the equipment health condition are realised. The system uses uncertain fault prediction method and hybrid intelligent algorithm to discover the hierarchical association between operation feature big data and operation faults, the feature extraction of operation faults, and the intelligent diagnosis of operation faults. It effectively improved the sensitivity, robustness, and accuracy of monitoring and diagnosis. In the cloud service platform based on the Internet of things, the system realises the intelligent fault prediction and diagnosis, establishes a proactive maintenance system, improves the production efficiency, and ensures the production safety.


Introduction
With the continuous development of science and technology, fault diagnosis has become the basic technology of equipment management and maintenance in the field of industrial production. It plays an important role in ensuring the safe and reliable operation of the equipment, improving the efficiency of management and maintenance, and generating the economic benefits.
The history of the equipment fault diagnosis technology includes three stages: the first stage based on the material life analysis and estimation; the second stage based on sensor and computer technology; and the last stage is intelligent diagnosis. Intelligent diagnosis has been gradually developed in recent years. The development of artificial intelligence (AI) technology provides intelligent technical solutions for equipment fault diagnosis. It is an inevitable trend that the diagnostic process based on numerical computation and signal processing will be replaced by the diagnostic process based on knowledge processing and knowledge reasoning.
Fault diagnosis is a comprehensive problem, which involves cybernetics, signal processing, and pattern recognition, computer science, AI, electronic technology, statistical mathematics, and other disciplines. With the deep integration of information and industrialisation, the application of new generation of information technology, such as Internet, Internet of things (IOT), big data and cloud computing, has driven industry into a new era. The data produced, collected, and processed by the equipment of enterprises are also largely increased. Through advanced technologies such as the health condition perception, high-speed data transmission, distributed computing, and diagnostic analysis brought by the Internet and IOT, the information technology and the industrial system are integrated deeply, and the research and development of the enterprise are innovating [1].
The condition monitoring and fault diagnosis of large complex equipment, such as gas turbine, is usually characterised by complexity, uncertainty, correlation, and hierarchy [2]. Therefore, it is difficult to diagnose the weak and compound faults accurately and timely. At the same time, because of the complexity of equipment and the highly automation of the system, the amount of data needed to be analysed is very huge. It is unrealistic to analyse data by manual. In recent years, AI technologies, such as expert system, fuzzy logic, neural network (NN), and genetic algorithm, have been applied to the intelligent fault diagnosis of equipment, and achieved remarkable results in practice [3][4][5]. These methods are effective only under certain conditions. In the face of problems such as incomplete diagnostic information, human determination of fuzzy membership function, knowledge acquisition in expert system, imbalanced training samples, its' application has been greatly restricted. Therefore, the hybrid intelligent fault diagnosis technology combined with the AI, modern signal processing technology and feature extraction method have sprung up in the last few years [6].
This paper studies the intelligent fault diagnosis system based on big data. In view of the problems existing in the life-cycle health monitoring and diagnosis of enterprise class equipment, machine-learning algorithm is applied to analyse mining equipment data, the expert knowledge base is established, and the diagnosis rules related to the fault are obtained, and the intelligent and efficient online monitoring and remote diagnosis of equipment health condition is realised. The intelligent fault diagnosis system based on big data learning, using the uncertain fault prediction method, the hierarchical association of operation feature big data and operation faults, the feature extraction of operation faults and the fault diagnosis based on the hybrid intelligent algorithm are realised. On the basis of the cloud service platform based on the IOT, the system realises the intelligent fault prediction and diagnosis for the complex equipment, and establishes a proactive maintenance system for the enterprise. The effectiveness of the system has been well verified in engineering applications. Compared with the traditional fault diagnosis system, the system can effectively improve the sensitivity, robustness, and accuracy of the monitoring and diagnosis, predict the fault earlier, and reduce the misdiagnosis rate and the missed diagnosis rate.

Intelligent fault diagnosis method
Traditional fault diagnosis method based on mathematical model is used to understand the variation of parameters and the scope of the theory in equipment operation process, it is designed to analyse the relationship between various abnormal conditions and parameters of the model. It is difficult to generate an accurate model for large complex equipment due to the complexity, thus it is not easy to fulfil the diagnose task effectively. Therefore, intelligent fault diagnosis technology was introduced to solve this problem [7]. The intelligent fault diagnosis is developed upon the basis of expert J. Eng system, fuzzy mathematics, and fault tree analysis (FTA) and machine-learning methods [8].

Expert system
The expert system is a computer system that can solve complex problems with a large number of knowledge-based rules and reasoning methods in certain fields. The goal of the expert system is to convert knowledge and experience of domain experts into expression pattern, and simulate the reasoning thinking process of human experts in computer.
Expert system usually consists of six parts: human-machine interaction interface, knowledge base, reasoning machine, interpreter, integrated database, and knowledge acquisition. As shown in Fig. 1.

Fuzzy mathematics
The fault condition of many diagnostic objects is fuzzy. An effective way to diagnose such faults is to apply fuzzy mathematics theory. The diagnosis method based on fuzzy mathematics does not need to establish an accurate mathematical model. With the proper use of local functions and fuzzy rules, the intelligent fuzzy diagnosis can be realised through fuzzy reasoning [9]. The intelligent diagnosis method based on fuzzy mathematics is shown in Fig. 2.

Fault tree analysis
The FTA built on prior knowledge of faults and causes. The diagnosis process starts with a system fault, keeps asking 'why this happens' to add sub-trees step by step. Through this heuristic search, the root cause of the fault will be revealed. Fault tree-based diagnosis methods are widely used in practical engineering, but most of them are combined with other methods.
When the fault tree is established, the most unexpected fault condition is first analysed as the top event of the fault tree. Then, all direct causes that lead to the failure are found, and all the direct factors that lead to the next event are found. Through layer-bylayer analysis, until time can no longer decomposable [10]. The establishment process of fault tree model are shown in Fig. 3.

Machine learning
Machine learning was promising an answer to many of the old and new challenges of manufacturing, which is widely discussed by researchers and practitioners alike. These data-driven approaches are able to find highly complex and non-linear patterns in data and transform raw data to mathematical models, which are then applied for prediction, detection, classification, regression, or forecasting.
However, the field of machine learning is very diverse and many different algorithms, theories, and methods are available. For many manufacturing practitioners, this represents a barrier regarding the adoption of these powerful tools and thus may hinder the utilisation of the vast amounts of data increasingly being available. The machine-learning method can be divides into two categories: traditional machine-learning algorithms and neuron network.

Traditional algorithms:: Classification Algorithms:
The study of parallel or improved strategies for different classification algorithms has become the main research direction of classification learning algorithms in big data environment, such as support vector machine classification, decision tree classification, NN classification, and other methods. In recent years, the deep learning technology developed on the basis of NN has gradually demonstrated its unique advantages in dealing with big data. With the increase of the hidden layer, the multilayer NN has more flexible and richer expressiveness than the perception. It can be used to build more complex mathematical models and can profoundly reveal the complex and rich information hidden in the mass data, thus making more accurate predictions.
Clustering Algorithms: Clustering learning is one of the first methods used in pattern recognition and data mining tasks, and is used to study big databases in various applications. The classical clustering algorithm faces many challenges in big data environment, such as big data volume, large volume of data, and high data dimension. It is the key problem to improve the existing clustering algorithms and propose a new clustering algorithm for big data clustering research.

Deep neural network:
In knowledge acquisition, the NN does not need knowledge engineers to organise, summarise, and digest the expert's knowledge. It only needs to use the examples of solving problems by domain experts to train the NN. In the knowledge representation, the NN is implicitly expressed, and the knowledge of a certain problem is expressed in the same network. The network has high universality, easy access to knowledge, and parallel associative reasoning. In knowledge reasoning, NN can achieve reasoning through interactions between neurons.
The expression, self-organising, and self-learning ability of NN can overcome the defects when the input data are not fully understandable.
The application of NN in fault diagnosis is mainly concentrated in two aspects. One is classifier to recognise the fault pattern, and the other one is dynamic prediction model to predict the fault. The NN of intelligent fault diagnosis system has three type of layers: input layer, middle layer, and output layer, as shown in Fig. 4.
The input layer is responsible for receiving all kinds of collected information from the system. The middle layer, which is also been called hidden layer, contains multiple hidden neurons, its output connected to the inputs of other neurons through weight coefficient. A deep neural network (DNN) is an artificial neural network (ANN) with multiple hidden layers between the input and output layers. The output layer gives a specific method to distinguish fault after the transform. After the completion of network training, NN will be able to give the classification diagnosis results quickly for new input information.

Big data-based intelligent diagnosis
As of the increasing number of monitoring points, high sampling frequency, and long data collection time in the large complex equipment's operating condition, the data needed to be deal with are growing overtime, which lead to the evolution of diagnosis technology. The application to predict and diagnose equipment fault was force to change [11].
The operation mechanism of large equipment is complex, the parameters and structure are uncertain, and the influence of the bad working condition and interference on the system is coupled, and there is a lot of uncertainty in the process of fault prediction. On the basis of the comparison of the typical fault prediction methods based on randomness, fuzziness, and grey, the method of fault prediction based on uncertainty is studied by researchers. In fact, there are advantages and disadvantages of different intelligent algorithms for different objects. The key is to select the proper method according to specific circumstances, then optimise its efficiency and accuracy.
Combined with the features of big data and machine-learning theory of complex equipment, the fault prediction and diagnosis method based on operation big data learning include the four aspects: he hierarchical correlation between the equipment operation feature big data and operation faults; fault feature extraction; multi-sensor data fusion and fault diagnosis based on hybrid intelligent algorithm.

Approaches to analyse complex equipment big data
Combined with the features of big data and machine-learning theory of complex equipment, the fault prediction and diagnosis method based on operation big data learning includes many aspects: The hierarchical correlation between the equipment operation feature big data and operation faults; dimensionality reduction; data fusion, fault diagnosis based on hybrid intelligent algorithm, divide-and-conquer strategy and distributed algorithm.

Hierarchical correlation:
Through the detailed analysis of the causes of the operation failure of complex equipment, a hierarchical structure model of the cause of the operation failure is established, the data of the operation features of the complex equipment are collected and analysed, and the association mapping between the features of the complex equipment and the cause of the operation failure is established.

Dimensionality reduction:
Dimensionality reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables [12]. It can be divided into feature selection and feature extraction.
In order to solve the high-dimension problem of equipment operation features, the method of fault feature extraction based on rough set attribute reduction can be adopted, and the redundant attributes of big data in operation feature are eliminated, and the feature parameter related to the operation fault of complex equipment is obtained [13].

Multi-source data fusion:
Multi-source data fusion is a technology that integrates information from multiple sources into a unified model. It involves the theory of probability and statistics, information theory, signal processing, pattern recognition, fuzzy mathematics, AI etc. In recent years, it has developed into a highly comprehensive new technology [14].
Data fusion technology generally includes three levels: data level, feature level, and decision level. The intelligent algorithm of data fusion in intelligent fault diagnosis system mainly includes feature extraction, pattern recognition, and decision. The system uses a comprehensive algorithm to associate and identify the relevant information from the measured data of multiple sensors and related databases, and make decisions and evaluation accordingly. Multi-source data fusion enhances the reliability and robustness of the system, enhances the credibility of the data, and improves the diagnostic accuracy.
The commonly used methods of data fusion can be summarised as two main categories: the classical method and the modern method. The classical methods are weighted average, maximum likelihood estimation, least square, Kalman filtering, Bayesian estimation, classical reasoning, Dempster-Shafer (D-S) evidence reasoning, quality factors etc. The modern methods include cluster analysis, logic template, entropy theory, voting method, fuzzy logic theory, generating rule, NN, genetic algorithm, fuzzy integral theory, rough set theory, wavelet analysis theory, expert system etc. The data fusion of intelligent fault diagnosis system mainly adopts fuzzy logic theory, NN, rough set theory, and expert system algorithm [15][16][17]. Practice has proved that this kind of AI algorithm has better robustness and parallel processing ability in multi-sensor data fusion.
The multi-sensor data fusion method based on least squares and support vector machines is shown in Fig. 5.

Hybrid intelligent algorithm:
Data-driven fault diagnosis is a method of analysis, mining, and information extraction of industrial data. The essence of this method is to turn high dimensional, related, and difficult to be analysed data into low dimensional, unrelated, and easy to understand information, thus can better help the equipment personnel to understand the current operation condition of the equipment. However, the effect of datadriven approach is highly dependent on the quality of process data.
The diagnosis method based on expert knowledge is to make a systematic analysis of gas turbine operation through simple and direct objective description, in order to describe the cause of failure and to predict the possible faults. As of its simple operation and strong recognition ability, this method can diagnose the cause of gas turbine failure by simple knowledge and reasoning. It can detect the signs of early wear in the gas turbine before the equipment failure, provide accurate diagnosis information early, provide great time freedom for the equipment operation and maintenance engineers, and help the plant to plan the equipment maintenance more scientifically before the equipment is seriously damaged or the failure occurs. This method is only suitable for the initial test of gas turbine, because the knowledge cover function is limited, the deep fault cannot be well diagnosed.
In order to improve the reliability of equipment fault diagnosis in industrial big data environment, a hybrid intelligent fault detection model is designed. The modelling of hybrid intelligent algorithm is to combine the method based on knowledge engineering and the method based on data driven. It mainly uses data-driven method to detect the fault and uses the method based on knowledge engineering to diagnose the fault. On the basis of collecting the feature data of the complex equipment, the machinelearning algorithm, such as clustering and decision tree, is applied to the data mining of big data, and the expert knowledge base is established to obtain the diagnosis rules related to the fault, and to improve the reliability of the fault detection and diagnosis. The fault diagnosis process based on hybrid intelligent algorithm is shown in Fig. 6.

Divide-and-conquer strategy and sampling of big data:
The divide-and-conquer strategy and parallel processing is the basic strategy of big data processing. Divide and conquer strategy is a computing paradigm for dealing with big data problems. In recent years, with the development of distributed and parallel computing, divide-and-conquer strategy is particularly important. The design idea of data divide and conquer is to divide a big problem which is difficult to solve directly and divide into some smaller same problems in order to divide and rule [18]. How to learn the distribution knowledge of big data and optimise load balancing is a problem to be solved urgently. At the same time, it is necessary to filter the sample space according to certain performance standards, eliminate redundancy and noise data to avoid the inefficient execution of processing algorithms due to the large number of attributes and records in big data. On the basis of not reducing or even improving some performance, the computation time and space consumption can be reduced as much as possible.

Parallel and distribution algorithm of big data:
The most used strategy of applying traditional machine-learning algorithm into big data environment is to parallel or distribute the existing learning algorithms. A parallel or distributed algorithm is executed by a number of processes and/or processors to accomplish a particular task. However, it is much harder to design correct algorithms than their sequential counterparts. The reason is that it is hard to imagine all possible behaviours of a parallel/distributed system, and the iterations of data analysis algorithms even complicated this problem. At the present, the research of big data parallel algorithms and distributed algorithms have made some significantly progress in lots of circumstances, and achieved efficient analysis and processing of a certain magnitude of big data.
MapReduce is the most widely used parallel programming model at present. With the rise of its open source Hadoop, MapReduce has formed a large-scale data analysis system, which has become a standard for mass data processing in the academia and industry. Compared with the traditional parallel computing technology, MapReduce has the characteristics of large data processing systems such as linear scalability, high availability, ease of use, fault tolerance, load balance, and robustness. Moreover, users only need to pay attention to the high-level processing logic related to specific applications when using the MapReduce parallel programming model, and the other low-level complex parallel transactions, such as input distribution, task partition and scheduling, inter task communication, fault tolerance processing, and load balancing, are delivered to the execution engine. At the same time, with user defined programmable interface, such as input-output flow processing, task scheduling, intermediate data partition and sorting, the model achieves excellent balance in scalability, and programmability [19].

Big data analysis of equipment operation
With the help of equipment operation big data analysis technology, the equipment can be proactively maintained when the equipment failure is about to happen. This maintenance is based on the results of condition detection and fault diagnosis, and is a proactive and positive way of maintenance. This is especially important for some key equipment which has high utilisation rate and has great impact on production after failure. The flow of big data analysis technology is shown in Fig. 7, which combines the big data drive discrimination with the expert knowledge base discrimination. Combining fault mode and key factor analysis, a fault diagnosis record is formed synthetically as the basis of fault solution [20].

IOT based cloud services
The IOT has been widely applied in many fields because of its three aspects: comprehensive perception, stable transmission, and intelligent application. It brought new development space for equipment monitoring and diagnosis. Various sensors, data acquisition boxes, computers and servers, according to the agreed protocol rules, use the IOT to complete the real-time collaborative acquisition, intelligent processing, timely feedback of the health condition information of the equipment, and build a flexible, open, scalable, reconfigurable and real-time interactive equipment health condition database. Based on this database, the dynamic adaptive feature component reflecting the cause of equipment failure is extracted. Thus, a new intelligent monitoring and management model, which is integrated with condition detection, fault prediction, online monitoring and diagnosis, remote monitoring and diagnosis, and remote operation and maintenance is realised [21].
The intelligent fault diagnosis system based on big data of IOT is shown in Fig. 8. The system integrated information sensing technology, network technology, and intelligent computing technology.
i. the sensing layer uses the sensor installed on the equipment to collect information, and uses the short distance communication technology to transmit the data to the site gateway or the host computer to realise the functions of the acquisition and transmission of the perceptual layer data. ii. the network layer transfers the health monitoring signal and operation parameter data of the equipment to the branch servers and then uploaded to the headquarters of the company to realise centralised storage of the health condition information of the equipment. iii. the application layer is mainly to realise the analysis of the monitoring signal, the fault feature extraction, the fault diagnosis and the prediction function. First, it use the rich and mature algorithm of data pre-processing to reorganise, excavate and reasoning the data, then the human-machine exchange interface show the valuable core information and conclusions to the users. Finally, the functional requirements of the monitoring and diagnosis of the equipment health condition based on the concept of the IOT are completed, and the intelligent management, application and service are realised.
The system can also access the distributed control system (DCS) owned by the enterprise for integrated management. The equipment operation and maintenance management software based on the IOT, including the monitoring module and the management module, has realised the functions of sensing interface service, data analysis service, early warning and measure service, map data service, operation data analysis service, network analysis service and data query service, operation, and maintenance record and user management. It provides the production management, quality inspection, and operation and maintenance managers with all the basic information of life-cycle equipment, operation monitoring value, specific details of operation and maintenance, accurate location of fault, rapid generation of maintenance plan, professional monitoring data analysis and statistical analysis chart report according to demand.
The main functions of the intelligent diagnosis system based on the IOT technology are fault alarm，the trend analysis of the health condition of the equipment，provide decision-making information for equipment maintenance ， remote management and mobile office.

Application analysis
Gas turbine has many advantages, such as high thermal efficiency, high power, small volume, small pollution etc. It is widely used in the fields of industrial power generation. It is difficult to diagnose the early, weak, and compound faults accurately and timely due to the complexity of the structure and mechanism, the uncertainty of parameters and structures, the features of dynamic time-varying and strong coupling. Once the failure occurs, it will cause huge economic losses and life safety risks, and the maintenance cost is also very expensive.
An enterprise adopts preventive maintenance strategy with periodic overhaul to prevent shutdown of gas turbine. The time interval of the overhaul is rather conservative for insurance, thus result in excessive maintenance. Therefore, the enterprise urgently needs a technical solution to predict the fault location, fault type and its deterioration trend for establishing the proactive maintenance system.
The system installed certain number of sensors on the gas turbine. The specific installation position is at low pressure compressor, high pressure compressor, gearbox, starting system, hydraulic pump, compressor rear frame, low pressure turbine, and power turbine etc. Combined with the monitoring parameters of the DCS system that the factory has used, it constitutes a complete monitoring of the operation condition of each region of the gas turbine. At this time, the scheduled overhaul time of the factory is near. As all the detected condition values by the system are very consistent, and they work linearly in the normal operation condition, the factory decided to postpone the overhaul time. The decision saved the factory millions of overhaul costs and tens of millions in production losses.
Three months later, the sensor installed at the low pressure turbine area detected the condition is in an unstable way and indicated a slight fault of the equipment. The system analysed this slight fault comes from an abnormal bearing. The system monitored no equipment slippage, particle debris pollution and periodic impact events, thus it is possible that the seal of this bearing is damaged. According to the influence of this slight fault on the equipment, the system judged that the equipment can continue to run for about 5-6 months before being repaired. The factory began to make equipment maintenance plan and purchase spare parts. As the factory no longer needs to pre order spare parts, it reduces the purchase funds and the long-term occupation of inventory.
The system collects and stores a lot of operational data through online operation for >1 year. The AI algorithm of the system realises the fault prediction based on the operation big data analysis and machine learning, and proves the effectiveness of the intelligent algorithm to the fault diagnosis. The system can predict and diagnose abnormal dynamic load, bearing wear (including ball and seal) damage, local surface damage, shaft imbalance, lubricating oil deterioration, blade friction, foreign object damage etc.
The modelling of hybrid intelligent algorithm is based on knowledge engineering and data-driven methods. The hybrid intelligent fault diagnosis method combines the various monitoring values in the system, uses the data-driven method to detect the fault, and uses the knowledge engineering method to diagnose the fault. On the basis of collecting the feature data of gas turbine, the machine-learning algorithm is applied to data mining of big data, the expert knowledge base is established, the diagnosis rules related to the fault are obtained, the reliability of fault detection and diagnosis is improved, and the most effective method for fault diagnosis of gas turbine is made.
The application practice verifies that the intelligent fault diagnosis system based on the operation big data has the following advantages: i. the failure can be detected in the shortest reaction time.
ii. early detection of minor damage can be detected. iii. the misdiagnosis rate and missed diagnosis rate is reduced. iv. It has better fault separation ability. The different fault types and locations can be accurately distinguished. v. it has better fault identification ability. The fault size can be accurately identified, and its deterioration trend over time can be predicted. vi. it has stronger robustness and higher reliability. The system can accomplish the task of fault diagnosis correctly in the presence of noise and interference. vii .
the self-adaptive ability is strong. The system is adaptive to the changed objects and can make full use of the new information generated by the changes to improve itself.

Conclusion
This paper aims at the problem of fault diagnosis of large and complex industrial equipment. On the basis of studying the fault diagnosis model, the machine-learning algorithm is applied to data mining of operation big data. The expert knowledge base is established， the diagnosis rules related to the fault are obtained, and the hybrid intelligent algorithm is adopted to realise the intelligent fault diagnosis based on the operation big data analysis and based on the knowledge. The cloud service technology based on the IOT is applied to the system. Under the framework of three layers of system, the perception layer, the network layer, and the application layer, the intelligent and efficient monitoring and diagnosis model of the equipment health condition online monitoring, remote monitoring, remote diagnosis, and fault matching recognition is realised.
In summary, the intelligent fault diagnosis system based on big data provides technical means for changing the equipment maintenance ideas and methods, and realises earlier fault prediction and more accurate fault location. The preventive maintenance system of the enterprise equipment turns to the proactive maintenance system based on the prediction of the health condition of the equipment. It has achieved the goal of improving safety, improving equipment utilisation, reducing costs, and increasing economic benefits.