Business Intelligence and Machine Learning Methods for Predictive Maintenance in Greek Railways

The extraction of useful information supports the process of making business decisions. In every mechanical process, application or service, the periodic maintenance of the necessary equipment is an expensive process and there-fore the technicians and the supervisors have the responsibility of the proper decision making. At the railway companies, a huge amount of data is produced which, with the appropriate processing and smart business systems, can attribute quality information and knowledge. In this paper, the benefits of the business intelligence are presented with the techniques of machine learning and data mining involved of the Greek railway companies, which use ob-solete procedures of maintenance. In addition, a study of the present situation is held as well as a record of the needs and requirements of the railway companies. At the same time, tools (open source of low cost) of machine learning and data mining are examined that can assist on the creation of a new strategic support of decisions for the development of the predictive maintenance of the Greek railways making a new complete system of business intelligence. Finally, the results and the motives of the railway companies are presented in order to create applications which can constitute the basic tool for the improvement of making decisions by the business’s administration.


Introduction
The correct decision-making process is one of the most important responsibilities of the executives of a company. It is very important to create data for the fast taking of valid and precise decisions that the administration will process from selected data with the use of appropriate tools [1]. The need for the provision of improved information and the upgrading of the decision-making process was the impulse for the development of the Business Intelligence systems [2]. A complete system of Business Intelligence consists of a group of technologies, informative applications, processes, and actions in order to collect, analyze, present, and distribute the business information, which consequently will transform into knowledge [3]. The goal of business intelligence is to achieve the optimization of all processes of a business, in a short time as possible by making the best decisions. The method includes various aspects, such as analysis, predictive modeling, performance management, data mining, machine learning.
Data mining is a process that allows the extraction of useful information from the business data of a company and it is an important tool that can assist on the improvement, diagnosis, and the prediction of dangers of a business process [4].
Machine learning using software and information applications is able to learn to act based on the input data using them in a way that people would have used their previous experiences as part of their learning process setting their actions faster and more efficient [5]. Combined applications of data mining and machine learning techniques with present or new information systems are capable of transforming the company's data into strategic business decisions [6]. The accomplishment of the integration of all the systems resulting in the creation of a digital transformation of the companies consists of the business intelligence that should be adopted for the task [7]. Every company in order to accomplish its task, along with other productive factors activates and exploits the equipment available. The operative maintenance of each company's equipment is considered to be a costly process and for its implementation, fast and appropriate decisions are imposed which derive from information process and accurate information [8].
The maintenance is a business process capable of attributing profitability, productivity, and security to every business. Maintenance is defined as the process of the business for the conservation of the equipment in a desirable condition of efficient function and aims at the increase of the reliability of the equipment, the improvement of the production time and the cost decrease [9]. The types of maintenance are: • Reactive maintenance: It is the method applied to the equipment that is constantly used until some malfunction arises. In this method, the task is interrupted, the malfunction in the equipment is restored and then it continues.
• Preventive maintenance: The basic principle of the Preventive Maintenance is the execution of maintenance tasks according to the manufacturer's standards. It is held in limited by the manufacturer deadlines and spare parts are replaced or equipment components that have been determined by the operating manual in order to minimize the malfunctions and preserve the right application of the equipment.
• Predictive maintenance: It is the method applied with regular inspections, special measurements of the equipment, application of new technologies and replacement of parts. It takes place a short while before the beginning of a malfunction, aiming at the significant decrease of the cost, in relation to the predefined parts replacement suggested by the manufacturer and sets as basic principle the predictive maintenance [10] [11].
At a constantly changing environment that every financial growth is combined with financial methods of transportation, the transportation of people and goods with the assistance of the railways is an urgent need to be achieved. The railway is one of the most inexpensive methods of transportation and its research leads to intensive financial activity [12]. The transportation companies use new applications for the optimum exploitation of data and the extraction of useful conclusions, and the railway companies are based more and more on the effective exploitation of information [13]. The rail transports perform a broad range of processes in order to achieve the main importance of their target which is the passengers' and goods' safety [14]. The railway transportation processes produce a huge amount of data with the appropriate tools of which can offer new technologies that can be analyzed and supply with the making of meaningful and precise decisions for their financial growth and provision of optimum services [15].
The rolling stock of the railway network is an important factor for the right function of the railways transportation as any failure detains the achievement of their strategic goals of safety, passengers' and goods' safety, reliability and velocity of the transportation task provided. The rolling stock of the railways consists of vehicles that are separated in railway wagons and engines. It is distinguished in driving material (the engines) and the traction material (the wagons). The railways are obliged to perform the transportation with safety, speed, and comfort for the passengers and with full protection for the products [16]. The mandatory requirements impose the constant monitoring of the rolling stock before the departure of the train as well as during their route. The systematic maintenance of the trains is essential since during their route many parts are subject to wear. The wear has to be dealt with on time, before it causes damage and anomalies in the circulation or affects negatively and to larger scale the train. The rolling stock maintenance is defined as the function of every railway operation for the preservation and the reliability of the trainsets for the smooth functioning, keeping the principles of the safety of the staff and the environmental protection [17]. The aim of the sector of Rolling Stock maintenance is the optimal function of the trainsets for a long period of time without problems in the minimal cost. In the Railway industry the sector of Rolling Stock maintenance ought to care for the development of the optimal and most efficient procedures of maintenance aiming at the limitation of expenses in order to improve the competitive position of the business, achieving the level of services in offer, to be the same or to the highest level [18]. With the approach of business intelligence Open Journal of Applied Sciences is feasible to avoid dysfunctions in a train which put at risk the passengers, minimizing simultaneously the operational cost, and advancing the quality of the services provided.
Business intelligence solutions supply tools with appropriate technologies in order to help collecting, integrating, storing, processing, and analyzing stored-business data. Business intelligence tools (low-cost open source) that enable changes or improvements and adapt to needs by meeting the requirements of the one who uses them are the WEKA machine learning program software and the KNIME data analysis program software. The two programs directly incorporate the latest technological developments.
In this paragraph of the paper are briefly presented relevant research with the use of the WEKA software of machine learning and the use of KNIME software of data mining and it is the starting point for the development of a smart tool that supports innovative processes for the achievement of a business intelligence system in order to make the right decisions. For the creation of a new system of complete business intelligence concerning the maintenance process in Greek railway rolling stock, the article [19] was made as a contribution. The article [19] describes the optimization of the construction process based on the reading and the prototypes as new approaches of data mining. It uses WEKA data at the Advanced Manufacturing Analytics platform in order to pinpoint hidden patterns on data related to the construction. Additionally, the study [20] refers to the data analysis that took place with machine learning techniques at WEKA, to pinpoint problems that affect the availability of the fleet of aircrafts, the impact of the spare tools availability was assessed, and an intelligent system was developed to minimize the cost of the spare tools. A systematic approach for the prediction of the cost of maintenance of the road construction equipment suggests the article [21]. In this paper the steps of collection, analysis, modeling, and validation of the data are discussed. The paper also presents the progress of different types of models analyzing seven algorithms with the use of WEKA software. The data mining approach that takes place in paper [22] can help transform maintenance processes on Greek railways. The paper [22] presents the development of a simple and easy to be used model for the prediction of an early damage of an oil pump at the oil and gas industry. The data analysis is based on real historic data from sensors. In the KNIME platform algorithms are applied where possible errors are successfully detected and classified ensuring the precision of the prediction. Additionally, the article [23] discusses the results of an industrial project oriented at the integration of data mining KNIME tools at the Enterprise Service Bus (ESB) platform. It reaches the conclusion that the machine of data mining KNIME and web services of clients can be incorporated in the present information systems and correlated with the Industry 4.0. In addition, the purpose of this paper [24] is the planning and the application of a predictive method of maintenance for farm machines. The suggested method is based on the analysis of historic data for the prediction of future failures of a machine. The promotion Open Journal of Applied Sciences of the suggested method took place with the application of an open-source tool software KNIME. The frame of improvement of the effectiveness at the editing of documents is mentioned in the paper [25]. The suggested approach uses techniques of retrieving information and identification of a passage through the techniques of machine learning. For the application of the method and the suggestion for the management of documents the KNIME software was used.
The following work presents the basic scientific knowledge relating to the field of business intelligence and describes the methods of machine learning and data mining. Immediately afterwards, the research approach to the subject takes place and examines the way, techniques, and tools (low-cost open source) of machine learning and data mining that can help reorganize maintenance processes and especially maintenance in railway rolling stock. Then, the problems faced by Greek railway companies are then recorded and follow the positive results of the use of business intelligent systems using the methods and tools of machine learning (WEKA) and data mining (KNIME). Finally, the conclusions are drawn from the implementation of a new comprehensive system of business intelligence proposed to railway companies in order to create a new decision support strategy.

Business Intelligence
Modern Technologies of Information and Communication are combined with historic, most of the times inactive data of the companies in order to provide with fast, secure, and reliable information to the people responsible to make decisions [26]. The combination of applications, techniques, processes, business actions and data analysis with innovative techniques for their transformation into information that will lead to more and quality knowledge the executives recommend the business intelligence [27]. The business intelligence system is a company-information system that processes data from the indoor and outdoor environment of the company and provides the administration with data in order to make fast the appropriate and meaningful decisions [28] [29]. A business intelligent system is a unique category of the system and its data are the people, the processes, and the equipment, which interact and collaborate to process data and provide the user (company) with information [30]. Every business intelligence system is constructed with constant developing levels comprising a pyramid. At the initial stage, the raw-historic data are located and at the final is the making of the final decisions as shown in Figure 1. For the achievement of the final stage take place processes as Data preparation, Data investigation, data visualization and data mining with optimum technique the application of algorithms of machine learning.

Data Mining
The data mining focuses on the exploratory analysis of data for the discovery of knowledge. From huge data base using algorithms of machine learning as well as Open Journal of Applied Sciences every system that supports decisions aims at the detection analysis and classification of a huge volume of information which will be useful in order to make the right decisions [31].

Classification-Regression
The categorization and the regression are tasks of Supervised learning aiming at the discovery of connections, among a set of data. They are techniques of data mining able to predict the value of a dependent variable related to other independent variables from a total of training data that a predictive model is modified and built. The type of the dependent variable defines the difference of the two data mining techniques. For the prediction of a dependent variable that contains distinct -nominal values categorization is used, whereas for the prediction/interpretation of continuous -numerical values the technique of regression is applied [32] [33].

Machine Learning
The machine learning studies and constructs algorithms that can learn from the data and the information that are provided in any case and make predictions based on them. The problem solving using smart ways is based on the ability of the systems to learn making use of previous knowledge and experience. The ap-plications that are created are based on algorithms of machine learning make predictions based on the data and make decisions that are expressed as the result [34].

Decision Trees
The construction of decision trees, it systemizes the data, improving the ability of management for analysis and provides the perspective of making rational decisions. The decision trees offer accessible interpretation even from non-professional users because the presentation of the knowledge is easily read, and the output of rules is comprehensible. Very important is the fact that the use of decision trees requires minimal data preparation and handles nominal and numeric input characteristics [35]. Two characteristic algorithms widely used in categorization and regression are C4.5 and o M5 respectively. • The Iterative Dichotomiser 3 or ID3, as well as its evolution, the C4.5 and its commercial version C5.0 is the prevailing algorithm, which is used to make decision trees [36]. The criterion used for the dividing of the characteristics is called Information Gain and it expresses the decrease of entropy that will arise, if a set of observations (S) is divided into subtotals based on the values of trait A. The Information Gain is quantitatively measured with entropy. The entropy in general expresses the size of the diversity in a set of data. From the algorithm ID3 for the separation of the observations the trait with the highest information gain is chosen [37].
• The M5 algorithm is an algorithm based on decision trees and it is used when the classification receives continuous values and there are quality and quantity data [38]. The initial algorithm M5 was invented by R. Quinlan and Yong Wang made some improvements. At the algorithm M5 the data connections that are defined in their leaves, come from models of linear regression that are trained from examples located at the corresponding intervals [39]. This splitting criterion is defined as the standard deviation of the class values that reach a node as a measure of the error at the node and calculates the expected reduction in error as a result of testing each attribute at the node. Then, the attribute that maximizes the expected error reduction is selected. The result of the separation process stops when the values of the class on the node are selected that maximize the expected error reduction [40].

Machine Learning and Data Mining for Maintenance
The machine learning is a method that is used for the concept of complex models and algorithms that lead to the prediction, which is in the centre of many applications that contribute to the predictive maintenance. In the machine learning, the algorithms and the prediction systems are based on the collection of data and observations. The ability of prediction that Machine Learning offers is capable to optimize the maintenance, the improvement of the quality and the enforcement of the production [41] [42]. The data mining offers the ability of use-Open Journal of Applied Sciences ful information to be analyzed and classified from big sets of data with the use of machine learning algorithms. The supply of useful information, via the data mining process, leads to making the right decisions so as to optimize the maintenance as shown in Figure 2 and the goals set case by case [43].

Machine Learning and Data Mining for Maintenance of Rolling Stock
Machine learning includes algorithms which allow a computing system to learn on its own based on previous knowledge and experience. The data mining consists of the application of methodologies of machine learning to large data bases. Huge mass of data is produced in the rolling stock maintenance [44]. Using data mining and machine learning information and results are produced that are useful tools which will lead to the upgrading and updating of the underling procedures (as shown in Figure 3) with the creation of new innovative techniques decreasing the time of the unavailable train sets as well as the cost of the equipment and the spare parts to be serviced [45].

Machine Learning Tool-WEKA
The WEKA is Machine Learning open source software it developed at the Waikato University of New Zeeland and it publicly available according to the license terms GNU General Public License. The use of WEKA offers great variety of machine learning algorithms. All the algorithms take the shape of a relatively simple table which can be read by a file or created by a simple data base question and constructed a decision tree. The reasonable default values ensure that the decision making is possible with the minimum effort. WEKA allows configuration design for data processing and in combination with graphical reproduction of algorithms can be used for predictive maintenance processes [46] [47].

Data Mining Tool-KNIME
The development of KNIME started in 2004 from a team of software engineers at the Konstanz University of Germany and it is an open-source software to analyze data, with the permission of GNU (General Public License). The KNIME  is a tool of business intelligence that includes a complete environment of development and an extensible system plug-in. The open-source platform for data processing KNIME allows to the user the easy and fast analysis of many tasks as the processing, conversion, analysis, and the visual presentation of data. The user can manager huge amount of data, without being limited by the number of lines of data or the processors during the analysis of data. It is compatible with all the operative systems including Linux, Windows and Mac since its materialization has been accomplished on JAVA platform while its planning, offers to users with no programming experience the ability to use it with ease. The way of creation and development of the tool of business intelligence KNIME allows the integration of various units of loading and offers complete procedures for the appearance of the final results [48] [49].

Recording Problems in a Greek Company Managing Railway Rolling Stock
The railway transport in Greece is available in four railway ranges and the services are managed by different companies and organizations. Two companies are responsible for the maintenance of rolling stock on the Greek railway network, one for the urban rail network of Athens and one for the rest of the railway network outside the urban fabric. Every company is active in the maintenance for the smooth function of the Rail Rolling Stock with the ultimate goal the achievement of the safe and reliable function of the railway system. The methods of maintenance that are used from both companies are the same: 10) Decrease of cost coming from the hand-written reports on paper and ink. 11) Monitoring, documentation and comparison of the cost of operation of the trains at every railway line in order the administration of the company make the right decisions.
12) Automation of the processes that will lead to the elimination of the unnecessary bureaucracy.

Results of Business Intelligence at the Greek Railway Rolling Stock
A new business intelligence information system that will incorporate the techniques of machine learning and the process of data mining with the tools of open-source (WEKA -KNIME) will ensure the analysis of a huge amount of inactive data (with a low cost), that are produces through the processes of maintenance. By their interpretation, the quality information will generate for the optimum decision making by the administration every company aiming at the transformation of the process of maintenance into a predictive maintenance as shown in Figure 4, enabling the companies to evolve financially along with the benefit of improved services to the citizens. The addition of applications at the present information system that the Greek companies manage aims at the reassurance of the reliability and the availability of the Rolling stock, the decrease of the time of non-available trains and the increase of the life expectancy of the rolling stock. The applications can be the basic tool for the improvement of the making-decisions process from the administration of the company [50]. The applications suggested aim at the evolution of the management process of the rolling stock.
• Goals of the innovation recommendation: 1) Right and effective management of the equipment.
2) Improved planning of maintenance tasks.
3) Systematic monitoring of the implementation of the maintenance tasks. 2) Planning and performing the maintenance based on the resources available (living and non-living resources).
3) Decrease of the total cost of maintenance which can reach up to 40% with the elimination of overtime and the purchase of inefficient spare parts.

Conclusion
With the application of business intelligence systems at the transport section and especially at the Greek railway companies that use outdated processes, the im-