Classification Approach for Intrusion Detection in Vehicle Systems

Vehicular ad hoc networks (VANETs) enable wireless communication among Vehicles and Infrastructures. Connected vehicles are promising in Intelligent Transportation Systems (ITSs) and smart cities. The main objective of VANET is to improve the safety, comfort, driving efficiency and waiting time on the road. VANET is unlike other ad hoc networks due to its unique characteristics and high mobility. However, it is vulnerable to various security attacks due to the lack of centralized infrastructure. This is a serious threat to the safety of road traffic. The Controller Area Network (CAN) is a bus communication protocol which defines a standard for reliable and efficient transmission between in-vehicle parts simultaneously. The message moves through CAN bus from one node to another node, but it does not have information about the source and destination address for authentication. Thus, the attacker can easily inject any message to lead to system faults. In this paper, we present machine learning techniques to cluster and classify the intrusions in VANET by KNN and SVM algorithms. The intrusion detection technique relies on the analysis of the offset ratio and time interval between the messages request and the response in the CAN.


Introduction
Advancement in technology has brought about the concept of intelligent vehicles which are considered to be more efficient and safer for the users.Intelligent vehicles tend to be connected to other vehicles, roadside infrastructure, such as the traffic management system and the internet, hence making them to be among the Internet of Things.However, such high levels of connectivity have meant that intelligent vehicles are at risks of cyber-attacks which might interfere with different aspects of the vehicle, such as its communication systems, endangering the security and privacy of the vehicle as well as putting the lives of its passengers at risk [1] [2] [3] [4].Connected vehicle technology has always been aimed at solving the challenges that are occasionally experienced with intelligent transport systems.An Intelligent Transport System usually allows intelligent vehicles to be in a position to communicate with the roadside infrastructure, other vehicles on the road and other road users.The communication system of an intelligent vehicle is usually referred to as Vehicle-to-Everything (V2X) or it is also referred to as the VANET, an abbreviation for Vehicular Ad hoc Networks [5].
An ordinary VANET communication system is usually responsible for three main types of communication to be considered a smart automobile.Those types of communication are Vehicle-to-Infrastructure (V2I), Vehicle-to-Vehicle (V2V), and Vehicle-to-Pedestrian (V2P).V2I involves the vehicle communicating with the roadside infrastructures, such as location sensors and other traffic monitoring systems.V2V involves a smart automobile being able to share information with other vehicles on the road.V2P involves the communication between the vehicle and pedestrians on the road.A cyber-attack on this communication system of a specific car or the ITS system is likely to result in endangering the security and privacy of the vehicle as well as putting the lives of its passengers at risk [6] [7].
There have been numerous concerns about the privacy and security of intelligent vehicles and the intelligent transport systems with various attacker models for smart vehicles being experienced.Among these concerns are cyber security threats on the VANET communication system where cyber attackers may exploit any potential weaknesses within the system to jam and spoof its signal.This would result in the whole V2X system being affected through deceptive signaling and delaying of the signal so as to ensure that the message transmitted is distorted and does not achieve its intended purposes [8] [9].
Other security threats faced by smart automotive may include hacking through the internet, as connected vehicles have access to the internet, or physical access to the vehicle intelligence system [10].For example, in 2016 Charlie Miller and Chris Valasek, who are security experts, wirelessly hacked the intelligence system of the Jeep Cherokee.Miller and Valasek were able to demonstrate that the Jeep Cherokee intelligence system had security vulnerability when they were able to compromise its entertainment system, steering and brakes, and its air conditioning system while the driver of the car was still driving [11].Another example is with the Nissan Leaf, where its companion application became exploited by hackers using its identity number that is usually printed on the vehicle's windows.This vulnerability allowed the hackers to take control of the heating and air conditioning system [12].Tesla Motors is considered to have the best cyber security on its intelligent vehicle system due to the amount of resources and time that is continually spent tacker to open the doors as well as start and drive away with the car.However, the attackers would require having physical access to this car if they were to execute such a plan.For this reason, the risk was impractical, though Tesla addressed this vulnerability immediately.

CAN Bus
Controller Area Network (CAN) is a bus communication protocol that can be utilized as a standard for reliable and efficient transmission between vehicle nodes in real time.In such network, broadcast messages must transmit from one node to another on the bus and there is no information about the source and the destination address for the validation.This security hole leads to inject any message by an attacker that can course to system malfunction.ECU is an embedded system which used in today's vehicles to control the engine and other components' functions.It is a computer with inside pre-programmed and programmable computer chips that is almost like a personal computer.The car's engine computer ECU makes the engine function the engine using sensors to control all engine functions [13] [14].The engine ECU is the most vehicles is contacted to onboard diagnostic connector and then report all diagnostic information to all other ECUs.This technique is helping in reducing the amount of wire that is needed or not needed to go to every ECU in order to test them [15].Tuohy et al. [16] stated in their analysis research that intra-vehicle networks demonstrate that each electronic sensor in a vehicle requires a new ECU device and subsystem and calls for standardization of automotive networks in intra-vehicular networks.There are many V2V wireless communication protocols like vehicular collision warning communication protocol, direction-aware broadcast forwarding routing protocol, etc. [17].The study [18] introduced a comprehensive state of the art survey on Integrated Vehicle Dynamics Control (IVDC) where they discuss several methodologies of IVDC and control strategies of coordination between ECU subsystems.Inter-vehicle communications through media like infrared and microwave using various kinds of protocols, enable vehicles to obtain data to deliver road traffic safety and efficiency, which is otherwise impossible to measure with on-board sensors [19].Reliability of inter-vehicle communication is one of the main aspects to ensuring wide range deployment of cooperative vehicle systems.The inter-vehicle communication allows vehicles to exchange message within a short broadcast range [20] [21].There are different communication protocols are established to support the communication.The most important protocol is Controller Area Network (CAN).It is a serial-bus communication protocol that supporting to connect sensors and actors with ECUs [22].ECUs can be attacked and intruded.Bypassing network security protections in vehicular systems, and embedding malicious code are instances of attacks that can avoid a large number of safety-critical systems.An attacker can get remote code execution of ECU in Wireless Engineering and Technology automotive vehicles through interfaces like Bluetooth.The attacker can influence the behavior of vehicles such as steering braking, acceleration and display, etc.
As we mentioned earlier, due to the weakness of in-vehicle, attackers will be strongly motivated to exploit the vulnerabilities of CAN Bus.In this work, we investigated two types of attacks that occurs in-vehicle: DoS and Fuzzy attacks.
We proposed a classification method to detect these kinds of attacks.

DoS Attack
In DOS attacks the server of the network is flooded with too many requests.As VANETs are using wireless technologies, it is easy to launch a DOS attack very easily.As a result of it, VANET service receivers may not receive, requested services at real time and lead to catastrophic results [23] [24].
In DOS attacks, the attacker can inject high priority messages in a very short period of time into the bus "Figure 1".Not only that, it is easy to gain control of a node in the network by the attacker.Therefore, it is easy to send the highest priority identifiers.Thus, the network is flooded very quickly and eventually will lead to accidents and on.

Fuzzy Attack
In a fuzzy attack the attacker injects messages of randomly spoofed identifiers with arbitrary data.As the result of it, all nodes of the network receive lots of functional messages and it may lead to malfunction of the network.This may lead to mal behavior in vehicles.
To launch a fuzzy attack, the attacker observed in vehicle messages and selected the target identifier/s.This may course to unexpected behaviors.Following "Figure 2

CAN Bus
Recently, the research into CAN bus security has grown because of some dem- There are many researches that have introduced to solve the vulnerabilities of CAN-Bus.Cho and Shin [34]    Hu et al. [36] proposed an efficient hidden Markov model (HMM) training method for system-call-based anomaly intrusion detection.

Intrusion Detection with Machine Learning
Other approaches have discussed detection attack with packet data.Detection insertion attacks by using packet message entropy [37].The drawback of this method was not estimated against attacks that influence only the message contents of a packet.Markovitz et al. [38] introduced a novel domain-aware anomaly detection system for in-car CAN bus traffic.They discovered the presence of semantically-meaningful content field, Multi-Value field and counter or sensor fields through inspection of real CAN bus communication.This method could not have assessed with attack situations.Another approached done by Taylor et al. [39] proposed an anomaly detector based on a Long Short-Term Memory neural network to detect CAN-bus attacks.The system works by learning to predict the coming data word from any destination by bus.The bits that highly in the actual next word are flagged as anomalies.Their detection system is increasing in time CAN bus traffic rather than different data stream.
There is not a serious security system need for traditional vehicles because there is no network to communicate with external network.However, Controller Area Network (CAN) connects the parts of vehicle together.Vehicles become computerized and connected to external networks.If the security is achieved, then safety will be achieved as well.It is important to detect and prevent the attacks in order to protect the safety of people.Therefore, there have been many researches are working in detecting and preventing attacks that target vehicles.Hoppe et al. [40] proposed a scheme for in-vehicle intrusion detection based on the investigation of the rate of messages.Due to the number of messages on CAN bus that includes the sum the normal and attacks messages, they analyzed the rates of messages per seconds in order to detect anomalous message rates.
Muter et al. [41] introduced a method for anomaly detection.The proposed technique proved that there is no false-positive error, but if attacker injects messages and could not outcome and break and effect the CAN, then their algorithm cannot detect the attack at all.Fuad et al. [42] proposed an effective misbehavior detection model based on machine learning techniques.The method has four phases: data acquisition, data sharing, analysis and decision making.They used Artificial Network (ANN) methods using the feed forward and the back propagation algorithms.It works by classifying and training based on historical data from both normal or malicious data.They used a real traffic dataset which called (NGSIM), so that's making their model more effective.
Hortelano et al. [43] proposed evaluate the efficacy of watchdog modules for intrusion detection in VANETs.The scheme works by controlling all coming packets so the system could decide if there is attack or not.There are a lot of IDS was proposed based on watchdog module in vehicle systems.They introduced three contributions in their proposed system.First, they stated that this module is compatible protocol and can work with any protocol in ad hoc routing.
Second, it is a high detection system and can work with low detection system.Wireless Engineering and Technology Finally, it can promise the prior properties and efficiency reducing the most of false positives and false negatives.
Van Herrewege et al. [44] proposed improve to CAN bus messages by adding a message authentication protocol.They stated that the standard authentication protocol is not appropriate to CAN bus.Also, they presented a CAN bus protocol "CANAuth" which is compatible lightweight message authentication protocol, but a pre-shared key need to be known by all the nodes that could making verify messages.
Matsumoto et al. [45] introduced a technique of preventing unauthorized data transmission in CAN.All data on the bus is monitored by a protected ECU.It broadcasts an error message when identifying a spoofed message, and that occurred before the unauthorized transmission is finalized.Most of previous researches investigate about message rate-based intrusion detection on CAN bus, so they need to gather a huge amount of CAN bus messages and the goal is to compute the distribution of a message.Therefore, the modern vehicles have limited computer pow for their devices in order to detect and response immediately.To solve this problem Song et al. [25] suggest a light-weight intrusion detection scheme.The main contribution is simplifying detection algorithm to respond faster and reduce the usage of computing power.

The Proposed Intrusion Detection System
We propose an intrusion detection system that determines the intrusions in vehicles.We use two algorithms based on KNN and SVM to detect the DoS and  [1], DATA [2], DATA [3], DATA [4], DATA [5], DATA [6], DATA [7], flag."Table 1" shows the explaining of some attributes.
The structure of the two datasets is similar, though they represent different types of attacks.The DoS dataset represents DoS attacks, where it involves injecting message of "0000" CAN ID every 0.3 millisecond, we note that "0000" is the most dominant.However, fuzzy attack dataset represents injecting messages of totally random CAN ID and Data vales every 0.0 milliseconds.DOI: 10.4236/wet.2018.9400787 Wireless Engineering and Technology We mentioned before that both datasets are similar, so we applied the same preprocess on to both of them.First, we added appropriate header names to each dataset as they are unmarked with headers.Then, we removed unnecessary columns, which were the Timestamp as we do not have a time-series analysis.
We removed the missing data as well.We also hexadecimal data into decimal format.Finally, we marked the normal messages with 1 and the injected messages with 0. For classification, we used two algorithms of the most well-known classification techniques: Support Vector Machine and K-Nearest Neighbor.First, we made a preprocessing for data as we mentioned above.Then, we extracted the features of each dataset.After that, we implemented KNN and SVM, and we will explain how they work in the next step."Figure 4" shows our proposed model.

K-Nearest Neighbor
There are various analysis algorithms used in machine learning.KNN is a non-parametric algorithm intended for classification and regression.It has various advantages over other machine learning algorithms: ease of interpreting algorithm's output, low calculation, and high predictive power [23].It is a simple algorithm that stores all existing case and classifies new cases by a same measure such as distance functions.Any case in this algorithm is classified based on a majority vote of its neighbors, with the case being allotted to the class most common between its K-nearest measured by a distance function.For example, if K is an integer K = 1, then K is assigned to the class of its nearest neighbor [24].

Support Vector Machine (SVM)
SVM is a supervised machine learning algorithm, and it can be used for either classification or regression challenges [47] [48].However, it is mostly used in classification problems.Therefore, there are many applications of SVM like in E-commerce, Stock marketing, etc.Not like other machine learning algorithms, SVM is based on the concept of decision planes that defines decision boundaries.
It is a type of graphical approach.

Performance Evaluation and Discussion
Through this analysis, we used Python, and the main reason we used python is because it has a free library for the Python programming language called Scikit-learn which helps a lot with machine learning.It delivers a range of supervised and unsupervised learning algorithms by a consistent interface in Python.It features various classification, regression and clustering algorithms including SVM, KNN, linear regression, etc. Actually, Scikitis is designed to interoperate with the Python numerical and scientific libraries like NumPy and SciPy.This library is focused on modeling data but not on loading, manipulating, and summarizing data.In order to detect the intrusive data, we planned to use clustering techniques like using K-means and K-medoids to detect the outlier sample and isolate it from the original data, but we received the data so we used the classification algorithms.Therefore, there is no signal processing and no noise, so we do not need to use a filter like Kalman.

Metric Comparison
Figure 1.DoS attack scenarios on in-vehicle network.

Figure 2 .
Figure 2. DoS attack scenarios on in-vehicle network.
onstrations of the vulnerabilities of in-vehicle networks.Previous approaches to detecting attacks on the CAN bus have mainly been based on timing information.CAN packets are normally transmitted at a regular frequency, so controlling frequency detection to defense against most attacks[25] [26][27].There are several approaches have been provided to be a very effective at detecting inserted and missing packets[28] [29][30] [31][32].There are many of researchers have been proposed anomaly detection methods for connected vehicles.The most of these methods and proposes were related to Controller Area Network (CAN Bus).They are based on characteristics of in-vehicle architectures and networks.Manufacturers must pay attention to CAN Bus standard weakness that might impact the security of vehicles.CAN is a standard that allows communication between numerous mechanisms in modern automobiles.Corey Thuen, senior security consultant at IOActive, explained the attackers can exploit many vulnerabilities in the technology systems of modern vehicles and 27% of these weaknesses due to exploit the CAN protocol.Exploit the CAN is lead to control the connected car.The goal of designing CAN is for half-duplex and high-speed transmission bus inter vehicle network.It delivers up to 1Mbps communication rate[33].The automotive manufactures are commonly used CAN protocol.In CAN, each Electronics Control Unit (ECU) sends a message to the vehicle network using a data packet.There is no clear destination for CAC packet.Therefore, ECU sends the message alone with its ID number and then the ECU on the destination re-Wireless Engineering and Technology trieves the message by sender ID.There are four mains frames inside in the communication inside CAN: the data frame, the remote frame, the error frame and the overload frame.Most of the communication comes over data frames, which creates of the data field, acknowledge field, arbitration field, and Cyclic Redundancy Check (CRC) field.Also, the arbitration field comprises an 11-bit identifier field and a Remote Transmission Request (RTR) field, that is used in arbitration and must be set to a foremost bit of a data frame.It will follow by 8-byte, then the cycle redundancy check field."Figure3" shows the structure of the data frame.
Intrusion detection methods have been deliberate to help the network prevent malicious attacks.Machine learning has been studied extensively in intrusion detection in VANET.In literature, quite number of effective intrusion detection techniques are developed based on machine learning techniques, based on the statement that the forms of the attack packets differ from those the normal packets like other ad hoc network types.

Figure 3 .
Figure 3. CAN data frame structure caption.
Fuzzy attacks.Through the analysis, we use two car-hacking datasets: "DoS dataset" and "fuzzy dataset", which are provided by the Hacking and Countermeasure Research Lab (HCRL) [46].These datasets came from real vehicles by connecting CAN traffic by the OBD-II port.Then, they got the performing of the message injection attacks.Each dataset has 300 intrusions of message injections and each intrusion achieved from 3 to 5 seconds.Each dataset needs from 30 -40 minutes of the CAN traffic.The DoS data set has 3,665,771 numbers of the messages, 3,078,250 messages are normal while 587,521 injected messages.It has 12 columns.Fuzzy dataset contains 3,838,860 rows of messages and 12 columns.It has 2,759,492 normal messages and 1,079,368 injected messages.In DoS attacks the injecting message occurs of '0 × 000' CAN ID, while in fuzzy attack Injecting messages are spoofed random CAN ID and DATA values.The 12 attributes are: Timestamp, CAN ID, DLC, DATA [0], DATA

Figure 6 .
Figure 6.Comparison between metrics for DoS dataset.
[35]osed a Clock-based IDS (CIDS) based Intrusion Detection system to protect in-vehicle Electronic Control Units (ECUs) and mitigate attacks in-vehicle network.The CIDS is able to extract clock skews from message intervals, fingerprints the transmitter ECUs, and models their clock behaviors using Recursive Least Squares (RLS).Then, CIDS could detect intrusions by CUSUM analysis based on the thus-constructed model.They clarify that the experiment applied on real vehicles and CIDS is able to detect various types of in vehicle network intrusions.Wand and Sanjay[35]proposed VeCure applied security method for vehicle systems, which is able to solve the message authentication vulnerability of the CAN Bus.The characteristics of VeCure method are.It's compatible with the modern connected vehicle systems.Besides, it can provide a trust structure.It is considered a novel message authentication method with offline computation ability to decrease the delay and cost of online message processing.VeCure provides a message authentication on the CAN in order to isolate the spoofed messaged injected from a targeted or compromised ECU and OBD-II port.

Table 1 .
Data attributes of CAN.
Figure 8. Accuracy.detectDoS and the Fuzzy attacks which occur on CAN Bus.We use two data sets, one for DoS Attack and other one for Fuzzy attack which is created by HCRL.We preprocessed the data and, then implemented the KNN and SVM algorithms.Both of them provided great results, however, KNN gave better performance than SVM.In future work, we will use some other classification algorithms and make the comparison to get the best one for IDS in vehicles.Besides, we will work to come up with a new method to prevent some attacks on CAN DOI: 10.4236/wet.2018.9400791 Wireless Engineering and Technology