A Privacy Preserving Federated Learning System for IoT Devices Using Blockchain and Optimization ()
1. Introduction
The integration of the Internet of things (IoT) has provided technological advancements in most sectors of human endeavors, such as healthcare [1], power [2], agriculture [3], manufacturing [4], entertainment [5] [6], etc. The IoT facilitates autonomous sensor or device communication and real-time decision-making based on the vast computational power of cloud and edge computing infrastructures [7]. The sensors generate a massive amount of data, which is paramount in IoT applications. Moreover, conventional methods have been employed to share the data, and they have centralized storage systems and management, which are prone to threats and data leakage. Additionally, centralized systems have been known to face single points of attack or failure [8]. Furthermore, data privacy concerns cannot be ruled out when dealing with data sharing using conventional data sharing methods. Since data owners have distinct or unreliable behaviors and uneven data distribution, there is a likelihood that shared models will be compromised. Therefore, the federated learning approach provides secured shared models where training and model updating are performed without sharing the actual data. Federated learning provides an enabling environment where a distributed machine-learning approach is created without the need to share data samples. Moreover, problems of data heterogeneity, privacy preservation, and data availability are resolved in federated systems while minimizing model bias, which makes it a better machine learning paradigm for IoT [7]. However, federated learning has encountered several challenges, making it not suitable for real-world IoT applications. The challenges are resource management problems, especially for resource-constrained IoT devices or sensors when they become central servers; single points of failure for central servers as any disruption can make collaborative learning inefficient; and federated learning is not scalable when the number of participating devices increases. Federated averaging is a popular method for updating the global model during global model training. Devices receive the updated global model when the global model has been updated through the averaging of local models [9]. However, in real-world situations, federated averaging could not agree, especially if the data is not evenly distributed among devices and there are appreciable variations in the number of data samples. The reason for this is that in the early phases of training, local models may not always outperform the average-based global model. To this end, there is a need to provide efficient methods for updating the global models. Additionally, more robust decentralized systems need to be provided in the context of federated learning.
Nowadays, blockchain technology acts as one of the security providers in IoT applications, and data can be shared in a decentralized and distributed fashion. Federated learning incorporated into blockchain protects participants from intrusion attacks using advanced machine learning algorithms [10]. Blockchain improves transparency and trust in IoT systems, while federated learning facilitates speedy distribution and model training across multiple IoT devices. Table 1 provides the list of abbreviations and their meaning. Motivated by the limitations of [9], this study proposes a blockchain-based federated learning system using an enhanced weighted mean of vectors optimization algorithm for IoT environment. The specific contributions of this study are as follows:
Table 1. The list of abbreviations used in this paper.
Abbreviation |
Full name |
AI |
Artificial intelligence |
IoT |
Internet of things |
IoMT |
Internet of medical things |
QoS |
Quality of service |
PPFLEC |
Privacy protection federated learning under edge computing |
FedMSQE |
Federated learning with minimum quantification square error |
FL-PMT |
Federated learning-based person movement identification |
BiLSTM |
Bidirectional long-short term memory |
SCALT |
Scalable and transferable classification system |
MI |
Mutual Information |
QTA |
Quality-oriented task allocation |
TCL |
Trust-based collaborative learning approach |
EMT |
Encrypted model training scheme |
DPoS |
Delegated proof of stake |
CNN |
Convolutional neural network |
CHCT |
Chameleon hash scheme with a changeable trapdoor |
RMB |
Redactable medical blockchain |
RSA |
Rivest-Shamir-Adleman |
DES |
Data encryption standard |
AES |
Advanced encryption standard |
ODMS-FL |
Optimal data management and secured federated learning |
MODE |
Multi-objective differential evolution |
IMOAO |
Improved multi-objective Aquila optimizer |
OBL |
Opposition-based learning |
MACSA |
Multi-objective cuckoo search algorithm |
PMQoSR |
Priority-based multi-objective quality of service routing |
WLFA |
Whale lion fireworks optimization |
MOEA/D |
Multi-object evolutionary algorithm based on decomposition |
NSGA-III/OBL |
Non-dominated sorting-based genetic algorithm incorporated with opposition-based learning |
ODNN |
Optimized deep neural network |
IoTFECNN |
Internet of things feature selection convolutional neural network |
CSA |
Capuchin search algorithm |
FedAvg |
Federated averaging |
EINFO |
Enhanced weIghted MeaN oF vectOrs |
1) To propose a security and privacy preservation system for IoT using blockchain-based federated learning.
2) To improve federated learning using an enhanced weighted mean of vectors optimization algorithm.
3) To enhance a delegated proof of stake (DPoS) consensus protocol using credibility status method for the selection of delegates.
This rest of this paper is organized as follows: Section 2 discusses literature review in three subsections: federated learning systems for IoT, federated learning based on blockchain technology for IoT and multi-objective optimization problems for IoT. Section 3 provides the proposed system model and problem formulation while Section 4 discusses the simulations results. Section 5 concludes the paper with future recommendations.
2. Literature Review
This section discusses the literature review in three subsections as follows:
2.1. Federated Learning for Internet of Things
Today, the Internet based on 5G and 6G has enabled the deployment of billions of IoT devices [11]. It implies that a massive amount of data will be generated from IoT devices, creating room for big data. Unfortunately, most IoT devices are controlled by central systems, which are prone to high costs of communication, storage, security, and privacy concerns. Additionally, robust algorithms are required for aggregating data on IoT platforms. To this end, federated learning provides a promising way to solve the limitations mentioned above. Federated learning is a data-driven machine learning paradigm that ensures collaborative learning between different participants without disclosing sensitive information about them. Thus, it minimizes the costs of storage, communication, and maintaining privacy. Federated learning has certain advantages for the IoT, such as scalability, improved model performance, and privacy preservation. Scalability is achieved via federated learning, where multiple IoT devices can leverage limited computation resources, including hardware, storage, etc., in a parallel manner, especially for low-bandwidth IoT devices. Due to the distributive nature of federated learning, more devices can join the network without incurring extra costs on a centralized server. Since a single IoT device may be resource-constrained, it is possible to have insufficient data to train a high-quality model. With federated learning, a single IoT device can collaborate with other devices to train high-quality models without exposing privacy. It means that raw data cannot leave devices during the model training process while model update parameters are shared between participants and the server. A review in [12] suggested systems heterogeneity and statistical heterogeneity to be challenges associated with federated learning for the Internet of medical things (IoMT). The system heterogeneity challenge occurs when each device in federated learning has distinct hardware, processing, communication, and storage capabilities. It limits the maximum efficiency when deploying federated learning and may increase laggard mitigation and fault tolerance. Statistical heterogeneity occurs when the number of data points on different devices absolutely differs, which may increase model complexity in analysis and evaluation and laggard risk. The authors in [13] presented a framework based on federated learning for healthcare IoT. The proposed framework is suitable for decentralized databases while achieving quality of service (QoS) and privacy. However, issues of scheduling and coordination in federated learning are not addressed. To this end, the authors in [14] proposed a privacy protection scheme for federated learning under edge computing (PPFLEC). The scheme preserves privacy using shared secrets and weight masks. Also, it protects devices against collusion attacks and equipment dropping while ensuring consistency and integrity using digital signatures. However, the deployment of neural network models is expensive and challenging. To resolve the challenge, a fixed-point quantizer with stochastic rounding is adopted, but how to achieve the minimum square quantization error is not resolved. The work in [15] proposed federated learning with minimum square quantization error (FedMSQE) to minimize quantization error for every participant in the federated learning. However, how to resolve the problem of malicious participants degrading the model’s quality by sharing low-quality data is not addressed. Therefore, the work in [16] addressed the problem by proposing an approach-based clustering to utilize social content data for selecting participants. Here, different groups of edge participants were established using group-specific federated learning. Aggregation is performed using models of different edge groups to achieve the robustness of the global model. However, intrusion attacks from different edge groups are not addressed. To this end, the authors in [17] proposed a method based on federated learning to detect unwanted intrusions among participants. This method guarantees the privacy and security of the local training models while sharing gradient parameters with the central global server. Afterwards, the server aggregates and shares the improved detection algorithm with the participants. However, the memory and computational costs of training unlabeled data on a cloud server are not addressed. The limitation was tackled by [18] via the federated learning-based person movement identification (FL-PMT) system. Deep reinforcement learning is used in the system to auto-label the unlabeled data, which is then used to train the model. In this case, the edge server allows parameters to travel via the cloud instead of the sensor data. In addition, bidirectional long short-term memory (BiLSTM) is employed in a number of smart healthcare system operations to categorize data. However, how to determine the contribution rate of each participant is not considered. The authors in [19] presented IoT for the healthcare system based on federated learning. Instead of using dataset size to estimate the contribution rate, the method uses the qualities of each participant’s datasets. Furthermore, the dropout-tolerable strategy is used to stop the federated learning process, particularly if the number of online participants is equal to or greater than the predetermined threshold. Furthermore, the proposed method is impervious to attacks, including model inversion and reconstruction. However, how to improve latency is not considered. The study in [20] makes use of clustered federated learning and edge computing to diagnose COVID19. The proposed method trains a multi-modal machine learning model that can diagnose COVID-19 in both X-ray and ultrasound images, enabling intelligent processing of visual input at the edge. However, the distribution and appearance of health data have unknown classes. Therefore, the authors in [21] proposed a scalable and transferable classification system, known as SCALT. The system is a one-classifier-per-class-based federated learning system that consists of a one-dimensional convolutional network used for feature extraction and an individual mini-classifier for every class. When a new class emerges, scaling is simple because only a mini-classifier needs to be trained. Only when it is moved to a new task does the feature extractor get updated. Another work in [22] combined deep neural networks, federated learning, and mutual information (MI) for effective feature selection and extraction. The proposed method can be used for anomaly detection of intrusions in the IoT network. Besides, IoT devices store information locally for model training, while models’ weights are modified and shared with a centralized server. However, none of the authors in [11]-[22] solved the problem of global model aggregates in a synchronous and asynchronous federated learning context.
2.2. Federated Learning Based on Blockchain Technology for Internet of Things
The IoT paradigm allows a significant number of physical devices to be connected to the Internet, where massive amounts of data are generated. The data generated from IoT devices is useful for training high-quality machine learning models, more specifically deep learning models. It means that patterns can be easily inferred and appropriate intelligent decisions can be derived from data via learning models. Unfortunately, the IoT paradigm is centralized, which makes it vulnerable to privacy leakage, issues of scalability, security threats, etc. Federated learning, among many solutions, reduces privacy leakages through collaborative training without disclosing sensitive information, but it is still prone to security threats, trust issues, and single points of failure or attack. By integrating blockchain technology, federated learning for IoT systems can be holistically se-cure, and problems of trust and single points of attack can be resolved [23]. A survey in [24] discussed the current challenges of blockchain-based federated learning for IoT applications, which include privacy leakage; lightweight blockchain for federated learning and IoT that should balance between privacy and security, storage and communication cost, and scalability and power consumption; lazy clients; stragglers; statistical heterogeneity; system heterogeneity; unsupervised federated learning; and artificial intelligence (AI)-enabled smart contracts. Additionally, an efficient real-time privacy policy is required to secure participants’ data in healthcare IoT applications. Therefore, the authors in [25] presented a blockchain and federated learning-empowered secure architecture for privacy preservation in smart healthcare. Also, blockchain-based cloud platforms are employed for privacy and security. However, personalized healthcare demands cannot be met by a one-size-fits-all model due to the diversity of health problems among patients. A blockchain-enabled personalized federated learning system that enables users to train personalized models without directly uploading sensitive data was proposed in [26] as a solution to this issue. However, only a subset of IoT devices with limited training data are selected to perform federated learning tasks due to their low budget. The problem was overcome by the authors in [27], who presented a blockchain-based federated learning market to decentralize federated learning via blockchain, maximize the amount of training data to maximize budget, and provide data for devices with limited resources. They proposed the quality-oriented task allocation algorithm (QTA) to assign suitable devices to complete federated learning tasks while optimizing training quality under a set budget and the trust-based collaborative learning approach (TCL) for data sharing among trusted devices. In order to fend off attacks from malevolent devices, an encrypted model training scheme (EMT) employs countervailable differential privacy technology. Furthermore, a fair reward distribution is guaranteed using the proposed contribution-driven DPoS consensus mechanism.
However, the heterogeneity problem is not resolved. The work in [28] addressed the het-erogeneity problem in federated learning by designing systems, model tiers, and data to reduce heterogeneity and propagate high-quality models for each federated client. Be-sides, blockchain-enabled federated learning lowers latency and consumption while pre-serving privacy. Two technological challenges, namely how to provide correct global model aggregation via a centralized federated learning server and how to develop techniques that encourage federated learning clients to donate their computer resources and time, were not solved. The authors of [29] addressed the issues by putting forth a decentralized solution based on federated learning and blockchain. They also proposed a trust-decentralized loop federated learning consensus protocol to manage resources within IoMT. During the aggregation process, suitable features are chosen using a hybrid weighted-leader exponential distribution optimization algorithm, which suggests that multiple features exhibit different degrees of variation across each feature. These chosen features are then sent to the training phase through the proposed pyramid squeeze attention generative adversarial networks in order to categorize the data as positive and negative. To achieve more efficiency in blockchain-based federated learning, the authors in [30] integrate blockchain and federated learning in a fog-IoT network. The proposed approach utilizes the distributive structure of the fog-IoT network to produce an adaptive network for IoT devices while preserving privacy. However, the problems of latency, da-ta sufficiency, traceability, and privacy concerns due to the increasing number of heterogeneous data are not solved. The authors in [31] proposed a federated learning-enabled blockchain-based framework, known as PPFchain, to guarantee the security and privacy preservation of IoT devices. In PPFchain, cryptographic primitives and a federated learning model are employed to ensure privacy in off-chain fog nodes, while blockchain is used to achieve low-cost and high performance in the network. Another work in [32] presented a technique based on blockchain and deep learning to preserve the privacy of electronic health records. In this technique, a convolutional neural network (CNN) is employed to distinguish between normal and abnormal users, while blockchain-based federated learning processes abnormal users and removes them from the database along with their accessibility of the health records. But because unscrupulous individuals can readily tamper with model updates, federated learning is susceptible to this kind of attack. The authors of [33] created a chameleon hash scheme with a changeable trapdoor (CHCT) for secure federated learning in industrial IoT environments in order to solve this limitation. The use of trapdoors was subject to a number of restrictions under the planned plan. Furthermore, a redactable medical blockchain (RMB) is an implementation of the CHCT system. However, a blockchain-based federated learning approach may not be the most plausible solution, as it is easy for models to be manipulated and intercepted during transmission between users and servers. Therefore, the authors in [34] proposed a system that ensures the security of the transmitted model between every component in the federated learning. Besides, the model is encrypted with Rivest-Shamir-Adleman (RSA), data encryption standard (DES), and advanced encryption standard (AES) algorithms, while the checksum is determined using a hash function, which is stored along with a private key in the blockchain. However, there are needs to improve existing solutions for safeguarding and effective administration of sensitive data generated by IoT devices. The authors in [35] designed an optimized data management and secured federated learning (ODMSMFL) system while integrating blockchain. The proposed system is capable of addressing unique requirements of IoMT, such as decentralized data ad-ministration, federated learning, and security. Besides, data management is enhanced using blockchain, which facilitates retrieval, adequate storage, and exchange of data without disclosing privacy.
2.3. Multi-Objective Optimization Problems for the Internet of Things
The IoT paradigm allows interactions among systems, processes, software, and technology over the Internet. It provides room for achieving the system’s efficiency but increases factors such as time of operation, energy consumption, delay, and workload, which may lead to conflict among them. To mitigate this problem, the factors can be formulated as a single or multi-objective optimization problem. In [36], the authors provided a multi-objective-based evolutionary algorithm where rapid mutation operators and multi-objective differential evolution (MODE) are employed to address the problem of stagnation of the local optimum. They considered the objectives of IoT services, such as load, energy consumption, delay, and service cost, as the basis for the multi-objective optimization problem. Pareto front in the local space provides sufficient diversity and accelerates the rate of convergence. However, because of the resource constraints of IoT devices, task completion increases along with delay. Therefore, the authors in [37] presented an improved multi-objective Aquila optimizer (IMOAO) with Pareto front to offload tasks from devices to fog nodes while minimizing delay. Opposition based learning (OBL) is used to improve the IMOAO algorithm and achieve sufficient diversity. How-ever, improper deployment of IoT devices on either a federated learning system or fog computing can lead to resource and bandwidth waste, a rise in energy consumption, and a poor QoS level. Thus, in [38], the authors proposed a mechanism to reduce bandwidth wastage, energy consumption, and single points of failure. Also, a multi-objective is formulated to minimize both energy consumption and delay between each component of the IoT network. Besides, a combinatorial optimization problem is solved using the multi-objective cuckoo search algorithm (MOCSA). However, an improper estimation model for control and monitoring of end-to-end communication and sensing is caused by the energy constraints of IoT devices. Also, it is challenging to achieve QoS requirements in IoT networks. Therefore, in [39], multi-objective optimization for QoS routing method is proposed to distinguish traffics while deriving better data communication. Also, a mechanism based on energy-efficient priority-based multi-objective QoS routing (PMQoSR) is designed to ensure QoS and energy in the IoT networks. The whale lion fireworks optimization method with fitness function routing (WLFA) mechanisms is an optimization technique with three hybrid algorithms that the proposed system uses to control the routing performance based on QoS criteria. The WLFA uses priority label and time delay patterns while transferring data to the destination in order to minimize localization errors, prevent congestion, and choose the shortest way via the network. However, IoT device selection, which is an NP-hard problem, needs to be solved. The authors in [40] formulated the IoT device selection problem as a multi-objective problem and incorporated OBL in the general framework of a multi-objective evolutionary algorithm-based on decomposition (MOEA/D). Also, convergence and diversity are enhanced using the many-objective algorithm, known as the non-dominated sorting-based genetic algorithm incorporated with OBL (NSGA-III/OBL). However, there is a challenge with IoT service placement in fog. Therefore, in [41], the authors provided a conceptual framework based on fog-cloud control to optimize IoT service placement. An automated planning model is formulated to manage service requests because of the heterogeneity of IoT resources and applications. Besides, automated evolutionary-based particle swarm optimization was employed to solve the IoT service placement problem while maximizing resources and improving the QoS of fog. However, achieving optimal security is challenging in the IoT network. To this end, the authors in [42] improved a meta-heuristic-based clustering protocol to achieve optimal communication. Here, an optimized deep neural network (ODNN) is used to detect malicious IoT devices based on their energy characteristics. Energy characteristics are determined using optimal cluster head selection, optimal routing, and neighborhood-based spider monkey optimization. However, due to the dynamic nature of the IoT environment, intrusion detection systems are paramount. In [43], the authors presented a multi-objective evolutionary CNN to detect intrusion. In the proposed approach, CNN is a classifier to detect intrusions, while MOEA/D is used to simplify the parameter tuning process of CNN. Specifically, MOEA/D simultaneously optimizes the two competing goals of the CNN model, such as detection performance and model complexity. It is achieved through a novel encoding scheme that converts CNN’s topological architecture into a chromosome. It allows MOEA/D to produce a variety of intrusions with different CNN model detection performances and complexities. Similar work in [44] designed an IoT feature extraction CNN (IoTFECNN) to detect anomalies in the IoT network. Besides, a binary multi-objective enhanced Capuchin search algorithm (CSA) called BMECapSA is proposed to efficiently select features. However, there is a lack of collaboration between application containers and resource allocation problems in the IoT-based cloud. The work in [45] presented a container model named Band-area application container to express in reality the variety of things. Also, an artificial fish swarm algorithm is proposed for optimizing container-enabled task scheduling. Another work in [46] presented multi-objective combinatorial convex optimization to minimize execution cost while blockchain-enabled cost-efficient scheduling algorithm framework to address deadlines and security challenges in IoMT. However, it is more challenging to satisfy the requirements of industrial IoT systems. Therefore, the authors in [47] presented a multi-agent deep reinforcement learning-based offloading method to satisfy the various requirements of different tasks in cloud-edge device computing.
3. The Proposed System Model
The proposed system model is presented in Figure 1. In the figure, the proposed system model comprises three layers: federated learning, an optimization algorithm, and a blockchain network. IoT device security and privacy can be achieved through the use of federated learning and blockchain technology. In this research, we adopt Paillier encryption for security and privacy instead of differential privacy or anonymization techniques, which could lead to more difficulties when it comes to training data and auditing. According to this study, IoT devices have limited resources and are unable to transmit data and conduct both local and global training at the same time. However, by lowering the overall computation cost and transmission delay of the proposed system through the use of robust machine learning and game theoretical methodologies, this work expects to provide answers to the limitations in the future. During local and global model training, blockchain enables federated clients, i.e., IoT devices, to exchange model parameters through uploading and downloading. Additionally, every aspect of IoT device operation is protected against both internal and external threats. The IoT devices in this study are in charge of local task initialization and model bootstrapping [48]. IoT devices can create new blocks on the blockchain by acting as
Algorithm 1. The proposed blockchain-based federated learning with enhanced weighted mean vector optimization algorithm. |
1: Parameter:
2: Initialization:
3: Generate Initial population using Equation (15)4: Determine best vector
5: for
to T do6: for
to N do7: Client n performs local training using Equation (1)
(1)8: Client n updates Equation (2)
(2)9: end for10: if
, then
(3)
(4)11: else
(5)
(6)12: end if13: if
, then
(7)14: else if
, then
(8)15: else
(9)16: end if 17: if
, then18:
(10)19: else
20: end if21:
22:
where
is a random number23: if
then24:
25: Update
26: end if 27: end for |
validators and initiators. In a specific case, the validator turns into a miner based on how many stakes it has. Therefore, the DPoS consensus protocol is adopted in this work [49]. Miners hash all of the recently generated blocks and digitally sign them. In the blockchain, the signed block is contained as a transaction. Over the network, the validators disseminate confirmation messages for blocks. Prior to the creation of any block in the blockchain, the miners are in charge of making this determination. Upon receiving the blocks from a miner, validators have to confirm their authenticity by contrasting the signatures on each block with those that have previously been stored in the blockchain. A new block is formed when the miner obtains a sufficient number of confirmation messages from the validators, provided that the number of valid messages exceeds the number of invalid messages. The most recent block is also appended to the end of the blockchain in chronological order. The nonce, the previous hash value, the current hash value, and the address of the data block make up each blockchain’s block, as shown in Figure 1. The data block, nonce, and hash value from the previous block are used by the IoT devices create a new hash value of the cur-rent block. Each IoT device has a memory address to store all blockchain transactions and a wallet address to store cryptocurrency. In our proposed system, DPoS overcome the activities of malicious nodes by maintaining network security and integrity via frequent voting of delegate and delegation change. This help to checkmate the activities of malicious node in the system. Furthermore, In DPoS consensus protocol, IoT devices select delegate to vote on behalf of those who chosen them and the delegate is dismissed if it under-performed. The delegate is given the opportunity to elect validators to propagate new blocks. It means that computational power required to mine or create new blocks is drastically minimized [49]. In the process of selecting delegates, reputation of delegates is considered rather than depending on their stakes alone. Note that every IoT device is assigned a node in the blockchain, which implies that each node has distinct reputation status. To this end, the number of nodes that vote changes accordingly. At the start of each election, the credibility status of node is calculated as follows:
(11)
where
is the number of votes,
is the reputation score, which can be calculated using page ranking [50], direct or indirect trust computation, etc., and
is the degree of honesty. If
, then node behaves honestly and can be selected as delegate provided it has more stakes; otherwise, the node is dishonest. Equation (11) is sorted in ascending order to determine whose node will be selected first as delegate.
The federated learning layer consists of central server and federated clients. Federated learning is designed based on the principle of training machine learning models over a decentralized servers or devices holding local data without disclosing sensitive information. This addresses the concerns of security and privacy. In Figure 1, federated clients (i.e., IoT devices) train local data using the consensus
Figure 1. The proposed blockchain-based federated learning for IoT environment.
machine learning model, e.g., CNN, to train local data. Local model training and updates are carried out in the federate client, and the local model is transmitted to the server while the proposed optimization algorithm is used to optimize the global gradient parameter. Using the Paillier encryption, encrypted model and gradient parameters are uploaded and downloaded to the block-chain by the server. The server receives the model parameters and processes them further. Global gradient updates and aggregate model parameters are sent to the federated clients from the federated server side. In a federated learning system, federated averaging, named FedAvg, is a common strategy that sends the updated global model to devices for their local model update after updating the global model by averaging local models [9] [51]. However, FedAvg may diverge in real-world scenarios, particularly when the data is not uniformly distributed across devices and the quantity of data samples differs noticeably between devices. It is because the average-based global model is not always superior to local models in the early stages of the training process. Moreover, there are variants of FedAvg such as theoretical guarantee, momentum method for clients, adaptive FedAvg, lazy and quantized gradient, etc. [52]. Unlike the work in [53] that excludes the central server from federated learning in a peer-to-peer environment, this study solves the problem of FedAvg by proposing an enhanced weIghted meaN oF vectOrs, named EINFO, for optimizing gradients collected from devices during local model training. Note that EINFO is derived from the study in [54] [55]. The proposed EINFO optimization algorithm addresses the problems of balancing exploration and exploitation found in most optimization algorithms. Let the vector
, and we consider the case of decentralized federated learning with momentum where client
holds the appropriate copy of parameters
and compute the unbiased estimate of
, which is defined as [52]
(12)
where
is the data distribution of the nth client and
is the loss function associated with the training data
. The client n updates its local parameters
as the weighted average of its neighbors:
where N is the number of clients. The client updates its parameters as
, where learning rate
and
is the global gradient vector. For simplicity, at every
iteration, each client computes
(13)
where
. We optimize Equation (13) using the proposed EINFO optimization algorithm, and the weighted mean
is calculated as [54].
(14)
where
, which is calculated based on
wavelet function, and during the optimization process, wavelet function is used to produce noticeable fluctuations.
is a constant number called the dilation parameter, and
is the fitness function. In the proposed EINFO algorithm, the weighted mean vector for the search space is calculated while the population is generated based on a set of vectors that describe the possible solutions. Besides, the proposed EINFO is described in the following stages:
3.1. Population Initialization Stage
The proposed EINFO optimization algorithm consists of a population of x vectors in N dimensional search space and a k number of decision variables, which corresponds to the number of clients in the federated learning. The random population is defined as follows:
(15)
where
and
are the lower and upper bounds of the population,
is a random number, and there are two control parameters: weighted mean factor
where
, r is a random value,
is the maximum number of generations, and scaling factor
. The control parameters are employed to amplify the obtained vector through an updating rule that depends on the size of the search space. Moreover, the control parameters can be tuned dynamically on the basis of population generation.
3.2. Updating Rule Stage
In this stage, a new vector is created using the updating rules. The updating rules increase population diversity during the search procedure. We modified the mean-based rule of [54] using the inverse square law. The mean-based rule is derived from the weighted mean of the random vectors, where it begins with the initial population and moves to the next solution using the weighted mean information. Moreover, the updating rule accelerates the convergence rate, which helps reach the optimum solution and improves the algorithm’s performance. The mean-based rule is designed on the basis of the best
, better
, and worst
solutions. The mean-based rule is defined as follows:
(16)
where
is a random value.
(17)
where
is a constant with small values and
is the fitness function such that
(18)
(19)
(20)
(21)
(22)
where
(23)
(24)
(25)
(26)
This study adds convergence acceleration to the updating rules of the proposed EINFO optimization algorithm to enhance global search capability via the best solutions. The convergence acceleration is defined as follows:
(27)
Therefore, the new vector
is defined as:
(28)
4. Simulation Result
This section evaluates the performance of the proposed EINFO optimization algorithm and compares its performance with the existing INFO algorithm [54]. We consider the population size
of 30 and the maximum iteration
of 500. The values of other parameters used in this study are given in Table 2. Initially, the study explores Griewank’s function as an optimization to evaluate the algorithms. Moreover, other optimization objective functions are considered in the evaluations. The proposed system is implemented using Python 3.10 with 8 GB RAM and processor capacity of 1.60 GH.
Table 2. The list of parameters and variables used in this paper.
Variable/parameter |
Meaning |
Value |
NV |
Number of voters |
30 |
RS |
Reputation score |
[0, 1] |
α |
Degree of honesty |
[0, 1] |
c |
Constant number |
2 |
d |
Constant number |
4 |
|
Maximum population |
30 |
|
Number of iterations |
500 |
N |
Number of clients |
1000 |
4.1. Evaluation of the Objective Function
Table 3 shows the performance of the proposed EINFO. The table shows that the best cost and execution time of the algorithm are used to evaluate performance. After around 30 iterations, the proposed EINFO exhibits the smallest optimal cost and surpasses the current approach in terms of average objective function values. The proposed EINFO’s performance shows that the algorithm is capable of doing appropriate exploration and search. Additionally, as Figure 2 illustrates, there is a suitable balance between exploration and exploitation, preventing early convergence. By utilizing the inverse square law of Equation (14), an improvement over the work in [54], the updating rule enhances both local and global search. However, there is a tradeoff between best cost and execution, as presented in Table 3.
Table 3. Comparison of the proposed in terms of best cost and execution time.
Model |
Best cost |
Execution time (s) |
INFO |
7.8744e−54 |
3.1160 |
Proposed EINFO |
4.3501e−05E |
3.3649 |
Figure 2. Evaluation based on convergence.
The convergence analysis presented in Figure 2 indicates that the proposed EINFO avoids premature convergence because it can search both locally and globally in solution space, finding solutions with a high density in the vicinity of the global optimal and a low density in the vicinity of the global optimal. It indicates that the proposed EINFO can effectively discover the optimal solutions by examining interesting regions in the search space that have the best cost.
4.2. Evaluation of the Proposed Federated Learning
Table 4 shows the performance of the proposed system in terms of accuracy, sensitivity, and specificity metrics. Specificity has a true negative rate and reveals more about the model than the accuracy metric, especially if the number of true positive and true negative instances is imbalanced. It is also similar to specificity [56]. This study compares the proposed model with multi-layer perception (MLP) [57], radial basis function neural network (RBFNN) [58], and extreme learning machine (ELM) [59].
In the table, it is observed that the proposed system model outperforms existing models in terms of accuracy, specificity, and sensitivity with higher values. The performance is achieved because of the optimized gradient parameter of the model using the proposed EINFO optimization algorithm. Moreover, the binary classification of the model is enhanced.
Table 4. Comparison with other models.
Model |
Accuracy |
Sensitivity |
Specificity |
MLP |
0.95 |
0.94 |
0.95 |
RBFNN |
0.90 |
0.90 |
0.91 |
ELM |
0.82 |
0.80 |
0.86 |
Proposed system (CNN) |
0.96 |
0.97 |
0.96 |
4.3. Evaluation of Proposed Blockchain-based Federated Learning System
In the proposed system, there is a tendency that blockchain can be split into different states, which creates rooms for compromise, known as forks. It implies that there are disagreements among delegates. This type of scenario allows two or more delegates to solve a nonce of blockchain simultaneously [60]. The probability that a fork will happen follows a Poisson, which is as follows:
(29)
where
is block propagation delay and
is the delegate;
is the expected time to generate blockchain, and
is the degree of honesty. The block propagation delay of the proposed blockchain-based federated learning is defined as:
,(30)
where
Kbits is the block header size,
Kbits is the transaction size,
is the block height, and
Mbits is the capacity of the transmission link. As shown in Figure 3, we assume different values of
and L. It is observed that the probability of fork creation slowly approaches zero for each value of
and
. It implies that a new block can be propagated if the value of
. Furthermore, only delegates with high reputation scores can mine and propagate new blocks. As the number of delegates increases, the probability of having a fork approaches zero, irrespective of the client’s capacity for transmission
Using Equation (11), it is observed in Figure 4 that the credibility status approaches one as the number of clients increases along with
. It implies that delegates have a high level of reputation. Furthermore, credibility status is directly proportional to reputation scores. Hence, the proposed DPoS consensus protocol is efficient in resolving the problem of in-discriminate propagation of new blocks.
In Figure 5, the evaluation of the proposed system model in terms of communication latency is shown for different numbers of clients. It is observed from the figure that
and
are considered in computing the communication delay.
Figure 3. Evaluation of blockchain-based federated learning.
Figure 4. Evaluation of proposed delegated proof of stake.
We consider different communication delays (in percent) to analyze the proposed system. Besides, when the
and communication delay is 10% and above, the communication response time is minimal because fewer clients are used for both validators and delegates. It implies that communication delays among clients are minimized. Also, as N continues to increase, the proposed system achieves
Figure 5. Evaluation of response time.
better response times for different values of communication delay. It shows the efficacy of the proposed system in terms of communication delay.
5. Conclusion
The paper proposes an enhanced weighted mean vector optimization algorithm, EINFO, in a blockchain-based federated learning system. The drawbacks of federated averaging during global update and model training—where data is not uniformly disseminated across devices and there are differences in the quantity of data samples—are tackled by the proposed EINFO. The EINFO algorithm maximizes the shared model parameters by employing a well-defined structure and updating the vector positions through local searching, vector combining, and updating rules. The weighted mean vector based on the inverse square law is used to create new vectors and enhance the model convergence rate to expand the exploration and exploitation capabilities. To choose validators, miners, and to propagate new blocks, a delegated proof of stake based on the reliability of blockchain nodes is suggested. Federated learning is included into the blockchain to protect nodes from both external and internal threats. To determine how well the suggested system performs in relation to current models in the literature, extensive simulations are run. The simulation results show that the proposed system outperforms existing schemes in terms of accuracy, sensitivity and specificity. In future, this study hopes to collaborate with relevant stakeholders for the real-time implementation of the proposed system model. Moreover, the proposed system did not discuss the communication model between the IoT devices as they exchange model update with the server, and also, privacy and security analysis will be carried out in future work to ascertain the proposed system model robustness against security and privacy related threats and attacks such as 51% attack, impersonation attack and transaction hacking. Similarly, as the blockchain became incapacitated, we hope to use off-chain storage system such as IPFS to store data address instead of the real data; furthermore, convergence speed and computational complexity will be conducted in the future work.