A Systematic Survey for Differential Privacy Techniques in Federated Learning ()
1. Introduction
Machine learning is a branch of artificial intelligence and computer science that focuses on using data and algorithms to mimic how humans learn (Bishop and Nasrabadi [1] ). Many fields (e.g., natural language processing (Nadkarni et al. [2] ), computer vision (Jarvis [3] ), bioinformatics (Fatima and Pasha [4] ), etc.) have benefited from machine learning. An imperative prerequisite to the success of machine learning is the availability of large amounts of high-quality data. With the help of the data, machine learning models can discover patterns in the data and perform tasks that are difficult for humans to carry out, such as fraud detection (Bolton and Hand [5] ), face recognition (Zhao et al. [6] ), speech recognition (Reddy [7] ), etc.
Machine learning models are centralized models, meaning that all the data for the training of the model must be centralized in one location (e.g., a data center). As a result, machine learning models face a number of challenges in practice. The first challenge is the breach of privacy (Ji et al. [8] ). Data privacy breaches have been observed in numerous examples as a result of centralizing personal data in one place (Ouadrhiri and Abdelhadi [9] ). Google+, for instance, has leaked personally identifiable information such as name, email address, occupation, gender, age, and relationship status of approximately 500,000 users because of a system vulnerability in 2018. Facebook compromised the personal information of approximately 533 million users in 2021 (e.g., phone numbers, login IDs, full names, etc.). Twitter compromised 5.4 million accounts in 2022, including phone numbers, locations, URLs, profile pictures, etc. The second challenge is the problem of data silos. Several countries around the world have passed strict laws to safeguard the privacy of personal information. These laws include the General Data Protection Regulation of the European Union and the Data Security Law of the People’s Republic of China. While these laws protect the security of data, they also restrict the flow of data, resulting in the problem of data silos.
A major challenge in the field of artificial intelligence today is how to resolve the problem of data silos while maintaining data privacy and security. Towards this end, Google proposed federation learning in 2017 (McMahan et al. [10] ). The federated learning model is a distributed machine learning model that enables multiple devices to collaboratively train the same machine learning model without exchanging local data, only by exchanging model parameters or intermediate results, thus achieving a balance between data privacy protection and data sharing computation. In the federated learning model, the data of each participant is stored locally instead of being centralized in the central server, so that the security of each participant’s data can be maintained to a certain extent. After the introduction of federated learning, it has rapidly received attention from both academia and industry, and many mature federated learning frameworks have been developed, for example, TensorFlow Federated developed by Google, Pysyft developed by OpenMinded (Ziller et al. [11] ), and FATE developed by Webank (Liu et al. [12] ).
While the data of each participant is stored locally in a federated learning framework, there is still a risk of data leakage (Li et al. [13] ). In the course of training a federated learning model, each participant must transmit information about the model parameters (e.g., gradients) to the central server. A number of examples show that the central server can invert participants’ local data using gradient information, which is known as an inference attack (Nasr et al. [14] ). To make the federated learning model more secure, scholars have introduced differential privacy techniques into federated learning. As a result of the combination of differential privacy techniques with federation learning, the data leakage problem of federation learning models can be effectively solved.
Although there are some literature reviews on differential-private federated learning models, there are some shortcomings. First, federed learning is developing rapidly and many of the latest research results have not been reviewed in, so this paper summarizes the latest research results on differential-private federated learning. Second, the research on the optimization of differential-private federated learning models is also important for the development of this field, but less attention has been paid by scholars. Therefore, this paper summarizes the research related to the optimization techniques for differential-private federated learning models. Finally, differential-private federated learning has achieved applications in many fields, which this paper summarizes.
This paper reviews the recent advances in differential privacy techniques for federated learning. The rest of this paper is structured as follows. In Section 2, we review some background knowledge about federated learning and differential privacy. In Section 3, we review the recent advances in federated learning with central differential privacy, local differential privacy, and distributed differential privacy, respectively. In Section 4, we review the algorithm optimization techniques and communication cost optimization techniques in the differential private federated learning model. In Section 5, we review some recent applications of the differential private federated learning model. In Section 6, we propose some future directions. In Section 7, we draw conclusions.
2. Fundamental Principles
In this section, we review some of the basic concepts of federated learning and differential privacy.
2.1. Federated Learning
In a federated learning model, multiple devices or clients collaborate in training a machine learning model by exchanging only model parameters or intermediate results without exchanging local data. As a result, federated learning consists of two key components: the central server and the federated learning clients. Let
be the set of federated learning clients. Each client has a local data set
with data structure
, where
represents the features and
represents the labels. The central server is responsible to decide the architecture of the machine learning model (e.g., logistic regression, etc.), and then sends the model information and model initialization parameters
to all federated learning clients. According to the central server, each client downloads the initial parameters, trains the machine learning model with their local data, and uploads the model parameters to the central server after the training is completed. The central server aggregates the parameters uploaded by all clients to form the global model parameters. In order to make the trained model more effective, the above process is performed several times until the model converges. Specifically, the federation learning model can be divided into the following three steps: initialization, local training, and global aggregation, and the workflow of the federated learning is shown in Figure 1.
Step 1. (Initialization) The central server decides the architecture of the machine learning model and sends the initial parameters
to each client.
Step 2. (Local Training) In the t-th round, client i downloads the parameter
from the central server, and updates the parameter by minimizing its loss function based on its local data set
, i.e.,
(1)
(2)
where
is the loss function for data sample j and is dependent on the underlying machine learning model;
is the loss function for client i;
is the number of the data in
. Equations (1) and (2) are usually solved by the stochastic gradient descent method, i.e.,
Figure 1. Workflow of federated learning.
(3)
where
is the gradient of
and
is the learning rate. Client i uploads the intermediate results (e.g.,
) to the central server.
Step 3. (Global Aggregation) The central server collects the intermediate results for each client and updates the global model parameters
through the global model aggregation algorithm. For example, under the FedAVG [10] ,
(4)
In addition to FedAVG, many variants of FedAVG (e.g., FedProx [15] , FedPAQ [16] , Turbo-Aggregate [17] , FedMA [18] , HierFAVG [19] ) can also be used.
For more details on federated learning models, the readers may refer to Yang et al. [20] , Rehman and Gaber [21] , and Ludwig and Baracaldo [22] .
2.2. Differential Privacy
Differential privacy (Dwork and Roth [23] ) is a data protection technique based on probability theory, and the idea behind it is that if for two adjacent databases (i.e., two databases differing by only one record), the statistical characteristics derived from these two databases cannot be used to deduce the single record, then the records in this database are said to be secure.
To this end, Dwork et al. [23] first give the definition of distance between databases and the definition of the randomized algorithm, and then give the concept of differential privacy. A randomized algorithm is an algorithm with the domain A and (discrete) range B will be associated with a mapping from A to the probability simplex over B, denoted
, and
for all i and
. For the databases x and y being collections of records from a universe
and being represented by their histograms (i.e.,
, in which each entry
represents the number of elements in the database x of type
), and the distance between x and y can be given by
, where
is the
-norm.
Definition 1. (Dwork et al. [23] ) A randomized algorithm
with domain
is
-differentially private if for all
and for all
such that
:
(5)
As described above, the definition of differential privacy guarantees privacy theoretically, but implementation requires perturbing the data by adding noise. By defining
(6)
as the
-sensitivity of a deterministic algorithm
, Dwork and Roth [23] add a random variable
to the deterministic algorithm and propose a Laplacian mechanism, and prove that the Laplace mechanism preserves
-differential privacy.
Definition 2. (Dwork et al. [23] ). Given a deterministic algorithm
, the Laplace mechanism is defined as:
(7)
where
are i.i.d. random variables drawn from
.
Theorem 1 (Dwork et al. [23] ). The Laplace mechanism preserves
- differential privacy.
By defining the arbitrary sensitivity
(8)
McSherry and Talwar [24] propose the exponential mechanism and proved that the exponential mechanism preserves
-differential privacy.
Definition 3. (McSherry and Talwar [24] ). The exponential mechanism
selects and outputs an element
with probability proportional to
.
Theorem 2. (McSherry and Talwar [24] ). The exponential mechanism preserves
-differential privacy.
By defining
(9)
as the
-sensitivity of a deterministic algorithm
, Nikolov et al. [25] add a random variable
and
to the deterministic algorithm and propose a Gaussian mechanism, and prove that the Gaussian mechanism preserves
-differential privacy.
Definition 4. (Nikolov et al. [25] ). Given a deterministic algorithm
, the Gaussian mechanism is defined as:
(10)
where
are i.i.d. random variables drawn from
and
.
Theorem 3. (Nikolov et al. [25] ). Let
be arbitrary. For
, the Gaussian Mechanism with parameter
is
-differentially private.
For more details about differential privacy, the readers may refer to Dwork [26] , Dwork and Roth [27] , and Ji et al. [8] .
3. An Overview of Federated Learning with Differential Privacy
During the federated learning process, the federated clients need to transmit parameters (e.g., gradients) to the central server, which may lead to the leakage of the federated clients’ local data. In order to protect the data of federated clients, both federated clients and the central server must use data protection techniques. Differential privacy is a probabilistic-based data privacy protection technique that has been successful in federated learning. The studies of differential privacy techniques in federated learning can be divided into three categories: federated learning with central differential privacy, federated learning with local differential privacy, and federated learning with distributed differential privacy. In this section, we provide an overview of federation learning with differential privacy. Specifically, we will first review the recent advances in federated learning with central differential privacy (in Subsection 3.1). Then, we will review recent advances in federation learning with local differential privacy (in Subsection 3.2). Finally, we will review recent advances in federated learning with distributed differential privacy (in Subsection 3.3).
3.1. Federeated Learning with Central Differential Privacy
Federated learning with central differential privacy is the way that a trusted central server adds noise to global parameters to protect local data. The workflow federated learning with central differential privacy is shown in Figure 2.
Geyer et al. [28] note that vanilla federated learning can be subject to differential attacks, thus initiating the study of federated learning with central differential privacy. In particular, a trusted server adds noise to aggregate results in order to protect against differential attacks. According to numerical experiments, this approach provides data security protection at the expense of accuracy. Triastcyn and Faltings [29] propose Bayesian differential privacy as a means of providing more precise privacy loss bounds. As demonstrated in experiments, the bayesian differential privacy significantly reduces noise, improves model accuracy, and reduces the number of communication rounds. The proposed method improves the accuracy of trained models by up to 10% according to experimental results. In Wei et al. [30] , they propose a novel approach, NbAFL, which adds artificial noise to parameters at the client’s side prior to aggregation. The NbAFL can satisfy the central DP under different levels of protection by properly adapting different variances of artificial noise. Furthermore, the authors develop a theoretical convergence bound for the loss function of the trained FL model in the NbAFL. Bernau et al. [31] examine the inference attack on central differential privacy. The authors of Zhang et al. [32] propose a clipping-enabled FedAvg, which combines the clipping technique with federated learning and central differential privacy. They demonstrate the relationship between clipping bias and the distribution of the client’s updates by analyzing the convergence of FedAvg with clipping. Hu et al. [33] present a new differentially-private FL scheme
Figure 2. Workflow of federated learning with central differential privacy.
referred to as Fed-SMP, which provides client-level DP guarantees while maintaining high model accuracy. To minimize the impact of privacy protection on model accuracy, Federal-SMP employs a new technique called Sparsified Model Perturbation (SMP), which involves sparsifying local models before perturbing them with additive Gaussian noise. Extensive experiments on real-world datasets demonstrate Fed-SMP’s capability to improve model accuracy while simultaneously reducing communication costs. Table 1 summarizes the recent advance in central differential privacy.
3.2. Federated Learning with Local Differential Privacy
In the framework of federation learning with central differential privacy, a necessary condition for this framework to be able to secure client data is that the central server is trusted, and if the central server is honest but curious, then the local client’s data will be leaked to the central server (e.g., Li et al. [34] and Melis et al. [35] ). Therefore, a more secure framework is the federation learning with local differential privacy, i.e., each client adds noise to the parameters uploaded to the central server to secure the local data. The workflow federated learning with local differential privacy is shown in Figure 3.
Federated learning with local differential privacy is first formalized by Kasiviswanathan et al. [36] . They show that a concept class is learnable by a local
Table 1. Summary of contributions in central differential privacy.
Figure 3. Workflow of federated learning with local differential privacy.
differentially private algorithm if and only if it is learnable in the statistical query model. Erlingsson et al. [37] propose a privacy-preserving mechanism called Randomized Aggregatable Privacy-Preserving Ordinal Response, or RAPPOR. They demonstrate that RAPPOR allows the collection of statistics on the population of client-side strings with strong privacy guarantees for each client and without linking the reports of the clients. Liu [38] proposes a generalized Gaussian (GG) mechanism based on LP global sensitivity and demonstrates that the GG mechanism reaches DP at a specified level of privacy. Truex et al. [39] focuses on federated learning frameworks with high-dimensional, continuous values and high-precision client data. As a result, the existing LDP protocols cannot be applied in this situation. The authors therefore proposed LDP-Fed, which provides a formal differential privacy guarantee for repeated collection of model parameters in federated neural network training over multiple individual participants’ private datasets. Sun et al. [40] examine whether differential privacy can protect backdoor attacks and demonstrate that norm clipping and weak differential privacy mitigate attacks without affecting overall performance. Xu et al. [41] addresses the situation where sensitive data about each user must be collected from multiple services independently and can be combined. In this research, the authors focus on preventing the privacy guarantee from being compromised during the joint collection of data and on how to analyze perturbed data from different services jointly. Towards this end, they propose mechanisms and estimation methods to process multidimensional analytical queries. Naseri et al. [42] investigate the robustness of local differential privacy techniques in FL. Experiments show that central differential privacy techniques are robust to defend against backdoor attacks. Wang et al. [43] propose a local differential privacy-based framework (named FedLDA) for federated learning of LDA models, as well as a novel LDP mechanism called Random Response with Priori (RRP). According to theoretical results, the novel framework provides theoretical guarantees regarding data privacy as well as model accuracy. Girgis et al. (2021) [44] propose a novel shuffle privacy model, in which each client randomizes its response and the server only receives a random shuffle of the clients’ responses. In sub-sampled shuffled models, numerical results demonstrate significant improvements in privacy guarantee over the state-of-the-art approximate Differential Privacy guarantee. Wei et al. [45] proposes a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before they are uploaded to the servers. Through varying the variances of the artificial noise processes, they demonstrate that the UDP framework can achieve (ε-δ)-LDP for the ith mobile terminal with adjustable privacy levels. Furthermore, they derive a theoretical upper bound for the convergence of UDP. Zhou et al. [46] propose a novel privacy-preserving federated learning framework for edge computing (PFLF). In PFLF, each client and the central server add noise before sending the data. For the purpose of protecting the privacy of their clients, they developed a flexible arrangement mechanism for counting the optimal training times for each individual, and prove that PFLF guarantees the privacy of clients and servers during the entire training process. As Thapa et al. [47] observe, federated learning and split learning are two distributed machine learning methods that perform similar tasks. Federated learning does not provide as much privacy as split learning. Split learning, however, performs slower than Federated learning. Thapa et al. [47] combines federated learning with split learning and presents a novel approach, splitfed learning (SFL), as well as a revised architectural configuration that incorporates differential privacy to enhance data privacy and model robustness. Wu et al. [48] present FedPerGNN, a federated GNN framework, which is capable of both effective and privacy-preserving personalization. The experimental results on six datasets demonstrate that FedPerGNN is capable of achieving 4.0 - 9.6 percent lower errors than the state-of-the-art federated personalization methods under good privacy protection. To secure cross-silo federated learning, Wang et al. [49] proposes a three-plane approach in which Local Differential Privacy is applied to user data before it is uploaded. According to theoretical results, LDP is capable of providing strong data privacy protection and still retaining user data statistics in order to maintain its high utility. Zhang et al. [50] propose federated f-differential privacy, a new notion specifically tailored to federated settings, based on the framework of Gaussian differential privacy. They then design PriFedSync as a generic framework for private federated learning. Table 2 summarizes the recent advance in local differential privacy.
Table 2. Summary of contributions in local differential privacy.
3.3. Federated Learning with Distributed Differential Privacy
Both centralized differential privacy and local differential privacy have shortcomings. For centralized differential privacy, it requires a trusted central server, and once the central server is malicious, then the data of the federated learning clients will be compromised, and a trusted central server is hard to find in practice. For local differential privacy, each client adds a lot of noise to the intermediate results they upload to satisfy the local differential privacy condition. Although the local data of the federated learning clients are secure under local differential privacy, it leads to too much noise in the aggregated results obtained by the federated servers, which leads to poor privacy-utility trade-offs. To address these shortcomings, scholars have proposed a differential privacy model that can guarantee data security while keeping the amount of added noise limited, i.e., distributed differential privacy. In this model, each federated learning client only needs to add a small amount of noise to ensure that the aggregation result of the central server satisfies the central differential privacy; at the same time, since the amount of noise added by each federated learning client is small and cannot guarantee the security of local data, the federated clients will use the secure aggregation technique (Bonawitz et al. (2017) [51] ), so that the federated server can only get the aggregation result of the intermediate parameters of all federated clients, but cannot get the intermediate parameters of each federated client, thus securing the federated clients’ local data. The workflow federated learning with distributed differential privacy is shown in Figure 4.
Dwork et al. [52] proposed a Binomial mechanism and prove that the binomial mechanism achieves the (ε − δ) differential privacy condition. Agarwal et al. [53] applied the binomial mechanism to federated learning and proposed a stochastic k-level quantization method and a randomized rotation method. Results show that the Binomial mechanism with the stochastic k-level quantization method and randomized rotation method can achieve nearly the same utility as the Gaussian mechanism, yet requires fewer representation bits. Canonne et al. [54] proposed a discrete Gaussian mechanism. According to the authors, discrete Gaussian noise can provide essentially the same level of privacy and accuracy as continuous Gaussian noise, both theoretically and experimentally. On the basis of Canonne et al. [54] ’s work, Kairouz et al. [55] applied the discrete Gaussian mechanism to federated learning with secure aggregation, and results show that the model can provide
central differential privacy and the mean squared error of the model is at most
. Agarwal et al. [56] proposes a new multi-dimensional Skellam mechanism based on the addition of the difference of two independent Poisson random variables as noise. According to their findings, even when the precision of the Skellam mechanism is low, it provides the same privacy-accuracy trade-off as the continuous Gaussian mechanism. In Bao et al. [57] , they propose a Skellam mixture mechanism (SMM) based on injecting random noise from a mixture of two shifted symmetric Skellam distributions. The SMM is found to satisfy the (ε − δ) differential privacy condition. By
Figure 4. Workflow of federated learning with distributed differential privacy.
applying SMM to federated learning with distributed SGD, the authors show that SMM improves model utility by eliminating the step of rounding the gradients. Chen et al. [58] proposes a Poisson binomial mechanism (PBM) under which it encodes local information as parameters of a binomial distribution, resulting in discrete outputs. Theoretically results show that PBM satisfies the (ε-δ)-approximate differential privacy, the communication cost equals is
and an MSE is at most
. Chen et al. [59] characterize the fundamental communication cost required to obtain the best accuracy achievable under ε central differential privacy. Theoretical results show that
bits per client are both sufficient and necessary for obtaining the best accuracy achievable ε-differential privacy. Cheu et al. [60] investigate the Shuffling method in distributed differential privacy and show that this model provides the power of the central model while avoiding the need to trust a central server and the complexity of cryptographic secure function evaluation. Jiang et al. [61] focus on the client dropout problem in distributed differential privacy and propose a distributed differentially private FL framework, named Hyades. Results show that Hyades is capable of managing client dropout in various realistic scenarios and achieving the optimal privacy-utility trade-off. Table 3 summarizes the recent contributions in distributed differential privacy.
Table 3. Summary of contributions in distributed differential privacy.
4. Optimization Techniques in Federated Learning with Differential Privacy
Applying differential privacy techniques to federation learning models can effectively protect clients’ data security, however, differential privacy techniques also lead to problems such as decreasing model accuracy and increasing communication costs. Therefore, scholars have started to optimize the federated learning models with differential privacy, and the main directions of optimization are algorithm accuracy optimization and communication cost optimization.
Zhou and Tang [62] design a differentially private distributed algorithm based on the stochastic variance reduced gradient (SVRG) algorithm, which is capable of preventing the learning server from accessing and inferring private training data. The authors further quantify its impact on learning in terms of convergence rate and shows that noise added at each gradient update results in a bounded deviation from the optimal way of learning. Hu et al. [63] address the issue of privacy-preserving techniques for federated learning models under heterogeneous customer data sets. They show that the model satisfies (ε − δ) differential privacy when the Gaussian mechanism is used by each client. Van Dijk et al. [64] propose a new algorithm for asynchronous federated learning that eliminates waiting times while at the same time reducing overall network communication. By adding Gaussian noise, they demonstrate how our algorithm can be made differentially private. Girgis et al. [65] focuses on the stochastic gradient descent algorithm for solving federated learning models with local differential privacy and proposes a distributed communication-efficient and locally differentially private stochastic gradient descent algorithm (CLDP-SGD) along with a detailed analysis of its communication, privacy, and convergence tradeoffs. Zhang et al. [32] examine the impact of clipping on federated learning with differential privacy and provide a convergence analysis of a differential private (DP) FedAvg algorithm. Zhang et al. [66] propose a federated learning scheme based on differential privacy and mechanism design. In addition to differential privacy mechanisms, two dominant-strategy truthful, individually rational, and budget-balanced mechanisms are designed to motivate clients to participate in training. Experiments demonstrate the effectiveness of the proposed scheme. Using optimal private linear operators on adaptive streams, Denisov et al. (2022) [67] present an improved Differential Privacy for SGD. The proposed algorithm achieves significant improvements in a notable problem in federated learning with differential privacy at the user level.
Lian et al. [68] presented COFEL, a novel federated learning system that reduces communication time through layer-based parameter selection and enhances privacy protection through local differences in privacy. In addition, they propose the COFEL-AVG algorithm for global aggregation as well as a layer-based parameter selection method, which enables the selection of the most valuable parameters for global aggregation in order to optimize the communication and training process. Amiri et al. [69] discuss the communication costs brought about by differential privacy, and present a novel algorithm for compressing client-server communications through quantization in order to achieve both differential privacy and reduced communication overhead. Liu et al. [70] propose a Projected Federated Averaging (PFA) scheme to explicitly model and leverage the heterogeneous privacy requirements of different clients and optimize utility for the joint model while minimizing communication cost. Truex et al. [71] present a novel approach that combines differential privacy and SMC, thus enabling users to reduce the growth of noise injection as the number of parties increases without sacrificing privacy while maintaining a pre-defined rate of trust. Table 4 summarizes the optimization techniques in differential private federated learning models.
5. Application of Differential Private Federated Learning
Differential privacy-preserving federated learning techniques can effectively address data security issues in federated learning, and thus have achieved important applications in many fields.
Andrés et al. [72] investigate the application of differential privacy techniques in protecting customer geolocation data. They present a mechanism for achieving
Table 4. Summary of contributions in optimization techniques.
geo-indistinguishability by adding controlled random noise to the user’s location. Wang et al. [73] consider local differential privacy protection for both qualitative data (e.g., categorical data) and discrete quantitative data (e.g., location data). They derive a k-subset mechanism and an efficient extension of k-subset mechanism for categorical data and discrete quantitive data, respectively. Zhao et al. [74] study the application of federated learning in the Internet of Vehicles. In this context, user data, such as traffic information, vehicle registration information, etc., may be exposed. For this purpose, the authors propose a novel local differential privacy mechanism, named as Three-Outputs, to protect the privacy of clients’ data, and propose an LDP-FedSGD to train the model. Cao et al. [75] examine the application of differential private federated learning in the context of the Power Internet of Things. The authors propose IFed, a novel federated learning framework that takes into account the trade-off between local differential privacy, data utility, and resource consumption, to allow electric providers who normally have adequate computing resources to assist users in the Power Internet of Things. Jia et al. [76] propose a blockchain-enabled differential private federated learning in the Industrial Internet of Things (IIoT). Extensive experimental results show that the proposed scheme and working mechanism have better performance in the selected indicators. Olowononi et al. [77] propose the use of FL, together with differential privacy to improve the resiliency of vehicular cyber-physical systems to adversarial attacks in connected vehicles. Liu et al. [78] propose a federated learning framework for distributed medical institutions to collaboratively learn a prediction model. In comparison with state-of-the-art and in-depth ablation experiments, the proposed method performs better on two medical image segmentation tasks. Kaissis et al. [79] present PriMIA, a differential private federated learning framework for image analysis, and theoretically and empirically evaluate its performance and privacy guarantees, and demonstrate that the protections provided prevent gradient-based model inversion attacks from regenerating usable data. Adnan et al. (2022) [80] investigates the application of differentially private federated learning to the analysis of histopathology images. In a comparison of the performance of the conventional machine learning model with the federated learning model, the authors found that the federated learning model could achieve a similar performance while providing strong privacy guarantees. Zhang et al. (2022) [81] investigated the application of differential private federated learning models to industrial cyber-physical systems. The authors propose a Privacy-Enhanced Momentum Federated Learning framework called PEMFL, which incorporates differential privacy (DP), momentum federated learning (MFL) and chaos-based encryption methods. Theoretical analysis and experimental results demonstrate the excellent accuracy and privacy security of the PEMFL. Liu et al. [82] investigate the application of differential private federated learning to wireless sensor networks. As a result of integrating hybrid differential privacy into federated learning, the authors propose a secure and reliable federated learning algorithm. Based on a theoretical analysis and an experimental evaluation on real-world datasets, the validity of the algorithm is demonstrated. Table 5 summarizes the applications of differential private federated learning models.
6. Future Directions
Differential privacy techniques have had some success in federated learning. Enterprises such as Microsoft, Apple, and Google have applied differential privacy-preserving federated learning models to their operations (Cormode et al. [83] ).
Table 5. Summary of contributions in applications of DPFL.
But existing research is still lacking. In this section, we discuss three possible future research directions for differential privacy techniques in federation learning: research on the conditions for the use of differential privacy, research on the design of differential privacy features-based algorithms, and research on the combination of game theory and distributed differential privacy.
6.1. Research on the Conditions for the Use of Differential Privacy
Most of the existing studies on differential privacy-preserving federated learning assume that the central server or federated learning clients use differential privacy techniques from the beginning of model training, and there is no discussion on the conditions for using differential privacy techniques. We know that attacking a federated learning model is costly, so even a malicious individual will make a cost-benefit tradeoff before launching an attack on a federated learning model. If the cost of attacking a federated learning model is too high, i.e., more than the benefit from attacking a federated learning model, then a malicious individual will not launch an attack on the model. Thus, there is no need to use differential privacy techniques in a federated learning model under such conditions. Therefore, it is necessary to analyze the conditions under which an attacker initiates an attack from the perspective of the attacker’s utility and thus determine the conditions under which differential privacy techniques should be used.
6.2. Research on the Design of Differential Privacy Features-Based Algorithms
The stochastic gradient descent algorithm is a common algorithm for solving large-scale differential privacy-preserving federated learning models. Scholars have made a series of optimizations on the convergency and convergence speed of the stochastic gradient descent algorithm. However, in some cases, the results of using the stochastic gradient descent algorithm to solve differential privacy- preserving federated learning models still fail to meet the requirements for industrial use. One possible reason is that the stochastic gradient descent algorithm is a general algorithm, and the characteristics of the differential privacy-preserving federated learning model are not fully considered in the design of this algorithm. Therefore, it is necessary to develop algorithms with better convergence and higher accuracy based on the features of differential privacy-pre- serving federated learning models.
6.3. Research on the Combination of Game Theory and Distributed Differential Privacy
Distributed differential privacy can improve the accuracy of federated learning models while protecting the security of federated learning clients’ data. However, distributed differential privacy requires the use of secure aggregation techniques, which imposes expensive communication costs on the federated learning models. Therefore, federated learning models with distributed differential privacy usually include the step of federated learning client selection. Existing client selection methods are mainly based on probability theory in which the probability of a federated learning client being selected is constructed by the norm of the gradients uploaded by federated learning clients. This portrayal is too simple and does not sufficiently consider the contribution of federated learning clients to the federated learning model. A more reasonable way is to apply cooperative game theory and mechanism design theory to consider the game relationship between the central server and the federated clients and among the federated clients to make a reasonable portrayal of the contribution of the federated learning users, so as to select higher quality federated clients to participate in the federated learning model. Therefore, it is essential to conduct research that combines game theory with distributed differential privacy.
7. Conclusion
Differential privacy techniques are a key design element of federated learning systems. In this work, we extensively survey state-of-the-art approaches and open up some interesting future research directions. First, we introduce the workflow of federated learning and the theoretical foundations of differential privacy techniques. Then, we overview three paradigms arising from the combination of differential privacy techniques and federation learning models, namely: centralized differential privacy, local differential privacy, and distributed differential privacy. After this, we review the optimization study of federated learning models oriented to differential privacy preservation. Finally, we review the applications of differential privacy-preserving federated learning models in various domains. In conclusion, differential privacy techniques play a crucial role in federated lear- ning systems. From this survey, we expect more and more researchers to devote themselves to this field.
Acknowledgements
The authors would like to thank the editors and anonymous reviewers for their constructive comments, which help improve the study significantly. This work was supported by the National Natural Science Foundation of China [grant number 72201022].