Enabling Privacy Preservation and Decentralization for Attribute-Based Task Assignment in Crowdsourcing

Crowdsourcing allows people who are endowed with certain skills to accom-plish special tasks with incentive. Despite the state-of-art crowdsourcing schemes have guaranteed low overhead and considerable quality, most of them expose task content and user’s attribute information to a centralized server. These servers are vulnerable to single points of failure, the leakage of user’s privacy information, and lacking of transparency. We therefore explored an alternative design for task assignment based on the emerging decentralized blockchain technology. While enabling the advantages of the public blockchain, changing to open operations requires some additional technology and design to preserve the privacy of user’s information. To mitigate this issue, we proposed a secure task assignment scheme, which enables task content preservation and anonymous attribute requirement checking. Specifically, by adopting the cryptographic techniques, the proposed scheme enables task requester to safely place his task in a transparent blockchain. Furthermore, the proposed scheme divides the attribute verification process into public pre-verification and requester verification, so that the requester can check only the identity of the worker, instead of verifying the attributes one by one, thereby preserving the identity of worker while significantly re-ducing the requester’s calculation burden. Additionally, security analysis demonstrated unrelated entities cannot learn about the task content and identity information from all data uploaded by requester and worker. Performance evaluation showed the low computational overhead of our scheme.


Introduction
Crowdsourcing is a powerful method, has emerged in the landscape of problem solving, to outsource the work originally done by the designated party to an unknown group of people in an open manner [1]. It enables tasks to be completed by specific professionals on demand, which significantly reduces costs and improves the quality of the solution. Along with these advantages, many large companies have successfully applied it into the market, such as ImageNet [2], Amazon Mechanical Turk [3] and UBER [4]. These applications mainly cover areas where devices have poor or even no computing capacity, and there still require further improvement.
All the participants on the process of crowdsourcing can be divided into three types of roles: requester, worker and platform. To be specific, the one that publishing tasks is considered as the requester, and the one that working on those tasks is named as the worker. The middleman between the requester and the worker is the platform, who is responsible for storing the tasks and maintaining the correct execution of the whole process. Many crowdsourcing applications share a similar structure: the requester submits the task content along with the reward to the platform, and then the workers accept the task and submit the solution of this task within the fixed time. After that, the requester confirms the quality of the solution and pays for the pre-declared reward to the worker.
Although these crowdsourcing applications have achieved considerable success, some of the key challenges still need to be addressed. One critical aspect is the lack of a credible guarantee on the quality of the work. Workers who have accepted the work may not have the corresponding skills to provide valuable answers [5] [6]. Statistical aggregation algorithms can tolerate some low-quality answers [7] [8] [9], but leaves a waste of resources. A straightforward approach is to customize the credentials based on the background of each worker and make sure that those workers only accept tasks within their capabilities [10] [11].
Currently, these credentials are usually distributed by various agencies. The crowdsourcing system will ask workers to upload these credentials in order to achieve capacity limitations during the task assignment process.
Another aspect is data confidentiality. Traditional centralized platforms typically obtain task content in plaintext. The compromise of the platform will result in the disclosure of information of the user. Therefore, most of existing solutions assume that the platform should be honest during the protocol, which is impractical [12]. Various examples have shown the potential threats of platform compromises, such as UBER, which has been affected by unreliable order issues and users' data leakage [13] [14]. To address this, an alternative design needs to be explored to achieve secure task assignment based on a more open and distributed infrastructure.
The Blockchain is a decentralized and intelligent infrastructure [15] [16]. Compared to the traditional distributed solution, blockchain enables the masses to join as participants, making it an ideal start point. In this paper, we adopt the design of the consortium chain because it has the best performance. In blockchain, the data will be initially verified by the agencies, then encapsulated into a block and appended to an existing chain. The remaining network participants perform the verification. When a chain is verified by a participant, any changes of this chain can be recognized by the participant. This feature has spawned countless fascinating decentralized applications [17] [18] [19]. Implementing the task assignment on the blockchain alleviates concerns about single points of failure. However, the open setting of the blockchain may pose a more serious threat to data confidentiality.
To solve the security issues in task assignment process, this paper uses skill credential to restrict the access of task content, which is achieved through Attribute-based Encryption (ABE) [20]. Specifically, each skill corresponds to an attribute one by one. Depending on the attributes owned by the worker, the authority will only distribute the credential keys of those attributes to the worker.
By applying credential during the encryption process, the requester can ensure that his task content is only visible to those who fully satisfy task's access control settings. Unfortunately, this method only preserves the privacy of the requester and still requires disclosure of the worker's identity. Because a task can accept multiple solutions. If there are no restrictions, workers will be motivated to submit their solutions multiple times, in such way that they can get more rewards than they actually do. Traceable Attribute-Based Signature [21] allows a signer, who own a set of attributes, to sign a message and make the recipient of the signature believe that the signer owns some attributes. It introduces a special tracing authority that has the capable of revealing the identity of the signer, but also brings back the weakness of centralization, so this technology cannot be directly introduced.
Correspondingly, the requester should be responsible for his own task. The requester will be given the ability to reveal and verify the identity of the participants in his task. But the true identity of the worker is not needed for the requester. Therefore, the identity shown to the requester will be replaced by an anonymous account approved by the authority. To fulfill the requirements of verification in above way, we design a novel scheme based on the ABE scheme proposed by Lewko and Waters [22] and bring out corresponding functional expansion. We propose a credential that are constructed by binding the anonymous account and the worker's attributes. Only the owner of the credentials can use it for decryption. Then, the worker can cover his real identity in the credential and form a proof. Anyone can check the validation of the proof and confirm that the prover satisfies certain attributes. But only the person designated by the prover (the requester of the corresponding task) can reveal the identity from the proof.
Our contributions: In this paper, our main contributions are as follows. 1) A secure attribute-based task assignment scheme is proposed, which can preserve information security on a transparent blockchain. Moreover, everyone can verify the correctness of the process without revealing the identity of the worker.
2) We preserved the privacy of worker with a random anonymous account, so that workers can change their identity at any time, which prevent requester from discovering associations among participants in different tasks.
3) We designed the attribute verification protocol with two aspects: public pre-verification and requester verification. Most verification works are performed by blockchain while only some steps are performed privately by the requester who knows the extra information, which significantly saves the computation cost in the requester's side. Therefore, the requester can prove the misbehaving of the worker by exposing additional information he knows. 4) We implemented the proof-of-prototype and the experimental results have shown the validation and feasibility of our proposed scheme.
The rest of this paper is organized as follows. Section 2 reviews the related work on task assignment for crowdsourcing system. We present models and goals in Section 3. Next, our scheme detail is presented in Section 4. The privacy discussions and performance evaluation are presented in Sections 5 and 6 respectively, followed by a conclusion in Sections 7.

Attribute-Based Encryption
ABE was first proposed by Sahai and Waters [23]. In an ABE system, each user has a unique ID and a set of attributes. In general, ABE can be divided into two categories: Key-Policy Attribute-Based Encryption (KP-ABE) [24] and Ciphertext-Policy Attribute-Based Encryption (CP-ABE) [25]. In KP-ABE, ciphertext is associated with a set of attributes, and the user's private key is associated with an access structure. The user can decrypt the ciphertext if and only if the attributes in the ciphertext satisfy the access structure of the user's private key. However, the encryptor cannot completely control over the encryption policy in KP-ABE.
In CP-ABE, ciphertext is created with an access structure, and the user's private key is generated based on the user's attributes. The user can decrypt the ciphertext if and only if the attribute of the user's private key satisfies the ciphertext access policy. In doing so, the encryptor is enabled to determine the access control of the ciphertext.
These schemes use a centralized approach with only one key distribution center (KDC), so they inherit all the centralized weaknesses such as single point of failure. The multi-authority ABE protocol is proposed and addressed to this problem. In multi-authority ABE, the entire attribute set is divided into N disjoint sets and managed by N authorities. Under this setting, each authority only knows part of user's attribute, and user is required to get the private key from all KDCs. Based on this model, many attribute-based encryption schemes with multiple authorities have been proposed, but they still rely on a semi-honest central authority [26] [27] [28], or cannot resist the user's collusion attack [29]. The work proposed by Jung et al. [20] can tolerant up to 2 N − authority compromise, and do not require a trusted server. However, their work is difficult to modify the number of authority after setup,. On the other hand, the work by Lewko et al. [22] cannot prevent the authority from being aware of the user's key during the key generation phase.

Centralized Crowdsourcing Systems
Many crowdsourcing systems are built in a centralized manner [3] [30] [31]. In order to understand the capabilities of workers and the tasks they are interested in, the platform requires the worker to complete their profile before joining.
Correspondingly, the platform needs to learn the plain text of the task content so that the task content can be sent to the worker. During this process, requesters and workers submit their private information in exchange for platform services.
This type of information is known and stored by a single party and is therefore vulnerable to a variety of attacks and privacy leakage. In a system with limited task content, such as Mturk [3], workers only need to complete some human intelligence tasks. Worker only needs to pass a non-robot test to become a qualified worker. This convenience allows worker to change their account at low cost.
Dynamo [32] specifically designed a wrapper for it, using pseudo IDs to provide unlinkability, but it is difficult to extend to multi-attribute task content and greatly limits its scope of application.

Distributed Crowdsourcing Systems
In spatial crowdsourcing (SC), the geographic location of workers and requesters is considered private information and should not be known to the platform and unrelated people. Liu et al. [33] proposes a model that divides the server into SC server and crypto service provider (CSP). The users encrypt their locations using the public key provided by CSP and hands it over to the SC server. The SC server then operates calculation on the ciphertexts and passes the results to the CSP.
The CSP then decrypts and publishes the results, but only the eligible workers can restore the location. This model requires that both the SC and CSP are semi-honest, and do not consider the case of collusion, so the degree of decentralized is very limited. In addition, the requester's geographic location is still known to the SC server, which is a privacy leakage of the requester.

Decentralized Crowdsourcing Systems
Li et al. [34] uses a reputation system to regulate workers' behaviors. Although workers use pseudonyms as their identity, the linkability between different tasks expose the interest of workers. And changing identity will lose its existing reputation, which brings great damages to workers. Lu et al. [35] proposes a private and anonymous crowdsourcing system based on common-prefix-linkable anonymous authentication. Each task has a unique prefix. Unless a worker proves his identity twice in a prefix, he stays anonymous and unlinkable across tasks.
However, these systems still treat the task content as open access data, which cause privacy leakage of the requester. In addition, in order to make users identifiable, these systems use registry authority to identify users, which makes the decentralized effect of the system questionable.

System Model
Our system model is shown in Figure 1. It contains four entities as follows: Authority: The authority has the right to endorse certain abilities in specific areas and provide qualified workers with keys that correspond to their anonymous accounts and capabilities. Note that each ability is treated as a single attribute. In addition, the authorities act as proposers of the blockchain block, that is, they are responsible for packing the information sent to the contract into blocks and appending them to the existing chain. Other entities can get the chain and verify it.
Smart Contract: The contract receives and stores the task content ciphertext posted by requester and the attribute proofs submitted by the worker. It validates the legitimacy of proof to detect misconduct, thereby ensuring a fair judgment in the dispute between the requester and the worker. Requester: The requester encrypts the task content according to the attribute requirements of his task and submits the task ciphertext to the contract. When the worker accepts the task and submits the attribute proof, he needs to verify the legality of the worker's anonymous identity and require the contract to make a judgment when the verification fails.
Worker: The worker creates an anonymous account in advance and obtains the credential keys from the authorities based on his or her attributes. Using these keys, he can decrypt the task ciphertext which satisfies the required policy and submit the appropriate attribute proof when deciding to accept the task.

Security Model
The authorities are semi-honest which means they follow our proposed scheme in general. We assume authorities are interested in which worker is using the key they distributed to participate in the work, but they will not collude with users or other authorities. Note that our system inherits the weakness of the blockchain. Although the authorities are semi-honest, the blockchain can resist 51% attacks. However, such attacks against blockchain infrastructure are considered out of scope.
The smart contract runs on the blockchain, which guarantees its availability and integrity, but does not include confidentiality. Other entities can directly read its data through the blockchain, but have no ability to tamper it.
The requester also assumed to besemi-honest. His request can only be assigned to a valid worker when the task assignment process is properly executed, so requester will follow the scheme in general. In particular, we assume that he is interested in the identity information of the workers involved in his task.
Workers are untrusted since they are random users. They may collude with other workers to accept a task which they are not allowed to or attempt to accept a task more than one times.
In our scenario, we define the security of worker and requester information as follows: Task content security: The task content ciphertext should only be decrypted by workers who fully satisfy the task attribute requirements.
Worker identity security: When the worker decides to accept a task, he will upload a proof to the contract. These can be divided into three main cases: 1) Given a proof. No entity can restore worker's global identity from the proof.
2) Given two workers who have accepted the attribute key distributed by the authority, and a proof constructed by one of them. The authority cannot distinguish which worker constructed the proof.
3) Given two anonymous account and a proof constructed by one of them, other workers who can decrypt the task cannot distinguish which account construct the proof.

Epoch
Tasks generally involve time-related restrictions such as deadlines. Therefore, the workers should check the consistency of time with the blockchain when accepting tasks. We introduce the epoch to the process of task acceptance. There is a stamp in each epoch. The worker's request is legal only when the worker uses the stamp of the current epoch in his message. Figure 2 is an example of every three blocks as an epoch. The stamp of the epoch is the hash of the last block of the previous epoch. Since the hash value of a block has only a negligible probability of collision, if the worker's message is not packed in a block of a certain epoch, the worker can ensure that his message has expired. This prevents messages from being packed into blocks after a long time.
In the proposed scenario, the consortium chain does not need to propose the block through the proof-of-work, so the block time is stable. Therefore, the duration of each epoch does not have a large deviation.

Proposed Secure Task Assignment Scheme
In this section, the proposed secure attribute-based task assignment scheme will be described in detail. To give a better understanding, the main notations will be listed in Table 1.

Scheme Overview
As shown in Figure 3, the proposed scheme consists of five steps. In step 1, the

Scheme Construction
The proposed scheme is based on ABE with multiple authorities proposed by Lewko and Waters [22]. In the task publication and task decryption phase, their ABE scheme will serve as an encryption method to preserve the security of the task content. The concrete construction is shown as follows. System Initialization In the global settings, select a prime p , groups 1  and T  of order p , a map The public key of the attribute i will be posted on the contract as public knowledge.
( ) 1 , , , , , Worker registration Workers can create new anonymous accounts at any time and request the appropriate credential keys from the authority, Workers are not allowed to have multiple qualified anonymous accounts at the same time, so when a worker applies for a new anonymous account through his global identity id W , since the authority knows the association between the worker's global identity and the anonymous account, the authority can announce that the worker's previous anonymous account has been revoked. Because the authorities are semi-honest, the revoke process for anonymous accounts can be done reliably.
Authorities are assumed to jointly maintain and disclose a set of revoked accounts account Rset . To create an anonymous account, the worker picks a random number Task publication According to the attribute requirements of the task, the requester constructs the linear secret-sharing schemes (LSSS) matrix R with ρ mapping its rows to attributes, and then the requester the task content as follows: 1) Selecting two random values , p s k ∈  and generate an asymmetric key pair, where requester PK is the public key. The task description is combined with k as the message M. , , , , , , , ,

2) Choose a random vector
The contract checks the correctness of by Equation (6) Finally, the worker can determine the correctness of his decryption through k in task content M and 1 C in task CT .
Proof publication After learning the content of the task, if the worker decides to accept the task, he needs to construct a proof to prove that his attributes can meet the requirements and he has not accepted this task before. This proof will be published on the contract for verification by other entities.
Proving that the worker can meet the requirements is equivalent to proving that the worker has a unrevoked anonymous account with credential key corresponding to the x R used in the decryption, so the worker constructs the proof Proof verification The verification of the proof is divided into two parts: the contract verifies the correctness of all parameters except If the Equation (11) is true, it demonstrates the worker has knowledge of x t , so ,2 x P is not obtained by the worker based on any APK , and it does not contain any information about  . If the public verification succeeds, the next part will be verified by the requester. Since the requester owns the private key corresponding to the requester PK , he can recover the plaintext of the ,7 account , x P P . First the requester checks whether the anonymous account claimed in the account P is not in the revoked account set account

Rset
and does not equal any other account that accepts this task, then he checks the following equation: Equation (14) and Equation (15) can prove the correct construction of ,5 x P and its relevance with account P , which means those attributes in the above public verifications does belong to this anonymous account. If the above check fails, the requester can reveal the plaintext of ,7 account , x P P , then the contract can repeat this process to determine which entity is misbehaving. Otherwise, the requester accepts the worker's participation and the task assignment process ends.

Security Analysis
In this section, we will analyze our protocol can preserve the security of worker and requester information.

Task Content Security
First we discussed that the task content can only be decrypted by the workers who fully satisfy the access policy, since our scheme is based on ABE with multiple authorities [22]. This part of our scheme is under the same security level. We first analyzed the case where the worker cannot satisfy the task's access policy, that is, for any combination of attributes that satisfy the access policy, the worker does not own the credential keys corresponding to the all attributes in the combination. In this case, the worker cannot find any subset of x R that can satisfy ( ) , then there is negligible probability to compute ( ) , s e g g .
Next we discussed that multiple workers cannot collude to access task content that they cannot access individually. Suppose that there is a group of workers, for any combination of attributes that satisfy the access policy, there do not exist worker who has the credential keys corresponding to all attributes in the com-bination. But there is at least one combination, the credential keys owned by multiple workers can satisfy all the attributes in the combination. However, as shown in Equation (7) is used in the construction of the proof. We analyze an extreme case where an entity knows all the information of a qualified worker except the secret key of the anonymous account z USK . This scenario is reasonable, because the key is the only secret that the worker will not share with others. In this case, the entity can perform the first three steps of the task decryption process, but in step four, the entity cannot calculate the task content M according to Equation (8) due to the lack of z USK . So the entity cannot decrypt the task content.

Worker Identity Security
The identity information of the workers is anonymous account z UPK and global identity id W . So we discuss whether other entities can get information about these two parameters from the worker's proof. As to id W , it exists in the form of ( ) in the proof, where d g is a one-time pad and is only known to the worker himself, so worker's global identity cannot be obtained by any other entity from the proof.
Next we discuss that the authorities and other workers cannot obtain information about the anonymous account from the worker's proof. The anonymous account information of the worker only exists in  In ,5 x P , based on the DDHI problem, In addition, ,5 x P only has a meaningful pairing operation with ,4 x C , but the result will become the same case as ,4 x P described above.

Performance Evaluation
In this section, we used computational cost as a metric to analyze the performance of our scheme. We used the JPBC library [36] Ver. 2.0.0 as an implementation of cryptographic operations. The implementation used 160-bit elliptic curve group on the curve 2 3 y x x = + over a 512-bit finite field. All processes were evaluated using a single thread of AMD ryzen CPU.

Requester's Computational Cost
In our scheme, the requester's calculation was mainly divided into two parts: task publication and proof verification (partial). In Figure 4(a), we illustrated the computational overhead of the requester in a task with only one worker. We used the calculation time as the y-axis and the number of attributes included in the task as the x-axis. Note that the number of attributes that appear in the proof is related to the access policy. Here we took the worst case that requires all attributes. It can be seen from Figure 4(a) that the task encryption time and the proof verification time increase linearly according to the number of attributes.
Although single verification time is within a reasonable range, this may become a major burden on the requester as the number of workers increases.

Worker's Computational Cost
The computational overhead of workers was also divided into two parts: task decryption and proof publication. In Figure 4(b), we described the computational overhead of the worker in two processes, the y-axis represents the computation time, and the x-axis represents the attribute number used in the process. As shown in Figure 4(b), both task decryption and proof publication increase linearly with the number of attributes. The process of task decryption results in cost saving in computation, which is consistent with the fact that worker needs to decrypt a large number of tasks for selection. In contrast, the cost of proof publication is high, but it is still reasonable compared to the time required for workers to complete their tasks.

Authority's Computational Cost
In Figure 4(c), we compared the proportion of computing overhead between the Journal of Computer and Communications blockchain and the requester in the verification work, y-axis is the calculation time and the x-axis is the amount of attribute used in the proof. It can be seen that the overhead of the verification work increases linearly according to the number of attributes, and the proportion of both remains constant at around 5:3. That is to say, although we had shifted more than half of the burden to the blockchain verifier and greatly reduced the computing time on the requester, the ratio is not large enough for the requester to easily deal with its work.
Next, we described the overhead of the authority to distribute the new key to the worker in Figure 4(d), the y-axis is the calculation time and the x-axis is the amount of attribute the worker has. It can be seen that although the calculation time increases linearly with the number of attributes, the computation operation is quite fast. This means that the worker can change the anonymous account after each task is completed, which does not incur too much cost to the authority.

Effectiveness
In Figure 4(e), we studied the performance impact on introducing anonymous accounts into ABE. Since our modification only affects the decryption process, so we compared the computational overhead of decryption with the origin scheme.
The y-axis is the calculation time and the x-axis is the amount of attribute used in the decryption. It can be seen that our scheme introduces only a constant cost and is negligible relative to the overall decryption overhead. Finally, we discussed the effect of the size of the task content on calculation overhead in Figure 4(f), the y-axis is the calculation time and the x-axis is the size of the task content. The number of attributes is set to 16. As shown in the Figure 4(f), when the size of the task content is less than 2 m, the impact of the size on the calculation time is less than 10%, which is not a key factor affecting the overhead. Obviously, this size is too small for files such as pictures and videos. Note that the contract will not process this data, it is sufficient to store only a description of how to access the actual data. In this case, 2 m is more than enough.

Conclusion
In this paper, we proposed a secure attribute-based task assignment scheme which can preserve information security on a transparent blockchain. First of all, the proposed scheme preserves the privacy of requesters and workers through anonymous accounts and attribute-based encryption. Second, the proposed scheme is compatible to blockchain, so as to get rid of the weakness from centralization and provide transparency. In addition, we divided the verification process into public pre-verification and requester verification, the computing burden of the requester can be greatly reduced. Finally, we analyzed the privacy and performance of the proposed protocol to show the satisfied features in both security and efficiency. In the future work, we will consider the attribute value as part of the requester's privacy for better security requirement and make a further improvement on the performance of the task assignment scheme.