The Improvement Path of the Legal Regulation of Public Data Openness in China ()
1. Introduction
Digital technologies are being widely applied to the management and services of the Chinese Government. As the process of digital China continues to develop and the construction of a digital government advances, more and more government departments and public service departments are digitally performing their functions, resulting in an increasing amount of data being generated and collected.
In 2021, China’s 14th Five-Year Plan explicitly encouraged the opening of public data to third parties for deeper utilization, and in 2022, the State Council, in its Opinions on Building a Data Base System and Better Utilizing the Role of Data Elements (also known as the “Twenty Articles on Data”), further proposed strengthening the aggregation, sharing, and open development of public data. In the Opinions on Building a Data Base System to Better Play the Role of Data Element, the State Council proposed to strengthen the aggregation, sharing and openness of public data, and encouraged public data to be provided to the society in the form of products and services under the premise of protecting personal privacy and ensuring public safety. For public data that do not carry personal information and do not affect public security, the scope of supply and use should be increased in accordance with its purpose. At the same time, enterprises of all kinds are also allowed and encouraged to utilize existing public data to provide welfare services to the public under the premise of complying with relevant laws and regulations. In 2023, The State Council issued the “Overall Layout Plan for the Construction of Digital China”, which also pointed out once again that it was necessary to promote the aggregation and utilization of public data, and to build a national data resource base for important fields such as public health, science and technology, and education.
Open public data has become one of the most important ways for China to utilize public data, activate the value of data, and empower the development of the physical economy. What is included in public data and how to open public data have become new issues that need attention. However, China has not yet made an accurate definition of public data (Shen, 2023), lacks of unified rules for opening public data (Zhang, Xiao, & Ning, 2023), and the construction of public data opening platform is faulty (Yan, 2024). To address the above problems, this paper reconstructed the concept of public data, reviewed the existing rules and platforms for opening public data, and put forward suggestions for improvement.
2. The Definition of Public Data
Clarifying the scope of public data is a prerequisite for opening and utilizing public data. Only by accurately defining the connotation and extension of public data can a solid foundation be laid for the formulation of legal standards for the opening up of public data, ensuring that the standards point to a clear and effective direction, avoiding excessive intervention by public power in the flow of data, preventing increased burdens on private subjects, and, at the same time, promoting the orderly flow of data in accordance with the law, giving full play to the due value of data, and spurring the development of the economy and society.
2.1. Current Status of the Definition of Public Data in China
At present, the existing legal provisions in China do not specify the scope of public data. The academic community has also not yet made a unified definition of the public data. Public data is often confused with government data and government affairs data. The conceptual ambiguity of public data will affect the quantity and quality of collected and acquired public data; it will also lead to the inability to determine the scope of application of law on the opening of public data, so that the regulatory effect of the law is greatly reduced; it will also impede the subsequent sharing and utilization of public data access, leading to the infringement, uneven distribution of benefits, improper use and other issues (Huang & Lai, 2018).
2.1.1. Definition of Public Data in Chinese Policy and Law
China’s policy documents and laws or regulations do not distinguish between public data, government data, and government affairs data. In 2020, the State Council’s “Opinions on Building a Better Institutional Mechanism for the Market-based Allocation of element of Production,” in the section proposing to promote the openness and sharing of government data, it states, establish systems and norms for the promotion of the openness of public data such as enterprise registration, transportation and meteorology. The Data Security Law promulgated in 2021 also does not directly define the scope of public data. However, in Chapter V, “Government Data Security and Openness”, it defines government data as data collected and used by state organs and organizations authorized by laws and regulations to manage public affairs in order to perform their statutory duties.
Standards for public data are also not harmonized in local legislation. According to the Chengdu Public Data Management Application, public data is equivalent to government data. The Regulations on Public Data in Zhejiang Province expanded the scope of subjects of public data, from organizations authorized by the laws to the organizations authorized by the laws and regulations. Relevant regulations in Beijing and Guangdong Province categorized data generated from the provision of public services as public data. In Shenzhen Special Economic Zone Data Regulations, the data generated by public service institutions and enterprises in the process of performing public services are also included in the scope of public data.
2.1.2. Academic Debates on the Concept of Public Data
Some scholars consider public data to include government data, government affairs data, etc. (Ren, 2023). Others argued that government data should refer only to data collected by administrative organs and their internal agencies, while government affairs data are data collected by organizations authorized to exercise administrative functions Information collected, organized or maintained by public utilities, data created by State-owned and private enterprises entrusted by the Government with public financial support, and data in the hands of those enterprises that are relevant to the government and of significant public interest are also public data (Zheng, 2018).
However, some scholars strictly limit the concept of public data to data formed in the process of performing management duties and public services by institutions exercising public power orarising from public authority, and raise concerns about the expansion and generalization of the concept of public data: the expansion of the concept of public data confuses the boundaries of the public data openness system and the data element circulation system, which not only violates the different underlying logics of the two systems, but also causes a huge impact on the data element market system (Wang & Wang, 2023).
2.2. Definition of Public Data by Other Countries and International Organizations
In the OPEN Government Data Act (2019), the U.S. government defines public data as data assets held by the federal government, includes data that can be or has been released to the public in an open format and can be found by searching Data.gov, as well as data that is in the global public domain or, if necessary, released under an open license. Consistent with its commitment to the free flow of data in order to leverage its value, the U.S. classifies and regulates public data in order to make better use of it and to facilitate collaboration between the government and nongovernmental organizations (NGOs), nonprofit organizations, citizens, schools, and private and state-owned enterprises to explore opportunities to co-develop data products based on public data, drive innovation in both the public and private sectors in accordance with the law and regulations.
The EU’s Directive 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information establishes a definition of public data, include data acquired by public-sector bodies or public undertakings and publicly funded research data. The Directive (EU) 2019/1024 of the European Parliament and of the Council of 20 June 2019 on open data and the re-use of public sector information establishes the definition of public data, including data obtained by public-sector bodies, public undertakings and publicly funded research data. In addition, the EU has established a new concept of “high-value datasets” to refer to dense data that are inclusive and of clear social value, economic value or environmental value.
The United Nations Survey on E-Government, published in 2020, suggests that public data includes all data available in the public domain, such as data created by governments, academia (e.g., scientific data), civil society and the private sector. Government data is one of the subsets of public data (United Nations, 2020).
The Open Data Charter agrees that a broader definition of public data should be adopted, specifically including data held by regional, local and city governments, international government agencies and public services departments; data created for government by external agencies; and data held by external agencies that is relevant to government programs and services and is of significant public interest (Open Data Charter, 2015).
2.3. Re-Conceptualization of Public Data
The same legal act may have different meanings in different law areas, but this state of affairs does not affect the act itself. Whether or not to isolate the public value of public data in administrative law and civil law, the openness, sharing and circulation of public data utilization will not be affected. Therefore, from a utilitarian point of view, it does not make a lot of sense in practice to strictly differentiate between the openness system and the circulation system of public data, or to restrict the scope of interpretation of public data.
Therefore, when determining the connotation and extension of public data, it is appropriate to choose a broad definition, with the goal of promoting data opening and sharing, including all types of data through as clear a form of expression and as broad a coverage as possible, ensuring that there is a sufficient legal basis for data opening and sharing and a sound protection system, so as to reduce the obstacles to the flow of public data. In addition, when formulating the corresponding laws and regulations, care should be taken to ensure the accuracy of the formulation of the provisions, to avoid the cross-use of public data with concepts such as “government data” and “government affairs data”, which in fact cover less than public data.
Defining the concept of public data should be based on a comprehensive judgement of the public nature of the data acquisition subject and source scenario, as well as the public interest of the specific content of the data. Therefore, when set the boundaries of public data, we can start from the subject element, the behavioral element and the content element. In terms of subject elements, government departments and organizations authorized by laws and regulations, institutions providing public services, state-owned enterprises and private enterprises that receiving financial support from the government can all be holders of public data. Enterprises that are not entrusted or subsidized by the government but are in industries or fields of public interest may also become holders of public data, such as large Internet platform enterprises. However, this needs to be strictly limited to cases where failure to categorize them as public data would have a serious adverse impact on citizens and society (Ma, 2024), otherwise, it would place an unnecessary public law burden on private subjects.
With regard to the behavioral element, public data shall be generated, collected and acquired by the above-mentioned institutions in the performance of their statutory functions or the provision of public services to society. Administrative organs and institutions rely on the authorization of the public to perform their functions and provide public services in accordance with the law, thus conferring natural public attributes on the data obtained in the process. At the same time, the data obtained when receiving government funding support for research also has public attributes because the funding comes from the public, and also needs to be included in the boundaries of public data. It can be argued that citizens, having paid for it in advance, certainly have the right to use it free of charge (Xing, 2021).
In terms of content elements, public data should be of public interest, and public interest is the basis for the establishment of the concept of public data, which is the fundamental characteristic that distinguishes it from other types of data such as personal data and enterprise data. In other words, the attribute judgment of public data depends on whether its application scenario and goal have public interest (Smichowski, 2019). On the one hand, the collection and acquisition of public data mostly take place in the fields of education, medical care, communication, water, electricity and gas, public transportation, etc., which are closely related to the public’s daily life, involving the interests of the vast majority of people, and in special cases, involving the broader interests of national security. On the other hand, even if data are not collected for public interest purposes, they can be transformed into public data if they are utilized in the public interest for the purpose of public decision-making, social innovation and addressing social challenges.
The mobility of data determines that public data, enterprise data and personal data are always in the rapid flow among different subjects, platform companies collect numerous data closely related to public interests in the course of their operations, while the wide application of the Internet of Things and wearable devices also brings difficulties in the acquisition and control of data. In the circumstance, the boundary between public data and enterprise data or personal data is becoming increasingly blurred (Xia, 2024). Only by correctly recognizing the scope of public data can we prepare for the subsequent opening and utilization of public data.
3. Design of Rules for Public Data Openness
Opening and sharing public data, expanding data utilization scenarios, and innovating data products and services are not only the proper meaning of optimizing national governance system and constructing a digital government under the rule of law, but also key initiatives to activate the potential of public data, cultivate the data element market, and promote the development of the digital economy.
3.1. Current Status of Public Data Openness Rules in China
China has not yet formulated unified rules for open public data, but has adopted the practice of local legislation first, whereby local governments formulate norms according to their specific conditions and realistic needs to manage public data opening activities within their own jurisdictions.
In 2019, Beijing and Shanghai were the first to introduce provisional measures for open public data. Subsequently, Guizhou, Shanxi, Zhejiang and other provinces have also promulgated the basic rules for opening public data in their provinces. With the increase in practical experience in public data opening, some local governments have further introduced corresponding implementation rules to continuously improve the provisions for public data opening.
Through the analysis of the regulations of various provinces and cities, it shows that China manages public data that need to be opened up mainly through the government’s compilation of list catalogs and the formulation of openness plans. According to the degree of confidentiality of the content, public data are categorized into non-open data, conditionally open data and unconditionally open data. Public data which involving personal information, commercial secrets, national security or laws and regulations stipulating that they cannot be opened are categorized as non-open data. Public data with high requirements for data security and processing capacity, high timeliness, or in need of continuous access are included in the conditional open data. Other public data belong to the unconditional open data. At the same time, non-open data that have been cleansed and declassified or have the consent of the right holders can be transformed into conditional open data or unconditional open data.
However, China’s current rules on the opening of public data are formulated by local governments, and the status of legislation is on the low side. There is also a lack of effective coordination mechanisms, supervision mechanisms and effectiveness evaluation mechanisms in the design of rules for public data opening. In addition, there are problems of uneven development and lack of sustainability in the opening of public data. This can lead to unstable expectations of public data openness and suspicion of deviation from the rule of law in data governance on the part of public data users (Wang & Huang, 2022).
3.2. Comparative Study on the Design of Rules for Public Data Openness
Openness of public data is not only a topic that needs to be addressed in China, but other countries in the international arena are equally concerned about openness of public data and its significance in utilizing the value of public data and stimulating the development of the digital economy. The United States, the European Union and other countries and international organizations have also formulated special rules on this issue.
3.2.1. Design of Rules for Public Data Openness in the United States
The U.S. Open Government Directive, enacted in 2009, already required the federal government to make government data available to the public on its website. In 2013, the Open Data Policy required agencies to create comprehensive open data inventories and public data lists. In 2019, the Foundations for Evidence Based Policy-making Act, also known as the OPEN Government Data Act, elevated this requirement to a statutory obligation for the government, while mandating that the government not only needs to make data available in a platform-independent, machine-readable, publicly accessible format that will not prevent the public from re-purposing it, but also be mindful of privacy and data security (Lin, 2023).
The United States has also established the Chief Data Officer (CDO) and the Chief Data Officer Council (CDOC) to manage and oversee public data openness. The chief data officer is responsible for meeting agency data needs, managing agency data assets, encouraging agencies and the public to use public data, and reviewing agencies’ open data efforts. The chief data officer council is responsible for evaluating and further optimizing agencies’ open data efforts, improving the federal government’s data collection efforts, and promoting open data sharing among agencies.
3.2.2. Design of Rules for Public Data Openness in the EU
The EU’s policies and rules for public data openness include Directive (EU) 2019/1024 (enacted in 2019), A European Strategy for Data (proposed in 2020), and the Data Governance Act (adopted in 2022). Directive (EU) 2019/1024 establishes general rules for the openness of public data, requiring governments to make public data available in a way that allows for bulk downloads and machine-readable. A European Strategy for Data is the EU’s policy measures and investment strategy for the development of the digital economy over the next five years, which proposes a long-term openness of government-held data and building an interoperable data space to give back and benefit society. The Data Governance Act allow natural person and legal persons to realise access to and re-use of public data in a secure processing environment dominated by the public sector, subject to the protection of personal information and trade secrets, and also responds to issues of intellectual property rights and personal information that are not covered by the Directive.
June 2024, High-Value Dataset Implementing Regulation (HVD Implementing Regulation) came into force, bringing new changes to the EU’s public data openness policy. The HVD Implementing Regulation makes more detailed provisions on the high-value datasets proposed in Directive (EU) 2019/1024, requiring that high-value data be made available free of charge in national data portals and accessible through application program interfaces and bulk downloads, and making it clear that high-value datasets should be made available to the public and the ways and procedures for opening up high-value datasets, which can help the government and the public to make better use of public data, and at the same time ensure the consistency and fairness of the EU’s data access policy. In addition, the regulation provides for a mechanism to assess the effectiveness of implementation. Departments are required to produce a statement every two years describing the measures they have taken to implement the regulation.
3.3. Improvement of the Design of Rules for Public Data Openness in China
The results of active exploration by local governments and the governance experience of other countries and international organizations have established a solid cornerstone for China to optimize the regulation of open public data. China should summarize its domestic experience and lessons learned as soon as possible, and combine them with international practices of open public data regulation to improve its own open public data mechanism, release more valuable data, promote the sharing and reuse of public data, and drive scientific and technological innovation and the development of industries and the economy.
The formulation of unified legal norms on public data openness is a necessary prerequisite for ensuring that public data openness is “legally enforceable” and for improving the regulation of public data openness in China. The practice of local governments leading the development of open public data rules may lead to data silos and data monopolization, which is not conducive to exchanges between localities and between local governments and data users, and may even exacerbate barriers to data sharing and utilization. Local governments, in the process of developing public data openness in their regions, may also experience the phenomenon of deviating from the central government’s development path for public data openness. Therefore, breaking down the rule barriers and grasping the development direction of open public data across the country through unified legislation is the first task in regulating open public data.
When formulating unified rules for the opening of public data, it is necessary to define the purpose of the legislation, and build a framework that includes at least the basic principles of public data opening, division of responsibilities, approach to openness, monitoring and evaluation mechanisms, and platform construction on the premise of clarifying the scope of public data.
With regard to the purpose of legislation, it should be emphasized that the opening up of public data is not only to meet the requirements of openness and transparency in administrative procedures, but also to revitalize the value of public data and to prepare for the promotion of the development and utilization of public data by society.
With regard to the basic principles of openness, firstly, it is clear that public data opening activities need to comply with laws and regulations, in other words, the purpose, procedures and subject qualifications are legal, and the prohibitions of the law shall not be violated; secondly, the opening of public data needs to be transparent, and the standards and procedures of data openness should be made public, so as to ensure both the public’s right to know, and also to ensure that there is no discremination in the way that members of the public have access to the open data; thirdly, strictly observe data security, that is to say, the development of public data opening activities must be carried out on the premise of protecting personal information and commercial secrets and maintaining national security. Finding an appropriate balance between open data access and confidentiality for companies and individuals is one of the most important and complex tasks in the digital society (Palfrey & Gasser, 2012), and the opening of public data can certainly not avoid this requirement, especially since public data may also involve important national interests,
With regard to the division of responsibilities, the allocation of responsibilities for public data openness between the central and local authorities and between local departments should be clarified to ensure that public data openness is carried out in an orderly manner throughout the country; the authorities responsible for public data openness and the authorities responsible for managing it should also be identified, the scope of responsibilities of each authority should be delineated, and a communication and coordination mechanism should be established between the authorities too, so as to put the requirements of the rules on the openness of public data into practice.
With regard to the approach to openness, the management system of establishing lists of public data shall be adhered to and optimized, so as to make it convenient for the corresponding authorities to carry out the work of opening up public data and for members of the public to inquire about the information they need; the forms of opening up public data in a hierarchical and classified manner shall be further improved, the types of public data to be opened up shall be determined in accordance with the degree of confidentiality of the data and the scenarios of their application. Public data that are of immediate national interest or of great significance to scientific and technological innovation and industrial development should be prioritized for opening, and the machine-readability and re-usability of open data should be established to ensure that the value of the data can be fully realized. When disclosing public data that are conditionally open to the public, strengthen the qualification examination of the subjects applying for disclosure, clarify the identity of the subject of the application, the purpose of the application and the scope of openness. Data that may involve personal privacy, commercial secrets or even national security should be carefully examined by specialized data security supervisory and management departments based on clear audit criteria before deciding whether to open them, and when opening the data, it should be ensured that the sensitive information has been completely wiped out, and that a mechanism for return visits, inspections and accountability is set up at a later stage, so as to prevent applicants from improperly making use of the conditionally open data.
With regard to the supervision and evaluation mechanism, the responsible authority should regularly report to the management authority on its work on open data, while the management authority needs to formulate corresponding evaluation criteria to examine the results of the responsible authority’s work, and at the same time make suggestions on how to continue to optimize the work on public data openness. The responsible authority and the management authority should also set up a special public communication channel to regularly learn about the specific needs of the public for data openness, and adjust and update the plan and list of public data openness according to the needs.
4. Platform Construction for Public Data Openness
Public data opening is an activity in which data holders share their data with the public, relying on the government to take the lead, through the establishment of a opening platform, which is essentially the supply of data. The utilization of public data is based on the opening of platform and the market competition mechanism to trade the right to use data and derivative products and services, which is essentially the docking of supply and demand of public data (Shang, 2024). Public data opening platform is not only a key to realizing public data opening, but also a link connecting public data opening and utilization.
4.1. Current Status of China’s Platform Construction for Public Data Openness
China realizes the opening of public data in the form of categorized on data open platforms. Conditionally open data require qualified applicants to apply to the data open platform or other relevant data carriers for provision. The applicant should meet the conditions set forth in advance by the data carrier or open platform, including but not limited to requirements for use, data security, technical capacity, credit and utilization feedback. For unconditionally open public data, natural persons, legal persons and unincorporated organizations need not register or apply in order to obtain them directly through the open platform by way of data download or interface call.
As of August 2023, China has established 22 provincial public data open platforms and 204 city public data open platforms (including municipalities, sub-provinces and prefectural-level administrative regions). This is a significant increase from the number of platforms that first went live in 2012.
However, similar to the rules on the public data openness, local governments are also “doing things their own way” when it comes to the construction of public data open platforms. The government has excessive discretion in whether to open data and what data to open, and there is no uniform standard. The content and quality of open data are also difficult to meet the needs of the public, the degree of openness of public data with strong utility, such as public security, commerce, industries and market supervision, is insufficient (Wang, 2021). The perception of use of some public data open platforms is poor, and the public is instead at a loss when faced with numerous public data resources.
4.2. Comparative Study on the Platform Construction for Public Data Openness
Under the requirements of the US Open Government Directive, in order to further improve the public data openness system and enhance the quality of data openness, the United States launched a data openness platform, data.gov, in 2009, which not only provides public data, but also provides data analysis tools, data project incubation resources, and related cases to help users explore the value of public data. In 2019, after the OPEN Government Data Act was enacted, the quantity and quality of the platform’s public data disclosure was further enhanced. To date, data.gov has collected and aggregated nearly 300,000 usable datasets and dataset collections from more than one hundred organizations, with more than one million views per month, playing an important role in unlocking the potential of data to help decision-making and drive innovation and economic development.
Data.europa.eu is the open platform for EU public data, managed by the Publications Office of the European Union. The platform provides more than 1.3 million public data sets from the EU in the fields of economics, finance, agriculture, energy, environment and so on. Users can not only search for relevant public data within a specific field, but also search for the content they need by country. In addition, the platform provides a series of learning courses to help users recognize and utilize public data.
4.3. Improvement of China’s Platform Construction for Public Data Openness
The public data open platform is a fundamental facility necessary for promoting the open sharing and subsequent utilization of public data, It is also a centralized manifestation of the results of the government’s work on public data openness. Therefore, optimizing the construction of the public data open platform is an important part of improving the public data open system.
In view of the current situation of the construction and operation of public data open platforms, China first needs to coordinate the establishment and operation of local public data open platforms on the basis of unified open standards and formats, and promote the interconnection and interoperability of existing open platforms, so as to facilitate users’ access to data and the government’s supervision and management of the opening of public data.
Secondly, with regard to the use of the public data open platform, China should not only continue to increase the content of open public data, but also improve the search function of the platform, increase the recommendation of related public data resources, and set up communication and consulting boards, so as to optimize the experience of using the open platform. At the same time, China should strengthen the maintenance of the public data open platform, encourage, support and guide the development and utilization of platform security technologies, cultivate data security awareness among public data users, to prevent the security risks to prevent the security risks that may arise from the opening and sharing of public data. In addition, China has to provide users with course resources for querying and utilizing public data, so as to improve citizen’s ability to access and utilize public data.
While improving the existing public data open platform, it is more important for China to accelerate the construction process of a national unified public data open platform, so as to form a “one-stop” service for the opening and utilization of public data, and to respond to the realistic requirements of China’s unified rules for the opening of public data.
5. Conclusion
As an important part of data resources, public data is characterized by its large scale, high quality and diverse collection methods, and it holds great political, social and economic value. Opening up public data is actually the precursor to utilizing public data and obtaining the value of public data. In practice, the opening and utilization of public data is a closely linked process, which is a synergistic relationship. Data opening is only the beginning, and data utilization is the means to achieve the purpose of tapping the value of public data and driving economic development.
China still needs to go a long way in opening and sharing public data. On the basis of summarizing the existing practical achievements and learning from international advanced experience, the government should clarify the specific scope of public data, speed up the process of formulating national unified legal norms on public data opening, and at the same time, promote the interconnection and interoperability of local public data open platforms, build a national public data open platform covering the whole country and with richer and more diversified data at an early date, thereby stimulating new vitality of public data, driving scientific and technological innovation and industrial prosperity, grasping the initiative of the development of the digital economy and forging a new development paradigm.