Personal Data v. Big Data in the EU: Control Lost, Discrimination Found

We live in the Big Data age. Firms process an enormous amount of raw, unstructured and personal data derived from innumerous sources. Users consent to this processing by ticking boxes when using movable or immovable devices and things. The users’ control over the processing of their data appears today mostly lost. As algorithms sort people into groups for various causes, both legitimate and illegitimate, fundamental rights are endangered. This article examines the lawfulness of the data subject’s consent to the processing of their data under the new EU General Data Protection Regulation. It also explores the possible inability to fully anonymize personal data and provides an overview of specific “private networks of knowledge”, which firms may construct, in violation of people’s fundamental rights to data protection and to non-discrimination. As the Big Data age is here to stay, both law and technology must together reinforce, in the future, the beneficent use of Big Data, to promote the public good, but also, people’s control on their personal data, the foundation of their individual right to privacy.


Introduction
In the age of Big Data (King & Forder, 2016: p. 698;Giannakaki, 2014: p. 262), information (Lessig, 2006: pp. 180-185;Summers & DeLong, 2001) fully confirms its etymological origin (Araka, Koutras, & Makridou, 2014: pp. 398-399) and becomes abundantly available (Himma, 2007). It constitutes a mass-produced good (Battelle, 2005), consumed as a commodity, rather than leveraged as a tool for personal growth of the individual or the development of democratic societies (Koelman, 2006). Information, including personal data (i.e. "any information How to cite this paper: Bottis, M., & Bouchagiar, G. (2018). Personal Data v. Big Data in the EU: Control Lost, Discrimination Found. Open Journal of Philosophy,8, identification, but the capacity to identify, directly or indirectly, one person (Tene, 2008: p. 16). To sum up, if there is a capacity to identify the individual, to whom the "recorded heart rate" mentioned above relates, the data are personal and in particular health data (Panagopoulou-Koutnatzi, 2015b) and are fully regulated by the GDPR.
Given the above practices, which show at least an important loss of the user's control over her personal data, this paper examines the validity and lawfulness of the data subject's consent to the processing of their personal data, studies the inability to anonymize such data and also, provides an overview of specific "private networks of knowledge", which any digital company is able to build (own and control) in violation of the fundamental right to non-discrimination.

The Subject's Consent to Data Processing
One of the fundamental principles of data protection law in Europe and beyond is respect for personal autonomy (Bottis, 2014: p. 148). Legal provisions on personal data safeguard constitutionally-protected rights to informational self-determination (Kang, Shilton, Estrin, Burke, & Hansen, 2012: p. 820). Hence, it has been consistently supported by authors that the fundamental (Article 8(1-2) of CFREU; Article 16(1) of TFEU) right to the protection of personal data refers to control by the subject over the processing of her data (Oostveen & Irion, 2016;Rengel, 2014). The key tool for a legal control of personal data is the subject's consent to the processing (Tene & Polonetsky, 2013: pp. 260-263;Solove, 2013Solove, : p. 1894A29DPWP, 2011). The European lawmaker recently regulated the protection of natural persons with regard to the processing of personal data and the free movement of such data (GDPR), and in this Regulation, took into account these aspects of control (Recitals (7) and (68) of GDPR) and legislated that the previous subject's consent shall be a necessary prerequisite for the lawfulness of data processing (Article 6(1)(a) of the GDPR). In particular, under the GDPR, the collection and processing (Article 4(2) of the GDPR) of personal data shall be lawful if the data subject has given consent to the processing of his or her personal data (Recital (4) and (42) of the GDPR) for one or more specific purposes (Article 6(1)(a), Recital (32) of the GDPR). Moreover, "consent" of the data subject means any Open Journal of Philosophy freely given, specific, informed and unambiguous indication of the data subject's wishes by which he or she, by a statement or by a clear affirmative action, signifies agreement to the processing of personal data relating to him or her (Recitals (42), (43), Articles 7(4), 4(11) of the GDPR).
One would assume, therefore, that a "single mouse-click" on any privacy policy's box, by which users may give their consent, should not be considered to fulfill the criterion of "freely given, specific, informed and unambiguous" indication of the data subject's wishes by which the individual has to signify agreement to the processing. Quite the opposite is true: under Recital 32 of the GDPR, consent can also be given by "ticking a box, when visiting an internet website" (the repealed Directive 95/46/EC makes no mention of the capacity to give consent simply by ticking a box).
Thus, the data subject's consent to the collection and processing of her personal data may be validly and lawfully given by a single "mouse-click" on the box of a webpage, the terms of use and the privacy policy of which-almost-nobody reads (Turow, Hoofnagle, Mulligan, Good, & Grossklags, 2006: p. 724;Pingo & Narayan, 2016: p. 4;Gindin, 2009;Chesterman, 2017). Given that, as documented, in most cases the users "generously click" on any box that may "pop-up" (Vranaki, 2016: p. 29), private enterprises legally (and with individual's "freely given, specific, informed and unambiguous" wishes) process (e.g. collect, record, organize, structure, store, adapt, alter, retrieve, consult, use, disclose, disseminate, make available, combine, restrict, erase or destroy) personal data.

Anonymizing Data: A Failure?
In several cases, after having collected personal data, firms anonymize them. This means that "effective" measures are taken and data are further processed in a manner which renders the re-identification of the individual impossible (Hon, Millard, & Walden, 2011;Stalla-Bourdillon & Knight, 2017). Anonymization constitutes further processing (A29DPWP, 2014) and always comes after the collection of data. Hence, given the legislated validity of consent that users have already given often by a single "mouse-click", companies may legally anonymize their collection of personal data. Anonymized (ex-personal) data can be freely used e.g. shared with third parties, sold etc as the rules of data protection do not apply to "personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable" (Recital (26) of the GDPR).
But in the age of Big Data, there is probably no safe way to render personal data truly anonymous (Scholz, 2017: p. 35;Schneier, 2015: pp. 50-53). Even after "anonymization", the data subject remains technologically identifiable (Ohm, 2010(Ohm, : p. 1701Sweeney, 2000;Golle, 2006;Gymrek, McGuire, Golan, Halperin, & Erlich, 2013;Bohannon, 2013;Narayanan & Shmatikov, 2008). The inability to anonymize personal data in a Big Data environment is due to the collection and correlation of a huge volume of data from multiple sources. The result is the possibility to draw "countless conclusions" about an individual, who may be identified, directly or indirectly (Tene & Polonetsky, 2013: p. 257;Cunha, 2012: p. 270). In other words, anonymization can only be achieved in "Small Data" environments, given that the volume and the variety of data processed in the world of Big Data, facilitate and encourage (re)identification of any individual (Mayer-Schönberger & Cukier, 2013: p. 154).
We see, therefore, the anonymization of personal data, in a Big Data environment, portrayed as a failure. The same technology, which reassured us that we could not be identified, and so our personal data could be used for some noble purposes as, for example, medical research, now betrays us. A huge data set is almost magically, and reassuringly, turned anonymous, and then, adding a piece of information or two, it is turned back, some point later in time, to full identification (De Hert & Papaconstantinou, 2016: p. 184). If this is the case, where is our consent in this situation? A "single click" consent to this processing is from the outset pointless. The very specific purpose of the processing for which the individual has to give her initial consent has often, at the time of "mouse-click", not even been decided yet by the firm who is the controller Thus, when users in fact ignore the final purpose (Steppe, 2017: p. 777;A29DPWP, 2008) for which consent is given (Bitton, 2014: p. 13), it is fair to support that they have lost control over their data (Solove, 2013(Solove, : p. 1902). If no genuine consent can be given and if anonymization is indeed practically impossible, then there is no control at all (Carolan, 2016;Danagher, 2012). But this loss of control contrasts strongly with the goals and principles of the constitutional, in Europe, right to the protection of personal data. It defeats the raison d' être of all previous European legislation on data protection all the way since 1995.

Knowledge and the Fundamental Right to Non-Discrimination
Although the right to the protection of personal data is fundamental, probably not many people are aware of this right and much fewer have been documented to exercise powers which this right gives them (O'Brien, 2012;Hill, 2012). That people fail to exercise their rights or do not care about their personal data does not mean that this "apathy" should be "applauded" (Tene & Polonetsky, 2013: p. 263). A very important reason why it should be required that individuals demonstrate greater interest in their data protection, is that control over the processing of personal data enables the data controller to know (Mayer-Schönberger & Cukier, 2013: pp. 50-61; Cohen, 2000: p. 402). In fact, in the Big Data environment control over the processing of personal data enables any firm to build its own "private networks of knowledge" (Powles & Hodson, 2017). These networks can lead, or perhaps has already led, to the accumulation of power, a power to an unprecedented extent and nature, resting in "private hands". This power may undermine the fundamental right to equality and non-discrimination (Article 21 of CFREU). As early as in 1993, Gandy Open Journal of Philosophy spoke of a digital environment, where databases profiled consumers and sorted them into groups, each of which was given different opportunities (Gandy, 1993). Some years later, other scholars (Gilliom, 2001;Haggerty & Ericson, 2006) built on Gandy's theory and explained the manners in which new, at the time, tools and datasets were used by governments and private companies alike, so as to sort people and discriminate against them. Today, as these authors argue, private enterprises focus on human beings and study users' behaviors or movements or desires, so as to "mathematically" predict people's trustworthiness and calculate each person's potential as a worker, a criminal or a consumer. "Private algorithms", which process users' data, are seen as "weapons of math destruction" that threaten democracy and the universal value of equality (O'Neil, 2016: pp. 2-3, p. 151).
Today's "free" Internet is paid for mainly by advertising, for the needs of which tons of personal data are collected (Richards & King, 2016: pp. 10-13).
Processing of these data with the help of cookies enables firms to identify the user and detect her online or even offline activities (Lam & Larose, 2017;Snyder, 2011). Thereafter, the user's data are used by private parties, to profile (Article 4(4) of the GDPR) people, to create "target groups", to which personalized ads may target the correct consumers (Förster & Weish, 2017: p. 19). In the Big Data environment, profiling or sorting consumers into groups may indeed be extremely effective. But the line between a legal sorting and profiling in favor of private interests and an unlawful, as contrary to the principle of equal treatment, discrimination based on personal data collected is blurry (Gandy, 2010;Article 21(1) of CFREU). It is also alarmingly disappearing, as users are being discriminated against on grounds of their personal data, not only during advertising, but in general, while private companies provide any services or just operate, by analyzing the users' data and "training their machines" (Mantelero, 2016: pp. 239-240;Crawford & Schultz, 2014: pp. 94-95, p. 98;Veale & Binns, 2017;Hastie, Tibshirani, & Friedman, 2009).
Given the correlations that Big Data allows and encourages, any private company that knows, for example, a user's gender, or her origin or her native language, may discriminate against her (Boyd & Crawford, 2011;Panagopoulou-Koutnatzi, 2017). This can happen by sorting or profiling, not only on the grounds of this information, but also on the grounds of other multiple personal data (Tene & Polonetsky, 2013: p. 240), which the private party may find by combining a huge volume of data, such as the exact address, where the user lives, or even the information that a consumer suffers from diabetes or that she is the mother of three minors (O'Neil, 2016: pp. 3-5, pp. 130-134, p. 151;Rubinstein, 2013: p. 76). Hence, a private company can use these data to create a system that will sort people into lists, put the most promising candidates on top, and "pick" the latter to fill the vacant posts in the company (O'Neil, 2016: pp. 3-5, pp. 130-134).
To sum up, sorting or profiling by "private algorithms", in favor of private interests and at the expense of people's fundamental right to equality and Open Journal of Philosophy non-discrimination, analyzing and correlating data so as to project the "perfect ad" (See A29DPWP, 2013: p. 46) or promote the "appropriate good" at the "appropriate price" (Turow & McGuigan, 2014;EDPS, 2015: p. 19) or predict criminal behaviors (Chander, 2017(Chander, : p. 1026 or "evaluate" the accused before sentencing courts (State v. Loomis, 2016), all these actions place almost insurmountable barriers in regulating the processing of personal data (Crawford & Schultz, 2014: p. 106). Knowledge and power seem to be accumulated in the hands of private entities in violation of people's fundamental rights. Firms may or do dictate "privacy policies and terms of processing of data", in conjunction with the continuous ticking of boxes with users' eyes closed (Manovich, 2011). This reality calls for solutions that will enable people to regain control over their personal data-over themselves (Mitrou, 2009).

Conclusion
By processing personal data, several economically and socially useful purposes have been achieved (Manovich, 2011;Prins, 2006: pp. 226-230;Knoppers & Thorogood, 2017). The processing of Big Data is even more promising. At the same time, however, the lawfulness of mass-processing of personal data in the Big Data environment is being questioned by many scholars. Although it is very important to examine this lawfulness in each emerging program or software, during the use of which consent is "grabbed by a mouse-click", it is much more important to understand the real conditions of this personal data processing, which many of us experience every day-or almost all of us experience many times each and every day.
The mass collection of personal data in an environment in which people do not meaningfully participate, in a setting of possibly opaque and discriminatory procedures (to predict, for example, people's behavior in general via the use of an algorithm, and then apply this prediction to a particular person), should concern all of us deeply. This is especially so, when people cannot know the purpose or even, ignore the very fact of processing and, hence, never give their consent in any meaningful way. The "consent fallacy" (i.e. the inability of the individual-websurfer to form and express free, conscious and informed choices, Mitrou, 2009: p. 466;Mitrou, 2017: p. 77) is accentuated at the highest possible degree. The processing of massive amounts of personal data, in combination with the accumulation of knowledge and power in "private networks" in violation of fundamental right to non-discrimination calls for a new progressive approach to legal provisions that protect personal data, and also, for the development of new technology inserting privacy protection in the very design of information systems dealing with Big Data (De Hert & Papaconstantinou, 2016).
The European legislator with the General Data Protection Regulation made a significant effort to protect people's rights on their personal data. Simultaneously, firms constantly devise and/or use new technologies of data-processing. This brings back to the discussion table some older academic opinions (Samuelson, 2000) that view commodification of personal data as a potential way, or even the