<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.4 20241031//EN" "JATS-journalpublishing1-4.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="1.4" xml:lang="en">
  <front>
    <journal-meta>
      <journal-id journal-id-type="publisher-id">jss</journal-id>
      <journal-title-group>
        <journal-title>Open Journal of Social Sciences</journal-title>
      </journal-title-group>
      <issn pub-type="epub">2327-5960</issn>
      <issn pub-type="ppub">2327-5952</issn>
      <publisher>
        <publisher-name>Scientific Research Publishing</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.4236/jss.2025.1312030</article-id>
      <article-id pub-id-type="publisher-id">jss-148369</article-id>
      <article-categories>
        <subj-group>
          <subject>Article</subject>
        </subj-group>
        <subj-group>
          <subject>Business</subject>
          <subject>Economics</subject>
          <subject>Social Sciences</subject>
          <subject>Humanities</subject>
        </subj-group>
      </article-categories>
      <title-group>
        <article-title>Differential Privacy Implementation for Anonymous Student Feedback on Campus Safety and Belonging</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <name name-style="western">
            <surname>Liu</surname>
            <given-names>Emma</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <name name-style="western">
            <surname>Guo</surname>
            <given-names>Joyce</given-names>
          </name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
      </contrib-group>
      <aff id="aff1"><label>1</label> Kent School, Kent, CT, USA </aff>
      <aff id="aff2"><label>2</label> Booth School of Business, The University of Chicago, Chicago, IL, USA </aff>
      <author-notes>
        <fn fn-type="conflict" id="fn-conflict">
          <p>The authors declare no conflicts of interest regarding the publication of this paper.</p>
        </fn>
      </author-notes>
      <pub-date pub-type="epub">
        <day>09</day>
        <month>12</month>
        <year>2025</year>
      </pub-date>
      <pub-date pub-type="collection">
        <month>12</month>
        <year>2025</year>
      </pub-date>
      <volume>13</volume>
      <issue>12</issue>
      <fpage>399</fpage>
      <lpage>410</lpage>
      <history>
        <date date-type="received">
          <day>10</day>
          <month>10</month>
          <year>2025</year>
        </date>
        <date date-type="accepted">
          <day>23</day>
          <month>12</month>
          <year>2025</year>
        </date>
        <date date-type="published">
          <day>26</day>
          <month>12</month>
          <year>2025</year>
        </date>
      </history>
      <permissions>
        <copyright-statement>© 2025 by the authors and Scientific Research Publishing Inc.</copyright-statement>
        <copyright-year>2025</copyright-year>
        <license license-type="open-access">
          <license-p> This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( <ext-link ext-link-type="uri" xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link> ). </license-p>
        </license>
      </permissions>
      <self-uri content-type="doi" xlink:href="https://doi.org/10.4236/jss.2025.1312030">https://doi.org/10.4236/jss.2025.1312030</self-uri>
      <abstract>
        <p>This paper is a review of differential privacy in data collection. Differential privacy is a mathematical framework that protects individual privacy when third parties collect and analyze sensitive information. The system works by adding carefully controlled mathematical noise to datasets to conceal any specific person’s data in the analysis. We will further explore its current applications in the fields of healthcare and public policy and detail our program developed upon this foundation. Utilizing differential privacy to maintain anonymity, the program is a survey that collects student feedback within a high school or college setting. The goal of this project is to help schools better understand student experiences and concerns while ensuring that personal information remains confidential and protected.</p>
      </abstract>
      <kwd-group kwd-group-type="author-generated" xml:lang="en">
        <kwd>Differential Privacy</kwd>
        <kwd>Data Anonymization</kwd>
        <kwd>Local Differential Privacy</kwd>
        <kwd>Student Feedback</kwd>
        <kwd>Campus Safety</kwd>
        <kwd>Privacy-Preserving Data Collection</kwd>
        <kwd>Educational Data Ethics</kwd>
        <kwd>Laplace Mechanism</kwd>
        <kwd>Privacy Budget</kwd>
        <kwd>Anonymous Surveys</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec1">
      <title>1. Introduction</title>
      <p>In today’s data-driven society, organizations collect huge amounts of personal data to inform decisions and improve services. However, such heightened data usage raises serious concerns about the privacy of individuals. Traditional anonymization methods are widely used: hospitals remove patient names from medical records before sharing data for research, schools strip student names from academic performance data, and companies remove customer identifiers before analyzing purchasing patterns. However, these methods often fail to fully protect identities. This occurs because they fail to account for indirect identifiers, such as age, gender, ZIP code, etc. Such pieces of information, when combined with other publicly available data, can be used to re-identify individuals. The cross-referencing process works by finding unique or rare combinations of characteristics. For instance, there might be only one 65-year-old male living in a specific small town who visited a cardiologist on a particular date, making him easily identifiable even in “anonymous” medical data. Anyone with access to these identifiers could cross-reference to pinpoint an individual, making the data collection insecure. The consequences of such re-identification can be severe: individuals may face institutional discrimination or even legal repercussions based on their private data. For students, this could lead to disciplinary actions. Hence, differential privacy was developed to protect against this kind of risk, ensuring that no individual’s data has a significant influence on the output. By “no significant influence”, we mean that the statistical results should remain virtually the same whether any particular person’s data is included or excluded from that dataset.</p>
      <p>There are two main types of differential privacy: local and central differential privacy. In central differential privacy, noise is added by the collector to the entire dataset right before it is released. In local differential privacy, which is the method that our implementation utilizes, noise is added to each individual’s data before it is sent to the collector. Local differential privacy provides stronger privacy guarantees since the raw data is protected from the collector. However, it typically requires more noise to achieve the same effectiveness as central differential privacy.</p>
      <p>Differential privacy is a mathematical framework that tackles this issue. By inserting carefully calibrated randomness into datasets, differential privacy renders the presence or absence of an individual’s data insignificant to the overall outcome. Inserting randomness means adding mathematical noise, small random numbers, to either individual responses or to the final statistical results. E.g., for medical data, you might add small random values to the patient’s age or test results. In a survey, you could randomly flip some “yes” answers to “no” with a small probability. The randomness is not arbitrary but calculated to provide strong privacy guarantees while maintaining the usefulness of the data for statistical analysis. Differential privacy requires careful implementation and can reduce data accuracy, especially for small datasets. The process is typically handled by data scientists or privacy engineers who implement the algorithms. While differential privacy provides strong mathematical guarantees, it could theoretically be “broken” if implemented incorrectly or if the privacy parameters are set poorly. Furthermore, differential privacy has significant practical limitations that restrict its applicability. First, it operates on a “privacy budget. This means that each query or analysis consumes some of this budget, and once exhausted, no further queries can be made without compromising privacy guarantees. This means organizations must decide in advance what analyses they want to perform. Second, all potential queries must be known at the outset of data collection, preventing researchers from asking new questions that arise during the course of a study. These constraints make differential privacy challenging to implement in research environments where questions evolve based on preliminary findings.</p>
      <p>Confidentiality in data collection requires systems that cannot reveal anything substantial about any single person, even to a third party that has access to external information. This level of security is especially important in sensitive environments like schools, where students may feel reluctant to give honest feedback. Differential privacy solves this need by offering a strict, quantifiable guarantee of privacy. It allows the students to answer without fear that their responses will be traced back to them.</p>
      <p>Many school surveys struggle to balance transparency and privacy because they often ask sensitive questions about bullying, mental health, or substance use. Traditional school climate surveys all face this challenge: students may not provide honest answers in fear that their responses could be traced back to them, potentially leading to disciplinary action or unwanted attention from faculty. This project integrates differential privacy into the world of student feedback. While surveys are increasingly being used within schools to gauge student well-being and improve the learning environment, they often struggle to balance transparency and privacy. Students should trust differential privacy guarantees because they are based on rigorous mathematical proofs rather than policies. The randomness insertion happens automatically through software running on secure school servers or cloud systems, ensuring that the privacy protection is applied consistently. We would like to solve this problem by developing a program that allows students in high school or college to give honest feedback through a differentially private survey. In this way, we aim to allow schools to make informed, data-driven decisions while maintaining the trust and confidentiality of students.</p>
      <sec id="sec1dot1">
        <title>Technical Background</title>
        <p>To further understand differential privacy, we need to define some key terms.</p>
        <p>1) computation—any mathematical operation performed on data.</p>
        <p>2) dataset—a collection of information about individuals, like survey responses.</p>
        <p>3) randomized algorithm—a computer program that uses random number generation each time it runs, even with the same input data.</p>
        <p>The core of differential privacy lies in the idea of limiting how much information any single data point contributes to the output of a computation.</p>
        <disp-formula id="FD1">
          <mml:math display="inline">
            <mml:mrow>
              <mml:mi>P</mml:mi>
              <mml:mi>r</mml:mi>
              <mml:mrow>
                <mml:mo>[</mml:mo>
                <mml:mrow>
                  <mml:mi>M</mml:mi>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:mi>D</mml:mi>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>∈</mml:mo>
                  <mml:mi>S</mml:mi>
                </mml:mrow>
                <mml:mo>]</mml:mo>
              </mml:mrow>
              <mml:mo>≤</mml:mo>
              <mml:msup>
                <mml:mtext>e</mml:mtext>
                <mml:mi>ε</mml:mi>
              </mml:msup>
              <mml:mo>×</mml:mo>
              <mml:mi>P</mml:mi>
              <mml:mi>r</mml:mi>
              <mml:mrow>
                <mml:mo>[</mml:mo>
                <mml:mrow>
                  <mml:mi>M</mml:mi>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:msup>
                      <mml:mi>D</mml:mi>
                      <mml:mo>′</mml:mo>
                    </mml:msup>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>∈</mml:mo>
                  <mml:mi>S</mml:mi>
                </mml:mrow>
                <mml:mo>]</mml:mo>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>This mathematical equation captures the principle of differential privacy protection ([<xref ref-type="bibr" rid="B3">3</xref>]). <italic>M</italic> represents a randomized algorithm, <italic>D</italic> represents our original dataset, and <italic>D</italic><italic>'</italic> represents a “neighboring” dataset that is identical to <italic>D</italic> except for one person’s data being added or removed. <italic>S</italic> represents any possible set of outputs that our algorithm might produce.</p>
        <p>This equation states that the probability (<italic>Pr</italic>) of a randomized algorithm (<italic>M</italic>) when run on dataset (<italic>D</italic>) producing a result dataset (<italic>S</italic>) can be at most e<sup>ε</sup> times larger than the probability of yielding that same result set through a neighboring dataset (<italic>D</italic><italic>'</italic>) that differs by one person’s data.</p>
        <p>Here, epsilon (ε) is the privacy loss parameter, which controls how much privacy protection we get. We cannot simply choose epsilon to be incredibly small because that would require adding so much noise that our results would become meaningless. If we set epsilon to exactly 0, we would need infinite noise, making our data completely unusable. Differential privacy does require randomness. Without it, an attacker could run the same analysis multiple times and average out the noise.</p>
        <p>The mechanism most commonly used to achieve this is the Laplace Mechanism, which adds carefully calibrated noise. The amount of noise added depends on the sensitivity of the function and the epsilon value. This mathematical framework provides provable privacy guarantees, meaning we can mathematically demonstrate that privacy will be protected. It also maintains statistical utility, meaning the noisy data can still be used to draw accurate conclusions about overall trends and patterns. However, this doesn’t work equally well for all functions. Simple counting queries and basic statistics work well because they have low sensitivity and require minimal noise. Complex analyses may become unreliable because they suffer from noise accumulation and higher sensitivity. For example, an analysis trying to detect relationships between student safety and academic performance might conclude no relationship exists when the noise drowns out the actual correlation.</p>
        <p>While pure differential privacy uses the guarantee</p>
        <disp-formula id="FD2">
          <mml:math display="inline">
            <mml:mrow>
              <mml:mi>P</mml:mi>
              <mml:mi>r</mml:mi>
              <mml:mrow>
                <mml:mo>[</mml:mo>
                <mml:mrow>
                  <mml:mi>M</mml:mi>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:mi>D</mml:mi>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>∈</mml:mo>
                  <mml:mi>S</mml:mi>
                </mml:mrow>
                <mml:mo>]</mml:mo>
              </mml:mrow>
              <mml:mo>≤</mml:mo>
              <mml:msup>
                <mml:mtext>e</mml:mtext>
                <mml:mi>ε</mml:mi>
              </mml:msup>
              <mml:mo>×</mml:mo>
              <mml:mi>P</mml:mi>
              <mml:mi>r</mml:mi>
              <mml:mrow>
                <mml:mo>[</mml:mo>
                <mml:mrow>
                  <mml:mi>M</mml:mi>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:msup>
                      <mml:mi>D</mml:mi>
                      <mml:mo>′</mml:mo>
                    </mml:msup>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>∈</mml:mo>
                  <mml:mi>S</mml:mi>
                </mml:mrow>
                <mml:mo>]</mml:mo>
              </mml:mrow>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>practical systems often adopt approximate DP, adding a small failure probability <inline-formula><mml:math><mml:mrow><mml:mi> δ </mml:mi><mml:mo> &gt; </mml:mo><mml:mn> 0 </mml:mn></mml:mrow></mml:math></inline-formula> :</p>
        <disp-formula id="FD3">
          <mml:math display="inline">
            <mml:mrow>
              <mml:mi>P</mml:mi>
              <mml:mi>r</mml:mi>
              <mml:mrow>
                <mml:mo>[</mml:mo>
                <mml:mrow>
                  <mml:mi>M</mml:mi>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:mi>D</mml:mi>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>∈</mml:mo>
                  <mml:mi>S</mml:mi>
                </mml:mrow>
                <mml:mo>]</mml:mo>
              </mml:mrow>
              <mml:mo>≤</mml:mo>
              <mml:msup>
                <mml:mtext>e</mml:mtext>
                <mml:mi>ε</mml:mi>
              </mml:msup>
              <mml:mo>×</mml:mo>
              <mml:mi>P</mml:mi>
              <mml:mi>r</mml:mi>
              <mml:mrow>
                <mml:mo>[</mml:mo>
                <mml:mrow>
                  <mml:mi>M</mml:mi>
                  <mml:mrow>
                    <mml:mo>(</mml:mo>
                    <mml:msup>
                      <mml:mi>D</mml:mi>
                      <mml:mo>′</mml:mo>
                    </mml:msup>
                    <mml:mo>)</mml:mo>
                  </mml:mrow>
                  <mml:mo>∈</mml:mo>
                  <mml:mi>S</mml:mi>
                </mml:mrow>
                <mml:mo>]</mml:mo>
              </mml:mrow>
              <mml:mo>+</mml:mo>
              <mml:mi>δ</mml:mi>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>This <inline-formula><mml:math><mml:mrow><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mi> ε </mml:mi><mml:mo> , </mml:mo><mml:mi> δ </mml:mi></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> form facilitates mechanisms like the Gaussian mechanism (often preferred for high-dimensional or composition-heavy workloads). The key design knob is global sensitivity <inline-formula><mml:math><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi></mml:mrow></mml:math></inline-formula> , the maximum change to a function’s output when one person’s data changes. Mechanisms scale noise to <inline-formula><mml:math><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi></mml:mrow></mml:math></inline-formula> :</p>
        <p>Laplace mechanism (pure DP, previously discussed): add <inline-formula><mml:math><mml:mrow><mml:mtext> Lap </mml:mtext><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mn> 0 </mml:mn><mml:mo> , </mml:mo><mml:mi> b </mml:mi></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math><mml:mrow><mml:mi> b </mml:mi><mml:mo> = </mml:mo><mml:mrow><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi></mml:mrow><mml:mo> / </mml:mo><mml:mi> ε </mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula> .Gaussian mechanism (approx. DP): add <inline-formula><mml:math><mml:mrow><mml:mi> N </mml:mi><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mn> 0 </mml:mn><mml:mo> , </mml:mo><mml:msup><mml:mi> σ </mml:mi><mml:mn> 2 </mml:mn></mml:msup></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math><mml:mi> σ </mml:mi></mml:math></inline-formula> proportional to <inline-formula><mml:math><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi><mml:msqrt><mml:mrow><mml:mn> 2 </mml:mn><mml:mi> ln </mml:mi><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mn> 1.25 </mml:mn></mml:mrow><mml:mo> / </mml:mo><mml:mi> δ </mml:mi></mml:mrow></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow></mml:msqrt></mml:mrow></mml:math></inline-formula> .Exponential mechanism: select categories (e.g., argmax) with DP when outputs are non-numeric.Randomized response/k-ary randomized response (local DP): privatize discrete answers by flipping with calibrated probabilities; aggregate with an unbiased estimator.</p>
        <p>For means of bounded numeric responses in <inline-formula><mml:math><mml:mrow><mml:mrow><mml:mo> [ </mml:mo><mml:mrow><mml:mi> L </mml:mi><mml:mo> , </mml:mo><mml:mi> U </mml:mi></mml:mrow><mml:mo> ] </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> , <inline-formula><mml:math><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi></mml:mrow></mml:math></inline-formula> depends on context:</p>
        <p>Central DP, mean over n users: <inline-formula><mml:math display="inline"><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi><mml:mo> = </mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mi> U </mml:mi><mml:mo> − </mml:mo><mml:mi> L </mml:mi></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow><mml:mo> / </mml:mo><mml:mi> n </mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula> .Local DP, per-response release: if you release a privatized individual value <inline-formula><mml:math><mml:mrow><mml:mi> x </mml:mi><mml:mo> ∈ </mml:mo><mml:mrow><mml:mo> [ </mml:mo><mml:mrow><mml:mi> L </mml:mi><mml:mo> , </mml:mo><mml:mi> U </mml:mi></mml:mrow><mml:mo> ] </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> , the sensitivity of <inline-formula><mml:math><mml:mrow><mml:mi> f </mml:mi><mml:mrow><mml:mo> ( </mml:mo><mml:mi> x </mml:mi><mml:mo> ) </mml:mo></mml:mrow><mml:mo> = </mml:mo><mml:mi> x </mml:mi></mml:mrow></mml:math></inline-formula> is <inline-formula><mml:math><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi><mml:mo> = </mml:mo><mml:mi> U </mml:mi><mml:mo> − </mml:mo><mml:mi> L </mml:mi></mml:mrow></mml:math></inline-formula> . Scale/clip first to minimize <inline-formula><mml:math><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi></mml:mrow></mml:math></inline-formula> .</p>
        <p>Rule of thumb for accuracy: If each of <inline-formula><mml:math><mml:mi> n </mml:mi></mml:math></inline-formula> users releases <inline-formula><mml:math><mml:mrow><mml:mi> x </mml:mi><mml:mo> + </mml:mo><mml:mtext> Lap </mml:mtext><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mn> 0 </mml:mn><mml:mo> , </mml:mo><mml:mrow><mml:mi> Δ </mml:mi><mml:mo> / </mml:mo><mml:mi> ε </mml:mi></mml:mrow></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> , the sample mean remains unbiased and the added noise’s standard deviation is <inline-formula><mml:math><mml:mrow><mml:msqrt><mml:mn> 2 </mml:mn></mml:msqrt><mml:mo> ⋅ </mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mo> ( </mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi></mml:mrow><mml:mo> / </mml:mo><mml:mi> ε </mml:mi></mml:mrow></mml:mrow><mml:mo> ) </mml:mo></mml:mrow></mml:mrow><mml:mo> / </mml:mo><mml:mrow><mml:msqrt><mml:mi> n </mml:mi></mml:msqrt></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> . A handy back-of-the-envelope for expected absolute error of the mean is <inline-formula><mml:math><mml:mrow><mml:mo> ≈ </mml:mo><mml:mfrac><mml:mn> 2 </mml:mn><mml:mrow><mml:msqrt><mml:mrow><mml:mi> π </mml:mi><mml:mi> n </mml:mi></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mo> ⋅ </mml:mo><mml:mfrac><mml:mrow><mml:mi> Δ </mml:mi><mml:mi> f </mml:mi></mml:mrow><mml:mi> ε </mml:mi></mml:mfrac></mml:mrow></mml:math></inline-formula> .</p>
        <p>Every released statistic spends part of the privacy budget. With basic composition, <inline-formula><mml:math><mml:mi> ε </mml:mi></mml:math></inline-formula> (and <inline-formula><mml:math><mml:mi> δ </mml:mi></mml:math></inline-formula> ) add across queries. For a 10-question survey, you can:</p>
        <p>1) Allocate a uniform budget (e.g., total <inline-formula><mml:math><mml:mrow><mml:msub><mml:mi> ε </mml:mi><mml:mrow><mml:mi> t </mml:mi><mml:mi> o </mml:mi><mml:mi> t </mml:mi></mml:mrow></mml:msub><mml:mo> = </mml:mo><mml:mn> 2 </mml:mn><mml:mo> ⇒ </mml:mo><mml:mi> ε </mml:mi><mml:mo> = </mml:mo><mml:mn> 0.2 </mml:mn></mml:mrow></mml:math></inline-formula> per item), or,</p>
        <p>2) Use utility-weighted budgeting, granting larger <inline-formula><mml:math><mml:mi> ε </mml:mi></mml:math></inline-formula> to items the school deems critical for decision-making (e.g., safety) and making smaller <inline-formula><mml:math><mml:mi> ε </mml:mi></mml:math></inline-formula> elsewhere.</p>
        <p>Advanced composition (and privacy accountants) can slightly improve cumulative guarantees, but the high-level takeaway holds: plan your analysis up front and track spending so repeated re-queries don’t silently degrade protections.</p>
      </sec>
    </sec>
    <sec id="sec2">
      <title>2. Applications</title>
      <p>Now that we have established the technical foundation, we can examine the current implementations of differential privacy.</p>
      <sec id="sec2dot1">
        <title>2.1. Healthcare</title>
        <p>One example of differential privacy’s application in healthcare is its use in the diagnosis of coronary heart disease. In [<xref ref-type="bibr" rid="B5">5</xref>], one study is described that developed a differentially private algorithm specifically for “diagnosing coronary heart disease using medical records” (p. 2273). The algorithm allowed personal health data, such as data collected through smartphones or smartwatches, to be disrupted with noise before leaving the device. This preserved privacy by not letting individual-level data be disclosed in its raw form. Precisely, the algorithm maintained diagnostic accuracy rates above 85%, which is considered clinically useful for screening purposes. However, widespread clinical adoption of such differential privacy systems remains limited, as most are still in research phases due to regulatory concerns. The main drawbacks include reduced accuracy for rare conditions and difficulty integrating with existing medical record systems. Despite the privacy-preserving noise, the algorithm maintained strong performance at “predictive modeling”, showing that one can build effective diagnostic tools without compromising patient privacy. The authors also acknowledge, however, that a broad challenge of differential privacy for medical research is that “diminished accuracy in small datasets is problematic” (p. 2269). This shows both the potential and challenge of differential privacy in real clinical practice.</p>
      </sec>
      <sec id="sec2dot2">
        <title>2.2. Public Policy</title>
        <p>Another example of differential privacy in use is within the U.S. Census Bureau for public policy. According to [<xref ref-type="bibr" rid="B4">4</xref>], the Census Bureau began applying differential privacy to protect individuals in publicly released demographic statistics. Traditionally, census microdata was anonymized by removing names and direct identifiers, but researchers later discovered that combining datasets could still lead to reidentification. The paper explains that differential privacy provides “provable privacy protection against a wide range of potential attacks” (p. 3) by adding noise to outputs, ensuring that the inclusion or exclusion of any one person has little effect on the result, as stated previously. This prevents privacy breaches even when datasets are queried repeatedly. However, the authors also note challenges: “providing information about small or sparse subpopulations is hard to do while providing strong privacy guarantees” (p. 5). Social scientists have raised concerns that this could limit studies on “poverty, inequality, immigration, internal migration, and more” (p. 8). Still, the case of the Census demonstrates the seriousness with which policymakers are beginning to adopt differential privacy to enhance transparency and confidentiality in national statistics.</p>
        <p>From our case studies, we can observe the following key points:</p>
        <p>1) Differential privacy works best with large datasets where individual noise has less impact on overall patterns.</p>
        <p>2) The technique requires careful parameter tuning to balance privacy and utility.</p>
        <p>3) Institutional adoption faces practical challenges beyond the technical implementation ([<xref ref-type="bibr" rid="B1">1</xref>]).</p>
        <p>4) The approach may inadvertently harm research on marginalized communities who most need policy attention.</p>
        <p>These insights inform our approach to student surveys, where we must consider both the benefits of honest feedback and the potential drawbacks of reduced accuracy for smaller student subgroups.</p>
      </sec>
    </sec>
    <sec id="sec3">
      <title>3. Methods</title>
      <sec id="sec3dot1">
        <title>3.1. Threat Model and Re-Identification Pathways</title>
        <p>A rigorous privacy analysis begins with an explicit threat model. We assume an adversary who (i) can observe released aggregates or noisy records, (ii) may hold auxiliary data (e.g., social media, public records, or institutional rosters), and (iii) can issue or infer multiple statistics over time. Re-identification typically exploits (a) linkage attacks, where quasi-identifiers (age, ZIP code, time of event) are matched across datasets, (b) differencing attacks, which subtract near-identical aggregates to infer a single record’s contribution, and (c) composition attacks, where multiple releases gradually erode privacy guarantees. Differential privacy directly addresses (b) and (c) by bounding how much any single record can influence any output; careful product and platform design (e.g., rate limiting, access control) addresses (a).</p>
      </sec>
      <sec id="sec3dot2">
        <title>3.2. Algorithm and Code Implementation</title>
        <p>Our differential privacy survey system follows this algorithm design:</p>
        <p>1) Initialization: Set up survey questions and privacy parameters (epsilon, upper and lower bounds, etc.).</p>
        <p>2) Initial Response Collection: Collect respondent’s answer to each question.</p>
        <p>3) Noise Addition: Apply the Laplace mechanism to add calibrated noise to each response.</p>
        <p>4) Data Storage: Store the noisy responses.</p>
        <p>In this program, the Laplace mechanism is used within a custom Python class to collect and privatize responses. Each survey answer is recorded and then randomized using Laplace noise before being stored. By incorporating this noise at the point of data collection, the system enforces local differential privacy, ensuring that privacy is preserved even before the data is aggregated or transmitted. This approach is particularly valuable in environments like schools, where trust and anonymity are essential for honest feedback.</p>
        <p>from diffprivlib.mechanisms import LaplaceTruncated</p>
        <p>import numpy as np</p>
        <p>#survey, applying diff priv</p>
        <p>class DiffPrivSurvey:</p>
        <p> def __init__(self, questions, epsilons=None, lower=1, upper=10):</p>
        <p> self.questions = questions</p>
        <p> # Allow per-question ε; default to uniform if not provided </p>
        <p> self.epsilons = epsilons or= [10] * len(questions)</p>
        <p> self.lower = lower</p>
        <p> self.upper = upper</p>
        <p> self.all_responses = []</p>
        <p> def _clamp(self, v):</p>
        <p> return max(self.lower, min(self.upper, v))</p>
        <p> def collect_response(self, question_idx, question):</p>
        <p> response = int(input(f”{question} ({self.lower}-{self.upper}): “))</p>
        <p> x = self._clamp(response)</p>
        <p> delta = self.upper - self.lower</p>
        <p> mech = LaplaceTruncated(epsilon = self.epsilons[question_idx], sensitivity=delta, lower=self.lower, upper=self.upper)</p>
        <p> noisy_response = mech.randomise(x)</p>
        <p> return round(noisy_response)</p>
        <p> def run_survey(self):</p>
        <p> for i, q in enumerate(self.questions):</p>
        <p> response = self.collect_response(i, q)</p>
        <p> self.all_responses.append(response)</p>
        <p> def get_responses(self):</p>
        <p> return self.all_responses</p>
        <p> def print_responses(self):</p>
        <p> print(self.all_responses)</p>
        <p>questions = [</p>
        <p> "I feel safe and secure while on campus.",</p>
        <p> "I feel like I belong at this school.",</p>
        <p> "I feel comfortable expressing my identity at school.",</p>
        <p> "Bullying or harassment is handled fairly by the school.",</p>
        <p> "I feel stressed or overwhelmed by schoolwork.",</p>
        <p> "I am provided with adequate academic support.",</p>
        <p> "I am provided with adequate mental health resources.",</p>
        <p> "I trust that the school takes student feedback seriously.",</p>
        <p> "Students at this school are treated equally regardless of race, gender, or background.",</p>
        <p> "I feel motivated to do well academically at this school.",</p>
        <p>]</p>
        <p>survey = DiffPrivSurvey(questions)</p>
        <p>survey.run_survey()</p>
        <p>survey.print_responses()</p>
        <p>I feel safe and secure while on campus. (1-10): 8</p>
        <p>I feel like I belong at this school. (1-10): 5</p>
        <p>I feel comfortable expressing my identity at school. (1-10): 9</p>
        <p>Bullying or harassment is handled fairly by the school. (1-10): 2</p>
        <p>I feel stressed or overwhelmed by schoolwork. (1-10): 4</p>
        <p>I am provided with adequate academic support. (1-10): 6</p>
        <p>I am provided with adequate mental health resources. (1-10): 7</p>
        <p>I trust that the school takes student feedback seriously. (1-10): 1</p>
        <p>Students at this school are treated equally regardless of race, gender, or background. (1-10): 4</p>
        <p>I feel motivated to do well academically at this school. (1-10): 9</p>
        <p>[7, 5, 6, 3, 3, 6, 7, 3, 5, 8]</p>
        <p>The DiffPrivSurvey class is designed to collect sensitive student feedback while preserving individual privacy through differential privacy techniques. It defines a list of Likert-scale questions on students’ experiences and perceptions of their school environment, ranging from feelings of safety to access to mental health resources. When the survey is run, each question is presented to the respondent, who enters a numerical answer from 1 to 10. Instead of storing the raw response directly, the system adds mathematically calibrated noise to each answer to ensure that individual data points cannot be precisely traced back to any participant. This allows for meaningful data aggregation while protecting personal information.</p>
        <p>For each response, the Laplace mechanism, implemented via the IBM diffprivlib library ([<xref ref-type="bibr" rid="B7">7</xref>]), adds noise using a randomization process defined by the user’s privacy budget, epsilon. The mechanism introduces noise in proportion to the sensitivity of the function being computed. The sensitivity equals 9 because we’re measuring how much a single person’s response can change the output. As the survey is on a scale of 1 to 10, the maximum difference is bounded by 9 units.</p>
        <p>The privacy loss parameter <inline-formula><mml:math display="inline"><mml:mi> ε </mml:mi></mml:math></inline-formula> controls the balance between privacy protection and data accuracy in differential privacy. Smaller <inline-formula><mml:math display="inline"><mml:mi> ε </mml:mi></mml:math></inline-formula> values provide stronger privacy guarantees but inject more noise, reducing utility; larger <inline-formula><mml:math display="inline"><mml:mi> ε </mml:mi></mml:math></inline-formula> values weaken privacy but yield more accurate results.</p>
        <p>In practice, ε should be determined by the intended analytical accuracy, the number of individuals contributing, and any additional safeguards in the system (such as local perturbation, encryption, or limited data access).</p>
        <p>For local differential privacy systems such as ours, the effect of <inline-formula><mml:math display="inline"><mml:mi> ε </mml:mi></mml:math></inline-formula> interacts strongly with the sample size <italic>n</italic>. Because each participant adds noise to their own response, the noise in the aggregated mean decreases roughly as <inline-formula><mml:math display="inline"><mml:mrow><mml:mrow><mml:mn> 1 </mml:mn><mml:mo> / </mml:mo><mml:mrow><mml:msqrt><mml:mi> n </mml:mi></mml:msqrt></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> . The approximate standard error due to Laplace noise for bounded values in <inline-formula><mml:math display="inline"><mml:mrow><mml:mrow><mml:mo> [ </mml:mo><mml:mrow><mml:mi> L </mml:mi><mml:mo> , </mml:mo><mml:mi> U </mml:mi></mml:mrow><mml:mo> ] </mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> with sensitivity <inline-formula><mml:math display="inline"><mml:mrow><mml:mi> Δ </mml:mi><mml:mo> = </mml:mo><mml:mi> U </mml:mi><mml:mo> − </mml:mo><mml:mi> L </mml:mi></mml:mrow></mml:math></inline-formula> is:</p>
        <disp-formula id="FD4">
          <mml:math display="inline">
            <mml:mrow>
              <mml:msub>
                <mml:mi>S</mml:mi>
                <mml:mrow>
                  <mml:mi>D</mml:mi>
                  <mml:mi>P</mml:mi>
                </mml:mrow>
              </mml:msub>
              <mml:mo>≈</mml:mo>
              <mml:mfrac>
                <mml:mrow>
                  <mml:msqrt>
                    <mml:mn>2</mml:mn>
                  </mml:msqrt>
                  <mml:mi>Δ</mml:mi>
                </mml:mrow>
                <mml:mrow>
                  <mml:mi>ε</mml:mi>
                  <mml:msqrt>
                    <mml:mi>n</mml:mi>
                  </mml:msqrt>
                </mml:mrow>
              </mml:mfrac>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>Thus, as <inline-formula><mml:math><mml:mi> n </mml:mi></mml:math></inline-formula> grows, even modest <inline-formula><mml:math display="inline"><mml:mi> ε </mml:mi></mml:math></inline-formula> values produce reliable estimates; conversely, with small <inline-formula><mml:math><mml:mi> n </mml:mi></mml:math></inline-formula> , higher <inline-formula><mml:math display="inline"><mml:mi> ε </mml:mi></mml:math></inline-formula> is needed to maintain usable signal quality. <bold>Table 1</bold> illustrates the expected accuracy trade-off for a 1 - 10 Likert scale (<inline-formula><mml:math display="inline"><mml:mrow><mml:mi> Δ </mml:mi><mml:mo> = </mml:mo><mml:mn> 9 </mml:mn></mml:mrow></mml:math></inline-formula> ) at different <inline-formula><mml:math display="inline"><mml:mi> ε </mml:mi></mml:math></inline-formula> and <inline-formula><mml:math><mml:mi> n </mml:mi></mml:math></inline-formula> values.</p>
        <p><bold>Table 1</bold><bold>.</bold> Expected accuracy (in Likert points) for different sample sizes and privacy parameters</p>
        <table-wrap id="tbl1">
          <label>Table 1</label>
          <table>
            <tbody>
              <tr>
                <td>
                  <italic>n</italic>
                </td>
                <td>ε = 0.5</td>
                <td>ε = 1</td>
                <td>ε = 2</td>
                <td>ε = 5</td>
                <td>ε = 10</td>
              </tr>
              <tr>
                <td>10</td>
                <td>4.0</td>
                <td>2.8</td>
                <td>1.4</td>
                <td>0.6</td>
                <td>0.4</td>
              </tr>
              <tr>
                <td>50</td>
                <td>1.8</td>
                <td>1.3</td>
                <td>0.6</td>
                <td>0.3</td>
                <td>0.18</td>
              </tr>
              <tr>
                <td>200</td>
                <td>0.9</td>
                <td>0.6</td>
                <td>0.3</td>
                <td>0.13</td>
                <td>0.09</td>
              </tr>
            </tbody>
          </table>
        </table-wrap>
        <p>(Values are approximate <inline-formula><mml:math><mml:mrow><mml:msub><mml:mi> S </mml:mi><mml:mrow><mml:mi> D </mml:mi><mml:mi> P </mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></inline-formula> in Likert points.)</p>
        <p>This relationship highlights that <inline-formula><mml:math><mml:mi> ε </mml:mi></mml:math></inline-formula> cannot be evaluated in isolation. For large surveys (hundreds of participants), <inline-formula><mml:math><mml:mi> ε </mml:mi></mml:math></inline-formula> values between 0.5 and 2 often provide strong privacy with minimal loss of accuracy. </p>
        <p>However, in smaller studies, or classroom-level pilots, <inline-formula><mml:math><mml:mrow><mml:mi> ε </mml:mi><mml:mo> = </mml:mo><mml:mn> 10 </mml:mn></mml:mrow></mml:math></inline-formula> can be reasonable: it keeps added noise below half a point on a 10-point scale while still randomizing individual answers enough to discourage re-identification.</p>
        <p>Moreover, because our system enforces local perturbation—each student’s response is privatized before submission—the effective privacy risk is already reduced compared with central models. Within this context, <inline-formula><mml:math><mml:mrow><mml:mi> ε </mml:mi><mml:mo> = </mml:mo><mml:mn> 10 </mml:mn></mml:mrow></mml:math></inline-formula> offers a practical, transparent trade-off between privacy and interpretability, allowing schools to make meaningful use of aggregated feedback.</p>
        <p>Each noisy response is generated and stored in self.all_responses, a list that represents all modified answers for a single survey run. The run_survey() method iterates over each question and collects responses with added noise, and print_responses() outputs the final result as a list of values. These values are privacy-preserving approximations of the original responses. Importantly, although individual responses are obscured, aggregated patterns across many responses can still be meaningfully analyzed, which is the central benefit of applying differential privacy in surveys. For example, the average scores across questions will converge to the true values as sample size increases (since the random noise will cancel itself out). This implementation ensures that even if survey data is accessed or analyzed later, the privacy of any one individual remains protected.</p>
        <p>Alternatively, the survey can treat each Likert item as a distinct categorical label, applying k-ary randomized response (kRR). Here, each participant reports their true category with probability </p>
        <disp-formula id="FD5">
          <mml:math display="inline">
            <mml:mrow>
              <mml:mi>p</mml:mi>
              <mml:mo>=</mml:mo>
              <mml:mfrac>
                <mml:mrow>
                  <mml:msup>
                    <mml:mtext>e</mml:mtext>
                    <mml:mi>ε</mml:mi>
                  </mml:msup>
                </mml:mrow>
                <mml:mrow>
                  <mml:msup>
                    <mml:mtext>e</mml:mtext>
                    <mml:mi>ε</mml:mi>
                  </mml:msup>
                  <mml:mo>+</mml:mo>
                  <mml:mi>k</mml:mi>
                  <mml:mo>−</mml:mo>
                  <mml:mn>1</mml:mn>
                </mml:mrow>
              </mml:mfrac>
            </mml:mrow>
          </mml:math>
        </disp-formula>
        <p>and reports one of the other <inline-formula><mml:math><mml:mrow><mml:mi> k </mml:mi><mml:mo> − </mml:mo><mml:mn> 1 </mml:mn></mml:mrow></mml:math></inline-formula> categories uniformly at random otherwise. This method ensures that all outputs remain valid Likert choices and eliminates the risk of producing out-of-range values. It is especially suitable for estimating proportions or histograms across categories (e.g., “What fraction of students strongly agree?”), rather than for numeric averages.</p>
        <p>Both techniques provide the same formal privacy guarantees but make different utility trade-offs. The numeric approach retains higher accuracy for aggregate averages, while the categorical approach offers cleaner interpretability and stronger discrete protection for individual answers.</p>
      </sec>
      <sec id="sec3dot3">
        <title>3.3. Data Collection and Submission Protocol</title>
        <p>Although the current DiffPrivSurvey implementation does not yet include cryptographic token management or duplicate-response detection, these mechanisms are conceptually part of the broader system design. In a production deployment, each participant could receive a single-use, anonymous token or blind-signed credential before responding. The server would then verify the token and mark it as used, ensuring one response per participant without storing personally identifiable information. Such a mechanism prevents duplicate submissions and mitigates risks of data inflation or manipulation while preserving anonymity ([<xref ref-type="bibr" rid="B6">6</xref>]).</p>
      </sec>
      <sec id="sec3dot4">
        <title>3.4. Handling Missing and Partial Responses</title>
        <p>Our prototype assumes full participation, but real surveys often include nonresponse. [<xref ref-type="bibr" rid="B2">2</xref>] show that imputing missing values or applying nonresponse weights can increase sensitivity and complicate differential privacy guarantees. To avoid this, our system assigns a separate privacy budget εᵢ per question: only answered items consume budget, and unanswered ones simply reduce the effective n for that question. We do not perform imputation inside the DP mechanism, preventing additional sensitivity. Aggregate outputs report each n and adjust confidence intervals accordingly, making the trade-off between privacy and statistical power transparent.</p>
      </sec>
    </sec>
    <sec id="sec4">
      <title>4. Discussion</title>
      <sec id="sec4dot1">
        <title>4.1. Aggregation &amp; Accuracy Reporting</title>
        <p>To maintain transparency, publish utility diagnostics alongside results:</p>
        <p>Sampling variance vs. privacy variance. Report the standard error of each mean as <inline-formula><mml:math><mml:mrow><mml:msqrt><mml:mrow><mml:mfrac><mml:mrow><mml:msup><mml:mi> s </mml:mi><mml:mn> 2 </mml:mn></mml:msup></mml:mrow><mml:mi> n </mml:mi></mml:mfrac><mml:mo> + </mml:mo><mml:mfrac><mml:mrow><mml:mn> 2 </mml:mn><mml:msup><mml:mi> Δ </mml:mi><mml:mn> 2 </mml:mn></mml:msup></mml:mrow><mml:mrow><mml:mi> n </mml:mi><mml:msup><mml:mi> ε </mml:mi><mml:mn> 2 </mml:mn></mml:msup></mml:mrow></mml:mfrac></mml:mrow></mml:msqrt></mml:mrow></mml:math></inline-formula> , where <inline-formula><mml:math><mml:mrow><mml:msup><mml:mi> s </mml:mi><mml:mn> 2 </mml:mn></mml:msup></mml:mrow></mml:math></inline-formula> is the empirical variance of the (noisy) responses, <inline-formula><mml:math><mml:mi> n </mml:mi></mml:math></inline-formula> is the sample size, and the second term approximates the additional DP noise for local Laplace.Confidence intervals. Use the combined variance for CIs so stakeholders see how uncertainty shrinks as <inline-formula><mml:math><mml:mi> n </mml:mi></mml:math></inline-formula> grows.Release policy. Withhold subgroup statistics unless <inline-formula><mml:math><mml:mi> n </mml:mi></mml:math></inline-formula> exceeds a threshold to mitigate outlier influence and protect small cohorts.</p>
      </sec>
      <sec id="sec4dot2">
        <title>4.2. Platform Hardening Beyond DP</title>
        <p>When implementing a privacy-preserving system, developers must address potential issues beyond the scope of differential privacy itself. IP addresses must be hidden by VPNs or proxy servers, and timestamps can be randomized or delayed to prevent timing-based correlation attacks. In addition, the survey infrastructure must be secured by encrypted links, secure hosting environments, and data retention policies. A production-grade deployment should also include:</p>
        <p>Client integrity: prevent multiple submissions (e.g., blind-signed, rate-limited tokens) without tracking identities; resist replay by expiring tokens.Transport &amp; storage: TLS in transit, KMS-backed encryption at rest, and strict key rotation.Metadata minimization: strip or coarsen device, network, and timing fields; use upload jitter to reduce timing correlation.Access control &amp; logging: role-based access with short-lived credentials; log data access (on aggregates, not raw inputs).Query governance: a privacy accountant to enforce total <inline-formula><mml:math><mml:mi> ε </mml:mi></mml:math></inline-formula> ; dashboards that show “budget spent” per survey wave.Red-team simulations: regularly test linkage and differencing attacks using synthetic adversarial datasets.</p>
        <p>Only through this integration can educational institutions achieve the privacy protection needed to promote candid student feedback.</p>
      </sec>
      <sec id="sec4dot3">
        <title>4.3. Ethical and Equity Considerations</title>
        <p>Even with strong privacy, differential privacy can reduce signals for small or marginalized subgroups. To avoid silencing these voices, couple DP with:Minimum-n thresholds and multi-wave aggregation (pool across time) to raise sample size without increasing <inline-formula><mml:math><mml:mi> ε </mml:mi></mml:math></inline-formula> .Decision safeguards: when subgroup estimates are too noisy, defer high-stakes decisions or collect additional data with consent and higher ε explicitly communicated to participants.Transparent communication: publish plain-language summaries explaining how privacy noise works and what uncertainty means for policy choices.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <title>References</title>
      <ref id="B1">
        <label>1.</label>
        <citation-alternatives>
          <mixed-citation publication-type="other">Cummings, R., Desfontaines, D., Evans, D., Geambasu, R., Huang, Y., Jagielski, M. et al. (2024). Advancing Differential Privacy: Where We Are Now and Future Directions for Real-World Deployment. <italic>Harvard</italic><italic>Data</italic><italic>Science</italic><italic>Review,</italic><italic>6,</italic> 1-123. https://doi.org/10.1162/99608f92.d3197524 <pub-id pub-id-type="doi">10.1162/99608f92.d3197524</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1162/99608f92.d3197524">https://doi.org/10.1162/99608f92.d3197524</ext-link></mixed-citation>
          <element-citation publication-type="other">
            <person-group person-group-type="author">
              <string-name>Cummings, R.</string-name>
              <string-name>Desfontaines, D.</string-name>
              <string-name>Evans, D.</string-name>
              <string-name>Geambasu, R.</string-name>
              <string-name>Huang, Y.</string-name>
              <string-name>Jagielski, M.</string-name>
            </person-group>
            <year>2024</year>
            <pub-id pub-id-type="doi">10.1162/99608f92.d3197524</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B2">
        <label>2.</label>
        <citation-alternatives>
          <mixed-citation publication-type="web">Drechsler, J., &amp; Bailie, J. (2024). <italic>The Complexities of Differential Privacy for Survey Data</italic>. https://arxiv.org/abs/2408.07006</mixed-citation>
          <element-citation publication-type="web">
            <person-group person-group-type="author">
              <string-name>Drechsler, J.</string-name>
              <string-name>Bailie, J.</string-name>
            </person-group>
            <year>2024</year>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B3">
        <label>3.</label>
        <citation-alternatives>
          <mixed-citation publication-type="confproc">Dwork, C. (2011). The Promise of Differential Privacy: A Tutorial on Algorithmic Techniques. In <italic>2011 IEEE 52nd Annual Symposium on Foundations of Computer Science</italic> (pp. 1-2). IEEE. https://doi.org/10.1109/focs.2011.88 <pub-id pub-id-type="doi">10.1109/focs.2011.88</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1109/focs.2011.88">https://doi.org/10.1109/focs.2011.88</ext-link></mixed-citation>
          <element-citation publication-type="confproc">
            <person-group person-group-type="author">
              <string-name>Dwork, C.</string-name>
            </person-group>
            <year>2011</year>
            <pub-id pub-id-type="doi">10.1109/focs.2011.88</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B4">
        <label>4.</label>
        <citation-alternatives>
          <mixed-citation publication-type="web">Feldman, V. (2020). <italic>Differential Privacy: Issues for Policymakers</italic>. Simons Institute. https://simons.berkeley.edu/news/differential-privacy-issues-policymakers</mixed-citation>
          <element-citation publication-type="web">
            <person-group person-group-type="author">
              <string-name>Feldman, V.</string-name>
            </person-group>
            <year>2020</year>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B5">
        <label>5.</label>
        <citation-alternatives>
          <mixed-citation publication-type="journal">Ficek, J., Wang, W., Chen, H., Dagne, G., &amp; Daley, E. (2021). Differential Privacy in Health Research: A Scoping Review. <italic>Journal</italic><italic>of</italic><italic>the</italic><italic>American</italic><italic>Medical</italic><italic>Informatics</italic><italic>Association,</italic><italic>28,</italic> 2269-2276. https://doi.org/10.1093/jamia/ocab135 <pub-id pub-id-type="doi">10.1093/jamia/ocab135</pub-id><pub-id pub-id-type="pmid">34333623</pub-id><ext-link ext-link-type="uri" xlink:href="https://doi.org/10.1093/jamia/ocab135">https://doi.org/10.1093/jamia/ocab135</ext-link></mixed-citation>
          <element-citation publication-type="journal">
            <person-group person-group-type="author">
              <string-name>Ficek, J.</string-name>
              <string-name>Wang, W.</string-name>
              <string-name>Chen, H.</string-name>
              <string-name>Dagne, G.</string-name>
              <string-name>Daley, E.</string-name>
            </person-group>
            <year>2021</year>
            <pub-id pub-id-type="doi">10.1093/jamia/ocab135</pub-id>
            <pub-id pub-id-type="pmid">34333623</pub-id>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B6">
        <label>6.</label>
        <citation-alternatives>
          <mixed-citation publication-type="web">Greenstadt, R., &amp; Miers, I. (2015). <italic>ANONIZE: A Large-Scale Anonymous Survey System</italic>. Johns Hopkins University Department of Computer Science. https://www.infoq.com/articles/anonize-large-scale-anonymous-survey-system/</mixed-citation>
          <element-citation publication-type="web">
            <person-group person-group-type="author">
              <string-name>Greenstadt, R.</string-name>
              <string-name>Miers, I.</string-name>
            </person-group>
            <year>2015</year>
          </element-citation>
        </citation-alternatives>
      </ref>
      <ref id="B7">
        <label>7.</label>
        <citation-alternatives>
          <mixed-citation publication-type="web">Holohan, N. (2025). Differential Privacy Library. <italic>GitHub</italic>. https://github.com/IBM/differential-privacy-library</mixed-citation>
          <element-citation publication-type="web">
            <person-group person-group-type="author">
              <string-name>Holohan, N.</string-name>
            </person-group>
            <year>2025</year>
          </element-citation>
        </citation-alternatives>
      </ref>
    </ref-list>
  </back>
</article>