Journal of Information Security

Volume 15, Issue 4 (October 2024)

ISSN Print: 2153-1234   ISSN Online: 2153-1242

Google-based Impact Factor: 3.25  Citations  

Protecting LLMs against Privacy Attacks While Preserving Utility

  XML Download Download as PDF (Size: 8587KB)  PP. 448-473  
DOI: 10.4236/jis.2024.154026    128 Downloads   894 Views  

ABSTRACT

The recent interest in the deployment of Generative AI applications that use large language models (LLMs) has brought to the forefront significant privacy concerns, notably the leakage of Personally Identifiable Information (PII) and other confidential or protected information that may have been memorized during training, specifically during a fine-tuning or customization process. This inadvertent leakage of sensitive information typically occurs when the models are subjected to black-box attacks. To address the growing concerns of safeguarding private and sensitive information while simultaneously preserving its utility, we analyze the performance of Targeted Catastrophic Forgetting (TCF). TCF involves preserving targeted pieces of sensitive information within datasets through an iterative pipeline which significantly reduces the likelihood of such information being leaked or reproduced by the model during black-box attacks, such as the autocompletion attack in our case. The experiments conducted using TCF evidently demonstrate its capability to reduce the extraction of PII while still preserving the context and utility of the target application.

Share and Cite:

Dhingra, G., Sood, S., Wase, Z. M., Bahga, A. and Madisetti, V. K. (2024) Protecting LLMs against Privacy Attacks While Preserving Utility. Journal of Information Security, 15, 448-473. doi: 10.4236/jis.2024.154026.

Cited by

No relevant information.

Copyright © 2025 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.