Machine Learning with Individualized Privacy Guarantees; Private Prompt Learning for Large Language Models – podwójne seminarium naukowe - IDEAS NCBR – Intelligent Algorithms for Digital Economy

Franziska Boenisch

Tytuł: Machine Learning with Individualized Privacy Guarantees

Abstrakt:

When applying machine learning (ML) in sensitive domains, we have to ensure privacy protection to avoid the leakage of the private training data from the model. The standard approach to implement privacy is to integrate differential privacy (DP) into the training procedure. For training a model with DP, one sets a privacy budget which represents a maximal privacy violation that any individual is willing to face by contributing their data to the training set. We argue that this approach is limited because different individuals may have different privacy expectations. Thus, setting a uniform privacy budget across all points may be overly conservative for some individuals or, conversely, not sufficiently protective for others. Building on the standard algorithms for privacy-preserving ML, we propose Individualized DP (IDP) algorithms for machine learning. Our algorithms do not only allow to respect individuals’ privacy preferences, but also enable to leverage training data more efficiently which results in better ML model utility and thereby supports a broader practical deployment of privacy-preserving ML in sensitive domains.

Biogram:

Franziska is a tenure-track faculty at the CISPA Helmholtz Center for Information Security where she co-leads the SprintML lab. Before, she was a Postdoctoral Fellow at the University of Toronto and Vector Institute advised by Prof. Nicolas Papernot. Her current research centers around private and trustworthy machine learning. Franziska obtained her Ph.D. at the Computer Science Department at Freie University Berlin, where she pioneered the notion of individualized privacy in machine learning. During her Ph.D., Franziska was a research associate at the Fraunhofer Institute for Applied and Integrated Security (AISEC), Germany. She received a Fraunhofer TALENTA grant for outstanding female early career researchers and the German Industrial Research Foundation prize for her research on machine learning privacy.

Adam Dziedzic

Tytuł: Private Prompt Learning for Large Language Models

Abstrakt:

Large language models (LLMs) are excellent in-context learners. However, the sensitivity of data contained in prompts raises privacy concerns. Our work first shows that these concerns are valid: we instantiate a simple but highly effective membership inference attack against the data used to prompt LLMs. To address this vulnerability, one could forego prompting and resort to fine-tuning LLMs with known algorithms for private gradient descent. However, this comes at the expense of the practicality and efficiency offered by prompting. Therefore, we propose to privately learn to prompt. We first show that soft prompts can be obtained privately through gradient descent on downstream data. However, this is not the case for discrete prompts. Thus, we orchestrate a noisy vote among an ensemble of LLMs presented with different prompts, i.e., a flock of stochastic parrots. The vote privately transfers the flock’s knowledge into a single public prompt. We show that LLMs prompted with our private algorithms closely match the non-private baselines.

Biogram:

Adam is a Tenure Track Faculty Member at CISPA, co-leading the SprintML group. His research is focused on secure and trustworthy Machine Learning as a Service (MLaaS). Adam designs robust and reliable machine learning methods for training and inference of ML models while preserving data privacy and model confidentiality. Adam was a Postdoctoral Fellow at the Vector Institute and the University of Toronto, and a member of the CleverHans Lab, advised by Prof. Nicolas Papernot. He earned his PhD at the University of Chicago, where he was advised by Prof. Sanjay Krishnan and worked on input and model compression for adaptive and robust neural networks. Adam obtained his Bachelor’s and Master’s degrees from Warsaw University of Technology in Poland. He was also studying at DTU (Technical University of Denmark) and carried out research at EPFL, Switzerland. Adam also worked at CERN (Geneva, Switzerland), Barclays Investment Bank in London (UK), Microsoft Research (Redmond, USA), and Google (Madison, USA).

Zachęcamy do udziału w wydarzeniu stacjonarnie – prosimy o wiadomość mailową na adres registration@ideas-ncbr.pl z podaniem imienia, nazwiska i afiliacji. Możliwy jest też udział przez platformę Zoom:

https://us06web.zoom.us/j/87389672377?pwd=jquBwLiNB1vU2uMHGqMPMfGBJQTvZY.1

Meeting ID: 873 8967 2377

Passcode: 172318

Looking forward to seeing you at IDEAS!

Machine Learning with Individualized Privacy Guarantees; Private Prompt Learning for Large Language Models – podwójne seminarium naukowe

Newsletter