Background on Machine Unlearning (MU) The astonishing success of AI-generated content has led to a resurgence in the popularity of machine learning (ML) technologies. However the performance of machine learning models relies heavily on a large volume of data from massive distributed clients or customers. On the other hand, given the potential for misuse, legislators worldwide have wisely introduced laws and regulations that mandate user data deletion upon request, these includes the European General Data Protection Regulation [1] California Consumer Privacy Act [2] and Canada’s proposed Consumer Privacy Protection Act [3] have stipulated that ML service providers are obligated to ensure “right to be forgotten” [4] for clients, at the end of a contract, allowing them to remove their data effects from well-trained models. Machine Unlearning is the process of making a machine learning model “forget” information it learned before. It’s like the opposite of machine learning. Instead of teaching the model to find patterns and make predictions, machine unlearning removes patterns or predictions that are no longer needed or correct. It is natural and obvious to point out that laws and regulations are not going to stop irresponsible misuse of machine learning and machine unlearning algorithms. The goal would be to obtain a model which is identical to the alternative model that would be obtained when trained on the dataset after removing the points that need to be forgotten. In this paper we are going to discuss about different types of large language models, their applications in different various fields, approaches and methods that guarantees removal of users information upon request in large language models, the motivations on why we need to implement digital forgetting, types of Digital Forgetting, the needs opportunities and solutions adapted by multiple researches to correctly implement unlearning methods in large language models, varieties of solutions adapted, and then at last a survey on machine unlearning in large language models.

The astonishing success of AI-generated content has led to a resurgence in the popularity of machine learning (ML) technologies. However the performance of machine learning models relies heavily on a large volume of data from massive distributed clients or customers. On the other hand, given the potential for misuse, legislators worldwide have wisely introduced laws and regulations that mandate user data deletion upon request, these includes the European General Data Protection Regulation [1] California Consumer Privacy Act [2] and Canada’s proposed Consumer Privacy Protection Act [3] have stipulated that ML service providers are obligated to ensure “right to be forgotten” [4] for clients, at the end of a contract, allowing them to remove their data effects from well-trained models. Machine Unlearning is the process of making a machine learning model “forget” information it learned before. It’s like the opposite of machine learning. Instead of teaching the model to find patterns and make predictions, machine unlearning removes patterns or predictions that are no longer needed or correct. It is natural and obvious to point out that laws and regulations are not going to stop irresponsible misuse of machine learning and machine unlearning algorithms. The goal would be to obtain a model which is identical to the alternative model that would be obtained when trained on the dataset after removing the points that need to be forgotten. In this paper we are going to discuss about different types of large language models, their applications in different various fields, approaches and methods that guarantees removal of users information upon request in large language models, the motivations on why we need to implement digital forgetting, types of Digital Forgetting, the needs opportunities and solutions adapted by multiple researches to correctly implement unlearning methods in large language models, varieties of solutions adapted, and then at last a survey on machine unlearning in large language models.

Machine Unlearning in Large Language Models (LLMs): Opportunities and challenges

OKOYE, OBINNA KENNETH
2024/2025

Abstract

Background on Machine Unlearning (MU) The astonishing success of AI-generated content has led to a resurgence in the popularity of machine learning (ML) technologies. However the performance of machine learning models relies heavily on a large volume of data from massive distributed clients or customers. On the other hand, given the potential for misuse, legislators worldwide have wisely introduced laws and regulations that mandate user data deletion upon request, these includes the European General Data Protection Regulation [1] California Consumer Privacy Act [2] and Canada’s proposed Consumer Privacy Protection Act [3] have stipulated that ML service providers are obligated to ensure “right to be forgotten” [4] for clients, at the end of a contract, allowing them to remove their data effects from well-trained models. Machine Unlearning is the process of making a machine learning model “forget” information it learned before. It’s like the opposite of machine learning. Instead of teaching the model to find patterns and make predictions, machine unlearning removes patterns or predictions that are no longer needed or correct. It is natural and obvious to point out that laws and regulations are not going to stop irresponsible misuse of machine learning and machine unlearning algorithms. The goal would be to obtain a model which is identical to the alternative model that would be obtained when trained on the dataset after removing the points that need to be forgotten. In this paper we are going to discuss about different types of large language models, their applications in different various fields, approaches and methods that guarantees removal of users information upon request in large language models, the motivations on why we need to implement digital forgetting, types of Digital Forgetting, the needs opportunities and solutions adapted by multiple researches to correctly implement unlearning methods in large language models, varieties of solutions adapted, and then at last a survey on machine unlearning in large language models.
2024
Machine Unlearning in Large Language Models (LLMs): Opportunities and challenges
The astonishing success of AI-generated content has led to a resurgence in the popularity of machine learning (ML) technologies. However the performance of machine learning models relies heavily on a large volume of data from massive distributed clients or customers. On the other hand, given the potential for misuse, legislators worldwide have wisely introduced laws and regulations that mandate user data deletion upon request, these includes the European General Data Protection Regulation [1] California Consumer Privacy Act [2] and Canada’s proposed Consumer Privacy Protection Act [3] have stipulated that ML service providers are obligated to ensure “right to be forgotten” [4] for clients, at the end of a contract, allowing them to remove their data effects from well-trained models. Machine Unlearning is the process of making a machine learning model “forget” information it learned before. It’s like the opposite of machine learning. Instead of teaching the model to find patterns and make predictions, machine unlearning removes patterns or predictions that are no longer needed or correct. It is natural and obvious to point out that laws and regulations are not going to stop irresponsible misuse of machine learning and machine unlearning algorithms. The goal would be to obtain a model which is identical to the alternative model that would be obtained when trained on the dataset after removing the points that need to be forgotten. In this paper we are going to discuss about different types of large language models, their applications in different various fields, approaches and methods that guarantees removal of users information upon request in large language models, the motivations on why we need to implement digital forgetting, types of Digital Forgetting, the needs opportunities and solutions adapted by multiple researches to correctly implement unlearning methods in large language models, varieties of solutions adapted, and then at last a survey on machine unlearning in large language models.
Machine Unlearning
Large Language Model
Privacy
Machine
Unlearning
File in questo prodotto:
File Dimensione Formato  
OKOYE_OBINNA_KENNETH_MACHINE_UNLEARNING_IN_LARGE_LANGUAGE_MODELS.docx.pdf

Accesso riservato

Dimensione 574.07 kB
Formato Adobe PDF
574.07 kB Adobe PDF

The text of this website © Università degli studi di Padova. Full Text are published under a non-exclusive license. Metadata are under a CC0 License

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12608/87138