Does fine-tuning GPT-3 with the OpenAI API leak personally-identifiable information? [PDF]
Machine learning practitioners often fine-tune generative pre-trained models like GPT-3 to improve model performance at specific tasks. Previous works, however, suggest that fine-tuned machine learning models memorize and emit sensitive information from ...
A. Sun+6 more
semanticscholar +3 more sources
Enabling Efficient Personally Identifiable Information Detection with Automatic Consent Discovery
Personal data leakage prevention has now become a critical issue for implementing data management and sharing in many industries. Several data privacy regulations such as General Data Protection Regulation (GDPR), Health Insurance Portability and ...
S. Fugkeaw, Pattavee Sanchol
semanticscholar +3 more sources
Personally Identifiable Information (PII) Detection in the Unstructured Large Text Corpus using Natural Language Processing and Unsupervised Learning Technique [PDF]
Personally Identifiable Information (PII) has gained much attention with the rapid development of technologies and the exploitation of information relating to an individual.
Poornima Kulkarni, Cauvery N K
semanticscholar +2 more sources
The integration of Artificial Intelligence (AI) into big data analytics represents a pivotal shift in the management of Personally Identifiable Information (PII) within the financial sector.
S. Olabanji+5 more
semanticscholar +3 more sources
On the leakage of personally identifiable information via online social networks [PDF]
For purposes of this paper, we define "Personally identifiable information" (PII) as information which can be used to distinguish or trace an individual's identity either alone or when combined with other information that is linkable to a specific individual.
Craig E. Wills+1 more
openaire +4 more sources
Privacy in Public Archives: Managing Personally Identifiable Information in Special Collections
Archivists aim to make research and manuscripts accessible to the public. However, accessibility becomes tricky when donors or institutions enforce limitations. Sometimes limitations need to be enforced, especially when dealing with sensitive information
Zachary Stein
semanticscholar +3 more sources
Coronavirus disease 2019 or COVID-19 is a zoonosis, which means a disease that contaminates from the animals to the humans. Since it is very highly epizootic, it has forced the public health experts to implement smartphone-based applications to trace its
Molla Rashied Hussein+4 more
semanticscholar +2 more sources
A pseudonymized corpus of occupational health narratives for clinical entity recognition in Spanish [PDF]
Despite the high creation cost, annotated corpora are indispensable for robust natural language processing systems. In the clinical field, in addition to annotating medical entities, corpus creators must also remove personally identifiable information ...
Jocelyn Dunstan+8 more
doaj +2 more sources
Cyber-Assets at Risk (CAR): The Cost of Personally Identifiable Information Data Breaches [PDF]
: Severe financial consequences of data breaches enforce organizations to reconsider their cybersecurity investment. Although attack frequency and trends seem similar per industry, the impact of a data breach may exponentially increase depending on the ...
Bouazzaoui, Sarah+4 more
semanticscholar +3 more sources
Simulating Data Breaches: Synthetic Datasets for Depicting Personally Identifiable Information through Scenario-based Breaches [PDF]
Abhishek Sharma, May Bantan
openalex +2 more sources