Best Practices for Handling PII in Databases

Best Practices for Handling Personally Identifiable Information in Databases

The protection of personally identifiable information (PII) is a core responsibility for researchers and practitioners. From the earliest stages of inquiry design to the development of data management protocols, safeguarding participants must remain a central priority. Clear communication is essential – participants should understand what information is collected, how it will be used, and their informed consent must be secured before any data is gathered.

📄 Download the detailed PDF version of the instructions

What is Personally Identifiable Information?

GDPR Definition

"Any information relating to an identified or identifiable natural person" - including names, ID numbers, location data, online identifiers, and factors specific to physical, physiological, genetic, mental, economic, cultural or social identity.

Special Consideration for Small Populations

In small organizations or communities, demographic characteristics that wouldn't normally identify someone can become PII. Being the only person of a certain ethnicity, gender, or age group in a small population can make someone easily identifiable even without their name.

Implementation Phases

📋

Before Collection

  • Minimize PII collection - only what's necessary
  • Obtain free prior informed consent from participants
  • Avoid collecting names when possible
  • Use unique identifiers instead of names
  • Plan separate tables for PII
  • Submit to Ethical Review Board
📊

During Collection

  • Use professional devices only
  • Password-protect all devices
  • Lock devices when not in use
  • Follow data collection protocols
💾

After Collection

  • Separate PII into different tables
  • Names & geography in one table
  • Demographics in separate table
  • Apply distance buffers to geographic data
  • Link tables with unique IDs only
🔒

Security Measures

  • Encrypt all databases and backups
  • Strong passwords & 2 factor-authentication
  • Role-based access controls
  • Comprehensive audit logging
  • Regular security reviews

Data Retention Timeline

📅
10 Years
Delete names, photos,
phone numbers, emails,
addresses, ID numbers
🗂️
50 Years
Delete all raw data
Keep summary data only

Quick Reference Checklist

🎯 Planning & Collection

  • Minimize PII collection
  • Use unique identifiers
  • Plan separate database tables
  • Submit to Ethics Review Board
  • Consider small population risks
  • Use professional devices only

🔐 Storage & Security

  • Separate PII from other data
  • Apply geographic distance buffers
  • Encrypt databases and backups
  • Implement strong authentication
  • Set role-based access levels
  • Monitor with audit logs

⏰ Data Retention

  • Document retention periods in DMP
  • 10 years: Delete direct identifiers
  • 50 years: Delete all raw data
  • Keep only summary statistics
  • Review access permissions regularly
  • Ensure regulatory compliance

Founding Lead Organizations