We found the following R packages for de-identifying data, or checking how well data are de-identified (in alphabetic order):
Name | Description | Data type | Privacy models | More info | License | Maintenance | GitHub stars |
---|---|---|---|---|---|---|---|
Datacheck | Open source R package and web app to check the presence of common identifiers | Tabular data (CSV) | - | Project report and demo | MIT | Active | 0-10 |
sdcMicro | R package and web app to apply generalization, top- and bottom coding, recoding + analyze privacy risks and utility | Tabular microdata (.Rdata, .sav, .sasb7dat, .csv, .txt, .dta) | k-anonymity | Documentation, demo | GPL-v2 | Active | 10-100 |
sdcTable | R package to apply statistical disclosure control to tables | Tables (e.g., frequency tables) | Suppression | Documentation | GPL-v2 | Active | 0-10 |