On this page: data-to-code, code-to-data, tools-to-data, algorithm-to-data,
cryptography, technique, tool, computing, computation, analysis, analyse,
Date of last review: 2023-04-02
When you use personal data in your research project, you likely also need to analyse those data, often using a script of sorts. In this chapter, we discuss the following scenarios for analysing personal data:
- “Regular” data analysis (“data-to-code”), where the data are brought to the “script” or analysis software in order to analyse them.
- “Code-to-data” scenario, where a script or analysis software is run on the data, without moving the data elsewhere.
- Federated analysis scenario, where a script or analysis software runs on multiple datasets that are in different locations, without moving those datasets elsewhere.
Additionally, we discuss relatively new cryptographic techniques that can be used in securing the analysis of personal or otherwise sensitive data.
Which scenario should I choose?
Which scenario is suitable to apply in your project depends on, among others:
- Your dataset: does it contain personal data? How large is the dataset? Do you know the data structure and analysis method beforehand?
- Which computing facility is most suitable:
- Local (e.g., laptop), on campus (e.g., cluster at Geosciences), from a national trusted party (e.g., SURF), or external (e.g., Amazon, Microsoft)?
- Located in the Netherlands, Europe or in a non-EEA country?
- Small or large amount of computing power (CPUs/codes/threads or GPUs, memory size, disk space, etc.)?
- Which software you need to run on the data using the computer power, e.g., R,
Python, SPSS, or any other scripting language.
- Does the software require root user access to install and/or configure?
- Does the software require paid licenses (e.g., MATLAB)?
- Can the software be installed in advance, or does it need to be updated during analyses (e.g., with additional packages from a repository)?
- Whether and with whom you are collaborating on your project.
Tools and support
We have created an overview of secure computing software and services in this GitHub repository. Keep in mind that this is by no means a complete list!
If you work at Utrecht University, you can ask the Research engineering team for help with choosing a suitable computing solution. If you have already chosen a solution, but are not sure whether it is safe to use, you can contact Information Security or your privacy officer for help.