Cryptographic techniques
On this page: encryption, cryptography, security, collaboration, confidential
computing, mpc, homomorphic encryption
Date of last review: 2023-04-02
Besides the scenarios described previously, there are also multiple cryptographic techniques that can be applied to protect sensitive data in the analysis phase. Here, we discuss secure multiparty computation, confidential computing, and homomorphic encryption.
Although there is some overlap in functionality and purpose between these three techniques, they are generally still considered to be distinct and can be combined to enhance security.
These cryptographic techniques are relatively new and are not available as distinct services (yet) for direct application in research. They are for now listed here for information purposes.
Secure multiparty computation
Secure multiparty computation (also referred to as “MPC”) is a set of cryptographic techniques that allows multiple parties to jointly perform analyses on distributed datasets, as if they had a shared database, and without revealing the underlying data to each other. Among those techniques are secure set intersection (securely investigating which elements multiple databases have in common), homomorphic encryption (see below), and others.
When to use
The benefits of MPC are that no raw data are shared between the parties, computations are guaranteed to perform correctly, and there is a degree of control on who receives the result of the computation (i.e., the results are not necessarily combined in a central location). MPC is therefore a good way of implementing Privacy by Design into your project when you work with personal data.
Contrary to federated analysis, MPC is suitable for linking “vertically partitioned” datasets, i.e., when different organisations have different (types of) information on the same people and thus want to link those different data sources.
Implications for research
- The computation in MPC is really joint: you need to have agreed on a specific analysis to be performed and what you will reveal as result of the computation.
- There is no one-size-fits-all MPC solution: different use cases ask for different implementations of MPC.
- Additional computational resources are required to generate random secrets and distribute data over the multiple parties.
Example
- MPC was used by a medical insurance company and hospital to determine the effectiveness of a personal lifestyle app for diabetes. In this example, it was possible to calculate average medical cost for different patient groups, based on whether they used the app or not, without revealing patient information between the insurance company and the hospital.
- You can find a simplified example on jointly calculating average income here.
You can find more information about secure multiparty computation on https://securecomputation.org/, in this report, and on the website of TNO.
Confidential computing
Confidential computing is a technique that protects data in use through a (hardware-based) Trusted Execution Environment (TEE). This environment makes sure that data within it are kept confidential (data confidentiality) and that both the data and the code running in the TEE cannot be modified or deleted (data and code integrity). The TEE uses embedded encryption keys and makes sure that the analysis stops running when malware or unauthorised access is detected. Moreover, data and code are even invisible to the operating system, cloud provider and any virtual machines.
There are many possible applications of this technique, for example:
- You want to protect against unauthorised access during the analysis of sensitive data.
- You want to analyse sensitive data, and it is necessary to use an untrusted cloud platform or infrastructure.
- You want to prevent the analysis script from leaking or manipulating data.
It is important that confidential computing is used together with encryption of data at rest and in transit, with restricted access to the decryption keys. It also requires the TEE to be trustworthy (attestation), which is an active field of study. You can read more on the website of the Confidential Computing Consortium.
(Fully) homomorphic encryption
Where “regular” encryption focuses on data at rest (e.g., in storage) or data in transit (e.g., when transferring data), homomorphic encryption allows analyses to be performed on encrypted data (“data in use”). During the analysis, both the data and the computation result remain encrypted, unless they are decrypted by the decryption key owner. This technique can be applied both in confidential computing and in secure multiparty computation.
There are multiple types of homomorphic encryption: partial, somewhat partial and fully homomorphic encryption. The latter is the most promising solution, as it allows an infinite number of additions and multiplications to be performed on the encrypted data.
Currently, the practical use of homomorphic is limited, because it can require a lot of computational resources to use it, causing it to be relatively slow. New implementations are however being developed, see this website for a list of available implementations. Another limitation is that there is no interaction with the data during the analysis, and so you cannot check whether the analysis was successful. To solve this, you could use a synthetic dataset to develop and test your algorithms first.