Introduction

Scientific research often includes some form of personal data. However, researchers may be unaware of what personal data are or whether they are being collected. With the implementation of the General Data Protection Regulation (GDPR) in 2018, stricter legal requirements apply to handling personal data and its sharing and publication. In our own experience, the number and complexity of questions on handling personal data in scientific research at Utrecht University (UU) is increasing.

Our goal at Research Data Management Support (RDM Support) is to assist researchers with any issues surrounding the management of their research data, including research data that contain personal data. To understand how we can best help researchers with their privacy-related questions and needs, we wanted to investigate:

  1. To what extent are UU researchers aware of privacy legislation and practices?
  2. What data privacy issues do UU researchers typically run into?
  3. What support do UU researchers need to handle personal data?

To answer these questions, we set up an online survey and planned one-on-one meetings with a selection of UU researchers. This report describes our methods and full results of both the online survey and the one-on-one meetings. For a full summary of the results and recommendations on how to move forward, please refer to the Recommendations report.

This survey was part of a larger project, the Data Privacy Project3, a data support effort led by RDM Support at UU that aims to provide actionable and FAIR (Findable, Accessible, Interoperable, Reusable) information and tools for researchers to handle personal data in their research.

Methods

Survey questions

The Data Privacy Survey was created by the project coordinators (DH, NM) with input from a wide variety of experts, consisting of privacy experts, data managers, data consultants, and IT staff (e.g., information security, research engineering). The survey consisted of 19-23 questions (the exact number differing depending on given answers), which were separated into 5 sections:

  1. About you and your research: 5 questions about the faculty, department, position, and type of (personal) data the researcher works with.
  2. Measures and documents: 4-6 questions about researchers’ familiarity and use of privacy-related measures (e.g., processing register, Data Protection Impact Assessment, encryption, pseudonymisation, etc.), data storage, and informed consent forms.
  3. Data sharing, archiving and publication: 3-4 questions about researchers’ data sharing and publishing practices.
  4. Finding support: 3-4 questions about the awareness of several data support channels.
  5. Improving our services: 4 questions about researchers’ challenges, needs, and suggestions to improve UU’s data privacy support.

The estimated 10 minute duration of the survey in advance proved to be relatively accurate: respondents spent a median of 8.4 minutes on the survey (although there was a very large variation; range: 0.4 minutes - 5.1 days).

Procedure online survey

The survey was created in Qualtrics and distributed from March 21st, 2022 onwards via several communication channels to reach as many UU researchers as possible:

  • An email was sent to all academic staff at UU through central communication channels. The reasoning was that this would be the most effective way to reach as many UU researchers as possible - taking for granted that we would likely miss a small part of non-academic personnel also involved in research in some way.
  • A mention in several Faculty newsletters.
  • Social media messages (e.g., on Twitter.
  • A news item on the UU intranet.
  • Via data support colleagues, who were asked to point researchers they were in contact with to the survey.

Researchers could decide voluntarily whether they wanted to participate in the survey. All results reported here come from researchers who provided their active consent, worked at Utrecht University and indicated to work with some type of research data.

Online survey respondents

UU-wide

The survey was filled out by 176 UU researchers. As can be seen in the figure below, we received responses from each UU faculty, but the distribution was not equal: the faculties of Science and Social and Behavioural Sciences (FSBS) were overrepresented, whereas the responses of the Faculty of Geosciences and the Faculty of Medicine were rather low. This can be explained by the mere size of the faculties (e.g., the faculty of Science is UU’s largest faculty), but also by the types of research performed there. For example, research performed at the faculty of Geosciences is largely involved with natural scientific data, rather than data from or relating to humans. The Faculty of Medicine, then, is located at the University Medical Center Utrecht (UMCU) and is not of primary interest for the current survey, which was sent to UU researchers only. A small selection of respondents indicated to work at another part of the organisation, i.e., mainly University College Utrecht and the University Corporate Office.

Most of the survey respondents were relatively early career researchers (PhDs, junior researchers and postdoctoral researchers). We suspect that this is because 1) there simply are more employees in these positions than there are in the others, and 2) early-career researchers may experience a greater need for support with respect to handling personal data.

Note that for the faculty-specific figures, the amount of respondents from the entire faculty were not always equal to the sum of respondents per department. This was caused by the fact that the Department question was not mandatory, and allowed multiple responses. So some respondents could be part of several departments, and others are not displayed in the Department plot, because they did not leave an answer to that question.

Science

There were 44 respondents from the Science faculty in the online survey. It took them a median of 5.7 minutes to complete it (ranging from 0.6 minutes to 5.1 days). In the graph below, the representation of each department within the Science faculty is visualised.

FSBS

There were 41 respondents from the Faculty of Social and Behavioural Sciences (FSBS) in the online survey. It took them a median of 8.9 minutes to complete it (ranging from 0.4 minutes to 0.3 days). In the graph below, the representation of each department within FSBS is visualised.

Humanities

There were 27 respondents from the Faculty of Humanities in the online survey. It took them a median of 8.5 minutes to complete it (ranging from 0.9 minutes to 1.3 days). In the graph below, the representation of each department within the Faculty of Humanities is visualised.

Veterinary Medicine

There were 23 respondents from the Faculty of Veterinary Medicine in the online survey. It took them a median of 10.4 minutes to complete it (ranging from 1.4 minutes to 0.2 days). In the graph below, the representation of each department within the Faculty of Veterinary Medicine is visualised.

LEG

There were 21 respondents from the Faculty of Law, Economics, and Governance (LEG) in the online survey. It took them a median of 10.1 minutes to complete it (ranging from 3.5 minutes to 0.9 days). In the graph below, the representation of each department within the Faculty of Law, Economics, and Governance (LEG) is visualised.

Geo

There were 14 respondents from the Faculty of Geosciences in the online survey. It took them a median of 7.1 minutes to complete it (ranging from 2.9 minutes to 0 days). In the graph below, the representation of each department within the Faculty of Geosciences is visualised.

One-on-one meetings

Besides the online survey, we organised one-on-one meetings with researchers, to hear about their personal experiences, challenges and needs concerning the handling of personal data in their research. Survey respondents could voluntarily leave their email address at the end of the survey to be contacted by us. These meetings were semi-structured and revolved around the following questions:

  • What made you leave your email address in the Data Privacy Survey? Related: What are your general experiences in handling personal data in research?
  • Which difficulties do you run into when handling personal data?
  • What support would you need to help you handle personal data in your research?
  • Do you have a concrete need for support at the moment?

All of the one-on-one meetings were conducted online and took approximately 30 minutes. During the meetings, one of the project coordinators (DH) was always present, together with either the other project coordinator (NM) or the relevant faculty privacy officer. Before the privacy officer was invited to the meeting, the researcher’s consent to do so was always obtained first.

UU-wide

From the survey respondents, 40 researchers left their email address to be contacted. Of those, 28 researchers indicated that they were willing to meet with us. Below, the division over faculties can be seen for all interviewees. Notably, the distribution seemed to mirror the faculty distribution in the entire survey relatively well.

Science

From the survey respondents, 7 researchers from the Faculty of Science left their email address to be contacted. Of those, 7 online meetings have been conducted, each with the project coordinators. Below, the division over positions can be seen for all interviewees from the Faculty of Science.

FSBS

From the survey respondents, 9 researchers from the Faculty of Social and Behavioural Sciences (FSBS) left their email address to be contacted. Of those, 6 online meetings have been conducted, most of them with the faculty privacy officer present. Below, the division over positions can be seen for all interviewees from the Faculty of Social and Behavioural Sciences (FSBS).

Humanities

From the survey respondents, 7 researchers from the Faculty of Humanities left their email address to be contacted. Of those, 5 online meetings have been conducted, all of them with one of the faculty privacy officers present. Below, the division over positions can be seen for all interviewees from the Faculty of Humanities.

Veterinary Medicine

From the survey respondents, 7 researchers from the faculty of Veterinary Medicine left their email address to be contacted. Of those, 5 online meetings have been conducted, most of them with a one of the faculty privacy officers present. Below, the division over positions can be seen for all interviewees from the faculty of Veterinary Medicine.

LEG

From the survey respondents, 7 researchers from the faculty of Law, Economics, and Governance (LEG) left their email address to be contacted. Of those, 2 online meetings have been conducted, both of them with one of the privacy officers present. Below, the division over positions can be seen for all interviewees from the faculty of Law, Economics, and Governance (LEG).

Geo

From the survey respondents, 1 PhD/junior/postdoctoral researcher from the Faculty of Geosciences left their email address to be contacted and agreed to indeed meet online.

Analysis

From the raw Qualtrics output, we first cleaned and split the data into different data files (cleaned and closed survey responses, open text responses, email addresses, see the pseudonymise-data.R script for details). Both the open text responses and the notes made during the one-on-one meetings were separately and manually coded to enable the extraction of action points (see the file codes-open-text-responses-meetings.csv for the codes used).

Below we report on the descriptive statistics or summaries from the survey questions and notes made during the one-on-one meetings. As we did not formulate hypotheses, no statistical analyses were performed.

Data and material availability

All survey-related documentation can be found in the dedicated GitHub repository.

Documentation

Code

The repository contains all scripts and documents used to clean the data and write the reports.

Data

As the original dataset contains personal information (demographic information, open text responses, email addresses, etc.), and no consent was obtained to share those details, we are unable to share them in this repository. We did however create dummy data files using Qualtrics’s “Generate responses” functionality (for the fake raw survey data) and the website “Mockaroo.com” (for the interview and open text response data). These files can be used to regenerate the current report, but will not create any realistic results.

To reproduce this report:

  1. Clone the repository from GitHub.
  2. In RStudio, open the file plot-data.R
  3. In the first dependencies block of plot-data.R, change reproduce <- "no" to reproduce <- "yes" so that the fake data files are read into the report.
  4. Open the file data-privacy-survey-report.Rmd and knit the document into a “fake” survey report.

Results

Types of research data

UU-wide

To investigate the types of research that were represented in the sample, we asked with which types of data, and specifically which types of personal data, researchers worked. Most researchers indicated to use tabular, textual, code and audio data as their primary research data format. In terms of personal data types, contact information, demographic information and direct identifiers were most common. This can be either because these are the types of personal data that are indeed most common, but possibly also because researchers mostly recognised these types of data as being personal data.

Types of research data
Datatype Count
Audio data 56
Bio-medical samples and data 28
Code/theoretical models 60
Geographical data 25
Images 38
Other 8
Physical samples 12
Physiological measurements 28
Tabular data 125
Textual data 107
Video data 39
Types of personal data
Personal Datatype Count
Contact information 82
Demographic information 102
Derived personal data 30
Direct identifiers 66
Health/physical information 26
Human behaviour 32
None 18
Other 11
Sensitive demographic information 36
Sensitive direct identifiers 9


When comparing faculties, it seemed that most personal data were processed in the faculty of Social and Behavioural Sciences (FSBS), although the faculties of Science and Veterinary Medicine also seemed to process quite some personal data. In the tabs, we look further into the types of (personal) data processed within each faculty.

Science

As with the university-wide data, researchers from the Science faculty indicated that they also most often used tabular data, textual data and code/theoretical models in their research. The same goes for the types of personal data: the same top-3 types were used here as in the entire university (demographic information, contact information, direct identifiers).

Types of research data
Datatype Count
Audio data 8
Bio-medical samples and data 8
Code/theoretical models 24
Geographical data 4
Images 13
Other 2
Physical samples 6
Physiological measurements 5
Tabular data 34
Textual data 27
Video data 9
Types of personal data
Personal Datatype Count
Contact information 13
Demographic information 17
Derived personal data 11
Direct identifiers 13
Health/physical information 8
Human behaviour 9
None 12
Other 2
Sensitive demographic information 4
Sensitive direct identifiers 4


As a large part of the researchers from the Science faculty indicated not to use any personal data whatsoever in their research, it may be useful to investigate whether this differed per department. In the figure below, the division of personal data types per Science department is plotted. As can be expected, the departments of Biology, Physics and Mathematics did not seem to process much personal data, whereas in the Information and Computing Sciences and Pharmaceutical Sciences departments, the most personal data were processed. This makes sense considering the types of research performed in these departments.

FSBS

As with the university-wide data, researchers from the Faculty of Social and Behavioural Sciences (FSBS) indicated that they also most often used tabular data, textual data and code/theoretical models in their research. The same goes for the types of personal data: demographic and contact information were used most often. Interestingly, sensitive demographic information was used much more often relative to the UU-wide responses.

Types of research data
Datatype Count
Audio data 12
Bio-medical samples and data 4
Code/theoretical models 12
Geographical data 6
Images 3
Other 4
Physical samples 1
Physiological measurements 8
Tabular data 33
Textual data 18
Video data 10
Types of personal data
Personal Datatype Count
Contact information 20
Demographic information 32
Derived personal data 8
Direct identifiers 14
Health/physical information 6
Human behaviour 8
Other 3
Sensitive demographic information 16
Sensitive direct identifiers 1


When comparing the types of personal data across departments of the faculty, the departments of Sociology and Cultural Anthropology seemed to process the least amount of personal data - although these departments may also simply be underrepresented in the survey sample. Most other departments processed quite some personal data, as can be expected from a faculty of which the research focuses almost exclusively on humans.

Humanities

As with the university-wide data, researchers from the Faculty of Humanities indicated that they also most frequently used tabular data, textual data and audio data in their research. The same goes for the types of personal data: the same top-3 types were present here as in the entire university (demographic information, contact information, direct identifiers).

Types of research data
Datatype Count
Audio data 12
Bio-medical samples and data 2
Code/theoretical models 4
Images 5
Physiological measurements 2
Tabular data 13
Textual data 18
Video data 6
Types of personal data
Personal Datatype Count
Contact information 10
Demographic information 14
Derived personal data 2
Direct identifiers 11
Health/physical information 1
Human behaviour 5
None 2
Other 2
Sensitive demographic information 4
Sensitive direct identifiers 1


When looking at the different departments within the faculty, it was clear that mostly the departments of Languages, Literature and Communication, and of Media and Culture Studies processed personal data in their research. Especially researchers at the department of Languages, Literature and Communication seemed to process a lot of personal data, which makes sense considering the type of research performed there.

Veterinary Medicine

When comparing the types of data that researchers at the Faculty of Veterinary Medicine indicated to work with, it was clear that researchers at this faculty worked with slightly different types of data, i.e., biological and physiological data were more common among these researchers. Despite this, the most frequent types of personal data do correspond well to the university-wide numbers: demographic information, direct identifiers and contact information were also the most common types of personal data researchers at this faculty seemed to deal with.

Types of research data
Datatype Count
Audio data 4
Bio-medical samples and data 13
Code/theoretical models 10
Geographical data 6
Images 10
Other 1
Physical samples 4
Physiological measurements 13
Tabular data 20
Textual data 13
Video data 9
Types of personal data
Personal Datatype Count
Contact information 16
Demographic information 15
Derived personal data 5
Direct identifiers 11
Health/physical information 8
Human behaviour 3
None 2
Other 1
Sensitive demographic information 2
Sensitive direct identifiers 2


When looking at each department separately, it became clear that researchers in the department of Population Health Sciences processed the large majority of the personal data within the faculty. Of course, it may also be possible that this department was simply better represented in the survey than the other departments.

LEG

As with the university-wide data, researchers from the Faculty of Law, Economics and Governance (LEG) indicated that they also most frequently used tabular data, textual data and audio data in their research. The same goes for the types of personal data: the same top-3 types were present here as in the entire university (demographic information, contact information, direct identifiers).

Types of research data
Datatype Count
Audio data 12
Code/theoretical models 1
Geographical data 2
Images 3
Tabular data 13
Textual data 17
Video data 2
Types of personal data
Personal Datatype Count
Contact information 14
Demographic information 14
Derived personal data 2
Direct identifiers 14
Health/physical information 3
Human behaviour 3
Sensitive demographic information 8


When looking at the different departments within the faculty, researchers in the department of Governance seemed to process personal data the most often, followed by the department of Law. Of course, it is possible that the other departments process more personal data than the respondents indicated.

Geo

As with the university-wide data, researchers from the faculty of Geosciences indicated that they also most frequently used textual and tabular data. However, in contrast to the university-wide data, they were followed by geographical data and code/theoretical models. This is to be expected considering the types of research performed in the faculty. In terms of personal data, the same top-3 types were present here as in the entire university (demographic information, contact information, direct identifiers).

Types of research data
Datatype Count
Audio data 4
Code/theoretical models 7
Geographical data 7
Images 2
Physical samples 2
Tabular data 10
Textual data 11
Video data 2
Types of personal data
Personal Datatype Count
Contact information 9
Demographic information 7
Derived personal data 2
Direct identifiers 4
Human behaviour 2
None 2
Other 1
Sensitive demographic information 1
Sensitive direct identifiers 1


When comparing departments, it became immediately clear that most personal data seemed to be processed in the department of Sustainable Development, and some also in the Human Geography and Spatial Planning department. This makes sense, as the research performed at the third department (Earth sciences) usually does not focus on human behaviour and thus does not involve much personal data, if any.

Current practices

The first part of the survey addressed the researchers’ current practices in handling personal data in their research.

Protective measures

UU-wide

With respect to organisational and technical measures used to protect personal data, most researchers indicated that they pseudonymise/anonymise their data. As this question is self-reported, we cannot assess whether the researchers’ data was actually sufficiently pseudonymised/anonymised in line with the GDPR. Secondly, many researchers seemed to implement access control and encryption, and complete a Data Management Plan (DMP) during their project(s). This is to be expected, as most funders nowadays require a DMP and many DMP templates explicitly address topics like pseudonymisation, access control and encryption. On the other hand, GDPR-specific assessments such as Data Protection Impact Assessments (DPIAs) or privacy reviews were least used. As it stands, these assessments were at this time still only carried out on a case-by-case basis.

Science

Below the protective and planning measures used by researchers of the Faculty of Science are visualised.

FSBS

Below the protective and planning measures used by researchers of the Faculty of Social and Behavioural Sciences (FSBS) are visualised.

Humanities

Below the protective and planning measures used by researchers of the Faculty of Humanities are visualised.

Veterinary Medicine

Below the protective and planning measures used by researchers of the Faculty of Veterinary Medicine are visualised.

LEG

Below the protective and planning measures used by researchers of the Faculty of Law, Economics and Governance (LEG) are visualised.

Geo

Below the protective and planning measures used by researchers of the Faculty of Geosciences are visualised.

Storage media

UU-wide

When working with personal data, it is important to choose a sufficiently secure storage medium. Luckily, most researchers indicated to rely on storage solutions that are provided or recommended by UU - and in most cases are indeed safe for storing personal data. Nonetheless, some researchers indicated to use non-UU solutions, including cloud solutions that UU advises against using. Several researchers also indicated that they used other storage solutions, such as those of external institutions (e.g., University Medical Center, Trimbos Institute, Central Bureau of Statistics) and repositories (e.g., DANS EASY, CLARIAH).

Science

In the graph below, you can see how often different storage media were reportedly used by researchers from the Faculty of Science.

FSBS

In the graph below, you can see how often different storage media were reportedly used by researchers from the Faculty of Social and Behavioural Sciences (FSBS).

Humanities

In the graph below, you can see how often different storage media were reportedly used by researchers from the Faculty of Humanities.

Veterinary Medicine

In the graph below, you can see how often different storage media were reportedly used by researchers from the Faculty of Veterinary Medicine.

LEG

In the graph below, you can see how often different storage media were reportedly used by researchers from the Faculty of Law, Economics and Governance (LEG).

Geo

In the graph below, you can see how often different storage media were reportedly used by researchers from the Faculty of Geosciences.

Data Protection Impact Assessment (DPIA)

UU-wide

A Data Protection Impact Assessment (DPIA) is a legal instrument to assess the risks involved for data subjects, and helps determine the necessary safeguards to reduce those risks to an acceptable level. Despite it being an important legal instrument, most respondents indicated to never have carried one out, or heard of it, for that matter. A minority of the sample had heard of it, or had completed one. Currrently, the desired scenario is that researchers get help from a privacy officer when performing a DPIA, and the results seemed to indicate that this was indeed the case in the majority of cases.

Science

Below the experience and help received with DPIAs is displayed for the Faculty of Science.

FSBS

Below the experience and help received with DPIAs is displayed for the Faculty of Social and Behavioural Sciences (FSBS).

Humanities

Below the experience and help received with DPIAs is displayed for the Faculty of Humanities.

Veterinary Medicine

Below the experience and help received with DPIAs is displayed for the Faculty of Veterinary Medicine.

LEG

Below the experience and help received with DPIAs is displayed for the Faculty of Law, Economics and Governance (LEG).

Geo

Below the experience and help received with DPIAs is displayed for the Faculty of Geosciences.

Data sharing practices

UU-wide

To investigate how often data were being shared and under which circumstances, we asked researchers if and with which parties they typically share their research data, and which measures they usually take to do so securely. While there were some researchers who indicated not to share their data at all, most researchers seemed to only share research data within the organisation, and otherwise within the European Economic Area (EEA). This is relatively good news, as such transfers usually require little, if any, additional safeguards. Responses in the “Other” category, however, suggest that a lot of data actually were shared, for example with co-authors at another institution, students, or in “pseudonymised form”.

Concerning the measures used before sharing data, most researchers indicated to pseudonymise their data, although we cannot assess the quality of such pseudonymisation. Researchers also seemed to use approved tools, agreements, and providing access without data transfer. Only a few researchers indicated that they involved a data expert while transferring data, and the use of Standard Contractual Clauses appeared limited.

Science

Below the data sharing practices across the Faculty of Science are visualised.

FSBS

Below the data sharing practices across the Faculty of Social and Behavioural Sciences (FSBS) are visualised.

Humanities

Below the data sharing practices across the Faculty of Humanities are visualised.

Veterinary Medicine

Below the data sharing practices across the Faculty of Veterinary Medicine are visualised.

LEG

Below the data sharing practices across the Faculty of Law, Economics and Governance (LEG) are visualised.

Geo

Below the data sharing practices across the Faculty of Geosciences are visualised.

Data publishing

UU-wide

Open science and privacy are often seen as conflicting, as sharing personal data cannot be done just like that, but requires at the very least a valid legal basis and additional safeguards. Therefore, in our experience, to date not many datasets that contain personal data have been shared for reuse purposes, and the survey respondents seemed to confirm this experience. The majority of researchers indicated not to publish their data, or only in anonymised form (again, we cannot be certain whether the data were indeed fully anonymised).

The primary reason for not publishing data appeared to be that researchers were still working on the data, did not want to/need to publish their data, or could not anonymise the data. Other reasons given by the respondents included that publishing data was too much effort, publication was undesirable, or just never considered.

Science

For the Faculty of Science, no researchers indicated to publish datasets, only metadata if applicable. Therefore, there were no respondents who filled out the question about the data format in which the data were published.

FSBS

Below you can see for the Faculty of Social and Behavioural Sciences (FSBS) in which format they published their data if they did (left), or if not, which reasons they had to not publish data (right).

Humanities

Below you can see for the Faculty of Humanities in which format they published their data if they did (left), or if not, which reasons they had to not publish data (right).

Veterinary Medicine

For the Faculty of Veterinary Medicine, no researchers indicated to publish datasets, only metadata if applicable. Therefore, there were no researchers who filled out the question about the data format in which the data were published.

LEG

Below you can see for the Faculty of Law, Economics and Governance in which format they published their data if they did (left), or if not, which reasons they had to not publish data (right).

Geo

For the Faculty of Geosciences, no researchers indicated to publish datasets, only metadata if applicable. Therefore, there were no researchers who filled out the question about the data format in which the data were published.

Existing support channels

The second part of the survey concerned the visibility and use of existing support channels.

Faculty privacy officer

When asked whether respondents knew who their faculty privacy officer was, a little over half of the researchers indicated that they did not (Yes: 64 (48%), No: 69 (52%)).

When comparing faculties (see below), it was striking that the faculties where the most personal data seems to be processed, a small majority of researchers was not aware of their faculty privacy officer. This suggests either that these researchers had simply never required help from their privacy officer, or that the faculty privacy officers could increase their visibility within these faculties.

Looking for help

UU-wide

When asked whether researchers had ever looked for support in the form of information, tools, or in-person support, an overwhelming majority indicated that they had, as can be seen in the graph below. Most researchers that looked for support indicated, however, that they did not always find the support they were looking for. Together with the results from the previous question, this suggests that the visibility of the current support channels could be improved. Note however that there were some differences between faculties (see the different tabs).

Science

Below are the results for the Faculty of Science when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

FSBS

Below are the results for the Faculty of Social and Behavioural Sciences (FSBS) when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

Humanities

Below are the results for the Faculty of Humanities when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

Veterinary Medicine

Below are the results for the Faculty of Veterinary Medicine when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

LEG

Below are the results for the Faculty of Law, Economics and Governance (LEG) when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

Geo

Below are the results for the Faculty of Geosciences when asked whether researchers had looked for information, support or tools in handling personal data, and whether they had found what they were looking for:

Channels used to find support

UU-wide

The graph below indicates which channels researchers used most when looking for information about handling personal data. As shown below, all options mentioned were to some extent consulted by the researchers. However, there was large differentiation between these channels. The university website and intranet were the most visited online resources for information about handling personal data. Notably, colleagues appeared to play an important role as well in informing researchers about how to handle personal data. This suggests that the better informed researchers are, the more positive the effect is on their colleagues as well. Moreover, this also suggests that in-person support may be a more effective way of increasing awareness of privacy-related practices than more “distant” information sources.

Science

Below you can find the channels used by researchers from the Faculty of Science to find information about handling personal data:

FSBS

Below you can find the channels used by researchers from the Faculty of Social and Behavioural Sciences (FSBS) to find information about handling personal data:

Humanities

Below you can find the channels used by researchers from the Faculty of Humanities to find information about handling personal data:

Veterinary Medicine

Below you can find the channels used by researchers from the Faculty of Veterinary Medicine to find information about handling personal data:

LEG

Below you can find the channels used by researchers from the Faculty of Law, Economics and Governance (LEG) to find information about handling personal data:

Geo

Below you can find the channels used by researchers from the Faculty of Geosciences to find information about handling personal data:

Challenges and needs (survey)

UU-wide

As can be seen above, most researchers experienced privacy to be an obstacle for open science and research data management in some way. It is therefore important to aim for support in this area. What this support should look like according to researchers, however, differed a bit. As can be seen below, accessible information and visible support channels seemed to be the most wanted improvements in the current support, closely followed by UU-wide policy on the topic, and privacy-related walk-in hours.

Science

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Science:

FSBS

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Social and Behavioural Sciences (FSBS):

Humanities

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Humanities:

Veterinary Medicine

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Veterinary Medicine:

LEG

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Law, Economics and Governance (LEG):

Geo

Below you can find the preferred ways of improving privacy-related support as indicated by researchers from the Faculty of Geosciences:

Challenges and needs (open questions, meetings)

As mentioned in the Methods section, the responses on the following open questions, and the notes taken during the one-on-one meetings with researchers were coded to allow for easier analysis. From the survey, the following questions were coded:

  • “Which challenges concerning the handling of personal data do you run into most often?”
  • “What specific information or tools about handling personal data are you missing from existing sources?”
  • “What can we do better to support you in handling personal data in research?” (responses to the “Other” option)

The codes that were assigned in both the survey and the meeting notes can be seen in the word cloud below: the larger the font, the more often the code was assigned. Please note that there is overlap in researchers between the open questions in the survey and the meeting notes, and therefore some codes may have been applied twice for the same researcher.

UU-wide

Science

Below you can find the wordcloud for just the Science faculty:

FSBS

Below you can find the wordcloud for just the Faculty of Social and Behavioural Sciences (FSBS):

Humanities

Below you can find the wordcloud for just the Faculty of Humanities:

Veterinary Medicine

Below you can find the wordcloud for just the Faculty of Veterinary Medicine:

LEG

Below you can find the wordcloud for just the Faculty of Law, Economics and Governance (LEG):

Geo

Below you can find the wordcloud for just the Faculty of Geosciences:

Most common challenges and needs

Below we highlight the most frequently mentioned challenges and needs expressed by researchers in the survey (open questions) and one-one-one meetings, along with the amount of times they were mentioned. A full summary of the results and recommendations can be found in the Recommendations report.

Visibility and findability

It should be clear where to go for help (to whom or which webpage, etc.) (mentioned 27 times). Some researchers indicated not to know where to go for help with privacy-related matters, others mentioned that there should be more support personnel available (mentioned 8 times). One researcher said to prefer having one place to go to for all data-related questions.

Closely related to visibility is findability. Many researchers pointed out that available information was difficult to find, or it was confusing which source should be followed. As an example, one researcher from the faculty of Humanities was determined that they found some useful information at the RDM Support website. However, upon closer inspection, it appeared that this information arose from another data management-related website with the UU logo on it.

“UU biedt ontzettend veel aan, maar je moet veel websites bezoeken om alles te vinden […] Een duidelijk overzicht voor welke informatie je waar moet zijn zou fijn zijn.”

Specific tools

Some (26) researchers expressed a need for a concrete (improvement of a) tool, such as for:

  • Speech-to-text conversion tool (mentioned 6 times)
  • Tools to blur video images or otherwise pseudonymise video data (mentioned 4 times)
  • High-performance computing for critical personal data (mentioned 1 times)

Hands-on support

Many (24) researchers indicated that support staff could sometimes provide support in a more hands-on fashion, rather than abstract advice and telling researchers how not to do things. Some researchers added that privacy professionals sometimes had a tendency to cling to the letter of the law, leading to significant delays in their project, instead of looking at how to concretely solve existing issues in practice (mentioned 11 times):

“Soms zijn we door deze regels heiliger dan de paus.”
“Actual getting-your-hands-dirty support: not the kind that tells you what to do, but also the kind that helps you by doing.”

Less bureaucracy

Processes were often experienced as time-inefficient, and sometimes longer and more bureaucratic than necessary (mentioned 22 times). For example, the DPIA process was mentioned explicitly 10 times, as well as having to fill out too many forms with overlapping content (e.g., Privacy Scan, Data Management Plan, DPIA). Some (9) researchers argued that (part of) this burden should be relieved or carried by support staff:

“Minder acties die gericht zijn op inhoudelijk trainen van WP en meer uit handen nemen van deze groep.”
“Sharing data costs a lot of time and is inefficient when you do not do it often.”

Unclear processes and guidelines

Many (18) researchers complained that it was unclear what was expected of them when they processed personal data in their research, or that they would like to have more, or more practical guidelines on this topic (mentioned 14 times), for example on:

  • What steps do researchers need to take? (3 researchers)
  • Who should researcher ask for help in which situation?
  • Who is responsible or has the authority to make the right decisions when handling personal data? Do privacy officers/DPO have to be seen as gatekeepers (similar to an ethical committee) or as advisers? Do researchers have to listen to the advise, or is that their own responsibility? (4 researchers)
  • When rules change, how does that affect what researchers need to do? (1 researchers)
  • How do processes and requirements differ for different types of research (e.g., student projects vs. large longitudinal projects? (3 researchers)

Information and education

A large part of the respondents expressed a need for more (clear) information and education with respect to handling personal data in research. In general, researchers indicated that the information offered to them should be more clear (3 researchers), simpler (9 researchers), consistent across resources (8 researchers), possibly in the form of templates (14 researchers). Luckily, some researchers indicated to already be happy with existing materials (4 researchers).

Information for specific research or data types

Many researchers indicated to have a need for more tailored information for specific types of data or research (9 researchers), for example for etnographic data (mentioned 2 times), historical data (mentioned 2 times), or video data (mentioned 3 times).

“De informatie is gewoon veel te generiek, er zouden templates moeten zijn per type onderzoek.”
“Veel templates en uitlegmodellen spreken over data, data packages en metadata, maar die woorden zijn niet ingebed in historisch onderzoek. Er ontstaat al snel verwarring over wat historici nu precies moeten met archiefmateriaal in het licht van privacy.”

Frequently asked questions

A selection of researchers used the space in the open questions and/or the meetings to ask knowledge-related questions. Below are examples of the most commonly asked questions:

  • When are data still personal (mentioned 9 times), how to anonymise personal data (mentioned 3 times), and when are data anonymised sufficiently (mentioned 5 times)?
  • What are the privacy-related requirements for students? (mentioned 7 times)
  • How to store different types of personal data? (mentioned 6 times)
  • Data sharing: what data can be shared and with whom? (mentioned 7 times)
  • How to find a balance between informing data subjects too little vs. providing too much privacy-related information that will scare them off or hurt their trust in research? (mentioned 5 times)
  • How to balance open science and privacy? (mentioned 5 times). Related: when is reusing data for different purposes allowed? (e.g., education data, data collected by students, mentioned 4 times)
  • How to collaborate with multiple institutions? (mentioned 4 times)

Education and onboarding

The need for more educational resources was also recognised by a selection of the researchers. Concretely, they said, researchers could be educated more in the following ways:

  • Privacy and/or research data management as part of the master or PhD curriculum (mentioned 8 times)
  • Privacy as part of the onboarding procedure of new employees (mentioned 5 times)
  • Mandatory privacy training for supervisors, principal investigators, professors, and/or teachers (mentioned 4 times)
  • A course on how to handle personal data in research (mentioned 2 times)
“There is no one who tells at the start of your PhD how you should handle your data. […] I think new PhD students should get a basic course on data management and privacy.”

… or simply no issues

Notably, there were also researchers who indicated not to have run into issues (yet), or to have received sufficient and useful help (mentioned 9 times). For example:

“The data manager and privacy officer of the faculty of humanities help a lot. This support is essential!”
“Tot nu toe heb ik niet veel problemen gehad. De institutional review board van onze afdeling kijkt altijd kritisch naar de onderzoeksvoorstellen, ook met name op omgaan met persoonsgegevens.”

Summary

A full summary of the results and recommendations can be found in the Recommendations report.

Discussion

In order to interpret the results described in this report correctly, there are a few matters that need to be taken into account:

  • The respondents of the survey represented a selection of all of UU’s Scientific Personnel. It is likely that at least part of the respondents had a specific reason to participate. For example, they may have explicitly wanted to raise their voice because of some bad experiences they had had. Although it is important that these concerns be heard, they should not be interpreted as to speak for all UU researchers.
  • It is important to note that we, the writers of this report, cannot be considered fully objective in interpreting the survey results. The feedback received via this survey also involves our own services and possibly work performance. Although we have attempted to interpret the results of the survey without bias, it is possible that we have introduced some, especially where the open questions and one-on-one meetings are concerned.

Technical information

This report was created in R markdown, and was last generated on 2023-05-22. It was created in the following local environment:

## R version 4.3.0 (2023-04-21 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19042)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=Dutch_Netherlands.utf8  LC_CTYPE=Dutch_Netherlands.utf8   
## [3] LC_MONETARY=Dutch_Netherlands.utf8 LC_NUMERIC=C                      
## [5] LC_TIME=Dutch_Netherlands.utf8    
## 
## time zone: Europe/Amsterdam
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] wordcloud2_0.2.1      readxl_1.4.2          kableExtra_1.3.4.9000
##  [4] knitr_1.42            gridExtra_2.3         lubridate_1.9.2      
##  [7] forcats_1.0.0         stringr_1.5.0         dplyr_1.1.2          
## [10] purrr_1.0.1           readr_2.1.4           tidyr_1.3.0          
## [13] tibble_3.2.1          ggplot2_3.4.2         tidyverse_2.0.0      
## [16] data.table_1.14.8    
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.6        utf8_1.2.3        generics_0.1.3    xml2_1.3.4       
##  [5] stringi_1.7.12    hms_1.1.3         digest_0.6.31     magrittr_2.0.3   
##  [9] evaluate_0.20     grid_4.3.0        timechange_0.2.0  fastmap_1.1.1    
## [13] cellranger_1.1.0  jsonlite_1.8.4    httr_1.4.6        rvest_1.0.3      
## [17] fansi_1.0.4       viridisLite_0.4.2 scales_1.2.1      jquerylib_0.1.4  
## [21] cli_3.6.1         rlang_1.1.1       ellipsis_0.3.2    munsell_0.5.0    
## [25] withr_2.5.0       cachem_1.0.8      yaml_2.3.7        tools_4.3.0      
## [29] tzdb_0.4.0        colorspace_2.1-0  webshot_0.5.4     vctrs_0.6.2      
## [33] R6_2.5.1          lifecycle_1.0.3   htmlwidgets_1.6.2 pkgconfig_2.0.3  
## [37] pillar_1.9.0      bslib_0.4.2       gtable_0.3.3      glue_1.6.2       
## [41] systemfonts_1.0.4 highr_0.10        xfun_0.39         tidyselect_1.2.0 
## [45] rstudioapi_0.14   farver_2.1.1      htmltools_0.5.5   labeling_0.4.2   
## [49] rmarkdown_2.21    svglite_2.1.1     compiler_4.3.0

  1. Research Data Management Support, Utrecht University, ORCID: 0000-0003-3282-8083↩︎

  2. Research Data Management Support, Utrecht University, ORCID: 0000-0003-1412-4402↩︎

  3. The Data Privacy Project was funded by Utrecht University’s Research IT program and a Digital Competence Center grant from the Dutch Organization for Scientific Research (NWO).↩︎