Searching for Data
What
As open science becomes the standard in academia, more (meta)data are published in repositories across the web. Before creating new data, it is therefore worth searching for existing datasets that may be reused or built upon.
Why
Searching for existing data can save time and resources while enabling comparison and improvements in research methodology and data quality. That being said, existing datasets are often collected for specific study designs and this may require adapting your research questions or methods to ensure compatibility.
Who
Researchers at all career stages can benefit from searching for existing data. Librarians and data stewards can provide support in identifying suitable repositories and search strategies.
When
Searching for data is especially useful in the early stages of a project, such as during proposal writing or study design. However, it can also be valuable later on, for example when validating results, performing comparisons, or extending analyses.
Where
Data can be deposited in generic or domain-specific repositories. They may also be curated and described within catalogues.
Examples of generic repositories include DataVerseNL, Yoda, and Zenodo.
For the social sciences and humanities, national infrastructures such as ODISSEI and CLARIAH serve as catalogues that provide access to datasets.
Registries such as re3data and FAIRsharing can help you identify suitable repositories for your discipline.
In addition, the following discovery platforms can help you find datasets across repositories:
- OpenAlex: a database containing a wide range of scholarly outputs, including datasets.
- DataCite Commons: a discovery service aimed at making research outputs findable and connected.
- OpenAIRE and its Dutch counterpart, the Netherlands Research Portal.
- Open Data Europe: a portal providing access to a large collection of European datasets from the EU and beyond.
How
You can search for data by constructing BOOLEAN search queries, just as you would in a systematic literature search. Operators such as AND, OR, NOT, and wildcards (e.g. *) can be used to combine concepts, broaden or narrow results, and capture variations of search terms.