A persistent identifiers (PID) can be assigned to a dataset or other research object to maximize its longevity. It consists of a unique ID and a service that ensures the right path to the object should it change locations over time. PID provide findability, persistence and authenticity to digital research objects and allow interoperability between collections and repositories. Because of this, PIDs are essential for an object to be FAIR. In practice, researchers use PIDs to reference datasets and other research outputs in academic or other types of publications.
Various PID systems exist: DOI, Handle, ARK, PURL, among others. Some repositories will use a specific system and others will mint their own PIDs upon submission. Data repositories indexed at re3data.org can be filtered by PID system.
Digital Object Identifier
Digital Object Identifier is the most widely adopted PID for research objects and publications. DOIs are minted and maintained by various Registration Agencies (RAs).
Academic publishers will be part of or have arrangements with RAs and provide DOIs for publications. Important among the group of DOI Registration Agencies are CrossRef, which assigns ID to academic publications, and DataCite, which does so to research data and other research outputs.
Many repositories work with their own unique persistent identifier systems. For example, ENA assigns accession numbers to hierarchically identify different levels of a submission (project, sample, assay, etc.). See here. Similar identifier systems are used by other EBI repositories, such as UniProt, ArrayExpress, MetaboLights, BioModels, EMPIAR and EGA. Accession numbers are assigned by the repository as part of the dataset submission process.
How to add identifier to datasets
Accession numbers and DOIs are minted and assigned to a digital research object by the repository or broker as part of the submission process.
Tools and tutorials
This PID Guide helps the user select a suitable PID system.