Data Use Conditions
Data Use Ontology at EGA
The EGA is committed to its involvement in the work of GA4GH. In an effort to enhance data discoverability & streamline data access, EGA have implemented the use of the Data Use Ontology (DUO), based on consent codes as described in Dyke et al. 2017. The Data Use Ontology codes will be displayed on the live dataset page of your submission to advise any would be requestor on how the data can be used and also to enhance data discoverability as users will be able to search on these codes to find applicable datasets.
Detailed in the table below are the current DUO codes that should be added into the policy section of your submission in webin or used in your XML where submitting programmatically. These terms are verified against the current version here.
For each policy please select a maximum of one primary code and any number of secondary category codes (if appropriate), which are given in the table below.
Term | Label | Description |
---|---|---|
DUO:0000001 | data use permission | A data item that is used to indicate consent permissions for datasets and/or materials, and relates to the purposes for which datasets and/or material might be removed, stored or used. |
DUO:0000004 | no restriction | This consent code primary category indicates there is no restriction on use. |
DUO:0000006 | health/medical/biomedical research and clinical care | This primary category consent code indicates that use is allowed for health/medical/biomedical purposes; does not include the study of population origins or ancestry. |
DUO:0000007 | disease-specific research and clinical care | This primary category consent code indicates that use is allowed provided it is related to the specified disease. |
DUO:0000011 | population origins or ancestry research | This primary category consent code indicates that use of the data is limited to the study of population origins or ancestry. |
DUO:0000012 | research-specific restrictions | This secondary category consent code indicates that use is limited to studies of a certain research type. |
DUO:0000015 | no general methods research | This secondary category consent code indicates that use includes methods development research(e.g., development of software or algorithms) only within the bounds of other use limitations. |
DUO:0000016 | genetic studies only | This secondary category consent code indicates that use is limited to genetic studies only (i.e., no phenotype-only research). |
DUO:0000017 | data use modifier | Data use modifiers indicate additional conditions for use. |
DUO:0000018 | not-for-profit use only | This requirement indicates that use is limited to not-for-profit organizations. |
DUO:0000019 | publication required | This requirement indicates that requestor agrees to make results of studies using the data available to the larger scientific community. |
DUO:0000020 | collaboration required | This requirement indicates that the requestor must agree to collaboration with the primary study investigator(s). |
DUO:0000021 | ethics approval required | This requirement indicates that the requestor must provide documentation of local IRB/ERB approval. |
DUO:0000022 | geographical restriction | This requirement indicates that use is limited to within a specific geographic region.. |
DUO:0000024 | publication moratorium | This requirement indicates that requestor agrees not to publish results of studies until a specific date. |
DUO:0000025 | time limit on use | This requirement indicates that use is approved for a specific number of months. |
DUO:0000026 | user-specific restriction | This requirement indicates that use is limited to use by approved users. |
DUO:0000027 | project-specific restriction | This requirement indicates that use is limited to use within an approved project. |
DUO:0000028 | institution-specific restriction | This requirement indicates that use is limited to use within an approved institution. |
DUO:0000029 | return to database/resource | This requirement indicates that the requestor must return derived/enriched data to the database/resource. |
DUO:0000042 | General Research Use | This data use limitation indicates that use is allowed for health/medical/biomedical purposes and other biological research, including the study of population origins or ancestry. |
DUO:0000043 | clinical care use | This data use modifier indicates that use is allowed for clinical use and care. |
DUO:0000044 | population origins or ancestry research prohibited | This data use modifier indicates use for purposes of population, origin, or ancestry research is prohibited. |
DUO:0000045 | not for profit organisation use only | This data use modifier indicates that use of the data is limited to not-for-profit organizations. |
DUO:0000046 | non-commercial use only | This data use modifier indicates that use of the data is limited to not-for-profit use. |
Point to Notice : For the consent code DUO_0000007 where data is restricted to use on a specific disease, please accompany it with an appropriate ontology from MONDO e.g., If the data is restricted to the use of research into juvenile idiopathic arthritis the code should be displayed as DUO_0000007; MONDO:0011429
Submission via Webin or Submission Portal
Once you have chosen the appropriate codes from above, please contact EGA helpdesk detailing the following :
- The box to which you are submitting
- The policy accession to which the codes should be added
Programmatic Submissions via XML and REST
If you are submitting programmatically with the use of XML the example detailed below should be followed. The data uses attribute is used to store the DUO code. If the DUO code has no modifiers e.g., no use of an additional ontology to modify a term then it should be reflected as seen here
<DATA_USES> <DATA_USE ontology="DUO"code="0000014"version="17-07-2016"/> </DATA_USES>
Where version references the ontology used from a given build of the DUO which defaults to the most recent version. This is required in case the term changes in a future version of DUO.
To allow for DUO to be modified in order to provide further information on a data use condition (such as an EFO to clarify the data use condition to a specific disease), a modifier should be used. For example, below DUO:0000007 is modified with the use of two EFO ontologies and is given as:
<DATA_USE ontology="DUO"code="0000007"version="17-07-2016"> <MODIFIER> <DB>EFO</DB> <ID>0001645</ID> </MODIFIER> <MODIFIER> <DB>EFO</DB> <ID>0001655</ID> </MODIFIER>
In addition the URL element can be used to reference a specific URL related to the data use, such as a specific version of an ontology, or an associated link.
The final XML containing two separate DUO’s (highlighted) to describe a data use condition would look like this.
<POLICY_SET> <POLICY alias="ena-POLICY-BABRAHAM-23-03-2017-09:47:38:853-62"center_name="BABRAHAM"accession="EGAP00001000615"broker_name="EGA"> <IDENTIFIERS> <PRIMARY_ID>EGAP00001000615</PRIMARY_ID> <SUBMITTER_ID namespace="BABRAHAM">ena-POLICY-BABRAHAM-23-03-2017-09:47:38:853-62</SUBMITTER_ID> </IDENTIFIERS> <TITLE>Data Access Agreement for PCHiC, RNA-Seq, ChIP-Seq</TITLE> <DAC_REF accession="EGAC00001000523"> <IDENTIFIERS> <PRIMARY_ID>EGAC00001000523</PRIMARY_ID> </IDENTIFIERS> </DAC_REF> <POLICY_FILE>ftp://ftp.ebi.ac.uk/pub/contrib/pchic/EGA_Data_Access_Request_DIL.docx</POLICY_FILE> <DATA_USES> <DATA_USE ontology="DUO"code="0000007"version="17-07-2016"> <MODIFIER> <DB>EFO</DB> <ID>0001645</ID> </MODIFIER> <MODIFIER> <DB>EFO</DB> <ID>0001655</ID> </MODIFIER> </DATA_USE> <DATA_USE ontology="DUO"code="0000014"version="17-07-2016"/> </DATA_USES> </POLICY> </POLICY_SET>