Data Use Conditions


Data Use Ontology at EGA

The EGA is committed to its involvement in the work of GA4GH. In an effort to enhance data discoverability & streamline data access, EGA have implemented the use of the Data Use Ontology (DUO), based on consent codes as described in Dyke et al. 2017. The Data Use Ontology codes will be displayed on the live dataset page of your submission to advise any would be requestor on how the data can be used and also to enhance data discoverability as users will be able to search on these codes to find applicable datasets.

Detailed in the table below are the current DUO codes that should be added into the policy section of your submission in webin or used in your XML where submitting programmatically. These terms are verified against the current version here.

For each policy please select a maximum of one primary code and any number of secondary category codes (if appropriate), which are given in the table below.

  Term   Label   Description
  DUO:0000001 data use permission A data item that is used to indicate consent permissions for datasets and/or materials, and relates to the purposes for which datasets and/or material might be removed, stored or used.
  DUO:0000004 no restriction This consent code primary category indicates there is no restriction on use.
  DUO:0000006 health/medical/biomedical research and clinical care This primary category consent code indicates that use is allowed for health/medical/biomedical purposes; does not include the study of population origins or ancestry.
  DUO:0000007 disease-specific research and clinical care This primary category consent code indicates that use is allowed provided it is related to the specified disease.
  DUO:0000011 population origins or ancestry research This primary category consent code indicates that use of the data is limited to the study of population origins or ancestry.
  DUO:0000012 research-specific restrictions This secondary category consent code indicates that use is limited to studies of a certain research type.
  DUO:0000015 no general methods research This secondary category consent code indicates that use includes methods development research(e.g., development of software or algorithms) only within the bounds of other use limitations.
  DUO:0000016 genetic studies only This secondary category consent code indicates that use is limited to genetic studies only (i.e., no phenotype-only research).
  DUO:0000017 data use modifier Data use modifiers indicate additional conditions for use.
  DUO:0000018 not-for-profit use only This requirement indicates that use is limited to not-for-profit organizations.
  DUO:0000019 publication required This requirement indicates that requestor agrees to make results of studies using the data available to the larger scientific community.
  DUO:0000020 collaboration required This requirement indicates that the requestor must agree to collaboration with the primary study investigator(s).
  DUO:0000021 ethics approval required This requirement indicates that the requestor must provide documentation of local IRB/ERB approval.
  DUO:0000022 geographical restriction This requirement indicates that use is limited to within a specific geographic region..
  DUO:0000024 publication moratorium This requirement indicates that requestor agrees not to publish results of studies until a specific date.
  DUO:0000025 time limit on use This requirement indicates that use is approved for a specific number of months.
  DUO:0000026 user-specific restriction This requirement indicates that use is limited to use by approved users.
  DUO:0000027 project-specific restriction This requirement indicates that use is limited to use within an approved project.
  DUO:0000028 institution-specific restriction This requirement indicates that use is limited to use within an approved institution.
  DUO:0000029 return to database/resource This requirement indicates that the requestor must return derived/enriched data to the database/resource.
  DUO:0000042 General Research Use This data use limitation indicates that use is allowed for health/medical/biomedical purposes and other biological research, including the study of population origins or ancestry.
  DUO:0000043 clinical care use This data use modifier indicates that use is allowed for clinical use and care.
  DUO:0000044 population origins or ancestry research prohibited This data use modifier indicates use for purposes of population, origin, or ancestry research is prohibited.
  DUO:0000045 not for profit organisation use only This data use modifier indicates that use of the data is limited to not-for-profit organizations.
  DUO:0000046 non-commercial use only This data use modifier indicates that use of the data is limited to not-for-profit use.

Point to Notice : For the consent code DUO_0000007 where data is restricted to use on a specific disease, please accompany it with an appropriate ontology from MONDO e.g., If the data is restricted to the use of research into juvenile idiopathic arthritis the code should be displayed as DUO_0000007; MONDO:0011429


Submission via Webin or Submission Portal

Once you have chosen the appropriate codes from above, please contact EGA helpdesk detailing the following :

  • The box to which you are submitting
  • The policy accession to which the codes should be added
We will then make the necessary changes so that the appropriate codes are added to the policy and displayed on our website.


Programmatic Submissions via XML and REST

If you are submitting programmatically with the use of XML the example detailed below should be followed. The data uses attribute is used to store the DUO code. If the DUO code has no modifiers e.g., no use of an additional ontology to modify a term then it should be reflected as seen here

<DATA_USES>
   <DATA_USE ontology="DUO"code="0000014"version="17-07-2016"/>
</DATA_USES>

Where version references the ontology used from a given build of the DUO which defaults to the most recent version. This is required in case the term changes in a future version of DUO.

To allow for DUO to be modified in order to provide further information on a data use condition (such as an EFO to clarify the data use condition to a specific disease), a modifier should be used. For example, below DUO:0000007 is modified with the use of two EFO ontologies and is given as:

 <DATA_USE ontology="DUO"code="0000007"version="17-07-2016">
  <MODIFIER>
    <DB>EFO</DB>
    <ID>0001645</ID>
  </MODIFIER>
  <MODIFIER>
    <DB>EFO</DB>
    <ID>0001655</ID>
  </MODIFIER>

In addition the URL element can be used to reference a specific URL related to the data use, such as a specific version of an ontology, or an associated link.

The final XML containing two separate DUO’s (highlighted) to describe a data use condition would look like this.

<POLICY_SET>
  <POLICY alias="ena-POLICY-BABRAHAM-23-03-2017-09:47:38:853-62"center_name="BABRAHAM"accession="EGAP00001000615"broker_name="EGA">  
    <IDENTIFIERS> 
      <PRIMARY_ID>EGAP00001000615</PRIMARY_ID>
      <SUBMITTER_ID namespace="BABRAHAM">ena-POLICY-BABRAHAM-23-03-2017-09:47:38:853-62</SUBMITTER_ID>
    </IDENTIFIERS>
    <TITLE>Data Access Agreement for PCHiC, RNA-Seq, ChIP-Seq</TITLE>
    <DAC_REF accession="EGAC00001000523">
      <IDENTIFIERS>
        <PRIMARY_ID>EGAC00001000523</PRIMARY_ID>
      </IDENTIFIERS>
    </DAC_REF>
    <POLICY_FILE>ftp://ftp.ebi.ac.uk/pub/contrib/pchic/EGA_Data_Access_Request_DIL.docx</POLICY_FILE>
    <DATA_USES>
      <DATA_USE ontology="DUO"code="0000007"version="17-07-2016">
        <MODIFIER>
           <DB>EFO</DB>
           <ID>0001645</ID>
        </MODIFIER> 
        <MODIFIER>
           <DB>EFO</DB>
           <ID>0001655</ID>
        </MODIFIER>
       </DATA_USE>
       <DATA_USE ontology="DUO"code="0000014"version="17-07-2016"/>
       </DATA_USES>
   </POLICY>
</POLICY_SET>