Unaligned/Raw read sequence submission

The submission of all metadata required for reads (raw/unaligned) can be submitted and edited using EGA Webin, which should be a sufficient tool for most submitters.

Large scale and/or frequent submitters may wish to consider submitting your metadata programmatically to our REST server

**Metadata submitted as xmls or through the Webin tool will be made publicly available to view on the EGA website and other EBI resource/partner websites**

Unaligned/ Raw read Metadata


 The metadata objects required for read submissions are as follows:

Study: information about the sequencing study

Samples: Information about the sequencing samples

Experiment: information about the libraries, platform; associated with study, sample(s) and run(s)

DAC: contains information about the Data Access Committee (DAC)

Policy: contains the Data Access Agreement (DAA); associated with DAC

Dataset: contains the collection of runs/analysis data files to be subject to controlled access; associated with Policy

**Study, samples, DAC and policy metadata can all be registered prior to uploading files**


Using EGA Webin

Go to the EGA Webin page and log in using your submission account name and password.  

Components may be registered individually (e.g. Study, samples, DAC and policy) or together by selecting Experiements and reads if your data files have been uploaded)

Experiments and reads (data files must be uploaded)
Data Access Committee (DAC)
Data access policy
Dataset (data files must be uploaded)


Register your Study

  • Go to the New Submission tab
  • Choose Register study (project), click Next and complete the web form
  • Click submit to accession your study

To use the study accession number in a publication, we suggest the following format:

We suggest the use of the below template :

"Sequence data has been deposited at the European Genome-phenome Archive (EGA), which is hosted by the EBI and the CRG, under accession number EGASXXXXXXXXXXX.
Further information about EGA can be found on https://ega-archive.org "The European Genome-phenome Archive of human data consented for biomedical research"( http://www.nature.com/ng/journal/v47/n7/full/ng.3312.html ). 


Register your Samples

All samples should have 'Gender', 'Donor id (Subject id)' and 'Phenotype' attributes.

Gender should be described as 'male', 'female' or 'unknown'.  If 'unknown' due to a known sex chromosome aneuploidy, please create a user defined attribute called 'Sex chromosome karyotype' and add the appropriate value, for example, 'XXY'.

Donor id (Subject id) should be a de-identified subject handle.  If unknown, please add 'unknown' to the field.

Phenotypes should, where possible, be an Experimental Factor Ontology accession.  If a term cannot be found to describe your phenotype please use free text.  All sample phenotypes considered important for further analysis of the data should be provided (for example, tumour type), additional phenotype attributes can be created by defining your own attributes; use the notion 'phenotype2', 'phenotype3', etc.

  • Go to the New Submission tab
  • Choose Register samples and click Next






Submit experiments and runs (data files MUST be uploaded)

  • Go to the New Submission page
  • Choose I wish to do a complete submission  and follow the online prompts, which will guide you through adding information or selecting existing accession for your study, samples, experiments and runs.
  • Once completed please register your Data Access Committee (DAC)Data Access Agreement (DAA) and dataset/s to conclude your metadata submission.
  • Your samples, Data Access Committee (DAC) and Data access policy may also be registered before your read files have been uploaded. 


Register your Data Access Committee (DAC)

Further information on the role of your DAC can be found here.

  • Go to the New Submission tab
  • Choose Register Data Access Committee (DAC) and click Next and follow the online prompts 


Register your Data access policy

Your Data access policy provides the terms and conditions of data use, this is also referred to as the Data Access Agreement (DAA).

Completion of a DAA by the applicant/s should form part of the application process to the Data Access Committee (DAC).

  • Go to the New Submission tab
  • Choose Register Data access policy and click Next and follow the online prompts 


Submitting your Dataset

The dataset describes the data files, defined by the run (EGARXXXXXXXXXXX) and analysis (EGAZ00000000000) accessions that make up the dataset and links the collection of data files to a specified Data Access Committee and Data access policy.

As a result, you must have registered your Reads and experimentsData Access Committee (DAC) and Data access policy before submitting your Dataset. 

Please consider the number of datasets that your submission consists of, for example, a case control study is likely to consist of at least two datasets.  In addition, we suggest that multiple datasets should be

described for studies using the same samples but different sequence technologies.  Please contact EGA Helpdesk for further assistance.

  • Go to the New Submission tab
  • Choose Submit Dataset and click Next
  • Select/Register Data Access Committee (DAC) and Data access policy
  • Register your dataset

  • After submitting your dataset you should contact the EGA Helpdesk to provide a release date for your dataset.


Datasets are automatically held (i.e. not released) unless they are affiliated to a study that has already been released. 


**Metadata submitted as xmls or through the Webin tool will be made publicly available to view on the EGA website and other EBI resource/partner websites**

What happens after the submission of a dataset?

Once you have completed the registration of your dataset/s please contact the helpdesk@ega-archive.org to provide a release date for your study.  Datasets affiliated to existing studies that have already been released should automatically be released.  

Please note that all datasets affiliated to unreleased studies are automatically placed on hold until the authorised submitter or DAC contact instructs our helpdesk@ega-archive.org for the study to be released.

When your study progresses to our live site the named DAC contacts will be provided access to the EGA user management tools  to create and manage EGA accounts with access permissions to the dataset/s affiliated to the study.

Further information regarding the role of the Data Access Committee can be found here

Finally, your data is archived within our databases and prepared for encrypted distribution upon the request of permitted EGA account holders.

We strongly advise you NOT to delete your data until we confirm that your data has been successfully archived.