Submission FAQ

This page provides answers to some common questions asked by submitters.  

If you have any questions regarding submission to the EGA, please contact the EGA-Helpdesk.

Subscribe to the EGA submitter announcement list to receive the latest updates

 

The EGA is the archive to use at the EBI if your original consent agreements require your data to be subject to controlled acccess.  The EGA will not accept your data unless you can confirm that your consents require controlled access distribution.

For consent agreements enabling full open public access, consider submitting to the following archives at the EBI:

ArrayExpress, European Nuclotide Archive (ENA) and The Database of Genomic Variants archive (DGVa).

The EGA provides submitters with a completely free, secure and permanent archiving solution for sharing data worldwide.

Submitters retain complete ownership over data and may submit data in stages and control access permissions to the data once submitted.

We support controlled access for named consortium members prior to publication; typically 6-12 months pre-publication.

Each organization that has deposited data in the EGA is given a publically viewable website on our system, which contains a user submitted description of the organization, the experiments and data used in the study together with a links back to the organization website. 

In addition, each study is assigned a stable and unique accession number that may be referred to in future publications.

Throughout the data submission process the EGA will continue to consult with submitters to ensure that the data is accurately represented, that the formal data access application is in place and the granularity of data access has been set correctly.

We also provide a EGA helpdesk, which provides support to users and submitters.

Data submitted to the EGA may also, where appropriate, be integrated with other resources available at the EBI, such as Ensembl and ArrayExpress.

Submissions made to the EGA will also be cross-linked in the study catalog at the NCBI resource, The Database of Genotypes and Phenotypes (dbGAP), with a link to the study in EGA. However, data files will only be able to be obtained from the EGA.

The EGA accepts de-identified data with an approved Data Access Consortium (DAC) plan; which is responsible for all data access decisions.

Data that does not need to be subject to controlled access can be submitted to other EBI archive resources.

Submissions to EGA come in a variety of formats and sizes and it is therefore difficult for us to say exactly how long a submission will take. We, therefore, advise all of our submitters to allow as much time as possible to make a submission and suggest to anticipate that the submission process will take at least one month

You will receive your accession number upon the submission of your study.xml or registering your study using the online metadata submission tool. The EGA submitter Portal.  Full instructions of the submission process will be provided in your submission pack.

We suggest the use of the below template :

"Sequence data has been deposited at the European Genome-phenome Archive (EGA), which is hosted by the EBI and the CRG, under accession number EGASXXXXXXXXXXX.
Further information about EGA can be found on https://ega-archive.org "The European Genome-phenome Archive of human data consented for biomedical research"( http://www.nature.com/ng/journal/v47/n7/full/ng.3312.html ). 

A Data Access Committee (DAC) is responsible for making the data access decisions for the data submitted.  A DAC may consist of a single individual or group of individuals. 

A DAC makes data access decisions based on the Data application form and completion of Data Access Agreement (DAA) submitted by applicants.

Click here for further information on creating a Data Access Committee.

Our accepted data types include all manufacturer raw data formats from the array-based and next generation sequencing platforms. Processed or analysed data, such as genotypes and structural variants as well as additional information (e.g. quality scores and intensity values) may all be uploaded to our databases.

We also accept and distribute phenotype data as well as clinical data for secondary use that may be associated with the samples.  

Email our EGA-Helpdesk for more information

The EGA set-up consists of a secure computing facility for data processing and a shared EBI set-up for data submissions and distribution of data via data requests made through the EGA website.

All distributed data is encrypted and can only be accessed using an encryption key, which is distributed to uses by post or courier.

Our security protocols for log-in and downloading data have been successfully applied to other EBI-hosted EU projects containing restricted data.

Data files are uploaded into private submission drop boxes using FTP or Aspera protocols, which are provided as part of the submission procedure.

All submitters must use EgaCryptor, which encrypts, generates md5sum's and uploads your files to your submission dropbox.

Data files may are then uploaded using FTP or Aspera.

All submissions require policy documentation. 
This consists of 'Data Access Agreement (DAA), 'Data Processing Agreement' and 'Authorised Submitters Formulary'.

All data submitted and distributed to the EGA must be encrypted with GnuPG, which ensures that the data is kept secure and accessed exclusively by permitted EGA personnel and users.  All submitters must use the EgaCryptor to create EGA compliant files prior to uploading.

There exists a time windows between the data upload and the availability of such files via the Submitter Portal. For this reason, the files can be linked with the samples only a few hours after the upload.

We require pre and post encryption md5sum values to be provided for all submitted files, so that we can ensure that file integrity has been maintained during the transfer process. Md5sums are generated automatically using the EgaCryptor tool provided.

**YOUR SUBMISSION WILL NOT BE ACCEPTED AND MAY BE SIGNIFICANTLY DELAYED IF YOU DO NOT PROVIDE MD5SUM VALUES FOR ALL DATA FILES iN THE FORMAT REQUIRED**

**PLEASE CONTACT THE EGA HELPDESK PRIOR TO SENDING YOUR HARDISK**

Encrypted data files can be transferred to a user supplied hard drive, which should be sent to:

EGA Helpdesk
Centre for Genomic Regulation (CRG)
C/ Dr. Aiguader, 88
PRBB Building - EGA office (068.01)
08003 Barcelona, Spain

Important:
To ensure that no custom charges are applied, please describe the goods as 'Intellectual  Property Rights - no commercial value'.We reserve the right to refuse delivery or seek re-imbursement of costs if this instruction is not followed.

Please ensure that ALL data you transfer to your hard disk is encrypted with the EGA public key, which may be obtained by contacting the helpdesk@ega-archive.org.  Files may also be encrypted and md5sums generated using EgaCryptor
 
We are happy to return all hard disks providing return postage is paid. 

We have methods in place for the secure removal of deposited controlled access data. Contact EGA-helpdesk for further details.

After submission the EGA team will process the data into databases and archive the original files. Members of the EGA will then consult with the submitter to ensure that the data is represented accurately on the website and the formal arrangement for data access application has been set correctly.

 

If you have any further questions please do not hesitate to contact the EGA Helpdesk: helpdesk@ega-archive.org

Subscribe to the EGA submitter announcement list to receive the latest updates

Metadata provided will be made publicly available to view on the EGA website and other EBI resource/partner websites. It is the submitter responsibility not to submit sensitive metadata