Federated EGA
Federated EGA Vision Statement
The Federated EGA is the primary global resource for discovery and access of sensitive human omics and associated data consented for secondary use, through a network of national human data repositories to accelerate disease research and improve human health.
Over the last 10 years, most individual-level human omics data have been generated in the context of research consortia and shared via global repositories such as the European Genome-phenome Archive (EGA). Many countries now have emerging personalized medicine programmes which are generating data from national or regional initiatives. Thus, human genomics is undergoing a step change from being a research-driven activity to one funded through healthcare initiatives.
Genetic data generated in a healthcare context is subject to more stringent information governance than research data and often must comply with national legislation. To address this need, the Federated EGA provides a network of connected resources to enable transnational discovery of and access to human data for research while also respecting jurisdictional data protection regulations. By providing a solution to emerging challenges around secure and efficient management of human omics and associated data, the Federated EGA fosters data reuse, enables reproducibility, and accelerates biomedical research.
Overview
The EGA project is currently a collaboration between EMBL-EBI and the CRG, regulated by agreements between the two institutions. The Federated European Genome-phenome Archive (EGA) will be a distributed network of repositories for sharing human -omics data and phenotypes. Typically a node would be an organization or project that hosts human genetic data so that the data can remain within a jurisdiction. Federated EGA gathers metadata of -omics data collections stored in national or regional archives and makes them discoverable across the EGA network.
EGA is contributing the Federated EGA model, requirements and experiences to several communities and projects like GA4GH, ELIXIR Federated Human Data Implementation Study or ELIXIR Federated Human Data community.
Documentation
Title | Version | Description |
---|---|---|
Structure and Organization | ||
EGA Federation: Structure and Organization | 1.1 | The structure of an EGA federated network and service expectations. We organise the EGA into three types of nodes: Central EGA, Federated EGA nodes and EGA Community nodes; we outline the goals of such an organization, and summarize the commitments and services provided by the nodes. |
Strategic Committee | ||
EGA Federation Strategic Committee | 1.1 | In the EGA Federation Strategic Committee terms of reference document we describe the purpose and objectives of the committee, which is to provide direction and strategic planning for the federated EGA project. The committee receives input from the EGA Strategic Committee and provides feedback for the EGA strategic roadmap. |
Operations Committee | ||
EGA Federation Operations Committee | 1.1 | The EGA Federation Operations Committee terms of reference describes the purpose and objectives of the operations committee, which is to review operational performance and coordinate technical implementation roadmaps of EGA Federated and Community nodes. The committee receives advice from the EGA Federated Strategic Committee, and provides operational reporting to the EGA Federated Strategic Committee |
Guidelines | ||
Node Operations guidelines | 2.0 | The EGA Federated Node Operations gives an overview of the operational areas which require resources in order to create a federated EGA node. The document is based on more than 10 years experience of establishing and operating the EBI and CRG Central EGA nodes. It provides a breakdown of the operational areas of responsibility into Helpdesk Services, Technical Operations, Software Development, and IT Infrastructure. |
Available Software
The LocalEGA is a federated storage software for sensitive data.
Software |
Main LocalEGA software Repository |
Documentation |
Main LocalEGA software Documentation |
Local EGA Software
A portable toolkit to securely deposit and share human sensitive data - Local EGA, Mini-Symposium Federated Human Data, Elixir All Hands Meeting, 2020
Federated EGA API's
Below is a list of the GA4GH standards and APIs implemented by the Federated EGA. Visit EGA-GA4GH for the full list that are currently available or planned for implementation at EGA.
Standard | Purpose | Specification Version |
Supported Version |
Implementation |
---|---|---|---|---|
Beacon | Supports discovery of genomic variants, individuals, and individuals | V1.0.1 | V0.3 | Specification Documentation Endpoint |
Crypt4GH | Enables direct byte-level compatible random access to encrypted genetic data stored in community standards (e.g. CRAM, VCF) | V1.0 | V1.0 | Specification Documentation Endpoints |
Data Use Ontology (DUO) | Allow users to semantically tag datasets with usage restrictions so datasets can be automatically discoverable based on a researcher's authorization level or intended use. | 2021-02-23 | 2021-02-23 | Specification Documentation Endpoint |
htsget | A protocol for secure, efficient, and reliable access to sequencing read and variation data. | V1.3.0 | V1.0.0 | Specification Documentation Endpoint |
refget | Enables access to reference sequences using an identifier derived from the sequence itself. | V1.2.6 | N/A | Specification Documentation Endpoint |
Researcher IDs (passport, visa) | Specify the collection of researchers that may access a dataset at any given time, and the credentials they must supply. | V1.0.1 | V1.0.1 | Specification Documentation Endpoint |
API | Purpose | EGA API Version | Implementation |
---|---|---|---|
Submission API | For submitting metadata to EGA following the INSDC object schemas. Implements DUO. | V1.0.0 | Specification Documentation Endpoint |
Permissions API | For getting and setting permissions to EGA objects. Implements Researcher ID. | V1.0.0 | Specification Documentation Endpoint |