Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @

VT Life Science Data Integration

From EGIWiki
Revision as of 01:21, 28 November 2014 by Nunolf (talk | contribs) (→‎Motivation)
Jump to navigation Jump to search

Main Members Workplan Meetings Actions

General Project Information

  • Project title : Integrating ELIXIR reference datasets within the European Grid Infrastructure
  • Proposers  : Fotis E. Psomopoulos, Giacinto Donvito
  • Coordinator  : Fotis E. Psomopoulos
  • Start Date  :
  • End date  :


There has been significant work done in the EGI in the past to help the deployment and discovery of services, where “services” can be either computationally oriented (such as batch queues) or application oriented (such as web-services, ready-to-use applications embedded in portal gateways or encapsulated in Virtual Machine Images). However in bioinformatics many services used for analysis purposes rely on public reference datasets. Reference dataset are getting big and users struggle to discover, download and compute with them. There is an increasing demand to compute the data where the reference datasets are located. EGI members already host some biological reference datasets across the infrastructure, however currently EGI neither provides discovery capabilities for available datasets, nor provides guidelines for those who wish to use these datasets or would like to replicate additional datasets onto EGI sites. The project will facilitate the discovery of existing reference datasets in EGI and will develop and deploy services that allows the replication of life science reference datasets by data providers, resource providers and researchers, and the use of these datasets by life science researchers in analysis applications.


  1. Identify existing life science datasets in EGI
  2. Identify reference datasets for replication
  3. EGI AppDB extension to a dataset registry
  4. Tools for data replication
  5. Analysis tools to work with data replicas
  6. Integration with ELIXIR Registry