VT Scalable Access to Federated Data

From EGIWiki
Jump to: navigation, search
Engagement overview Community requirements Community events Training EGI Webinars Documentations

EGI Virtual teams: Main Active Projects Closed Projects Guidelines


General Project Information


Different solutions for federated storage management for High Throughput Computing of data a la grid or on cloud are possible, but not yet widely available in EGI as validated platforms capable of meet the performance requirements of Research Infrastructures. The problem to be faced is processing and visualization of large datasets, where the volume of data makes transfer unfeasible, and requires the migration of computation to data.

For example, "large amounts of image stacks or volumetric data are produced daily at brain research sites around the world. This includes human brain imaging data in clinics, connectome data in research studies, whole brain imaging with light-sheet microscopy and tissue clearing methods or micro-optical sectioning techniques, two-photon imaging, array tomography, and electron beam microscopy." Similar requirements are emerging from other areas like structural biology and life sciences.

A key challenge in make such data available is to make it accessible without moving large amounts of data. Typical dataset sizes can reach in the terabyte range, while a researcher may want to only view or access a small subset of the entire dataset.



TASK1 (DONE) Invite TCB, OMB, compentece centres and user communities to participate, identify infrastructure providers contributing resources to the testbed

TASK2 (DONE) Define a list of relevant use cases for scalable big data access requiring co-location of compute and data

TASK3 (IN PROGRESS) Performance testing in different test scenarios




Personal tools