Difference between revisions of "EGI ENVRI"
Line 82: | Line 82: | ||
=== EISCAT-3D === | === EISCAT-3D === | ||
A | The design of the next generation incoherent scatter radar system, EISCAT_3D, opens up opportunities for physicists to explore many new research fields. On the other hand, it also introduces significant challenges in handling large-scale experimental data which will be massively generated at great speeds and volumes. This chellenge is typically referred to as a big data problem and requires solutions from beyond the capabilities of conventional database technologies. | ||
[https://documents.egi.eu/secure/ShowDocument?docid=1839 Towards a Big Data Strategy for EISCAT-3D document] (external link) | |||
To identify existing services and new services that can tackle the EISCAT_3D big data chellenge, a collaboration has been formed in February 2013 among EISCAT_3D, EGI and the EUDAT infrastructures under the ENVRI project. A document is emerging from this collaboration and it outlines a project that would take the first steps towards defining the EISCAT_3D big data strategy. | |||
The document is a working draft and is available at [https://documents.egi.eu/secure/ShowDocument?docid=1839 Towards a Big Data Strategy for EISCAT-3D document] (external link). It is updated on a regular basis. If you wish to join this collaboration and/or suggest updates to the document, then please email Malgorzata Krakowian . Further information about the collaboration is at https://wiki.egi.eu/wiki/EGI_ENVRI | |||
Key points in the document: | Key points in the document: |
Revision as of 10:29, 16 August 2013
|
General Project Information
- Leader: Malgorzata Krakowian (malgorzata.krakowian @ egi.eu)
- Start Date: 1.02.2013
- Meetings: Indico page (slides and minutes from meetings)
Mailing lists:
- envri-eiscat3d @ mailman.egi.eu - for EISCAT-3D study case
Motivation
A study case was set up to identify existing services and solutions from EGI that could address the data pre-processing, post-processing, publishing needs of these two ESFRI projects. The outcome of the pilot is expected to be directly applicable to EISCAT_3D and EURO-ARGO, and indirectly by other ESFRIs of ENVRI. In cooperation with EISCAT-3D and EURO-ARGO representatives in ENVRI, EGI.eu will try to find best suitable solutions for data pre-processing of primary data and post-processing toward publishing.
Members
- (Study case leader & EGI Operations Officer) Malgorzata Krakowian malgorzata.krakowian [at] egi.eu
- (Technical Outreach Manager) Gergely Sipos gergely.sipos [at] egi.eu
- (User Community Support Officer) Nuno Ferreira nuno.ferreir [at] egi.eu
- (EGI Chief Operations Officer) Tiziana Ferrari
- Dr. Ingemar Haggstrom Ingemar.Haggstrom [at] eiscat.se
EURO-ARGO:
- Thierry Carval Thierry.Carval [at] ifremer.fr
- Jerome Detoc Jerome.Detoc [at] ifremer.fr (data management, big data)
- Yin Chen ChenY58 [at] cardiff.ac.uk
- Malcolm Atkinson
- Alex Hardisty
- Yannick Legré
- Paul Martin
- Alun Preece
- Ari Lukkarinen ari.lukkarinen [at] csc.fi
- Antti Pursula
- Ville Savolainen
Resources
- Google doc for study case (internal for EGI)
- Euro-Argo architecture (image file)
- EISCAT-3D architecture (image file)
External tools and presentations (could be useful)
- The IT Challenges For Research Infrastructures In Physics presentation (CRISP ESFRI cluster project)
ENVRI wiki
- Euro-Argo description
- Ero-Argo data management - ftp server
- EISCAT-3D description
- Analyse Common Requirements for Data Processing
- Iceland Volcano Study Case
Progress
EISCAT-3D
The design of the next generation incoherent scatter radar system, EISCAT_3D, opens up opportunities for physicists to explore many new research fields. On the other hand, it also introduces significant challenges in handling large-scale experimental data which will be massively generated at great speeds and volumes. This chellenge is typically referred to as a big data problem and requires solutions from beyond the capabilities of conventional database technologies.
To identify existing services and new services that can tackle the EISCAT_3D big data chellenge, a collaboration has been formed in February 2013 among EISCAT_3D, EGI and the EUDAT infrastructures under the ENVRI project. A document is emerging from this collaboration and it outlines a project that would take the first steps towards defining the EISCAT_3D big data strategy.
The document is a working draft and is available at Towards a Big Data Strategy for EISCAT-3D document (external link). It is updated on a regular basis. If you wish to join this collaboration and/or suggest updates to the document, then please email Malgorzata Krakowian . Further information about the collaboration is at https://wiki.egi.eu/wiki/EGI_ENVRI
Key points in the document:
- EISCAT archive includes 60TB of correlated data collected during 1981-2013. There are interfaces to interact with the data but the data is not searchable.
- The collaboration would make the archive searchable by:
- moving the data to EGI storages (grid or cloud to be decided later)
- registering the data into EGI file catalogues and metadata services
- studying, identifying and - if feasible - implementing data mirroring and indexing strategies that would suit to the envisaged processing strategies of EISCAT_3D. The studies could cover areas such as use of Hadoop or other big-data technologies on EGI.
Those questionnaires are used to collect requirements from EISCAT data managers and scientists:
For scientists: https://www.surveymonkey.com/s/ENVRI-EISCAT_Scientists
For data managers: https://www.surveymonkey.com/s/ENVRI-EISCAT_Data_managers
- 'Towards a Big Data Strategy for EISCAT-3D' presentation (pdf) during 16th EISCAT International Symposium 2013
Euro-ARGO
A short document will be prepared that outlines the scope of the EURO-ARGO - EGI collaboration. The document would be used by EGI and by EURO-ARGO to invite contributors into the collaboration (NGIs, institutes).
Key points in the document:
- EURO-ARGO architecture and interfaces to interact with the data.
- The collaboration would make the archive searchable by
- moving the data to EGI storages (grid or cloud?)
- registering the data into EGI file catalogues and metadata services
- Advertisement of the data to potentially interested EGI user communities