Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Competence centre EISCAT 3D

From EGIWiki
Jump to navigation Jump to search
EGI-Engage Competence centres: Main page ELIXIR BBMRI MoBrain DARIAH LifeWatch EISCAT_3D EPOS Disaster Mitigation | EGI-Engage Knowledge Commons


EISCATlogo.png




CC Coordinator: Ingemar Häggström

CC members' list: cc-eiscat3d AT mailman.egi.eu


Target user communities: EISCAT_3D users

List of organizations representing the user communities: EISCAT Scientific Association

Duration of the CC: 30 M

Starting at Project Month: March 2015

Ending at Project Month: August 2017



The design of the next generation incoherent scatter radar system, EISCAT_3D introduces significant challenges in handling large-scale experimental data which will be massively generated at great speeds and volumes. The CC will build an e-Infrastructure to meet the requirements of the EISCAT_3D data system, will support the EISCAT science community in their acquisition, curation, access to and processing of the data, and will train data scientists who can explore new approaches to solve problems via new data-centric way of conceptualising, organising and carrying out research activities.

Objectives

  1. To build common e-Infrastructure to meet the requirements of a big scientific data system such as EISCAT_3D data system
  2. To demonstrate the developed e-Infrastructure can support the EISCAT science community in their acquisition, curation, access to and processing of the data
  3. To train data scientists who can explore new approaches to solve problems via new data-centric way of conceptualising, organising and carrying out research activities, which will lead to new discoveries and significant scientific breakthroughs.


Tasks

Task 1: User Support and Training

(M15-M30) Effort: 3 PM

Objectives: Disseminate the possibilities for users with the portal.
Arrange training events co-located with other activities for EISCAT users. The yearly radar-school and the yearly EISCAT_3D user meeting are suitable for co-location of training events. Two training events are planned. The first focuses on the basic functionality of the portal and is held after the portal is available on a production platform. The second training event focuses on the new features implemented in the mini-projects (Task 3 and 4). CSC is the lead partner, will provide the training and the training material. EISCAT will contribute with relevant use-cases for the training.

Task 2: Deploy the portal as a production system

(M1-M12) Effort: 6 PM

Objectives: Deploy the pilot as a production system.

The first portal was released in March 2016 based on DIRAC. This version serves as a proof of concept and is currently open for user review and feedback. Read more about this below

Task 3: Basic reanalysis within the portal

(M13-M24) Effort: 6 PM

Objectives: Enabling basic reanalysis within the portal.

Basic reanalysis within the portal with user-specified constraints. Instead of downloading large sets of data, it is desirable to be able to do some basic reanalysis of the low levels of data with other constraints than were used in the standard analysis. Basic constraints are defining the volumes of space and the time intervals to integrate. In total there are about 100 analysis parameters to set, and the portal should identify them and provide tools to change them.

EISCAT leads this task providing specifications for the new capabilities. CSC provides the development effort to integrate these into the portal.

Task 4: Use level 3 data as metadata

(M8-M30) Effort: 12PM

Objectives: Enable the use of level 3 data of EISCAT as metadata.

Setup the level 3 data of EISCAT to complement the radar metadata. The level 3 data of EISCAT are the derived ionospheric physical parameters like densities, temperatures and drifts at selected volumes of space. The data is currently stored in a separate database, Madrigal, with no connection to the lower levels of the data. This task should complement the radar metadata, parameters of the radar hardware, with the physical parameters. Currently the only link between the data sets are the time stamps, but it's desirable to use a set of identifiers to more clearly follow how the different levels of data have been formed, exactly which set of level 1 data were used to derive the level 2 and finally level 3 data. Expand the portal of the pilot to be able to search the data based on the expanded metadata set, and allow download of selected levels of data. Investigate further how the access rights of the data should be followed.

EISCAT leads this complex task, specifying how to implement these advanced features. SNIC and CSC provides the development and testing resources.

Task 5: Exploitation

(M13-M30) Effort: 2PM

Objectives: Maintain the portal on a production platform

Deploy, maintain and update the production version of the portal. SNIC leads this task providing suitable resources for operating the portal. EISCAT provides new data to be added to archive.


First portal - proof of concept

The first portal was released in March 2016 based on DIRAC as a proof of concept. This version is now open for user review and feedback.

The proof of concept is focuses on data cataloguing and provides two key services for the user:

  1. Discover EISCAT data through metadata (instead of file location or physical file name).
  2. Download batches of EISCAT files from the storages after these were discovered through the portal server.

Computing services are not part of this version (they will be of the second portal version).

The aims of the proof of concept are:

  1. Assess the suitability of using DIRAC system for the EISCAT portal purposes.
  2. Establish a baseline file structure to access the EISCAT files through the portal. The structure will be improved in the future to optimise access management (access control, PIDs, frequent queries, etc.).
  3. Establish a baseline metadata schema to discover EISCAT data through metadata via the portal. The schema will be improved in the future to optimise access management.
  4. Collect feedback about data organisation for the EISCAT_3D data model (for example on most suitable separation of data and metadata) for the data organisation activity of EISCAT_3D.

The proof of concept includes a DIRAC Storage Element (SE) service running at the EISCAT institute. The DIRAC SE makes EISCAT Level 2 data file system accessible to the File Catalogue and the users. The total EISCAT Level 2 dataset is 70-80 TB, out of which a subset will be deployed on the DIRAC SE server. This Storage Element service exposes the files to the DIRAC service portal. The portal is using a MySQL server component (hosted by CYFRONET in Poland) to catalogue the files. The file structure on the server and the metadata schema in the catalogue replicates those used in the EISCAT database for level 2 files (See Appendix 1 of Deliverable 6.3). Current metadata (in SQL database) are location (site, start time, end time) and access rights. Other ‘metadata’ are embedded in the files themselves. In the first prototype only these SQL will be used in the DIRAC metadata catalogue. (In a second phase additional metadata can be extracted from the files.)

How to get access to the portal

  1. Get a personal X.509 certificate from a recognised Certification Authority (CA). This can be done in two ways:
    1. From your national IGTF CA: https://www.igtf.net/ (requires a personal visit at the CA)
    2. From the Terena Certificate Service (TCS) - only if your institute is recognised in the TCS portal: https://www.digicert.com/secure/saml/discovery/?entityID=https%3A%2F%2Fwww.digicert.com%2Fsso&returnIDParam=idp
  2. Import the X.509 certificate into a web browser and register for membership in the eiscat.se Virtual Organisation: https://perun.metacentrum.cz/perun-registrar-cert/?vo=eiscat.se
  3. Using the same browser go to http://dirac.egi.eu/DIRAC/, then login by clicking the 'Secure connection' button in the bottom-right corner.
  4. Make sure you are using your eiscat.se profile (Bottom-right listbox, left to 'Secure connection'.)

How to provide feedback about the portal

Feedback about the portal is collected in a dedicated queue of the EGI RT ticketing system. You need to have an EGI SSO account to be able to submit feedback tickets. You can setup an EGI SSO account at http://egi.eu/sso Please create separate tickets about each feedback item you have:

  1. Go to the ticket submission interface https://rt.egi.eu/rt/Ticket/Create.html?Queue=74
  2. Login with your EGI SSO account
  3. Add the following details into the ticket:
  • Status - new
  • Owner - Victor Mendez
  • Subject - <subject of your feedback>
  • Describe the issue below:
    • Attach file: <if applicable>
    • <textual description>

Workspace for CC members

Open actions

  1. Keep continue testing the DIRAC-EISCAT portal prototype (new on April 1).
    1. Instructions about how to get access and how to provide feedback is at https://wiki.egi.eu/w/index.php?title=Competence_centre_EISCAT_3D#First_portal_-_proof_of_concept
    2. Victor needs to enable access to CC members manually because the VOMS integration does not work. (VOMS is down?)
  2. Arrange a session for EISCAT community members to see or test the DIRAC prototype portal
    1. Tutorial at upcoming EISCAT symposium, Thu 19 May. Planning a 1 hour splinter session with organiser committee.
  3. Easy access for testers is necessary to expand the tester group. The necessity to obtain national X.509 certificate should be eliminated. (new on April 1). Explore the following options - Gergely to lead this:
    1. How to centralise certificate issue for EISCAT testers - Can the EGI CA issue cert for this? What should be the protocol to issue?
    2. Federated login into DIRAC using the new EGI AAI proxy service --> Setup a meeting between the DIRAC team and the EGI AAI team
  4. DIRAC further development (Victor) (New on April 22):
    1. Add file download functionality (both for portal and command line)

Working documents for portal specification

  1. EISCAT_3D System Overview: https://docs.google.com/document/d/1qJ90AhkDkTTrQo-lyPbAYO3BaZWn3GoefjMXJSlmiBk
  2. Portal specification: https://docs.google.com/document/d/1KywhcUnctyxKOTK5efbaIo2VMitfB3K3menVz7-4jlY
  3. EISCAT_3D data model draft: https://docs.google.com/document/d/1RfxkiTbq754UlqOdP22FgbsfW9aOLpYKfScTNouqETk

Deprecated - ENVRI Pilot in EGI-InSPIRE - OpenSource Geospatial Catalogue

This section is about a previous, EGI and EISCAT related project that was supported in 2013-2014. The Competence Centre effort made this obsolete. This section is kept only for archive.

OSGC is an Open Source implementation of an OpenSearch GeoSpatial Catalogue compliant to OGC 10-32r3 specification, developed by EGI.eu under the ENVRI project.

OSGC provides a catalogue engine built on top of a PostgreSQL+Postgis database, which exposes a cusmizable OpenSearch interface. Most of the application configuration can be set from the Admin web interface, while Data Administrators have a separated Dropbox interface, which ease the management of the catalog and the data storage, and a Data Gateway interface, which controls access to data and produces data access statistics.

A stand-alone client web interface, written in HTML5 and Javascript, can be used to query this catalog and other compliant OpenSearch catalogs.

Features

  • OpenSearch catalogue engine with customizable output formats, products metadata, query schema, input formats (for data injection).
  • Web admin interface (to offer the catalogue as a Platform-As-A-Service on the Cloud).
  • 'Dropbox feature', to automatically extract metadata, register it into the catalogue and optionally push the data file into Cloud or other connected storage.
  • Data Gateway interface, to control access to data, produce data access statistics and bridge non-http protocols.
  • OpenSearch web client interface, with the possibility to execute it remotely or as a standalone application (for integration into Cloud Virtual Laboratories PaaS services) and cumulative download (with shop-chart functionality).

Known limitations/next steps to evolve the system into a production version

  • Test system scalability
  • Develop pre-processing service to extract further metadata from the data
  • Develop visualisation service
  • Develop Near Real Time tool to import data automatically from receiving stations
  • Evolve admin panel based on user requirements
  • Evolve client panel based on user requirements

Resources and documentations