Difference between revisions of "DCH-RP:PoC 1 Belgium"
Line 45: | Line 45: | ||
'''SUGGESTED TEST PROCEDURES''' | |||
'''Tools''' | |||
a. Investigate the audit and certification method described inTrustworthy Repositories – Audit and Certification” http://www.digitalrepositoryauditandcertification.org/pub/Main/ReferenceInputDocuments/trac.pdf | |||
b. Check the existing tools available at: "Risk-analysis for E-depots:DRAMBORA" http://www.repositoryaudit.eu/ | |||
c. Check with other partners if other audit tools are available | |||
'''Participation in this PoC''' | |||
The work to be carried out in this scenario is not to be underestimated. ICCU has already indicated its willingness to join this PoC. Other partners will be asked to join, directly via the project partners involved in WP5 or via the discussion forum. An item on this topic will be started to obtain comments and collaboration from the DCH community. | |||
'''Execution''' | |||
• Choice of tools | |||
• Repartition of the work among partners | |||
• Definition of documentation method of the trial | |||
• Description of the local preserved data on which the audit is done | |||
• Execution of the audit | |||
• Prepare for doing an audit on data stored with grid or cloud | |||
'''Timeline''' | |||
To be discussed with the partners | |||
The report on the First proof of concept is due in Month 12, this means end of October 2013 | |||
End of May 2013: list of contributing partners is available | |||
May-August: choice of standards, preparation of the procedure | |||
September-15 October: execution of the tasks | |||
15-31 October: preparation of the report | |||
'''Threats''' | |||
• Procedure is too long | |||
• Not enough of the procedure can be done electronically | |||
• Not enough partners are found to contribute | |||
=== '''PoC 2 Test out download and access of DCH data on grid storage''' === | === '''PoC 2 Test out download and access of DCH data on grid storage''' === |
Revision as of 17:32, 2 May 2013
WP5: Proofs of Concept | Scenarios | PoC Phase 1 | PoC Phase 2 | DCH Glossary |
Proof of Concept 1 | | Belgium | | Estonia | | Hungary | | Italy | | Poland | | Sweden |
In Belgium, the first Proof of Concept will involve the following CH institutes:
- KIK
- KMKG
- KB
- RA
EGI Resources
EGI, as an e-Infrastructure provider, is supporting the DCH-RP project with resources as follows:
Grid storage Resources
Provider: Vrije Universiteit Brussel (VUB) - EGI site name "begrid-vub-ulb"
Storage: Grid SE storage, based on DPM, with SRM interface(?)
Cloud Storage
Cloud storage will be handled through the EGI Federated Clouds Task Force. The exact provider needs to be determined.
Proof of Concept scenarios
The Belgian partners would like to investigate Scenarios 1 and 2 (coming from DCH-RP Deliverable D3.1) as follows in PoC 1:
For Scenario 1, we should like to do the following: check the integrity/audit the data that is archived locally. In fact this equals a risk analysis of the chosen archiving method. We do not have to invent what needs to be done, all this is described in the document: “Trustworthy Repositories – Audit and Certification” http://www.digitalrepositoryauditandcertification.org/pub/Main/ReferenceInputDocuments/trac.pdf More information is also available at: "Risk-analysis for E-depots:DRAMBORA" http://www.repositoryaudit.eu/
Of course we want to do this with other partners.
Not to find in the current scenarios is a grid storage solution. I think this is mentioned in the project with the use of the eCSGW. In Belgium we are very interested to test the possibilities of such a storage solution. In the sense: how fast can data be retrieved from the grid storage, how easy is it to search for the data, also for the general public (in fact for all those that already use the existing local archive today).
PoC 1 Audit and certification on local data
Combining Scenario 1 and 2 together the following PoC outline may be implemented:
KIK-IRPA, one of the Belgian DCH organisations, has already a local preservation system for their data. They have described their preservation system in a “Best practices” document that is accepted throughout the organisation. However one of their main concerns is to maintain the integrity of their data. Their exist auditing and certification schemes for trustworthy repositories, see: “Trustworthy Repositories – Audit and Certification” http://www.digitalrepositoryauditandcertification.org/pub/Main/ReferenceInputDocuments/trac.pdf and "Risk-analysis for E-depots:DRAMBORA" http://www.repositoryaudit.eu/. Such an audit also equals a risk analysis of the chosen archiving method. How esay this all may sound, real life shows that almost no one ever terminates the whole procedure, hence there is no common “best practices” available.
Doing an audit in a consequent way requests to use the necessary tools. In this scenario we want to use existing tools and document the auditing process. We will do this on the local data that is in the KIK-IRPA preservation scheme and on data of other partners if possible. Such an audit is in fact independent of where the data is stored but it is certainly a “tool” that will be very valuable for preservation done on data stored with e-infrastructures or other storage service providers.
Once done it would be useful to execute the procedure on preservation done on grid and cloud. This could be done in the PoC 2 projects as then experience with grid and cloud in the project will be available.
Suggested test data: KIK-IRPA will use their existing repository to do the audit
SUGGESTED TEST PROCEDURES
Tools
a. Investigate the audit and certification method described inTrustworthy Repositories – Audit and Certification” http://www.digitalrepositoryauditandcertification.org/pub/Main/ReferenceInputDocuments/trac.pdf b. Check the existing tools available at: "Risk-analysis for E-depots:DRAMBORA" http://www.repositoryaudit.eu/ c. Check with other partners if other audit tools are available
Participation in this PoC
The work to be carried out in this scenario is not to be underestimated. ICCU has already indicated its willingness to join this PoC. Other partners will be asked to join, directly via the project partners involved in WP5 or via the discussion forum. An item on this topic will be started to obtain comments and collaboration from the DCH community.
Execution
• Choice of tools • Repartition of the work among partners • Definition of documentation method of the trial • Description of the local preserved data on which the audit is done • Execution of the audit • Prepare for doing an audit on data stored with grid or cloud
Timeline
To be discussed with the partners The report on the First proof of concept is due in Month 12, this means end of October 2013 End of May 2013: list of contributing partners is available May-August: choice of standards, preparation of the procedure September-15 October: execution of the tasks 15-31 October: preparation of the report
Threats • Procedure is too long • Not enough of the procedure can be done electronically • Not enough partners are found to contribute
PoC 2 Test out download and access of DCH data on grid storage
Many DCH institutions use a local solution to store their data and/or to do long term preservation. A new possibility is to store data with (existing) e-infrastructures for example store data on the grid or in the cloud. Several questions arise. Do we look at - “e-infrastructures for research” - Commercial e-infrastructures - Grid storage - Cloud storage - Do we know and control where our data is located - Can our users easily and efficiently use the data (as if it was local) In Belgium we have a grid e-infrastructure and the possibility to use grid storage on that e-infrastructure. We are interested to test out the store and access facilities of the grid storage. I other words we want to measure the access times to the grid storage while using a usual web interface. The Italia partners have made the e-Cultural Science gateway available for the project. This tool has been developed for the project Indicate and is being modified for uploading data to the grid without using the gateway storage as an intermediate step. The e-Cultural Science Gateway will be used in this case to test out grid storage.
Suggested test data: Belgian DCH institutions will upload data to the grid
SUGGESTED TEST PROCEDURES
Tools
e-Cultural Science Gateway
Participation in this PoC
Partners have to be found to realise this PoC.
Possible scenarios concerning the use of eCSG
1. Use the eCSG installed in Catania with storage in Catania (this solution has the advantage of a working eCSG with possibility to upload data)
2. Use the eCSG installed in Catania with storage on BEgrid (for this scenario there is a need to adapt eCSG an dcache, the storage management system in BEgrid)
3. Install the eCSG on BEgrid and use with BEgrid storage
Execution • E1: Exploit the possibility to have eCSG working with dcache • E2: Dependent on the outcome choose scenario 1 or 2 • E3: Put data on the grid storage • E4: Define the access measurement tools • E5: Define the userfriendliness measurement tool • E6: Exploit the technical requirements to install eCSG on BEgrid • E7: Depending on the outcome of E6, install the eCSG • E8: Repeat E3-E5 on the BEgrid eCSG • E9: Repartition of the work among partners • E10: Definition of documentation method of the trial
Challenges Several challenges ly ahead for this Proof of Concept - The ecSG fails to upload (a large quantity) of data - The ecSG fails to access the data efficiently - The need for belonging to an identity federation may be a major drawback for using the eCSG beyond a trial - The impossibility to install eCSG at other grid infrastructures than the Italian grid
Timeline
To be discussed with the partners The report on the First proof of concept is due in Month 12, this means end of October 2013 End of May 2013: list of contributing partners is available May-August: choice of standards, preparation of the procedure September-15 October: execution of the tasks 15-31 October: preparation of the report
Extra information that can be obtained during the tests- Definition of the technical requirements for a user interface to access the grid storage - Documentation of DCH data management possibilities on the grid
Extension in PoC 2 - Set up a similar test for using cloud storage