EGI-InSPIRE:Switzerland-QR10
Quarterly Report Number | NGI Name | Partner Name | Author |
---|---|---|---|
1. MEETINGS AND DISSEMINATION
1.1. CONFERENCES/WORKSHOPS ORGANISED
Date | Location | Title | Participants | Outcome (Short report & Indico URL) |
---|---|---|---|---|
September 2012 | Prague | EGI Technical Forum | UNIBE-LHEP (Gianfranco Sciacca), UZH (Sergio Maffioletti,Tyanko Alexiev), SWITCH (Alessandro Usai, Simon Leinen, Valery Tschopp) | Important know-how build,accounting problem discussed, OMB and OTAG |
1.2. OTHER CONFERENCES/WORKSHOPS ATTENDED
Date | Location | Title | Participants | Outcome (Short report & Indico URL) |
---|---|---|---|---|
October 2012 | Lugano (CH) | Swiss High Performance Computing Forum | UNIGE (Szymon Gadomski) and UNIBE-LHEP (Gianfranco Sciacca) | Talk: 'Disk Pool Manager Storage Systems at the Universities of Bern and Geneva'
S.Gadomski (UNIGE-DPNC) and G.Sciacca (UNIBE-LHEP) |
1.3. PUBLICATIONS
Publication title | Journal / Proceedings title | Journal references Volume number Issue Pages from - to |
Authors 1. 2. 3. Et al? |
---|
2. ACTIVITY REPORT
2.1. Progress Summary
Progress on accounting and monitoring.
Transition to UMD releases almost complete.
UNIBE-ID certified (accounting problem in progress).
Review/reshuffle of the tasks among the partners.
2.2. Main Achievements
Solution to the accounting problem of UNIBE finally in place. CSCS are still waiting for a definite solution for the accounting of CREAM CE + ARC CE mixed environment.
ARC monitoring to be enabled on the prod Nagios system: the ARC probes are NOT production quality, and therefore some sites reserve their right to accept the monitoring of their ARC services (flagged as non prod in gocdb).
Nagios ARC probes review: on going, arc gridftp probe decommissioned by OMB after NGI_CH request. Upgrade to UMD1 and UMD2 complete: Geneva will soon upgrade the site bdii and DPM from gLite 3.2 to UMD (see GGUS ticket) -> apart from UNIGE No gLite 3.1 and gLite 3.2 services in NGI_CH.
Security effort to be reviewd by NGI_CH. Some manager tasks to be reassigned from SWITCH to SWING. UNIBE-ID certified, it will start being monitored by the Nagios prod system as soon as the probes are enabled (it has been certified through the use of the Nagios test system). note: ARC 2 bdii problem escalated with a GGUS ticket. In general it has been remarked that ARC support is lacking, and that it could be improved.
2.3. Issues and mitigation
CSCS
- Some storage problems at CSCS, due to hardware failures, they have been tackled with emergency disks.
- CPU expansion at CSCS (the cluster has now 21800 HS06 computing power).
- All compute nodes upgraded to UMD1,
- All VOBOX middleware services (gsissh and bdii) were removed from the VO-specific machines.
PSI
- For over a year 1TB disks failures in our old SUN X4540 systems, no data loss thanks to the resilient ZFS RAID6. Need to continue and operate these systems for some more time.
- Introduction of a limit on 3 GB RAM usage for jobs in the SGE configuration -> neededed to switch from srmcp (Java) based tools to lcg-tools, due to srmcp causing too high short lived peak memory consumption and therefore SGE to kill the job.
UNIGE
- Issue: Space token UNIGE-DPNC_LOCALGROUPDISK getting over 90% full a few times.
- Mitigation: Finding which replicas are no longer or less needed.
- Methods: statistic of age and last access times(available from the DPM), lists of datasets to keep (maintained by analysis project leaders), evaluation of total data size by project based on those lists, negotiation in the group meetings.
UNIBE-LHEP
Second cluster with ARC2 installed, it will eventually replace the current (somehow old) cluster. ARC support is problematic. *Accounting finally working*.
Issue Description | Mitigation Description |
---|