Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "EGI-InSPIRE:Switzerland-QR7"

From EGIWiki
Jump to navigation Jump to search
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
__NOTOC__
{{Template:EGI-Inspire menubar}}
 
{{Template:Inspire_reports_menubar}}
{{TOC_right}}
{| border="1" cellspacing="0" cellpadding="2"
{| border="1" cellspacing="0" cellpadding="2"
|-
|-

Latest revision as of 17:38, 22 January 2015

EGI Inspire Main page


Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports



Quarterly Report Number NGI Name Partner Name Author
PQ7 NGI_CH SWITCH Simon Leinen


1. MEETINGS AND DISSEMINATION

1.1. CONFERENCES/WORKSHOPS ORGANISED

Date Location Title Participants Outcome (Short report & Indico URL)

1.2. OTHER CONFERENCES/WORKSHOPS ATTENDED

Date Location Title Participants Outcome (Short report & Indico URL)
10.11.2011 Amsterdam NIL kickoff meeting 1 from Switzerland https://www.egi.eu/indico/materialDisplay.py?materialId=minutes&confId=659
28.11.2011 Bern Swiss Distributed Computing Day 2011 57 http://www.swing-grid.ch/event/516553-swiss-distributed-computing-day
24.1.2012 Amsterdam OMB Two from Switzerland https://www.egi.eu/indico/conferenceDisplay.py?confId=618
24.-26.1-2012 Amsterdam Sustainability workshop 1 from Switzerland https://www.egi.eu/indico/conferenceDisplay.py?confId=709


1.3. PUBLICATIONS

Publication title Journal / Proceedings title Journal references
Volume number
Issue

Pages from - to
Authors
1.
2.
3.
Et al?
"ARC and gLite interoperability in ATLAS sites" 2011 J. Phys.: Conf. Ser. 331 052026 Haug, S.; Gadomski, S.; Filipcic, A.

2. ACTIVITY REPORT

2.1. Progress Summary

CSCS

  1. Preparing the move to the new Lugano building, which is almost ready. New hardware has been ordered and it is expected to move there by mid-april.

PSI

  1. Received new storage hardware (SGI SI5500 System with 2*60 3TB Disks, ca 270 TB of usable RAID6 space). Currently installing it.
  2. Further virtualization of services (directory services and central logging)
  3. Expecting delivery of additional compute nodes.
  4. new storage and compute nodes require massive restructuring of our network environment. Will be carried out in March.

UZH

  1. GC3Pie framework and AppPot frameworks have been adapted and used for a comp.chem usecase in collaboration with University of Perugia. Outcomes to be presented at the next EGI Community forum

UNIBE-LHEP

  1. Upgraded CE middleware version to ARC 11.05-1
  2. Configured CE to publish Glue 1.2 schema and registered to site-bdii
  3. Moved 20TB of non-utilised (formerly NFS) storage to the DPM SE for regional VOs and ops/dteam VOs
  4. Purchased new virtualisation server for KVM. Waited for certification of SLC6.2 on 31-Jan-2012. Installation and configuration will follow

2.2. Main Achievements

CSCS

  1. Move of most of production systems to KVM on SSD disks. This has notably increased the performance and reliability of the Grid related VMs.
  2. Installation of the new Interlagos CPU in 10 WNs, this increased to 32 the number of job slots per machine.
  3. Deployment of GPFS on previous Lustre hardware. This has notably increased the performance and stability of the scratch filesystem. GPFS seems to be way more performant than Luster for our current setup. Also, metadata is now exclusively on SSDs.
  4. Re-cabled part of the cluster switches and networks to prepare for the move.

UNIBE-LHEP

  1. Uninterrupted, generally stable operation
  2. Record elapsed WallTime for two months in a row (Dec 2011, Jan 2012)

2.3. Issues and mitigation

Issue Description Mitigation Description
CSCS: ARC-CEs seem unable to properly work with gLite BDII Waiting for new middleware update.
UNIBE-LHEP: Lustre scratch for ARC under heavier than usual load due to new ATLAS workloads running the the site. This contributed to prevent all job slots from being filled some lustre and batch scheduler tuning, now all slots can be occupied again. Plan to attach more disks to lustre.
UNIBE-LHEP: Investigation on degraded network rates for gridftp and FTS from ND cloud underway, involving SWITCH and UniBE NOCs and ND ATLAS operation manager.
UNIBE-LHEP: Discovered that running a storage pool for DPM on the head node causes the whole root filesystem to be exposed to the outside world. Configuration of pool storage on the head node is officially supported in gLite. Deployed new storage pool on separate server, drained pools on DPM head node and reconfigured the service to exclude storage pools on the head node.
UNIBE-LHEP: Some files lost on the DPM SE: during the draining of the pools on the head node: files that had been copied by ARC clients were deleted rather than migrated to a different pool. Observed that all files copied by ARC clients are flagged as "volatile" in the DPM name space DB. Files copied by gLite clients are flagged by default as "permanent" instead. List of lost files passed to the affected (regional) VOs none. Raised the issue with ARC communities.
UNIBE-LHEP: Ongoing: Central NGI_DE/NGI_CH NAGIOS monitoring still not offering probes for ARC services, so ARC CE is still flagged as "not in production" in GOCDB none. Waiting for further production NAGIOS updates
UNIBE-LHEP: Ongoing: Site still not reporting to the central EGI registry via APEL. No production tool exists yet none. New attempt to pursue the issue at national and EGI level