NGI DE CH Operations Center:Operations Meeting:03022012

The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Operations Meeting Main

Introduction

Minutes of last meeting

Announcements

Meetings/conferences

NGI-DE/NGI-CH/D-Grid Workshop in April
Note: There is also a dCache Workshop in April. Date should be chosen carefully.

The EGI Community Forum (http://go.egi.eu/cf12) will be in Munich 26-30th March 2012 and held in conjunction with the 2nd EMI 
Technical Conference. Abstract submission was open until 2/12/11.

Availability/reliability statistics

Last:
https://documents.egi.eu/public/ShowDocument?docid=959

Monitoring

Nagios Update 15

Staged rollout/updates

;UMD
https://wiki.egi.eu/wiki/UMD-1:UMD-1.5.0

;EMI

other topics

Gstat
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=1930
		
		
Gstat
Sites with CRYTICAL gstat status: wuppertalprod Uni-Bonn DESY-HH SCAI MaiGRID LRZ-LMU

Round the sites

NGI-DE

BMRZ-FRANKFURT (Uni Frankfurt)
DESY-HH
DESY-ZN
FZJuelich
Goegrid
GSI
ITWM (Martin Braun)

ntr

KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)

- gLexec updated at WNs
- roled based mapping for glexec was requested by atlas
- WMS disk full: Problems with ngi-de-nagios portal

KIT (Uni Karlsruhe)
LRZ
MPI-K
MPPMU (Cesare Delle Fratte)

 + gftp crashes every few hours 'cause ("OutOfMemoryError" using both OpenJDK and Sun JDK). The issue has been solved upgrading java JDK 1.6 package.
 + Increased the number of movers in order to reduce pending transfers. 
 + CREAM2 upgraded (glite-CREAM moved from 3.2.13-1 to 3.2.14-1.sl5, glite-SGE_utils.x86_64 3.2.3-1.sl5
 + Security fix on Apel service 
 + LFC failures due to "Bad magic number": hanging gpfs connections causing lfc timeouts. The work around was to change CREAM config to decrease gpfs load.

RWTH Aachen
SCAI
Uni Bonn
Uni Dortmund
Uni Dresden (Ralph Mueller Pfefferkorn)

since about two months problem with our file system, especially with the central nfs file system. The nfs system becomes 
overloaded. 100s of jobs with 100s of files.
Paolo/CSCS: We had the same problems. It was fixed by changing the CREAM grubber and we went from Lustre to gpfs and SSD disks 
for the metadata and for the inode's table.

Uni Freiburg (Anton Gamel)

- problems with gsi ssh -> increased movers
- installed additional dCache servers

Uni Mainz-Maigrid
Uni Siegen
Uni Wuppertal

SwiNG

CSCS (Paolo)

- maintenance two days ago: firmware update of the disks, lost 4 disks/CMS pool (in contact with CMS)
- test CERNVMFS in preproduction

PSI
Switch

Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.

Status ROD

Any problematic tickets?
Handover of the ROD shift
ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table

LRZ from 02.2012. 2*2 Shifts

ROD Newsletter Nov. 2011
https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2011-2011.pdf

tickets were not mentioned within 10 days. Be aware of the ROD statistics.
Please pay attention to the Escalation Procedures
https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure#Escalation_for_operational_problem_at_site

AOB

If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.

NGI DE CH Operations Center:Operations Meeting:03022012

Contents

Introduction

Announcements

Round the sites

Status ROD

AOB

Navigation menu

NGI DE CH Operations Center:Operations Meeting:03022012

Introduction

Announcements

Round the sites

Status ROD

AOB

Navigation menu

Search