Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "NGI DE CH Operations Center:Operations Meeting:16032012"

From EGIWiki
Jump to navigation Jump to search
Line 40: Line 40:
* Goegrid
* Goegrid
* GSI
* GSI
* ITWM
* ITWM (Martin Braun)
   09.03->
   - 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.
  ITWM will not be able to attend the phone conference. There is nothing special to report.
  - 16/3/12 ntr
* KIT (GridKa, FZK-LCG2)
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)
   9.03->
   - 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance
  downtime 13.03 - 15.03, full site maintainance
  - 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central
power supply, some LFC and 3D-DB were migrated to Oracle version 11g
* KIT (Uni Karlsruhe)
* KIT (Uni Karlsruhe)
* LRZ
* LRZ
* MPI-K
* MPI-K
* MPPMU
* MPPMU (Cesare Delle Fratte)
  - ntr
  - a few problems with CREAMS and ATLA jobs. Problems are solved now.
* RWTH Aachen
* RWTH Aachen
* SCAI
* SCAI (Andre Gemuend)
  - Problems with SE: DPM daemon died. We will update EMI-DPM
  - ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the
dpm daemon:
* Uni Bonn
* Uni Bonn
* Uni Dortmund
* Uni Dortmund
* Uni Dresden
* Uni Dresden (Ralph Mueller Pfeeferkorn)
  - 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now
  - Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation
that you need 3GB RAM.
* Uni Freiburg
* Uni Freiburg
* Uni Mainz-Maigrid
* Uni Mainz-Maigrid

Revision as of 09:58, 22 March 2012

Operations Meeting Main

Introduction

  • Minutes of last meeting

Announcements

Feb:
VoOps:
Av/Re= 97%
UNI BONN=69%

BDII: Av/Re=99,3%
 
  • Monitoring
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978
  • Staged rollout/updates
UMD:
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012
  • Survey
Usage and future maintenance of deployed software
Operations/Platform Deployment Survey

Round the sites

NGI-DE
  • BMRZ-FRANKFURT (Uni Frankfurt)
  • DESY-HH
  • DESY-ZN
  • FZJuelich
  • Goegrid
  • GSI
  • ITWM (Martin Braun)
 - 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.
 - 16/3/12 ntr
  • KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)
 - 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance
 - 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did 
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central 
power supply, some LFC and 3D-DB were migrated to Oracle version 11g
  • KIT (Uni Karlsruhe)
  • LRZ
  • MPI-K
  • MPPMU (Cesare Delle Fratte)
 - ntr
 - a few problems with CREAMS and ATLA jobs. Problems are solved now.
  • RWTH Aachen
  • SCAI (Andre Gemuend)
 - Problems with SE: DPM daemon died. We will update EMI-DPM
 - ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the 
dpm daemon: 
  • Uni Bonn
  • Uni Dortmund
  • Uni Dresden (Ralph Mueller Pfeeferkorn)
 - 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now 
 - Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or 
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation 
that you need 3GB RAM.
  • Uni Freiburg
  • Uni Mainz-Maigrid
  • Uni Siegen
  • Uni Wuppertal
SwiNG
  • CSCS
  • PSI
  • Switch

Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.

Status ROD

welcome LRZ Daniel Waldmann

09	 27.02	 04.03	 Team5, LRZ	
10	 05.03	 11.03	 Team6, CSCS/NGI_CH	
11	 12.03	 18.03	 Team1, DESY	
12	 19.03	 25.03	 Team2, FhG
  • Nagios<-->Dashboard issue
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038

AOB

If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.