NGI DE CH Operations Center:Operations Meeting:14092012
Jump to navigation Jump to search
- Minutes of last meeting
EGI: next week there is the Technical Forum in Prague
- Availability/reliability statistics
90% Three sites did not hit the target: RWTH-Aachen 52% UNI-SIEGEN-HEP 69%
update 17 tested. It seems to work fine. Next week we update our production system. Update 17 include sensors for Globus, Unicore and EMI 2 WNs soem problems wiht monitring WMSs, but should be fixed from now.
- Staged rollout/updates
DN Publishing -----Ursprüngliche Nachricht----- Von: Operations of NGI-DE  Im Auftrag von Dimitri Nilsen Gesendet: Dienstag, 31. Juli 2012 18:20 An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE Betreff: Publishing User DNs Dear Sites, according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box
Status of releases Sites that support WLCG VOs should update to EMI release until 1st October. At least to EMI 1. gLite releases should not longer be supported. We at KIT are currently updating our services (WMSs, sBDIIs, CREAMs). WNs will follow. Dimitri will send around a list of versions and deadlines. 1st October will be a little bit unrealistic.
Round the sites
- BMRZ-FRANKFURT (Uni Frankfurt)
- FZJuelich (Mathilda)
- ITWM (Martin)
- all WNs and CEs updated to SL6 and EMI-2/UMD-2 - one SE node, APEL node, site BDII still running glite 3.2 - What is the status of this EGI broadcast?
- KIT (GridKa, FZK-LCG2)
- KIT (Uni Karlsruhe, Dimitri, Tobias)
emi migration. plans to update WN to EMI 2. any experience whith emi-wn?
- MPPMU (Cesare)
- deployement of CVMFS
- RWTH Aachen
- Uni Bonn
- Uni Dortmund
- Uni Dresden
- Uni Freiburg (Anton)
- one reason for the low performance/avail/relia: SAM test failed over some days->site was offline. This was caused by monitoring problem. Aachen had the same problem. Now it is working again. - one of the file system of one of our pools crashed, we lost 15TB of data whitch we were able to partially restore. Interesting: After this we had to put the site offline from time to time because the dataflow of the restore process was so high that jobs were blocked.-> Additional downtime to restore the files was needed. - We need a downtime at the end of the month to update dCache to 1.9.12, to instal CERNVMFS and we did a TORQUE update on CREAMS. But this TORQUE version blocked proxies. We did downgrade. For the old gLite and EMI versions there is still an old version of a TORQUE package in the repository. Recommendation from Dimitri: An email to the rollout list should be written. Next week in Prague Dimitri can also ask the people from EMI. - added some WNS - Migration to EMI 2 started, CREAM 3 in test phase
- Uni Mainz-Maigrid
- Uni Siegen
- Uni Wuppertal
- CSCS (Paulo)
- increased capacity of compute to 2200 cores - prepare maintance for next Tuesday to fix dCache pool nodes
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.
- Any problematic tickets?
- Handover of the ROD shift
- ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table
- bad matrix for rod shifts last months. Problem was handling of tickets in expired state. Please handle tickets more carefully to avoid such situations.
- Rotation table was updated.
- Next meeting will be in two weeks after the Prague meeting on 28 September
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.