https://wiki.egi.eu/w/api.php?action=feedcontributions&user=Tkoenig&feedformat=atomEGIWiki - User contributions [en]2024-03-28T12:06:14ZUser contributionsMediaWiki 1.37.1https://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=41065NGI DE CH Operations Center:Operations Meeting:280920122012-10-02T07:42:07Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
Source of this statistic is https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf <br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface. For those sites which receive information for their local monitoring system should <br />
have a look. But the system works. All nodes are tested.<br />
Some problems with monitoring of WMSs, but should be fixed from now. Due to this the monitoring system was not available for two <br />
hours. But we will do a recomputation of the availability statistics<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted <br />
through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem <br />
will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
<br />
EMI2 WN problem<br />
=======<br />
Workernodes should not be updated to EMI2, because there are some troubles with this release.This was not officially announced. EMI <br />
1 WN installation is not affected.<br />
<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Pavel Weber, Tobias Koenig)<br />
most services to EMI1/2, except CREAM CEs and Apel. Apel still run gLite 3.2. We plan to update next week. We will be in contact <br />
via Email.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare delle Fratte)<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
still working on the EMI updates. Priority to our SE is lower, because we migrate our data. Other services should be updated within <br />
the next two weeks.<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton Gamel via Email)<br />
After some Problems at the beginning of the month.<br />
torque 2.5.12 problem already mentioned), > that cost a _lot_ of availability/reliability % > site is doing well again.<br />
Started to deploy a xrootd cache machine > (xrd.bfg.uni-freiburg.de)<br />
User with lots of accesses of eos SRM > caused ban of some of our WNs at cern storage.<br />
Under investigation. State at the moment:<br />
root ./Analysis binary requests files not in > input list. Reason and "default path" unclear.<br />
<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.<br />
<br />
Next meeting will be in two or three weeks</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=41061NGI DE CH Operations Center:Operations Meeting:280920122012-10-02T07:32:18Z<p>Tkoenig: </p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface. For those sites which receive information for their local monitoring system should <br />
have a look. But the system works. All nodes are tested.<br />
Some problems with monitoring of WMSs, but should be fixed from now. Due to this the monitoring system was not available for two <br />
hours. But we will do a recomputation of the availability statistics<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted <br />
through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem <br />
will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
<br />
EMI2 WN problem<br />
=======<br />
Workernodes should not be updated to EMI2, because there are some troubles with this release.This was not officially announced. EMI <br />
1 WN installation is not affected.<br />
<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Pavel Weber, Tobias Koenig)<br />
most services to EMI1/2, except CREAM CEs and Apel. Apel still run gLite 3.2. We plan to update next week. We will be in contact <br />
via Email.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare delle Fratte)<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
still working on the EMI updates. Priority to our SE is lower, because we migrate our data. Other services should be updated within <br />
the next two weeks.<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton Gamel via Email)<br />
After some Problems at the beginning of the month.<br />
torque 2.5.12 problem already mentioned), > that cost a _lot_ of availability/reliability % > site is doing well again.<br />
Started to deploy a xrootd cache machine > (xrd.bfg.uni-freiburg.de)<br />
User with lots of accesses of eos SRM > caused ban of some of our WNs at cern storage.<br />
Under investigation. State at the moment:<br />
root ./Analysis binary requests files not in > input list. Reason and "default path" unclear.<br />
<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.<br />
<br />
Next meeting will be in two or three weeks</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting&diff=40970NGI DE CH Operations Center:Operations Meeting2012-09-28T13:57:57Z<p>Tkoenig: /* Minutes */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center|NGI-DE/NGI-CH Operations Center]]<br />
<br />
=Next Meeting=<br />
<br />
=How to Connect=<br />
<br />
Connect Via Phone:<br />
<br />
Following DFN ISDN Gateways are available:<br />
Germany - Berlin:+49-30-2541080<br />
Germany - Stuttgart:+49-711-6330190<br />
<br />
If requested please enter following conference number: 97922688<br />
<br />
<br />
Outdated info:<br />
<br />
''For the NGI-DE/-CH operations meeting we use the EVO online conference system. There is the possibility to connect via EVO web client or via a normal telephone by using one of the telephone bridges. You can find more general information about EVO at https://wiki.egi.eu/wiki/NGI_DE:EVO The meeting access information and the exact date will be sent around via email to our Email list.''<br />
<br />
=Minutes=<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:01072011|01.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:22072011|22.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:05082011|05.08.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02092011|02.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:30092011|30.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:21102011|21.10.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:11112011|11.11.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02122011|02.12.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:03022012|03.02.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:16032012|16.03.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:06072012|06.07.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:14092012|14.09.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:28092012|28.09.2012]]<br />
<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:Template|Template]]</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting&diff=40969NGI DE CH Operations Center:Operations Meeting2012-09-28T13:57:22Z<p>Tkoenig: /* Next Meeting */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center|NGI-DE/NGI-CH Operations Center]]<br />
<br />
=Next Meeting=<br />
<br />
=How to Connect=<br />
<br />
Connect Via Phone:<br />
<br />
Following DFN ISDN Gateways are available:<br />
Germany - Berlin:+49-30-2541080<br />
Germany - Stuttgart:+49-711-6330190<br />
<br />
If requested please enter following conference number: 97922688<br />
<br />
<br />
Outdated info:<br />
<br />
''For the NGI-DE/-CH operations meeting we use the EVO online conference system. There is the possibility to connect via EVO web client or via a normal telephone by using one of the telephone bridges. You can find more general information about EVO at https://wiki.egi.eu/wiki/NGI_DE:EVO The meeting access information and the exact date will be sent around via email to our Email list.''<br />
<br />
=Minutes=<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:01072011|01.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:22072011|22.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:05082011|05.08.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02092011|02.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:30092011|30.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:21102011|21.10.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:11112011|11.11.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02122011|02.12.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:03022012|03.02.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:16032012|16.03.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:06072012|06.07.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:14092012|14.09.2012]]<br />
<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:Template|Template]]</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=40968NGI DE CH Operations Center:Operations Meeting:280920122012-09-28T13:56:54Z<p>Tkoenig: /* AOB */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface. For those sites which receive information for their local monitoring system should <br />
have a look. But the system works. All nodes are tested.<br />
Some problems with monitoring of WMSs, but should be fixed from now. Due to this the monitoring system was not available for two <br />
hours. But we will do a recomputation of the availability statistics<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted <br />
through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem <br />
will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
<br />
EMI2 WN problem<br />
=======<br />
Workernodes should not be updated to EMI2, because there are some troubles with this release.This was not officially announced. EMI <br />
1 WN installation is not affected.<br />
<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Pavel Weber, Tobias Koenig)<br />
most services to EMI1/2, except CREAM CEs and Apel. Apel still run gLite 3.2. We plan to update next week. We will be in contact <br />
via Email.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare delle Fratte)<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
still working on the EMI updates. Priority to our SE is lower, because we migrate our data. Other services should be updated within <br />
the next two weeks.<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.<br />
<br />
Next meeting will be in two or three weeks</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=40967NGI DE CH Operations Center:Operations Meeting:280920122012-09-28T13:56:27Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface. For those sites which receive information for their local monitoring system should <br />
have a look. But the system works. All nodes are tested.<br />
Some problems with monitoring of WMSs, but should be fixed from now. Due to this the monitoring system was not available for two <br />
hours. But we will do a recomputation of the availability statistics<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted <br />
through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem <br />
will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
<br />
EMI2 WN problem<br />
=======<br />
Workernodes should not be updated to EMI2, because there are some troubles with this release.This was not officially announced. EMI <br />
1 WN installation is not affected.<br />
<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Pavel Weber, Tobias Koenig)<br />
most services to EMI1/2, except CREAM CEs and Apel. Apel still run gLite 3.2. We plan to update next week. We will be in contact <br />
via Email.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare delle Fratte)<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
still working on the EMI updates. Priority to our SE is lower, because we migrate our data. Other services should be updated within <br />
the next two weeks.<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=40965NGI DE CH Operations Center:Operations Meeting:280920122012-09-28T13:47:39Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface. For those sites which receive information for their local monitoring system should <br />
have a look. But the system works. All nodes are tested.<br />
Some problems with monitoring of WMSs, but should be fixed from now. Due to this the monitoring system was not available for two <br />
hours. But we will do a recomputation of the availability statistics<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted <br />
through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem <br />
will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
<br />
EMI2 WN problem<br />
=======<br />
Workernodes should not be updated to EMI2, because there are some troubles with this release.This was not officially announced. EMI <br />
1 WN installation is not affected.<br />
<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Pavel Weber, Tobias Koenig)<br />
most services to EMI1/2<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare delle Fratte)<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=40964NGI DE CH Operations Center:Operations Meeting:280920122012-09-28T13:42:39Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface. For those sites which receive information for their local monitoring system should <br />
have a look. But the system works. All nodes are tested.<br />
Some problems with monitoring of WMSs, but should be fixed from now. Due to this the monitoring system was not available for two <br />
hours. But we will do a recomputation of the availability statistics<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted <br />
through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem <br />
will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Pavel Weber, Tobias Koenig)<br />
most services to EMI1/2<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare delle Fratte)<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=40963NGI DE CH Operations Center:Operations Meeting:280920122012-09-28T13:35:19Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface.<br />
some problems wiht monitring WMSs, but should be fixed from now.<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted <br />
through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem <br />
will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Pavel Weber, Tobias Koenig)<br />
most services to EMI1/2<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare delle Fratte)<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=40962NGI DE CH Operations Center:Operations Meeting:280920122012-09-28T13:32:28Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface.<br />
some problems wiht monitring WMSs, but should be fixed from now.<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted <br />
through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem <br />
will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
most services to EMI1/2<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:28092012&diff=40961NGI DE CH Operations Center:Operations Meeting:280920122012-09-28T13:31:55Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
* Availability/reliability statistics<br />
90%<br />
Three sites did not hit the target:<br />
RWTH-Aachen 52%<br />
UNI-SIEGEN-HEP 69%<br />
* Monitoring<br />
production update to release 17.1<br />
still some problems with myegi web interface.<br />
some problems wiht monitring WMSs, but should be fixed from now.<br />
* Staged rollout/updates<br />
<pre><br />
gLite 3.1<br />
======<br />
As already announced in [1] and in a number of other advisories, the gLite 3.1 distribution is now no longer supported <br />
(http://glite.cern.ch/R3.1/) and SL4 reached end of security support on 02/02/2012. Security patches for gLite 3.1 and SL4 are no <br />
longer available.<br />
<br />
Unsupported gLite 3.2 products<br />
====================<br />
The gLite 3.2 components currently out of security support are: APEL, ARGUS, BDII, Cluster, CREAM, dCache, LB, LSF utils, MPI utils, <br />
SCAS, SGE utils, Torque client/server/utils, VOMS [1].<br />
<br />
Decommissioning by 30-09-2012<br />
=====================<br />
gLite 3.1 products and *unsupported* gLite 3.2 software components have to be retired by 30-09-2012. Site managers can choose <br />
between the option of upgrading to a supported UMD release of the product [2] in consultation with the supported VOs, or the <br />
decommissioning of the service following the related procedure [3].<br />
<br />
Note well: this retirement calendar does not apply to gLite 3.2 products that are still supported. As already announced in [4], the <br />
support of glite 3.2 glite-UI, glite-WN, glite-GLEXEC_wn, glite-LFC_mysql/glite-LFC_oracle, glite-SE_dpm_disk/glite-SE_dpm_mysql <br />
was recently extended to 30/11/2012.<br />
<br />
Escalation<br />
=======<br />
Starting from 01-10-2012 site managers of Resource Centres found to be hosting unsupported gLite 3.1/3.2 services, will be contacted through GGUS by the Central Grid Oversight team to request the retirement of the affected products.<br />
Resource Centres that will fail to retire unsupported gLite software by 01-11-2012, will be eligible for suspension and the problem will be escalated to EGI CSIRT for the enforcement of this suspension policy.<br />
</pre><br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
most services to EMI1/2<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
No EMI service already running at RZG<br />
EMI cream installation planned not before 2nd half of October<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting&diff=40735NGI DE CH Operations Center:Operations Meeting2012-09-20T11:59:45Z<p>Tkoenig: /* Minutes */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center|NGI-DE/NGI-CH Operations Center]]<br />
<br />
=Next Meeting=<br />
<br />
=How to Connect=<br />
<br />
Connect Via Phone:<br />
<br />
Following DFN ISDN Gateways are available:<br />
Germany - Berlin:+49-30-2541080<br />
Germany - Stuttgart:+49-711-6330190<br />
<br />
If requested please enter following conference number: 97922688<br />
<br />
<br />
Outdated info:<br />
<br />
''For the NGI-DE/-CH operations meeting we use the EVO online conference system. There is the possibility to connect via EVO web client or via a normal telephone by using one of the telephone bridges. You can find more general information about EVO at https://wiki.egi.eu/wiki/NGI_DE:EVO The meeting access information and the exact date will be sent around via email to our Email list.''<br />
<br />
=Minutes=<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:01072011|01.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:22072011|22.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:05082011|05.08.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02092011|02.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:30092011|30.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:21102011|21.10.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:11112011|11.11.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02122011|02.12.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:03022012|03.02.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:16032012|16.03.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:06072012|06.07.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:14092012|14.09.2012]]<br />
<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:Template|Template]]</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting&diff=40734NGI DE CH Operations Center:Operations Meeting2012-09-20T11:58:24Z<p>Tkoenig: /* Minutes */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center|NGI-DE/NGI-CH Operations Center]]<br />
<br />
=Next Meeting=<br />
<br />
=How to Connect=<br />
<br />
Connect Via Phone:<br />
<br />
Following DFN ISDN Gateways are available:<br />
Germany - Berlin:+49-30-2541080<br />
Germany - Stuttgart:+49-711-6330190<br />
<br />
If requested please enter following conference number: 97922688<br />
<br />
<br />
Outdated info:<br />
<br />
''For the NGI-DE/-CH operations meeting we use the EVO online conference system. There is the possibility to connect via EVO web client or via a normal telephone by using one of the telephone bridges. You can find more general information about EVO at https://wiki.egi.eu/wiki/NGI_DE:EVO The meeting access information and the exact date will be sent around via email to our Email list.''<br />
<br />
=Minutes=<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:01072011|01.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:22072011|22.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:05082011|05.08.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02092011|02.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:30092011|30.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:21102011|21.10.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:11112011|11.11.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02122011|02.12.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:03022012|03.02.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:16032012|16.03.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:14092012|14.09.2012]]<br />
<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:Template|Template]]</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting&diff=40733NGI DE CH Operations Center:Operations Meeting2012-09-20T11:58:13Z<p>Tkoenig: /* Minutes */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center|NGI-DE/NGI-CH Operations Center]]<br />
<br />
=Next Meeting=<br />
<br />
=How to Connect=<br />
<br />
Connect Via Phone:<br />
<br />
Following DFN ISDN Gateways are available:<br />
Germany - Berlin:+49-30-2541080<br />
Germany - Stuttgart:+49-711-6330190<br />
<br />
If requested please enter following conference number: 97922688<br />
<br />
<br />
Outdated info:<br />
<br />
''For the NGI-DE/-CH operations meeting we use the EVO online conference system. There is the possibility to connect via EVO web client or via a normal telephone by using one of the telephone bridges. You can find more general information about EVO at https://wiki.egi.eu/wiki/NGI_DE:EVO The meeting access information and the exact date will be sent around via email to our Email list.''<br />
<br />
=Minutes=<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:01072011|01.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:22072011|22.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:05082011|05.08.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02092011|02.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:30092011|30.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:21102011|21.10.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:11112011|11.11.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02122011|02.12.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:03022012|03.02.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:16032012|16.03.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:14092012|14.09.2012<br />
<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:Template|Template]]</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting&diff=40732NGI DE CH Operations Center:Operations Meeting2012-09-20T11:57:15Z<p>Tkoenig: /* Next Meeting */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center|NGI-DE/NGI-CH Operations Center]]<br />
<br />
=Next Meeting=<br />
<br />
=How to Connect=<br />
<br />
Connect Via Phone:<br />
<br />
Following DFN ISDN Gateways are available:<br />
Germany - Berlin:+49-30-2541080<br />
Germany - Stuttgart:+49-711-6330190<br />
<br />
If requested please enter following conference number: 97922688<br />
<br />
<br />
Outdated info:<br />
<br />
''For the NGI-DE/-CH operations meeting we use the EVO online conference system. There is the possibility to connect via EVO web client or via a normal telephone by using one of the telephone bridges. You can find more general information about EVO at https://wiki.egi.eu/wiki/NGI_DE:EVO The meeting access information and the exact date will be sent around via email to our Email list.''<br />
<br />
=Minutes=<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:01072011|01.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:22072011|22.07.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:05082011|05.08.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02092011|02.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:30092011|30.09.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:21102011|21.10.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:11112011|11.11.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:02122011|02.12.2011]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:03022012|03.02.2012]]<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:16032012|16.03.2012]]<br />
<br />
<br />
*[[NGI_DE_CH_Operations_Center:Operations_Meeting:Template|Template]]</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:14092012&diff=40731NGI DE CH Operations Center:Operations Meeting:140920122012-09-20T11:48:57Z<p>Tkoenig: /* AOB */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI: next week there is the Technical Forum in Prague <br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf<br />
<br />
90%<br />
<br />
Three sites did not hit the target:<br />
RWTH-Aachen 68%<br />
UNI-FREIBURG 59%<br />
UNI-SIEGEN-HEP 31%<br />
<br />
* Monitoring<br />
update 17 tested. It seems to work fine. Next week we update our production system. Update 17 include sensors for Globus, Unicore <br />
and EMI 2 WNs<br />
<br />
* Staged rollout/updates<br />
DN Publishing<br />
-----Ursprüngliche Nachricht-----<br />
Von: Operations of NGI-DE [mailto:NGI-DE-OPERATIONS@LISTSERV.DFN.DE] Im Auftrag von Dimitri Nilsen<br />
Gesendet: Dienstag, 31. Juli 2012 18:20<br />
An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE<br />
Betreff: Publishing User DNs<br />
Dear Sites,<br />
according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting<br />
Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box<br />
<br />
Status of releases<br />
Sites that support WLCG VOs should update to EMI release until 1st October. At least to EMI 1. gLite releases should not longer be <br />
supported. We at KIT are currently updating our services (WMSs, sBDIIs, CREAMs). WNs will follow. Dimitri will send around a list <br />
of versions and deadlines. 1st October will be a little bit unrealistic.<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin)<br />
- all WNs and CEs updated to SL6 and EMI-2/UMD-2<br />
- one SE node, APEL node, site BDII still running glite 3.2<br />
- What is the status of this [https://operations-portal.egi.eu/broadcast/archive/id/725 EGI broadcast]?<br />
* KIT (GridKa, FZK-LCG2)<br />
* KIT (Uni Karlsruhe, Dimitri, Tobias)<br />
emi migration. plans to update WN to EMI 2. any experience whith emi-wn?<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare)<br />
- deployement of CVMFS<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton)<br />
- one reason for the low performance/avail/relia: SAM test failed over some days->site was offline. This was caused by monitoring <br />
problem. Aachen had the same problem. Now it is working again.<br />
- one of the file system of one of our pools crashed, we lost 15TB of data whitch we were able to partially restore. Interesting: <br />
After this we had to put the site offline from time to time because the dataflow of the restore process was so high that jobs were <br />
blocked.-> Additional downtime to restore the files was needed.<br />
- We need a downtime at the end of the month to update dCache to 1.9.12, to instal CERNVMFS and we did a TORQUE update on CREAMS. <br />
But this TORQUE version blocked proxies. We did downgrade. For the old gLite and EMI versions there is still an old version of a <br />
TORQUE package in the repository. Recommendation from Dimitri: An email to the rollout list should be written. Next week in Prague <br />
Dimitri can also ask the people from EMI. <br />
- added some WNS<br />
- Migration to EMI 2 started, CREAM 3 in test phase<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paulo)<br />
- increased capacity of compute to 2200 cores<br />
- prepare maintance for next Tuesday to fix dCache pool nodes<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
* bad matrix for rod shifts last months. Problem was handling of tickets in expired state. Please handle tickets more carefully to avoid such situations.<br />
* Rotation table was updated.<br />
<br />
==AOB==<br />
* Next meeting will be in two weeks after the Prague meeting on 28 September<br />
<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:14092012&diff=40730NGI DE CH Operations Center:Operations Meeting:140920122012-09-20T11:47:55Z<p>Tkoenig: /* Status ROD */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI: next week there is the Technical Forum in Prague <br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf<br />
<br />
90%<br />
<br />
Three sites did not hit the target:<br />
RWTH-Aachen 68%<br />
UNI-FREIBURG 59%<br />
UNI-SIEGEN-HEP 31%<br />
<br />
* Monitoring<br />
update 17 tested. It seems to work fine. Next week we update our production system. Update 17 include sensors for Globus, Unicore <br />
and EMI 2 WNs<br />
<br />
* Staged rollout/updates<br />
DN Publishing<br />
-----Ursprüngliche Nachricht-----<br />
Von: Operations of NGI-DE [mailto:NGI-DE-OPERATIONS@LISTSERV.DFN.DE] Im Auftrag von Dimitri Nilsen<br />
Gesendet: Dienstag, 31. Juli 2012 18:20<br />
An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE<br />
Betreff: Publishing User DNs<br />
Dear Sites,<br />
according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting<br />
Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box<br />
<br />
Status of releases<br />
Sites that support WLCG VOs should update to EMI release until 1st October. At least to EMI 1. gLite releases should not longer be <br />
supported. We at KIT are currently updating our services (WMSs, sBDIIs, CREAMs). WNs will follow. Dimitri will send around a list <br />
of versions and deadlines. 1st October will be a little bit unrealistic.<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin)<br />
- all WNs and CEs updated to SL6 and EMI-2/UMD-2<br />
- one SE node, APEL node, site BDII still running glite 3.2<br />
- What is the status of this [https://operations-portal.egi.eu/broadcast/archive/id/725 EGI broadcast]?<br />
* KIT (GridKa, FZK-LCG2)<br />
* KIT (Uni Karlsruhe, Dimitri, Tobias)<br />
emi migration. plans to update WN to EMI 2. any experience whith emi-wn?<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare)<br />
- deployement of CVMFS<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton)<br />
- one reason for the low performance/avail/relia: SAM test failed over some days->site was offline. This was caused by monitoring <br />
problem. Aachen had the same problem. Now it is working again.<br />
- one of the file system of one of our pools crashed, we lost 15TB of data whitch we were able to partially restore. Interesting: <br />
After this we had to put the site offline from time to time because the dataflow of the restore process was so high that jobs were <br />
blocked.-> Additional downtime to restore the files was needed.<br />
- We need a downtime at the end of the month to update dCache to 1.9.12, to instal CERNVMFS and we did a TORQUE update on CREAMS. <br />
But this TORQUE version blocked proxies. We did downgrade. For the old gLite and EMI versions there is still an old version of a <br />
TORQUE package in the repository. Recommendation from Dimitri: An email to the rollout list should be written. Next week in Prague <br />
Dimitri can also ask the people from EMI. <br />
- added some WNS<br />
- Migration to EMI 2 started, CREAM 3 in test phase<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paulo)<br />
- increased capacity of compute to 2200 cores<br />
- prepare maintance for next Tuesday to fix dCache pool nodes<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
* bad matrix for rod shifts last months. Problem was handling of tickets in expired state. Please handle tickets more carefully to avoid such situations.<br />
* Rotation table was updated.<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:14092012&diff=40729NGI DE CH Operations Center:Operations Meeting:140920122012-09-20T11:45:53Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI: next week there is the Technical Forum in Prague <br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf<br />
<br />
90%<br />
<br />
Three sites did not hit the target:<br />
RWTH-Aachen 68%<br />
UNI-FREIBURG 59%<br />
UNI-SIEGEN-HEP 31%<br />
<br />
* Monitoring<br />
update 17 tested. It seems to work fine. Next week we update our production system. Update 17 include sensors for Globus, Unicore <br />
and EMI 2 WNs<br />
<br />
* Staged rollout/updates<br />
DN Publishing<br />
-----Ursprüngliche Nachricht-----<br />
Von: Operations of NGI-DE [mailto:NGI-DE-OPERATIONS@LISTSERV.DFN.DE] Im Auftrag von Dimitri Nilsen<br />
Gesendet: Dienstag, 31. Juli 2012 18:20<br />
An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE<br />
Betreff: Publishing User DNs<br />
Dear Sites,<br />
according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting<br />
Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box<br />
<br />
Status of releases<br />
Sites that support WLCG VOs should update to EMI release until 1st October. At least to EMI 1. gLite releases should not longer be <br />
supported. We at KIT are currently updating our services (WMSs, sBDIIs, CREAMs). WNs will follow. Dimitri will send around a list <br />
of versions and deadlines. 1st October will be a little bit unrealistic.<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin)<br />
- all WNs and CEs updated to SL6 and EMI-2/UMD-2<br />
- one SE node, APEL node, site BDII still running glite 3.2<br />
- What is the status of this [https://operations-portal.egi.eu/broadcast/archive/id/725 EGI broadcast]?<br />
* KIT (GridKa, FZK-LCG2)<br />
* KIT (Uni Karlsruhe, Dimitri, Tobias)<br />
emi migration. plans to update WN to EMI 2. any experience whith emi-wn?<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare)<br />
- deployement of CVMFS<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton)<br />
- one reason for the low performance/avail/relia: SAM test failed over some days->site was offline. This was caused by monitoring <br />
problem. Aachen had the same problem. Now it is working again.<br />
- one of the file system of one of our pools crashed, we lost 15TB of data whitch we were able to partially restore. Interesting: <br />
After this we had to put the site offline from time to time because the dataflow of the restore process was so high that jobs were <br />
blocked.-> Additional downtime to restore the files was needed.<br />
- We need a downtime at the end of the month to update dCache to 1.9.12, to instal CERNVMFS and we did a TORQUE update on CREAMS. <br />
But this TORQUE version blocked proxies. We did downgrade. For the old gLite and EMI versions there is still an old version of a <br />
TORQUE package in the repository. Recommendation from Dimitri: An email to the rollout list should be written. Next week in Prague <br />
Dimitri can also ask the people from EMI. <br />
- added some WNS<br />
- Migration to EMI 2 started, CREAM 3 in test phase<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paulo)<br />
- increased capacity of compute to 2200 cores<br />
- prepare maintance for next Tuesday to fix dCache pool nodes<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:14092012&diff=40728NGI DE CH Operations Center:Operations Meeting:140920122012-09-20T11:41:05Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI: next week there is the Technical Forum in Prague <br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf<br />
<br />
90%<br />
<br />
Three sites did not hit the target:<br />
RWTH-Aachen 68%<br />
UNI-FREIBURG 59%<br />
UNI-SIEGEN-HEP 31%<br />
<br />
* Monitoring<br />
update 17 tested. It seems to work fine. Next week we update our production system. Update 17 include sensors for Globus, Unicore <br />
and EMI 2 WNs<br />
<br />
* Staged rollout/updates<br />
DN Publishing<br />
-----Ursprüngliche Nachricht-----<br />
Von: Operations of NGI-DE [mailto:NGI-DE-OPERATIONS@LISTSERV.DFN.DE] Im Auftrag von Dimitri Nilsen<br />
Gesendet: Dienstag, 31. Juli 2012 18:20<br />
An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE<br />
Betreff: Publishing User DNs<br />
Dear Sites,<br />
according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting<br />
Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box<br />
<br />
Status of releases<br />
Sites that support WLCG VOs should update to EMI release until 1st October. At least to EMI 1. gLite releases should not longer be <br />
supported. We at KIT are currently updating our services (WMSs, sBDIIs, CREAMs). WNs will follow. Dimitri will send around a list <br />
of versions and deadlines. 1st October will be a little bit unrealistic.<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin)<br />
- all WNs and CEs updated to SL6 and EMI-2/UMD-2<br />
- one SE node, APEL node, site BDII still running glite 3.2<br />
- What is the status of this [https://operations-portal.egi.eu/broadcast/archive/id/725 EGI broadcast]?<br />
* KIT (GridKa, FZK-LCG2)<br />
* KIT (Uni Karlsruhe, Dimitri, Tobias)<br />
emi migration. plans to update WN to EMI 2. any experience whith emi-wn?<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare)<br />
- deployement of CVMFS<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton)<br />
- one reason for the low performance/avail/relia: SAM test failed over some days->site was offline. This was caused by monitoring <br />
problem. Aachen had the same problem. Now it is working again.<br />
- one of the file system of one of our pools crashed, we lost 15TB of data whitch we were able to partially restore. Interesting: <br />
After this we had to put the site offline from time to time because the dataflow of the restore process was so high that jobs were <br />
blocked.-> Additional downtime to restore the files was needed.<br />
- We need a downtime at the end of the month to update dCache to 1.9.12, to instal CERNVMFS and we did a TORQUE update on CREAMS. <br />
But this TORQUE version blocked proxies. We did downgrade. For the old gLite and EMI versions there is still an old version of a <br />
TORQUE package in the repository. Recommendation from Dimitri: An email to the rollout list should be written.<br />
- added some WNS<br />
- Migration to EMI 2 started, CREAM 3 in test phase<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:14092012&diff=40727NGI DE CH Operations Center:Operations Meeting:140920122012-09-20T11:39:45Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI: next week there is the Technical Forum in Prague <br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf<br />
<br />
90%<br />
<br />
Three sites did not hit the target:<br />
RWTH-Aachen 68%<br />
UNI-FREIBURG 59%<br />
UNI-SIEGEN-HEP 31%<br />
<br />
* Monitoring<br />
update 17 tested. It seems to work fine. Next week we update our production system. Update 17 include sensors for Globus, Unicore <br />
and EMI 2 WNs<br />
<br />
* Staged rollout/updates<br />
DN Publishing<br />
-----Ursprüngliche Nachricht-----<br />
Von: Operations of NGI-DE [mailto:NGI-DE-OPERATIONS@LISTSERV.DFN.DE] Im Auftrag von Dimitri Nilsen<br />
Gesendet: Dienstag, 31. Juli 2012 18:20<br />
An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE<br />
Betreff: Publishing User DNs<br />
Dear Sites,<br />
according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting<br />
Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box<br />
<br />
Status of releases<br />
Sites that support WLCG VOs should update to EMI release until 1st October. At least to EMI 1. gLite releases should not longer be <br />
supported. We at KIT are currently updating our services (WMSs, sBDIIs, CREAMs). WNs will follow. Dimitri will send around a list <br />
of versions and deadlines. 1st October will be a little bit unrealistic.<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin)<br />
- all WNs and CEs updated to SL6 and EMI-2/UMD-2<br />
- one SE node, APEL node, site BDII still running glite 3.2<br />
- What is the status of this [https://operations-portal.egi.eu/broadcast/archive/id/725 EGI broadcast]?<br />
* KIT (GridKa, FZK-LCG2)<br />
* KIT (Uni Karlsruhe, Dimitri, Tobias)<br />
emi migration. plans to update WN to EMI 2. any experience whith emi-wn?<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare)<br />
- deployement of CVMFS<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton)<br />
- one reason for the low performance/avail/relia: SAM test failed over some days->site was offline. This was caused by monitoring <br />
problem. Aachen had the same problem. Now it is working again.<br />
- one of the file system of one of our pools crashed, we lost 15TB of data whitch we were able to partially restore. Interesting: <br />
After this we had to put the site offline from time to time because the dataflow of the restore process was so high that jobs were <br />
blocked.-> Additional downtime to restore the files was needed.<br />
- We need a downtime at the end of the month to update dCache to 1.9.12, to instal CERNVMFS and we did a TORQUE update on CREAMS. <br />
But this TORQUE version blocked proxies. We did downgrade. For the old gLite and EMI versions there is still an old version of a <br />
TORQUE package in the repository. An email to the rollout list should be written.<br />
- added some WNS<br />
- Migration to EMI 2 started, CREAM 3 in test phase<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:14092012&diff=40726NGI DE CH Operations Center:Operations Meeting:140920122012-09-20T11:15:12Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI: next week there is the Technical Forum in Prague <br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf<br />
<br />
90%<br />
<br />
Three sites did not hit the target:<br />
RWTH-Aachen 68%<br />
UNI-FREIBURG 59%<br />
UNI-SIEGEN-HEP 31%<br />
<br />
* Monitoring<br />
update 17 tested. It seems to work fine. Next week we update our production system. Update 17 include sensors for Globus, Unicore <br />
and EMI 2 WNs<br />
<br />
* Staged rollout/updates<br />
DN Publishing<br />
-----Ursprüngliche Nachricht-----<br />
Von: Operations of NGI-DE [mailto:NGI-DE-OPERATIONS@LISTSERV.DFN.DE] Im Auftrag von Dimitri Nilsen<br />
Gesendet: Dienstag, 31. Juli 2012 18:20<br />
An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE<br />
Betreff: Publishing User DNs<br />
Dear Sites,<br />
according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting<br />
Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box<br />
<br />
Status of releases<br />
Sites that support WLCG VOs should update to EMI release until 1st October. At least to EMI 1. gLite releases should not longer be <br />
supported. We at KIT are currently updating our services (WMSs, sBDIIs, CREAMs). WNs will follow. Dimitri will send around a list <br />
of versions and deadlines. 1st October will be a little bit unrealistic.<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
- all WNs and CEs updated to SL6 and EMI-2/UMD-2<br />
- one SE node, APEL node, site BDII still running glite 3.2<br />
- What is the status of this [https://operations-portal.egi.eu/broadcast/archive/id/725 EGI broadcast]?<br />
* KIT (GridKa, FZK-LCG2)<br />
* KIT (Uni Karlsruhe)<br />
emi migration. plans to update WN to EMI 2. any experience whit emi-wn?<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
- deployement of CVMFS<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:14092012&diff=40725NGI DE CH Operations Center:Operations Meeting:140920122012-09-20T11:08:38Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI: next week Technical Forum Prague <br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf<br />
<br />
90%<br />
<br />
Three sites not hit the target:<br />
RWTH-Aachen 68%<br />
UNI-FREIBURG 59%<br />
UNI-SIEGEN-HEP 31%<br />
<br />
* Monitoring<br />
update 17 tested. Update 17 include sensors for Globus, Unicore and EMI 2 WNs<br />
<br />
* Staged rollout/updates<br />
DN Publishing<br />
-----Ursprüngliche Nachricht-----<br />
Von: Operations of NGI-DE [mailto:NGI-DE-OPERATIONS@LISTSERV.DFN.DE] Im Auftrag von Dimitri Nilsen<br />
Gesendet: Dienstag, 31. Juli 2012 18:20<br />
An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE<br />
Betreff: Publishing User DNs<br />
Dear Sites,<br />
according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting<br />
Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box<br />
<br />
Status of releases<br />
Sites that support WLCG VOs should update to EMI release until 1st October. At least to EMI 1. gLite releases should not longer be <br />
supported. We at KIT are currently updating our services (WMSs, sBDIIs, CREAMs). WNs will follow. Dimitri will send around a list <br />
of versions. 1st October will be unrealistic.<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
- all WNs and CEs updated to SL6 and EMI-2/UMD-2<br />
- one SE node, APEL node, site BDII still running glite 3.2<br />
- What is the status of this [https://operations-portal.egi.eu/broadcast/archive/id/725 EGI broadcast]?<br />
* KIT (GridKa, FZK-LCG2)<br />
* KIT (Uni Karlsruhe)<br />
emi migration. plans to update WN to EMI 2. any experience whit emi-wn?<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
- deployement of CVMFS<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:14092012&diff=40724NGI DE CH Operations Center:Operations Meeting:140920122012-09-20T11:04:45Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI: next week Technical Forum Prague <br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1332&version=2&filename=EGI_Aug2012.pdf<br />
<br />
90%<br />
<br />
Three sites not hit the target:<br />
RWTH-Aachen 68%<br />
UNI-FREIBURG 59%<br />
UNI-SIEGEN-HEP 31%<br />
<br />
* Monitoring<br />
update 17 tested. Update 17 include sensors for Globus, Unicore and EMI 2 WNs<br />
<br />
* Staged rollout/updates<br />
DN Publishing<br />
-----Ursprüngliche Nachricht-----<br />
Von: Operations of NGI-DE [mailto:NGI-DE-OPERATIONS@LISTSERV.DFN.DE] Im Auftrag von Dimitri Nilsen<br />
Gesendet: Dienstag, 31. Juli 2012 18:20<br />
An: NGI-DE-OPERATIONS@LISTSERV.DFN.DE<br />
Betreff: Publishing User DNs<br />
Dear Sites,<br />
according to "Grid Policy on the Handling of User-Level Job Accounting Data" sites should publish User DNs for accounting<br />
Please ensure you have publishGlobalUserName="yes" in publisher-config.xml at your apel box<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
- all WNs and CEs updated to SL6 and EMI-2/UMD-2<br />
- one SE node, APEL node, site BDII still running glite 3.2<br />
- What is the status of this [https://operations-portal.egi.eu/broadcast/archive/id/725 EGI broadcast]?<br />
* KIT (GridKa, FZK-LCG2)<br />
* KIT (Uni Karlsruhe)<br />
emi migration. plans to update WN to EMI 2. any experience whit emi-wn?<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
- deployement of CVMFS<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:06072012&diff=38317NGI DE CH Operations Center:Operations Meeting:060720122012-07-13T12:13:05Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI Tech.Forum 2012 in Prague<br />
http://tf2012.egi.eu/<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1251&version=2&filename=EGI_Jun2012.pdf<br />
Core services: 100%<br />
NGI_DE 95 % 95 %<br />
all sites hit a/r :)<br />
DESY reported we would have more than 95% for the region if KIT did not publish the wrong number of HepSpecs. <br />
<br />
* Monitoring<br />
waiting for update 17 for NGI-DE Nagios probes<br />
<br />
* Staged rollout/updates<br />
<br />
Middleware Baseline Versions for WLCG<br />
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions<br />
@all: We all should try to update to the listed versions. In our region a lot of sites are running older versions.<br />
<br />
Wrong value of site location in GOC-DB<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2386<br />
Dimitri from KIT sent an email around. Affected are DESY-Zeuthen, DGI-Referenceinstallation, KIT (already fixed), GSI, <br />
Maigrid, MPI-K, RWTH Aachen, Uni-Bonn, Uni-Dortmund<br />
<br />
Dimitri/KIT: Will send a survey about the baseline versions around before the Technical Forum in Prague<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH (Dmitri Ozerov)<br />
- running smoothly<br />
- next two weeks we increase # of WNs<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- running smoothly<br />
- Update of one CE, SE, UI and WNs to SL6 x86_64 and EMI-2<br />
- Staged rollout of DPM_mysql, WN +TORQUE_client for UMD-2<br />
- SE (DPM_mysql) without problems, looking at Known Problems and GGUS for<br />
other products strongly recommended<br />
(e. g. GGUS #82746, #82899, #83398, #83548, #83692, ...)<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- emi update ongoing<br />
- We plan to have some WNs connected to Grid Engine (batch system)<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- running smoothly<br />
- decommissioning a few old pools, migrating data to new pools<br />
- both CREAMS (BDII service) stop to work, this is fixed now<br />
* RWTH Aachen<br />
* SCAI (andre Gemuend)<br />
- problems with DECH VOMS, especially with java dependancy. <br />
I apologize for not sending notifications about registration problems <br />
@all: Please register again, if someone is missing<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton Gamel)<br />
- so far everything is ok<br />
- updated torque server to version 2.5.12. No new packages in the Apel repository were available so we did by our own <br />
and all run smoothly now<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paolo)<br />
- last Wednesday we had a maintenance:<br />
-- enabled CERNVMFS for ATLAS<br />
-- installed third CREAM for rolling update mechanism<br />
-- updated dCache to SL 5.7<br />
-- updated TORQUE to latest version 2.4.17<br />
- rest is fine <br />
* PSI<br />
* Switch<br />
- be on vacation untill 1st of August, looking with Dimitri (KIT) for a deputy for the ROD shift.<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
MPI-K is asking for help (GGUS Ticket 83822). Test for CREAM CEs are failing. Problem is not known. Job is done but <br />
without exit code 0. In consequence Nagios tries to restart job.<br />
<br />
Uni Bonn is suspended, seems to had the similar problem, but it was the wrong setup of the infosystem.<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
Is there a problem to mask alarms? We have many tickets to the same problem. Conclusion: Tickets were not handled in the <br />
right way. More comfortable is to mask the alarm.<br />
<br />
Performance:<br />
1.number: alarms over 72 hours<br />
2.number: ticket closed, experation date expired <br />
2012-01 NGI_DE 6 12<br />
2012-02 NGI_DE 2 1<br />
2012-03 NGI_DE 0 4<br />
2012-04 NGI_DE 2 14<br />
2012-05 NGI_DE 0 37<br />
2012-06 NGI_DE 0 8<br />
2012-07 NGI_DE 0 4<br />
<br />
Conclusion: Tickets related to second number/tickets should be monitored more closely.<br />
<br />
==AOB==<br />
Next Meeting at 27.7 or 3.8.<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:06072012&diff=38316NGI DE CH Operations Center:Operations Meeting:060720122012-07-13T12:08:37Z<p>Tkoenig: /* AOB */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI Tech.Forum 2012 in Prague<br />
http://tf2012.egi.eu/<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1251&version=2&filename=EGI_Jun2012.pdf<br />
Core services: 100%<br />
NGI_DE 95 % 95 %<br />
all sites hit a/r :)<br />
DESY reported we would have more than 95% for the region if KIT did not publish the wrong number of HepSpecs. <br />
<br />
* Monitoring<br />
waiting for update 17 for NGI-DE Nagios probes<br />
<br />
* Staged rollout/updates<br />
<br />
Middleware Baseline Versions for WLCG<br />
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions<br />
@all: We all should try to update to the listed versions. In our region a lot of sites are running older versions.<br />
<br />
Wrong value of site location in GOC-DB<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2386<br />
Dimitri from KIT sent an email around. Affected are DESY-Zeuthen, DGI-Referenceinstallation, KIT (already fixed), GSI, <br />
Maigrid, MPI-K, RWTH Aachen, Uni-Bonn, Uni-Dortmund<br />
<br />
Dimitri/KIT: Will send a survey about the baseline versions around before the Technical Forum in Prague<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH (Dmitri Ozerov)<br />
- running smoothly<br />
- next two weeks we increase # of WNs<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- running smoothly<br />
- Update of one CE, SE, UI and WNs to SL6 x86_64 and EMI-2<br />
- Staged rollout of DPM_mysql, WN +TORQUE_client for UMD-2<br />
- SE (DPM_mysql) without problems, looking at Known Problems and GGUS for<br />
other products strongly recommended<br />
(e. g. GGUS #82746, #82899, #83398, #83548, #83692, ...)<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- emi update ongoing<br />
- We plan to have some WNs connected to Grid Engine (batch system)<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- running smoothly<br />
- decommissioning a few old pools, migrating data to new pools<br />
- both CREAMS (BDII service) stop to work, this is fixed now<br />
* RWTH Aachen<br />
* SCAI (andre Gemuend)<br />
- problems with DECH VOMS, especially with java dependancy. <br />
I apologize for not sending notifications about registration problems <br />
@all: Please register again, if someone is missing<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton Gamel)<br />
- so far everything is ok<br />
- updated torque server to version 2.5.12. No new packages in the Apel repository were available so we did by our own <br />
and all run smoothly now<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paolo)<br />
- last Wednesday we had a maintnenance:<br />
-- enabled CERNVMFS for ATLAS<br />
-- installed third CREAM for rolling update mechanism<br />
-- updated dCache to SL 5.7<br />
-- updated TORQUE to latest version 2.4.17<br />
- rest is fine <br />
* PSI<br />
* Switch<br />
- be on vacation untill 1st of August, looking with Dimitri (KIT) for a deputy for the ROD shift.<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
MPI-K is asking for help (GGUS Ticket 83822). Test for CREAM CEs are failing. Problem is not known. Job is done but <br />
without exit code 0. In consequence Nagios tries to restart job.<br />
<br />
Uni Bonn is suspended, seems to had the similar problem, but it was the wrong setup of the infosystem.<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
Is there a problem to mask alarms? We have many tickets to the same problem. Conclusion: Tickets were not handled in the <br />
right way. More comfortable is to mask the alarm.<br />
<br />
Performance:<br />
1.number: alarms over 72 hours<br />
2.number: ticket closed, experation date expired <br />
2012-01 NGI_DE 6 12<br />
2012-02 NGI_DE 2 1<br />
2012-03 NGI_DE 0 4<br />
2012-04 NGI_DE 2 14<br />
2012-05 NGI_DE 0 37<br />
2012-06 NGI_DE 0 8<br />
2012-07 NGI_DE 0 4<br />
<br />
Conclusion: Tickets related to second number/tickets should be monitored more closely.<br />
<br />
==AOB==<br />
Next Meeting at 27.7 or 3.8.<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:06072012&diff=38315NGI DE CH Operations Center:Operations Meeting:060720122012-07-13T12:05:52Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI Tech.Forum 2012 in Prague<br />
http://tf2012.egi.eu/<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1251&version=2&filename=EGI_Jun2012.pdf<br />
Core services: 100%<br />
NGI_DE 95 % 95 %<br />
all sites hit a/r :)<br />
DESY reported we would have more than 95% for the region if KIT did not publish the wrong number of HepSpecs. <br />
<br />
* Monitoring<br />
waiting for update 17 for NGI-DE Nagios probes<br />
<br />
* Staged rollout/updates<br />
<br />
Middleware Baseline Versions for WLCG<br />
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions<br />
@all: We all should try to update to the listed versions. In our region a lot of sites are running older versions.<br />
<br />
Wrong value of site location in GOC-DB<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2386<br />
Dimitri from KIT sent an email around. Affected are DESY-Zeuthen, DGI-Referenceinstallation, KIT (already fixed), GSI, <br />
Maigrid, MPI-K, RWTH Aachen, Uni-Bonn, Uni-Dortmund<br />
<br />
Dimitri/KIT: Will send a survey about the baseline versions around before the Technical Forum in Prague<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH (Dmitri Ozerov)<br />
- running smoothly<br />
- next two weeks we increase # of WNs<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- running smoothly<br />
- Update of one CE, SE, UI and WNs to SL6 x86_64 and EMI-2<br />
- Staged rollout of DPM_mysql, WN +TORQUE_client for UMD-2<br />
- SE (DPM_mysql) without problems, looking at Known Problems and GGUS for<br />
other products strongly recommended<br />
(e. g. GGUS #82746, #82899, #83398, #83548, #83692, ...)<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- emi update ongoing<br />
- We plan to have some WNs connected to Grid Engine (batch system)<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- running smoothly<br />
- decommissioning a few old pools, migrating data to new pools<br />
- both CREAMS (BDII service) stop to work, this is fixed now<br />
* RWTH Aachen<br />
* SCAI (andre Gemuend)<br />
- problems with DECH VOMS, especially with java dependancy. <br />
I apologize for not sending notifications about registration problems <br />
@all: Please register again, if someone is missing<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton Gamel)<br />
- so far everything is ok<br />
- updated torque server to version 2.5.12. No new packages in the Apel repository were available so we did by our own <br />
and all run smoothly now<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paolo)<br />
- last Wednesday we had a maintnenance:<br />
-- enabled CERNVMFS for ATLAS<br />
-- installed third CREAM for rolling update mechanism<br />
-- updated dCache to SL 5.7<br />
-- updated TORQUE to latest version 2.4.17<br />
- rest is fine <br />
* PSI<br />
* Switch<br />
- be on vacation untill 1st of August, looking with Dimitri (KIT) for a deputy for the ROD shift.<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
MPI-K is asking for help (GGUS Ticket 83822). Test for CREAM CEs are failing. Problem is not known. Job is done but <br />
without exit code 0. In consequence Nagios tries to restart job.<br />
<br />
Uni Bonn is suspended, seems to had the similar problem, but it was the wrong setup of the infosystem.<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
Is there a problem to mask alarms? We have many tickets to the same problem. Conclusion: Tickets were not handled in the <br />
right way. More comfortable is to mask the alarm.<br />
<br />
Performance:<br />
1.number: alarms over 72 hours<br />
2.number: ticket closed, experation date expired <br />
2012-01 NGI_DE 6 12<br />
2012-02 NGI_DE 2 1<br />
2012-03 NGI_DE 0 4<br />
2012-04 NGI_DE 2 14<br />
2012-05 NGI_DE 0 37<br />
2012-06 NGI_DE 0 8<br />
2012-07 NGI_DE 0 4<br />
<br />
Conclusion: Tickets related to second number/tickets should be monitored more closely.<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:06072012&diff=38314NGI DE CH Operations Center:Operations Meeting:060720122012-07-13T12:03:56Z<p>Tkoenig: /* Status ROD */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI Tech.Forum 2012 in Prague<br />
http://tf2012.egi.eu/<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1251&version=2&filename=EGI_Jun2012.pdf<br />
Core services: 100%<br />
NGI_DE 95 % 95 %<br />
all sites hit a/r :)<br />
DESY reported we would have more than 95% for the region if KIT did not publish the wrong number of HepSpecs. <br />
<br />
* Monitoring<br />
waiting for update 17 for NGI-DE Nagios probes<br />
<br />
* Staged rollout/updates<br />
<br />
Middleware Baseline Versions for WLCG<br />
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions<br />
@all: We all should try to update to the listed versions. In our region a lot of sites are running older versions.<br />
<br />
Wrong value of site location in GOC-DB<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2386<br />
Dimitri from KIT sent an email around. Affected are DESY-Zeuthen, DGI-Referenceinstallation, KIT (already fixed), GSI, <br />
Maigrid, MPI-K, RWTH Aachen, Uni-Bonn, Uni-Dortmund<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH (Dmitri Ozerov)<br />
- running smoothly<br />
- next two weeks we increase # of WNs<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- running smoothly<br />
- Update of one CE, SE, UI and WNs to SL6 x86_64 and EMI-2<br />
- Staged rollout of DPM_mysql, WN +TORQUE_client for UMD-2<br />
- SE (DPM_mysql) without problems, looking at Known Problems and GGUS for<br />
other products strongly recommended<br />
(e. g. GGUS #82746, #82899, #83398, #83548, #83692, ...)<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- emi update ongoing<br />
- We plan to have some WNs connected to Grid Engine (batch system)<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- running smoothly<br />
- decommissioning a few old pools, migrating data to new pools<br />
- both CREAMS (BDII service) stop to work, this is fixed now<br />
* RWTH Aachen<br />
* SCAI (andre Gemuend)<br />
- problems with DECH VOMS, especially with java dependancy. <br />
I apologize for not sending notifications about registration problems <br />
@all: Please register again, if someone is missing<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton Gamel)<br />
- so far everything is ok<br />
- updated torque server to version 2.5.12. No new packages in the Apel repository were available so we did by our own <br />
and all run smoothly now<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paolo)<br />
- last Wednesday we had a maintnenance:<br />
-- enabled CERNVMFS for ATLAS<br />
-- installed third CREAM for rolling update mechanism<br />
-- updated dCache to SL 5.7<br />
-- updated TORQUE to latest version 2.4.17<br />
- rest is fine <br />
* PSI<br />
* Switch<br />
- be on vacation untill 1st of August, looking with Dimitri (KIT) for a deputy for the ROD shift.<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
MPI-K is asking for help (GGUS Ticket 83822). Test for CREAM CEs are failing. Problem is not known. Job is done but <br />
without exit code 0. In consequence Nagios tries to restart job.<br />
<br />
Uni Bonn is suspended, seems to had the similar problem, but it was the wrong setup of the infosystem.<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
Is there a problem to mask alarms? We have many tickets to the same problem. Conclusion: Tickets were not handled in the <br />
right way. More comfortable is to mask the alarm.<br />
<br />
Performance:<br />
1.number: alarms over 72 hours<br />
2.number: ticket closed, experation date expired <br />
2012-01 NGI_DE 6 12<br />
2012-02 NGI_DE 2 1<br />
2012-03 NGI_DE 0 4<br />
2012-04 NGI_DE 2 14<br />
2012-05 NGI_DE 0 37<br />
2012-06 NGI_DE 0 8<br />
2012-07 NGI_DE 0 4<br />
<br />
Conclusion: Tickets related to second number/tickets should be monitored more closely.<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:06072012&diff=38313NGI DE CH Operations Center:Operations Meeting:060720122012-07-13T11:49:57Z<p>Tkoenig: </p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI Tech.Forum 2012 in Prague<br />
http://tf2012.egi.eu/<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1251&version=2&filename=EGI_Jun2012.pdf<br />
Core services: 100%<br />
NGI_DE 95 % 95 %<br />
all sites hit a/r :)<br />
DESY reported we would have more than 95% for the region if KIT did not publish the wrong number of HepSpecs. <br />
<br />
* Monitoring<br />
waiting for update 17 for NGI-DE Nagios probes<br />
<br />
* Staged rollout/updates<br />
<br />
Middleware Baseline Versions for WLCG<br />
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions<br />
@all: We all should try to update to the listed versions. In our region a lot of sites are running older versions.<br />
<br />
Wrong value of site location in GOC-DB<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2386<br />
Dimitri from KIT sent an email around. Affected are DESY-Zeuthen, DGI-Referenceinstallation, KIT (already fixed), GSI, <br />
Maigrid, MPI-K, RWTH Aachen, Uni-Bonn, Uni-Dortmund<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH (Dmitri Ozerov)<br />
- running smoothly<br />
- next two weeks we increase # of WNs<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- running smoothly<br />
- Update of one CE, SE, UI and WNs to SL6 x86_64 and EMI-2<br />
- Staged rollout of DPM_mysql, WN +TORQUE_client for UMD-2<br />
- SE (DPM_mysql) without problems, looking at Known Problems and GGUS for<br />
other products strongly recommended<br />
(e. g. GGUS #82746, #82899, #83398, #83548, #83692, ...)<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- emi update ongoing<br />
- We plan to have some WNs connected to Grid Engine (batch system)<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- running smoothly<br />
- decommissioning a few old pools, migrating data to new pools<br />
- both CREAMS (BDII service) stop to work, this is fixed now<br />
* RWTH Aachen<br />
* SCAI (andre Gemuend)<br />
- problems with DECH VOMS, especially with java dependancy. <br />
I apologize for not sending notifications about registration problems <br />
@all: Please register again, if someone is missing<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg (Anton Gamel)<br />
- so far everything is ok<br />
- updated torque server to version 2.5.12. No new packages in the Apel repository were available so we did by our own <br />
and all run smoothly now<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paolo)<br />
- last Wednesday we had a maintnenance:<br />
-- enabled CERNVMFS for ATLAS<br />
-- installed third CREAM for rolling update mechanism<br />
-- updated dCache to SL 5.7<br />
-- updated TORQUE to latest version 2.4.17<br />
- rest is fine <br />
* PSI<br />
* Switch<br />
- be on vacation untill 1st of August, looking with Dimitri (KIT) for a deputy for the ROD shift.<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
MPI-K<br />
83822<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
mask alarms?<br />
<br />
Performance:<br />
2012-01 NGI_DE 6 12<br />
2012-02 NGI_DE 2 1<br />
2012-03 NGI_DE 0 4<br />
2012-04 NGI_DE 2 14<br />
2012-05 NGI_DE 0 37<br />
2012-06 NGI_DE 0 8<br />
2012-07 NGI_DE 0 4<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:06072012&diff=38306NGI DE CH Operations Center:Operations Meeting:060720122012-07-13T11:23:32Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
EGI Tech.Forum 2012 in Prague<br />
http://tf2012.egi.eu/<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/RetrieveFile?docid=1251&version=2&filename=EGI_Jun2012.pdf<br />
Core services: 100%<br />
NGI_DE 95 % 95 %<br />
all sites hit a/r :)<br />
DESY reported we would have more than 95% for the region if KIT did not publish the wrong number of HepSpecs. <br />
<br />
* Monitoring<br />
waiting for update 17 for NGI-DE Nagios probes<br />
<br />
* Staged rollout/updates<br />
<br />
Middleware Baseline Versions for WLCG<br />
https://twiki.cern.ch/twiki/bin/view/LCG/WLCGBaselineVersions<br />
@all: We all should try to update to the listed versions. In our region a lot of sites are running older versions.<br />
<br />
Wrong value of site location in GOC-DB<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2386<br />
Dimitri from KIT sent an email around. Affected are DESY-Zeuthen, DGI-Referenceinstallation, KIT (already fixed), GSI, <br />
Maigrid, MPI-K, RWTH Aachen, Uni-Bonn, Uni-Dortmund<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
- Update of one CE, SE, UI and WNs to SL6 x86_64 and EMI-2<br />
- Staged rollout of DPM_mysql, WN +TORQUE_client for UMD-2<br />
- SE (DPM_mysql) without problems, looking at Known Problems and GGUS for<br />
other products strongly recommended<br />
(e. g. GGUS #82746, #82899, #83398, #83548, #83692, ...)<br />
* KIT (GridKa, FZK-LCG2)<br />
emi update ongoing<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
MPI-K<br />
83822<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
mask alarms?<br />
<br />
Performance:<br />
2012-01 NGI_DE 6 12<br />
2012-02 NGI_DE 2 1<br />
2012-03 NGI_DE 0 4<br />
2012-04 NGI_DE 2 14<br />
2012-05 NGI_DE 0 37<br />
2012-06 NGI_DE 0 8<br />
2012-07 NGI_DE 0 4<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:13042012&diff=35649NGI DE CH Operations Center:Operations Meeting:130420122012-04-23T08:13:52Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed].<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/ShowDocument?docid=1091<br />
<br />
BDII: 92%. but see: https://ggus.eu/ws/ticket_info.php?ticket=81094 (caused by downtime of NGI-DE Nagios, recalculation of <br />
values was not done)<br />
<br />
NGI_DE: A:92 % R:96 %. first green month.<br />
<br />
* Monitoring<br />
ntr<br />
<br />
* Staged rollout/updates<br />
sites with CREAM CEs: enabling glexec in GOC-DB<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda Romberg)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
auger SoftwareManager role.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
ntr<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (via Email)<br />
All has been working fine until yesterday, when a network problem caused the GPFS scratch filesystem to die.<br />
We were unable to recover it and today we have rebuilt it from scratch:<br />
ARC is still not working, but all gLite/EGI services are up and running.<br />
This was an unscheduled downtime that will certainly affect A&R of this month.<br />
<br />
Next week the grid cluster at CSCS enters a scheduled downtime.<br />
We will move the hardware from the old building to the new datacentre<br />
and introduce major changes in the infrastructure: new WNs to replace<br />
old Sun Blades and new network design, we'll move from hybrid<br />
ethernet/infiniband to all-infiniband. The downtime should last no longer than 3 weeks.<br />
* PSI<br />
* Switch (Alessandro Ussai)<br />
- not much to report, we decommissioned in 2011 our site at SWITCH<br />
- we only have a giis which is run for the ARC sites in NGI_CH this is why we are not attending regularly the op meeting anymore <br />
as we don't have resources we will attend sporadically within the monitoring tasks though, when necessary<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
Again tickets for ROD.<br />
<br />
* Any problematic tickets? Ticket from central COD about alarms that are older than 72 hours. Situation unclear: Who take action? ROD shifter checked the dashboard but alarm dissapeared. Strange behaviour of the dashboard. There was a thread via our email list. We/Dimitri/KIT will report this in the escalated tickets.<br />
* We handle our tickets (user tickets in the NGI-DE helpdesk) really softly. We have to think about escalation procedures/escalation table with expiration dates dependent on the priority of ticket.<br />
* Handover of the ROD shift<br />
* ROD shift schedule was updated from Dimitri: https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:13042012&diff=35648NGI DE CH Operations Center:Operations Meeting:130420122012-04-23T08:13:35Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed].<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/ShowDocument?docid=1091<br />
<br />
BDII: 92%. but see: https://ggus.eu/ws/ticket_info.php?ticket=81094 (caused by downtime of NGI-DE Nagios, recalculation of values was <br />
not done)<br />
<br />
NGI_DE: A:92 % R:96 %. first green month.<br />
<br />
* Monitoring<br />
ntr<br />
<br />
* Staged rollout/updates<br />
sites with CREAM CEs: enabling glexec in GOC-DB<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda Romberg)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
auger SoftwareManager role.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
ntr<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (via Email)<br />
All has been working fine until yesterday, when a network problem caused the GPFS scratch filesystem to die.<br />
We were unable to recover it and today we have rebuilt it from scratch:<br />
ARC is still not working, but all gLite/EGI services are up and running.<br />
This was an unscheduled downtime that will certainly affect A&R of this month.<br />
<br />
Next week the grid cluster at CSCS enters a scheduled downtime.<br />
We will move the hardware from the old building to the new datacentre<br />
and introduce major changes in the infrastructure: new WNs to replace<br />
old Sun Blades and new network design, we'll move from hybrid<br />
ethernet/infiniband to all-infiniband. The downtime should last no longer than 3 weeks.<br />
* PSI<br />
* Switch (Alessandro Ussai)<br />
- not much to report, we decommissioned in 2011 our site at SWITCH<br />
- we only have a giis which is run for the ARC sites in NGI_CH this is why we are not attending regularly the op meeting anymore <br />
as we don't have resources we will attend sporadically within the monitoring tasks though, when necessary<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
Again tickets for ROD.<br />
<br />
* Any problematic tickets? Ticket from central COD about alarms that are older than 72 hours. Situation unclear: Who take action? ROD shifter checked the dashboard but alarm dissapeared. Strange behaviour of the dashboard. There was a thread via our email list. We/Dimitri/KIT will report this in the escalated tickets.<br />
* We handle our tickets (user tickets in the NGI-DE helpdesk) really softly. We have to think about escalation procedures/escalation table with expiration dates dependent on the priority of ticket.<br />
* Handover of the ROD shift<br />
* ROD shift schedule was updated from Dimitri: https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:13042012&diff=35645NGI DE CH Operations Center:Operations Meeting:130420122012-04-23T07:56:22Z<p>Tkoenig: /* Status ROD */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed].<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/ShowDocument?docid=1091<br />
<br />
BDII: 92%. but see: https://ggus.eu/ws/ticket_info.php?ticket=81094 (caused by downtime of NGI-DE Nagios, recalculation of values was not done)<br />
<br />
NGI_DE: A:92 % R:96 %. first green month.<br />
<br />
* Monitoring<br />
ntr<br />
<br />
* Staged rollout/updates<br />
sites with CREAM CEs: enabling glexec in GOC-DB<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda Romberg)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
auger SoftwareManager role.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
ntr<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (via Email)<br />
All has been working fine until yesterday, when a network problem caused the GPFS scratch filesystem to die.<br />
We were unable to recover it and today we have rebuilt it from scratch:<br />
ARC is still not working, but all gLite/EGI services are up and running.<br />
This was an unscheduled downtime that will certainly affect A&R of this month.<br />
<br />
Next week the grid cluster at CSCS enters a scheduled downtime.<br />
We will move the hardware from the old building to the new datacentre<br />
and introduce major changes in the infrastructure: new WNs to replace<br />
old Sun Blades and new network design, we'll move from hybrid<br />
ethernet/infiniband to all-infiniband. The downtime should last no longer than 3 weeks.<br />
* PSI<br />
* Switch (Alessandro Ussai)<br />
- not much to report, we decommissioned in 2011 our site at SWITCH<br />
- we only have a giis which is run for the ARC sites in NGI_CH this is why we are not attending regularly the op meeting anymore <br />
as we don't have resources we will attend sporadically within the monitoring tasks though, when necessary<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
Again tickets for ROD.<br />
<br />
* Any problematic tickets? Ticket from central COD about alarms that are older than 72 hours. Situation unclear: Who take action? ROD shifter checked the dashboard but alarm dissapeared. Strange behaviour of the dashboard. There was a thread via our email list. We/Dimitri/KIT will report this in the escalated tickets.<br />
* We handle our tickets (user tickets in the NGI-DE helpdesk) really softly. We have to think about escalation procedures/escalation table with expiration dates dependent on the priority of ticket.<br />
* Handover of the ROD shift<br />
* ROD shift schedule was updated from Dimitri: https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:13042012&diff=35643NGI DE CH Operations Center:Operations Meeting:130420122012-04-23T07:45:06Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed].<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/ShowDocument?docid=1091<br />
<br />
BDII: 92%. but see: https://ggus.eu/ws/ticket_info.php?ticket=81094 (caused by downtime of NGI-DE Nagios, recalculation of values was not done)<br />
<br />
NGI_DE: A:92 % R:96 %. first green month.<br />
<br />
* Monitoring<br />
ntr<br />
<br />
* Staged rollout/updates<br />
sites with CREAM CEs: enabling glexec in GOC-DB<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda Romberg)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
auger SoftwareManager role.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
ntr<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (via Email)<br />
All has been working fine until yesterday, when a network problem caused the GPFS scratch filesystem to die.<br />
We were unable to recover it and today we have rebuilt it from scratch:<br />
ARC is still not working, but all gLite/EGI services are up and running.<br />
This was an unscheduled downtime that will certainly affect A&R of this month.<br />
<br />
Next week the grid cluster at CSCS enters a scheduled downtime.<br />
We will move the hardware from the old building to the new datacentre<br />
and introduce major changes in the infrastructure: new WNs to replace<br />
old Sun Blades and new network design, we'll move from hybrid<br />
ethernet/infiniband to all-infiniband. The downtime should last no longer than 3 weeks.<br />
* PSI<br />
* Switch (Alessandro Ussai)<br />
- not much to report, we decommissioned in 2011 our site at SWITCH<br />
- we only have a giis which is run for the ARC sites in NGI_CH this is why we are not attending regularly the op meeting anymore <br />
as we don't have resources we will attend sporadically within the monitoring tasks though, when necessary<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
Again tickets for ROD.<br />
<br />
* Any problematic tickets? Ticket from central COD about alarms that are older than 72 hours. Situation unclear: Who take action. Alarm dissapeared.<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:13042012&diff=35642NGI DE CH Operations Center:Operations Meeting:130420122012-04-23T07:44:41Z<p>Tkoenig: /* Status ROD */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed].<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/ShowDocument?docid=1091<br />
<br />
BDII: 92%. but see: https://ggus.eu/ws/ticket_info.php?ticket=81094 (caused by downtime of NGI-DE Nagios, recalculation of values was not done)<br />
<br />
NGI_DE: A:92 % R:96 %. first green month.<br />
<br />
* Monitoring<br />
ntr<br />
<br />
* Staged rollout/updates<br />
sites with CREAM CEs: enabling glexec in GOC-DB<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda Romberg)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
auger SoftwareManager role.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (via Email)<br />
All has been working fine until yesterday, when a network problem caused the GPFS scratch filesystem to die.<br />
We were unable to recover it and today we have rebuilt it from scratch:<br />
ARC is still not working, but all gLite/EGI services are up and running.<br />
This was an unscheduled downtime that will certainly affect A&R of this month.<br />
<br />
Next week the grid cluster at CSCS enters a scheduled downtime.<br />
We will move the hardware from the old building to the new datacentre<br />
and introduce major changes in the infrastructure: new WNs to replace<br />
old Sun Blades and new network design, we'll move from hybrid<br />
ethernet/infiniband to all-infiniband. The downtime should last no longer than 3 weeks.<br />
* PSI<br />
* Switch (Alessandro Ussai)<br />
- not much to report, we decommissioned in 2011 our site at SWITCH<br />
- we only have a giis which is run for the ARC sites in NGI_CH this is why we are not attending regularly the op meeting anymore <br />
as we don't have resources we will attend sporadically within the monitoring tasks though, when necessary<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
Again tickets for ROD.<br />
<br />
* Any problematic tickets? Ticket from central COD about alarms that are older than 72 hours. Situation unclear: Who take action. Alarm dissapeared.<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:13042012&diff=35641NGI DE CH Operations Center:Operations Meeting:130420122012-04-23T07:42:14Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed].<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/ShowDocument?docid=1091<br />
<br />
BDII: 92%. but see: https://ggus.eu/ws/ticket_info.php?ticket=81094 (caused by downtime of NGI-DE Nagios, recalculation of values was not done)<br />
<br />
NGI_DE: A:92 % R:96 %. first green month.<br />
<br />
* Monitoring<br />
ntr<br />
<br />
* Staged rollout/updates<br />
sites with CREAM CEs: enabling glexec in GOC-DB<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda Romberg)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
auger SoftwareManager role.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (via Email)<br />
All has been working fine until yesterday, when a network problem caused the GPFS scratch filesystem to die.<br />
We were unable to recover it and today we have rebuilt it from scratch:<br />
ARC is still not working, but all gLite/EGI services are up and running.<br />
This was an unscheduled downtime that will certainly affect A&R of this month.<br />
<br />
Next week the grid cluster at CSCS enters a scheduled downtime.<br />
We will move the hardware from the old building to the new datacentre<br />
and introduce major changes in the infrastructure: new WNs to replace<br />
old Sun Blades and new network design, we'll move from hybrid<br />
ethernet/infiniband to all-infiniband. The downtime should last no longer than 3 weeks.<br />
* PSI<br />
* Switch (Alessandro Ussai)<br />
- not much to report, we decommissioned in 2011 our site at SWITCH<br />
- we only have a giis which is run for the ARC sites in NGI_CH this is why we are not attending regularly the op meeting anymore <br />
as we don't have resources we will attend sporadically within the monitoring tasks though, when necessary<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
Again tickets for ROD.<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:13042012&diff=35640NGI DE CH Operations Center:Operations Meeting:130420122012-04-23T07:38:03Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed].<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/ShowDocument?docid=1091<br />
<br />
BDII: 92%. but see: https://ggus.eu/ws/ticket_info.php?ticket=81094 (caused by downtime of NGI-DE Nagios, recalculation of values was not done)<br />
<br />
NGI_DE: A:92 % R:96 %. first green month.<br />
<br />
* Monitoring<br />
ntr<br />
<br />
* Staged rollout/updates<br />
sites with CREAM CEs: enabling glexec in GOC-DB<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda Romberg)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
auger SoftwareManager role.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
All has been working fine until yesterday, when a network problem caused the GPFS scratch filesystem to die.<br />
We were unable to recover it and today we have rebuilt it from scratch:<br />
ARC is still not working, but all gLite/EGI services are up and running.<br />
This was an unscheduled downtime that will certainly affect A&R of this month.<br />
<br />
Next week the grid cluster at CSCS enters a scheduled downtime.<br />
We will move the hardware from the old building to the new datacentre<br />
and introduce major changes in the infrastructure: new WNs to replace<br />
old Sun Blades and new network design, we'll move from hybrid<br />
ethernet/infiniband to all-infiniband. The downtime should last no longer than 3 weeks.<br />
* PSI<br />
* Switch (Alessandro Ussai)<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
Again tickets for ROD.<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:13042012&diff=35636NGI DE CH Operations Center:Operations Meeting:130420122012-04-23T07:21:20Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed].<br />
<br />
* Availability/reliability statistics<br />
https://documents.egi.eu/public/ShowDocument?docid=1091<br />
<br />
BDII: 92%. but see: https://ggus.eu/ws/ticket_info.php?ticket=81094<br />
<br />
NGI_DE: A:92 % R:96 %. first green month.<br />
<br />
* Monitoring<br />
* Staged rollout/updates<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich (Mathilda Romberg)<br />
ntr<br />
* Goegrid<br />
* GSI<br />
* ITWM<br />
* KIT (GridKa, FZK-LCG2)<br />
auger SoftwareManager role.<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
All has been working fine until yesterday, when a network problem caused the GPFS scratch filesystem to die.<br />
We were unable to recover it and today we have rebuilt it from scratch:<br />
ARC is still not working, but all gLite/EGI services are up and running.<br />
This was an unscheduled downtime that will certainly affect A&R of this month.<br />
<br />
Next week the grid cluster at CSCS enters a scheduled downtime.<br />
We will move the hardware from the old building to the new datacentre<br />
and introduce major changes in the infrastructure: new WNs to replace<br />
old Sun Blades and new network design, we'll move from hybrid<br />
ethernet/infiniband to all-infiniband. The downtime should last no longer than 3 weeks.<br />
* PSI<br />
* Switch (Alessandro Ussai)<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
Again tickets for ROD.<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34649NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:32:21Z<p>Tkoenig: /* Status ROD */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN only has 69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
<br />
High rate of unknown Nagios tests. We also contacted sites MPI-K, UNI-Dresden, UNI-Karlsruhe, UNI-Siegen per email. Reason: May be <br />
connection problems<br />
<br />
The NGI-DE Nagios was also down during our (KIT, FZKA-LCG2) downtime last week. We will recalculate the availability numbers for <br />
March.<br />
<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
For the surveys we need your feedback untill 20st of March via Email<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the related CREAM CE) instead of only one <br />
for the dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 <br />
and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue <br />
(Nagios send information/notification to the dashboard and the information/notification is not displayed correctly at the dashboard. <br />
In consequence the tickets hang in the dashboard longer than the problem persists)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now. See announcement section above.<br />
- short Telco meeting before the workshop at 13. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34648NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:28:44Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN only has 69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
<br />
High rate of unknown Nagios tests. We also contacted sites MPI-K, UNI-Dresden, UNI-Karlsruhe, UNI-Siegen per email. Reason: May be <br />
connection problems<br />
<br />
The NGI-DE Nagios was also down during our (KIT, FZKA-LCG2) downtime last week. We will recalculate the availability numbers for <br />
March.<br />
<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
For the surveys we need your feedback untill 20st of March via Email<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the related CREAM CE) instead of only one <br />
for the dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 <br />
and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue <br />
(Nagios send information/notification to the dashboard and the information/notification is not displayed correctly at the dashboard. In consequence the tickets hang in the dashboard longer than the problem persists)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now. See announcement section above.<br />
- short Telco meeting before the workshop at 13. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34647NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:26:31Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN only has 69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
<br />
High rate of unknown Nagios tests. We also contacted sites MPI-K, UNI-Dresden, UNI-Karlsruhe, UNI-Siegen per email. Reason: May be <br />
connection problems<br />
<br />
The NGI-DE Nagios was also down during our (KIT, FZKA-LCG2) downtime last week. We will recalculate the availability numbers for <br />
March.<br />
<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
For the surveys we need your feedback untill 20st of March via Email<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the related CREAM CE) instead of only one <br />
for the dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue <br />
(Nagios send information/notification to the dashboard and the information/notification is not displayed correctly at the dashboard. In consequence the tickets hang in the dashboard longer than the problem persists)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now. See announcement section above.<br />
- short Telco meeting before the workshop at 13. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34646NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:23:50Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN only has 69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
<br />
High rate of unknown Nagios tests. We also contacted sites MPI-K, UNI-Dresden, UNI-Karlsruhe, UNI-Siegen per email. Reason: May be <br />
connection problems<br />
<br />
The NGI-DE Nagios was also down during our (KIT, FZKA-LCG2) downtime last week. We will recalculate the availability numbers for <br />
March.<br />
<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
For the surveys we need your feedback untill 20st of March via Email<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue <br />
(Nagios send information/notification to the dashboard and the information/notification is not displayed correctly at the dashboard. In consequence the tickets hang in the dashboard longer than the problem persists)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now. See announcement section above.<br />
- short Telco meeting before the workshop at 13. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34645NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:22:37Z<p>Tkoenig: /* Announcements */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN only has 69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
<br />
High rate of unknown Nagios tests. We also contacted sites MPI-K, UNI-Dresden, UNI-Karlsruhe, UNI-Siegen per email. Reason: May be <br />
connection problems<br />
<br />
The NGI-DE Nagios was also down during our (KIT, FZKA-LCG2) downtime last week. We will recalculate the availability numbers for March.<br />
<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
For the surveys we need your feedback untill 20st of March via Email<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue <br />
(Nagios send information/notification to the dashboard and the information/notification is not displayed correctly at the dashboard. In consequence the tickets hang in the dashboard longer than the problem persists)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now. See announcement section above.<br />
- short Telco meeting before the workshop at 13. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34644NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:10:07Z<p>Tkoenig: /* AOB */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN=69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue <br />
(Nagios send information/notification to the dashboard and the information/notification is not displayed correctly at the dashboard. In consequence the tickets hang in the dashboard longer than the problem persists)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now. See announcement section above.<br />
- short Telco meeting before the workshop at 13. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34643NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:08:45Z<p>Tkoenig: /* AOB */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN=69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue <br />
(Nagios send information/notification to the dashboard and the information/notification is not displayed correctly at the dashboard. In consequence the tickets hang in the dashboard longer than the problem persists)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now. See announcement section above.<br />
- short Telco meeting before the workshop at 13. April<br />
- next meeting will be on Wednesday 18. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34642NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:07:08Z<p>Tkoenig: /* Status ROD */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN=69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue <br />
(Nagios send information/notification to the dashboard and the information/notification is not displayed correctly at the dashboard. In consequence the tickets hang in the dashboard longer than the problem persists)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now<br />
- next meeting will be on Wednesday 18. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34641NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T09:04:43Z<p>Tkoenig: /* AOB */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN=69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
- Reminder: Operation workshop in April: We hope many sites participate. You can register now<br />
- next meeting will be on Wednesday 18. April<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34639NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T08:59:59Z<p>Tkoenig: /* Status ROD */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN=69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets? No<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34638NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T08:59:18Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN=69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLAS jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:16032012&diff=34637NGI DE CH Operations Center:Operations Meeting:160320122012-03-22T08:58:59Z<p>Tkoenig: /* Round the sites */</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
** EGI Community Forum https://www.egi.eu/indico/conferenceDisplay.py?confId=679<br />
** [https://www.egi.eu/indico/conferenceDisplay.py?confId=820 Ops-Workshop im April 2012] ([https://www.egi.eu/indico/conferenceTimeTable.py?confId=820#all.detailed preliminary Agenda].<br />
* Availability/reliability statistics<br />
Feb:<br />
VoOps:<br />
Av/Re= 97%<br />
UNI BONN=69%<br />
<br />
BDII: Av/Re=99,3%<br />
<br />
* Monitoring<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1969<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1978<br />
* Staged rollout/updates<br />
<br />
UMD:<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-14-16-03-2012<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates/-/asset_publisher/Ir6q/content/update-13-17-02-2012<br />
<br />
* Survey<br />
[[Operations_Surveys#Usage_and_future_maintenance_of_deployed_software|Usage and future maintenance of deployed software]]<br />
[[Operations/Platform Deployment Survey]]<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
- 09.03 -> ITWM will not be able to attend the phone conference on 9/3/12. There is nothing special to report.<br />
- 16/3/12 ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
- 9.03 -> Announcement: downtime 13.03 - 15.03, full site maintainance<br />
- 16/3/12 downtime succeeded with two hours delay. During the downtime one central router was replaced by a new model, also we did <br />
some dCache updates. disk firmware and BIOS upgrades, Tape TSM update, complete cluster was reinstalled, changes of the central <br />
power supply, some LFC and 3D-DB were migrated to Oracle version 11g<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- ntr<br />
- a few problems with CREAMS and ATLA jobs. Problems are solved now.<br />
* RWTH Aachen<br />
* SCAI (Andre Gemuend)<br />
- Problems with SE: DPM daemon died. We will update EMI-DPM<br />
- ROD also filed two tickets (one concerning the DPM daemon problem and one concerning the CREAM CE) instead of only one for the <br />
dpm daemon: <br />
* Uni Bonn<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller Pfeeferkorn)<br />
- 16/3/12 Last week we had a two days downtime. We updated CREAM, Apel to the EMI release, dCache was updated to version 1.9.12 and dCache update included the upgrade from SL4 to SL5. Now all seems to be fine now <br />
- Problems with EMI release: After the update sometimes the Nagios test fails, with error message "job could not be submitted" or <br />
something like that. Comment by KIT: We also tested the BDII in EMI in preproduction. It was not described in the documentation <br />
that you need 3GB RAM.<br />
* Uni Freiburg<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
welcome LRZ Daniel Waldmann<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
09 27.02 04.03 Team5, LRZ <br />
10 05.03 11.03 Team6, CSCS/NGI_CH <br />
11 12.03 18.03 Team1, DESY <br />
12 19.03 25.03 Team2, FhG<br />
<br />
*Nagios<-->Dashboard issue<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2039<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=2038<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenighttps://wiki.egi.eu/w/index.php?title=NGI_DE_CH_Operations_Center:Operations_Meeting:02122011&diff=32807NGI DE CH Operations Center:Operations Meeting:021220112012-02-13T15:20:55Z<p>Tkoenig: Undo revision 32798 by Tkoenig (Talk)</p>
<hr />
<div>[[NGI_DE_CH_Operations_Center:Operations_Meeting|Operations Meeting Main]]<br />
<br />
==Introduction==<br />
<br />
* Minutes of last meeting<br />
<br />
==Announcements==<br />
<br />
* Meetings/conferences<br />
NGI-DE/NGI-CH/D-Grid Workshop in April<br />
Note: There is also a dCache Workshop in April. Date should be chosen carefully.<br />
<br />
The EGI Community Forum (http://go.egi.eu/cf12) will be in Munich 26-30th March 2012 and held in conjunction with the 2nd EMI <br />
Technical Conference. Abstract submission was open until 2/12/11.<br />
<br />
* Availability/reliability statistics<br />
<br />
Last:<br />
https://documents.egi.eu/public/ShowDocument?docid=959<br />
<br />
recomputation done<br />
https://helpdesk.ngi-de.eu/index.php?mode=ticket_info&ticket_id=1720<br />
* Monitoring<br />
https://tomtools.cern.ch/confluence/display/SAMDOC/Update-15<br />
* Staged rollout/updates<br />
<br />
;UMD<br />
https://wiki.egi.eu/wiki/UMD-1:UMD-1.5.0<br />
<br />
;EMI<br />
http://www.eu-emi.eu/emi-1-kebnekaise-updates<br />
<br />
;gLite3.1<br />
http://glite.web.cern.ch/glite/packages/R3.1/updates.asp<br />
<br />
;gLite3.2<br />
http://glite.web.cern.ch/glite/packages/R3.2/sl5_x86_64/updates.asp<br />
<br />
==other topics==<br />
EMI release / possible infosys errors (UNI-SIEGEN)<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=1722<br />
<br />
Gstat<br />
https://helpdesk.ngi-de.eu/?mode=ticket_info&ticket_id=1930<br />
<br />
Gstat<br />
Sites with CRYTICAL gstat status: wuppertalprod Uni-Bonn DESY-HH SCAI MaiGRID LRZ-LMU<br />
<br />
==Round the sites==<br />
<br />
; NGI-DE<br />
* BMRZ-FRANKFURT (Uni Frankfurt)<br />
* DESY-HH<br />
we updated all our wn's to torque 2.5.7-2 (glite-WN-version-3.2.12-1) and this works fine with the old torque server (2.3.13-1). <br />
Server we didn't update because of the problem with memory in new version.<br />
This week update of dcache-cms instance to 1.9.12-13 was done<br />
* DESY-ZN<br />
* FZJuelich<br />
* Goegrid<br />
* GSI<br />
* ITWM (Martin Braun)<br />
ntr<br />
* KIT (GridKa, FZK-LCG2, Dimitri Nilsen, Tobias Koenig)<br />
gLexec updated at WNs<br />
roled based mapping for glexec was requested by atlas<br />
WMS disk full: Problems with ngi-de-nagios portal<br />
* KIT (Uni Karlsruhe)<br />
* LRZ<br />
* MPI-K<br />
* MPPMU (Cesare Delle Fratte)<br />
- DOWNTIME 29/11 01/12 dcache upgrade from 1.9.5-28 to 1.9.12-13<br />
- problems with gridftp doors (solved by Java jdk update to latest packages)<br />
- dCache: number of movers was increased<br />
- updated one of the two CREAM<br />
- installed security fix on Apel box<br />
- strange lfc failures caused by gpfs partition problems<br />
* RWTH Aachen<br />
* SCAI<br />
* Uni Bonn<br />
services online<br />
* Uni Dortmund<br />
* Uni Dresden (Ralph Mueller-Pfefferkorn)<br />
- since about two months problem with our file system, especially with the central nfs file system. The nfs system becomes <br />
overloaded. 100s of jobs with 100s of files.<br />
Paolo/CSCS: We had the same problems. It was fixed by changing the CREAM grubber and we went from Lustre to gpfs and SSD disks <br />
for the metadata and for the inode's table.<br />
* Uni Freiburg (Anton Gamel)<br />
- problems with gsi ssh -> increased movers<br />
- installed additional dCache servers<br />
* Uni Mainz-Maigrid<br />
* Uni Siegen<br />
* Uni Wuppertal<br />
; SwiNG<br />
* CSCS (Paolo)<br />
- maintenance two days ago: firmware update of the disks, lost 4 disks/CMS pool (in contact with CMS)<br />
- test CERNVMFS in preproduction<br />
* PSI<br />
* Switch<br />
<br />
Note: please update your entry at https://wiki.egi.eu/wiki/NGI_DE:Sites if needed.<br />
<br />
==Status ROD==<br />
<br />
* Any problematic tickets?<br />
* Handover of the ROD shift<br />
* ROD shift schedule https://wiki.egi.eu/wiki/NGI_DE_CH_Operations_Center:Operations_Teams#Shifts_rotation_table<br />
<br />
LRZ from 02.2012. 2*2 Shifts<br />
<br />
ROD Newsletter Nov. 2011<br />
https://documents.egi.eu/secure/RetrieveFile?docid=298&version=1&filename=ROD%20newsletter%2011-2011.pdf<br />
<br />
tickets were not mentioned within 10 days. Be aware of the ROD statistics.<br />
Please pay attention to the Escalation Procedures<br />
https://wiki.egi.eu/wiki/Operations:COD_Escalation_Procedure#Escalation_for_operational_problem_at_site<br />
<br />
==AOB==<br />
<br />
If you have additional topics to be discussed during the meeting, please submit them in advance via our email list email list.</div>Tkoenig