Difference between revisions of "EGI-InSPIRE:Czech Rep-QR3"

From EGIWiki
Jump to: navigation, search
(2.1. Progress Summary)
 
(10 intermediate revisions by 2 users not shown)
Line 1: Line 1:
__NOTOC__
+
{{Template:EGI-Inspire menubar}}
 +
 
 +
{{Template:Inspire_reports_menubar}}
 +
{{TOC_right}}
 +
 
  
 
{| border="1" cellspacing="0" cellpadding="2"
 
{| border="1" cellspacing="0" cellpadding="2"
Line 20: Line 24:
  
 
==1. MEETINGS AND DISSEMINATION ==
 
==1. MEETINGS AND DISSEMINATION ==
=====GENERAL GUIDELINES FOR ALL EVENTS REPORTED IN THE FOLLOWING SECTIONS:=====
 
*please do not provide a list of participants, only give the number of people that attended
 
*for outcome, please list tangible agreements, decisions instead of listing program points or presentations you made. Otherwise put: “-“
 
*include your local events only if there was any EGI-related topic on the agenda
 
*provide an indico URL to your presentation (if available) or to the event itself.
 
**If your presentation is not available online, please send the slides to erika.swiderski@egi.eu.
 
Note: Complete the tables below by adding as many rows as needed.
 
Note: Complete the tables below by adding as many rows as needed.
 
  
 
===1.1. CONFERENCES/WORKSHOPS ORGANISED===
 
===1.1. CONFERENCES/WORKSHOPS ORGANISED===
Line 46: Line 42:
 
|-
 
|-
 
!scope="col"|Date||Location||Title||Participants||Outcome (Short report & Indico URL)  
 
!scope="col"|Date||Location||Title||Participants||Outcome (Short report & Indico URL)  
|-
 
|22.-24. November 2010||Prague||EMI All Hands Meeting||1||http://indico.cern.ch/conferenceDisplay.py?confId=108375
 
 
|-
 
|-
|24.-25. January 2011||Amsterdam||Operations Management Board||1||https://www.egi.eu/indico/conferenceDisplay.py?confId=153
+
|24.-25.1. 2011||Amsterdam||Operations Management Board||1||https://www.egi.eu/indico/conferenceDisplay.py?confId=153
 +
|-
 +
|25.1. 2011|| Amsterdam||OTAG ||1 ||https://www.egi.eu/indico/conferenceDisplay.py?confId=245
 +
|-
 +
|28.11. - 2.12 2010||CERN||ATLAS computing week||1||-
 
<!-- formatting text -->
 
<!-- formatting text -->
 
|-
 
|-
Line 73: Line 71:
 
===2.1. Progress Summary===
 
===2.1. Progress Summary===
  
* National instances of Nagios, Top BDII and Operations portal were provided, as well as set of services provided for VOCE and Auger VOs (WMS, LB, UI, MyProxy, VOMS).
+
* APEL accounting service was successfully deployed on CESNET cluster, complete migration to APEL will be finished in February.  
* APEL accounting service was successfully deployed on Cesnet cluster, complete migration to APEL will be finished in February.  
+
* Migration to new certification authority CESNET CA3 has started, most of services are already running with service certificate signed by new CA. 
* Migration to glite 3.2 was done on almost all services, the only missing services are LCG-CE (still missing in glite 3.2) and MON box.
+
* Migration to glite 3.2 was done on almost all services, the only missing services are LCG-CE (still missing in glite 3.2) and MON box (delayed due to certificate upgrade).
* Migration to new certification authority CESNET CA3 has started, most of services are already running with service certificate signed by new CA 
+
* Regular maintenance and upgrades of national instances of Nagios, Top BDII and Operations, as well as set of services provided for VOCE and Auger VOs (WMS, LB, UI, MyProxy, VOMS, LFC).
* We have been further maintaining the RequestTracker ticketing system (RT) and the
+
* Maintenance of the RequestTracker ticketing system (RT) and the interface between RT and the GGUS. NGI_CZ is also actively participating in the GGUS-RT Interface Task Force. In the last reporting period we had to deal with issues stemming from unrehearsed and impromptu interconnecting of different NGI ticketing systems and mailing lists and take measures to prevent these issues in the future.
interface between RT and the GGUS. NGI_CZ is also actively participating in the
+
* As the coordinator of the security monitoring activity of EGI CSIRT, we produced a draft of functions of security dashboard and initiated discussions with the portal developers and OTAG about enhancing the current operations dashboard accordingly. A final decision about the implementation has been than taken by OTAG and the work has started. A first prototype of the security dashboard is planned for cca May 2011.
GGUS-RT Interface Task Force. In the last reporting period we had to deal with
+
* Improvements to the EGI Pakiti service, which detects unpatched machines in the infrastructure (namely an alerting mechanism was added, which sents out notifications whenever a critical vulnerability appears).
issues stemming from unrehearsed and impromptu interconnecting of different NGI
+
* Regular activities in the EGI CSIRT, we acted a duty-contact for one week.
ticketing systems and mailing lists and take measures to prevent these issues in
+
* Regular support for VOCE users (plus yearly account renewal), Belle community in Czech Republic (experiments on prague_cesnet_lcg2 cluster), and support for Auger VO (data management issues, long-term jobs running over time limits)
the future.
+
* An experimental instance of the LB server has been deployed, along with required MSG_PUBLISH patches.
*As the coordinator of the security monitoring activity of EGI CSIRT, we produced a draft of functions of security dashboard and initiated discussions with the portal developers and OTAG about enhancing the current operations dashboard accordingly. A final decision about the implmentation has been than taken by OTAG and the work has started. A first prototype of the security dashboard is planned for cca May 2011.
 
*We've applied additional improvements to the EGI Pakiti service, which detects unpatched machines in the infrastructure (namely an alerting mechanism was added, which sents out notifications whenever a critical vulnerability appears).
 
* Regular activities in the EGI CSIRT, we acted a duty-contact for one week,
 
  
 
===2.2. Main Achievements===
 
===2.2. Main Achievements===
* gLite 3.2 migration almost done, MON box upgrade was delayed with CA change, the only missing service  
+
* gLite 3.2 migration almost done, MON box upgrade was delayed with CA change, the only missing service is LCG-CE
is LCG-CE
+
* Installation of new worker nodes adding about 4600 HEPSpec performance to cluster praguelcg2. Installation of new DPM disk servers that should provide about 1PB disk space. This space is not in production yet. The servers are still under performance and burn-in tests.
  
 
===2.3. Issues and mitigation===
 
===2.3. Issues and mitigation===
Line 98: Line 93:
 
!scope="col"| Mitigation Description
 
!scope="col"| Mitigation Description
 
|-
 
|-
|Bad availability/reliability numbers for  prague_cesnet_lcg2 cluster in December|| Problem was invoked by SAM org.sam.WN-Rep group, which is failing irregularly. Problem is detected only by SAM test, productions runs are not affected. We are investigating this issue, more in is in GGUS ticket https://gus.fzk.de/ws/ticket_info.php?ticket=66107
+
|Low availability/reliability numbers for  prague_cesnet_lcg2 cluster in December (73%)|| Problem was invoked by SAM org.sam.WN-Rep group, which is failing irregularly. Problem is detected only by SAM test, productions runs are not affected. We are investigating this issue, more information in is in GGUS ticket https://gus.fzk.de/ws/ticket_info.php?ticket=66107
 
|-
 
|-
 
|}
 
|}

Latest revision as of 14:10, 7 January 2015

EGI Inspire Main page


Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports




Quarterly Report Number NGI Name Partner Name Author
3 NGI_CZ Czech Republic Miroslav Ruda


1. MEETINGS AND DISSEMINATION

1.1. CONFERENCES/WORKSHOPS ORGANISED

Date Location Title Participants Outcome (Short report & Indico URL)

1.2. OTHER CONFERENCES/WORKSHOPS ATTENDED

Date Location Title Participants Outcome (Short report & Indico URL)
24.-25.1. 2011 Amsterdam Operations Management Board 1 https://www.egi.eu/indico/conferenceDisplay.py?confId=153
25.1. 2011 Amsterdam OTAG 1 https://www.egi.eu/indico/conferenceDisplay.py?confId=245
28.11. - 2.12 2010 CERN ATLAS computing week 1 -


1.3. PUBLICATIONS

Publication title Journal / Proceedings title Journal references
Volume number
Issue

Pages from - to
Authors
1.
2.
3.
Et al?

2. ACTIVITY REPORT

2.1. Progress Summary

  • APEL accounting service was successfully deployed on CESNET cluster, complete migration to APEL will be finished in February.
  • Migration to new certification authority CESNET CA3 has started, most of services are already running with service certificate signed by new CA.
  • Migration to glite 3.2 was done on almost all services, the only missing services are LCG-CE (still missing in glite 3.2) and MON box (delayed due to certificate upgrade).
  • Regular maintenance and upgrades of national instances of Nagios, Top BDII and Operations, as well as set of services provided for VOCE and Auger VOs (WMS, LB, UI, MyProxy, VOMS, LFC).
  • Maintenance of the RequestTracker ticketing system (RT) and the interface between RT and the GGUS. NGI_CZ is also actively participating in the GGUS-RT Interface Task Force. In the last reporting period we had to deal with issues stemming from unrehearsed and impromptu interconnecting of different NGI ticketing systems and mailing lists and take measures to prevent these issues in the future.
  • As the coordinator of the security monitoring activity of EGI CSIRT, we produced a draft of functions of security dashboard and initiated discussions with the portal developers and OTAG about enhancing the current operations dashboard accordingly. A final decision about the implementation has been than taken by OTAG and the work has started. A first prototype of the security dashboard is planned for cca May 2011.
  • Improvements to the EGI Pakiti service, which detects unpatched machines in the infrastructure (namely an alerting mechanism was added, which sents out notifications whenever a critical vulnerability appears).
  • Regular activities in the EGI CSIRT, we acted a duty-contact for one week.
  • Regular support for VOCE users (plus yearly account renewal), Belle community in Czech Republic (experiments on prague_cesnet_lcg2 cluster), and support for Auger VO (data management issues, long-term jobs running over time limits)
  • An experimental instance of the LB server has been deployed, along with required MSG_PUBLISH patches.

2.2. Main Achievements

  • gLite 3.2 migration almost done, MON box upgrade was delayed with CA change, the only missing service is LCG-CE
  • Installation of new worker nodes adding about 4600 HEPSpec performance to cluster praguelcg2. Installation of new DPM disk servers that should provide about 1PB disk space. This space is not in production yet. The servers are still under performance and burn-in tests.

2.3. Issues and mitigation

Issue Description Mitigation Description
Low availability/reliability numbers for prague_cesnet_lcg2 cluster in December (73%) Problem was invoked by SAM org.sam.WN-Rep group, which is failing irregularly. Problem is detected only by SAM test, productions runs are not affected. We are investigating this issue, more information in is in GGUS ticket https://gus.fzk.de/ws/ticket_info.php?ticket=66107