Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "EGI-InSPIRE:Plan 2012 SA1.8"

From EGIWiki
Jump to navigation Jump to search
 
(6 intermediate revisions by 3 users not shown)
Line 1: Line 1:
= Plans 2012 SA1.8 =
{{EGI-Inspire_menubar}}
{{TOC_right}}  {{Template:Inspire_reports_menubar}}
= Plans 2012 SA1.8 =


== Assessement of progress, 2011 ==
== Assessement of progress, 2011 ==


=== Core Grid Services  ===
=== Core Grid Services  ===


==== DTEAM VO Services ====
==== DTEAM VO Services ====


The migration of the DTEAM VO was finalized on January 2011. DTEAM VO is served by 2 geographically distributed VOMS servers in Thessaloniki and Athens (voms.hellasgrid.gr and voms2.hellasgrid.gr). During this year 7 NGI groups were created on the DTEAM VO (NGI_FI, NGI_NDGF, NGI_DE, NGI_IT, NGI_IE, NGI_UK, NGI_ZA) and 2 ROC Groups were decommissioned (ROC_Italy, SEE)
The migration of the DTEAM VO was finalized on January 2011. DTEAM VO is served by 2 geographically distributed VOMS servers in Thessaloniki and Athens (voms.hellasgrid.gr and voms2.hellasgrid.gr). During this year 7 NGI groups were created on the DTEAM VO (NGI_FI, NGI_NDGF, NGI_DE, NGI_IT, NGI_IE, NGI_UK, NGI_ZA) and 3 ROC Groups were decommissioned (ROC_Italy, SEE, dech)  


==== EGI Catch All CA ====
==== EGI Catch All CA ====


During 2011 the EGI Catch All CA setup three new Registration Authorities in Senegal, Egypt and for SixSq (partner in StratusLab) in Switzerland. This brings the total number of RAs to 7.
During 2011 the EGI Catch All CA setup three new Registration Authorities in Senegal, Egypt and for SixSq (partner in StratusLab) in Switzerland. This brings the total number of RAs to 7.  


==== Core Services for Site Certification ====
==== Core Services for Site Certification ====


A TOP-BDII, a WMS and an LB service was installed as catch all services for NGIs that do not operate their own services for the site certification process. In addition a portal was built, that syncs with GOCDB and gives the ability to the NGI Managers to add and remove on demand uncertified sites from the catch-all TOP-BDII.  
A TOP-BDII, a WMS and an LB service was installed as catch all services for NGIs that do not operate their own services for the site certification process. In addition a portal was built, that syncs with GOCDB and gives the ability to the NGI Managers to add and remove on demand uncertified sites from the catch-all TOP-BDII.  


<br>


=== Operations tool and availability computation ===
=== Operations tool and availability computation ===


==== Propose Changes for Operations tools ====
==== Propose Changes for Operations tools ====


An assessment of the operations tools was completed and the result were presented at the EGI Technical Conference in Lyon.
An assessment of the operations tools was completed and the result were presented at the EGI Technical Conference in Lyon.  


https://wiki.egi.eu/wiki/POEM_and_ACE_requirements
[[POEM_and_ACE_requirements]]


==== Data more readily available to NGIs ====
==== Data more readily available to NGIs ====


This has been provided by MyEGI. Maybe improvements can be suggested as more experience is gained from its usage.
This has been provided by MyEGI. Maybe improvements can be suggested as more experience is gained from its usage.  


==== Follow-up with developers for issues that affect accuracy ====
==== Follow-up with developers for issues that affect accuracy ====


There is a high number of unknown status from certain NGI nagios instances / sites. This is still investigated but it seems to involve mostly NGI nagios operations and not developers. This is an ongoing activity
There is a high number of unknown status from certain NGI nagios instances / sites. This is still investigated but it seems to involve mostly NGI nagios operations and not developers. This is an ongoing activity and will be followed up by TSA1.7


=== Operational Level Agreements (OLAs) ===
=== Operational Level Agreements (OLAs) ===


==== MSA 411 ====
==== MSA 411 ====


The milestone MSA11 "Operational Level Agreements within the EGI PRoduction Infrastructure" was achieved during 2011.
The milestone MSA11 "Operational Level Agreements within the EGI PRoduction Infrastructure" was achieved during 2011.  


  https://documents.egi.eu/document/524
  https://documents.egi.eu/document/524


<br>


==== Continue adaptations to the OLA between NGI and sites ====
==== Continue adaptations to the OLA between NGI and sites ====


The RC OLA has been finalized and available at:  
The RC OLA has been finalized and available at:  
Line 49: Line 53:
  https://documents.egi.eu/document/31
  https://documents.egi.eu/document/31


==== Produce OLA between EGI and NGIs, as well as a Core services OLA ====
==== Produce OLA between EGI and NGIs, as well as a Core services OLA ====


The RP OLA, which was started during 2011, partially covers this, with NGI responsibilities including the services NGI provides as core services, however it is ongoing that as tools evolve more services thresholds should be included in this OLA. The first release of the RP OLA was finalized in 2011 and the second release will come shortly in early 2012.
The RP OLA, which was started during 2011, partially covers this, with NGI responsibilities including the services NGI provides as core services, however it is ongoing that as tools evolve more services thresholds should be included in this OLA. The first release of the RP OLA was finalized in 2011 and the second release will come shortly in early 2012.  


  https://documents.egi.eu/document/463
  https://documents.egi.eu/document/463


In 2012 the EGI.eu OLA will cover the services offered by EGI.
In 2012 the EGI.eu OLA will cover the services offered by EGI.  


==== Propose an OLA amendment procedure (Spring 2011) ====
==== Propose an OLA amendment procedure (Spring 2011) ====


This action was not completed at the OLAs were not finalized. This is an action for 2012
This action was not completed at the OLAs were not finalized. This is an action for 2012  


==== Evaluate the impact of increased availability suspension threshold ====
==== Evaluate the impact of increased availability suspension threshold ====


During 2011 TSA1.8 evaluated the impact of increasing the availability suspension threshold. The results of the evaluation were presented at the Technical Forum in Lyon:
During 2011 TSA1.8 evaluated the impact of increasing the availability suspension threshold. The results of the evaluation were presented at the Technical Forum in Lyon:  


  https://www.egi.eu/indico/conferenceDisplay.py?confId=267
  https://www.egi.eu/indico/conferenceDisplay.py?confId=267


<br>


==== Reconvene with the OLA task force at least once per 2 months ====
==== Reconvene with the OLA task force at least once per 2 months ====


That was not really needed, depending on the requirements sometimes 2 meetings took place within 1 month, as the TF work has to go through the OMB for approval and additional comments to be addressed.
That was not really needed, depending on the requirements sometimes 2 meetings took place within 1 month, as the TF work has to go through the OMB for approval and additional comments to be addressed.  


==== Availability/Reliability ====
==== Availability/Reliability ====


TSA1.8 is responsible for the distribution of monthly league tables. Continue adding useful material to the wiki:
TSA1.8 is responsible for the distribution of monthly league tables. Continue adding useful material to the wiki:  


https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics  
[[Availability_and_reliability_monthly_statistics ]]


The investigation whether operational tools advancements can simplify the procedure is an ongoing activity and will continue in 2012:
The investigation whether operational tools advancements can simplify the procedure is an ongoing activity and will continue in 2012:  


  https://rt.egi.eu/guest/Ticket/Display.html?id=289
  https://rt.egi.eu/guest/Ticket/Display.html?id=289


Regarding the prime causes of site failures investigation: Ongoing, the first step is to determine the causes of the high % of UNKNOWN states in NGI Nagios (mentioned before in the accuracy issues) before going deeper into sites. Site replies to COD tickets for the reports could start be categorized in 2012. The initial results of the investigate show that the problems are mostly relating with operation.
Regarding the prime causes of site failures investigation: Ongoing, the first step is to determine the causes of the high&nbsp;% of UNKNOWN states in NGI Nagios (mentioned before in the accuracy issues) before going deeper into sites. Site replies to COD tickets for the reports could start be categorized in 2012. The initial results of the investigate show that the problems are mostly relating with operation.  


== Plans for 2012 ==
== Plans for 2012 ==


=== Core Grid Services ===
=== Core Grid Services ===


==== DTEAM VO Services ====
==== VO Services ====


The plan for to 2012 is to finalize the decommission of the legacy ROC Groups. (ROC_Benelux, ROC_France, ROC_UKI). Currently the DTEAM VO services are provided using the VOMRS service. Investigate whether the new VOMS service provides all the needed functionality.
The plan for to 2012 is to finalize the decommission of the legacy ROC Groups in the DTEAM VO. (ROC_Benelux, ROC_France, ROC_UKI). Currently the DTEAM VO services are provided using the VOMRS service. Investigate whether the new VOMS service provides all the needed functionality.


==== EGI Catch All CA ====
During 2012Q1 TSA1.8 will assess the need and the feasibility of setting up a replicated service of the OPS VO at the GRNET VOMS Infrastructure, while the the primary VO services for the OPS VO are provided by CERN.


Continue the support and operation of the EGI Catch All CA and the expansion of the RA Network as needed.
==== EGI Catch All CA ====


==== Core Services for Site Certification ====
Continue the support and operation of the EGI Catch All CA and the expansion of the RA Network as needed.


Continue the support and operation of the Site Certification Core Services.
==== Core Services for Site Certification ====


=== Operations tool and availability computation ===
Continue the support and operation of the Site Certification Core Services.


==== Follow-up with developers for issues that affect accuracy ====
=== Operational Level Agreements (OLAs)  ===


Continue the investigation of the relatively high number of unknown status from certain NGI nagios instances / sites. Target date 2012Q2.
==== MSA 418  ====


=== Operational Level Aggreements (OLAs) ===
The milestone MSA 418 "Operational Level Agreements (OLAs) within the EGI production infrastructure" is planned for 2012Q1 with deadline the end of the first month of 2012Q2.


==== MSA 418 ====
==== Produce OLA between EGI and NGIs, as well as a Core services OLA  ====


The milestone MSA 418 "Operational Level Agreements (OLAs) within the EGI production infrastructure" is planned for 2012Q1 with deadline the end of the first month of 2012Q2.
The 2nd release of the RP OLA will be finalized early 2012Q1. A new work item for 2012 is the EGI.eu OLA. A draft version will be ready in 2012Q2 and the final version is expected in 2012Q3.  


==== Produce OLA between EGI and NGIs, as well as a Core services OLA ====
==== OLA Task Force  ====


The 2nd release of the RP OLA will be finalized early 2012Q1. A new work item for 2012 is the EGI.eu OLA. A draft version will be ready in 2012Q2 and the final version is expected in 2012Q3. In 2012Q3 a new revision of the RP OLA will be drafted including any a
The OLA Task Force as a fixed group has finished its work. For the upcoming work on the EGI.eu OLA, TSA1.8 will establish direct communication channels with the people that are operating services at the EGI.eu level.


==== Propose an OLA amendment procedure ====
==== Availability/Reliability  ====


The amendment procedure for the OLA is scheduled for 2012Q2
TSA1.8 will continue the handling the validation and distribution of the monthly league tables regarding Resource Center and EGI.eu services and the maintenance of the relevant wiki space:  
 
==== OLA Task Force Meetings ====
 
The OLA Task Force will reconvene via video conference and/or face to face meetings as needed.
 
==== Availability/Reliability ====
 
TSA1.8 will continue the distribution of monthly league tables and the maintenance of the relevant wiki space:
 
https://wiki.egi.eu/wiki/Availability_and_reliability_monthly_statistics
 
The investigation whether operational tools advancements can simplify the procedure will continue in 2012 and recommendations will be made to operations and tools developers.
 
https://rt.egi.eu/guest/Ticket/Display.html?id=289


Regarding the prime causes of site failures investigation: Ongoing, the first step is to determine the causes of the high % of UNKNOWN states in NGI Nagios (mentioned before in the accuracy issues) before going deeper into sites. Site replies to COD tickets for the reports could start be categorized in 2012. The initial results of the investigate show that the problems are mostly relating with operation.
[[Availability_and_reliability_monthly_statistics]]

Latest revision as of 20:46, 24 December 2014

EGI Inspire Main page



Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports


Plans 2012 SA1.8

Assessement of progress, 2011

Core Grid Services

DTEAM VO Services

The migration of the DTEAM VO was finalized on January 2011. DTEAM VO is served by 2 geographically distributed VOMS servers in Thessaloniki and Athens (voms.hellasgrid.gr and voms2.hellasgrid.gr). During this year 7 NGI groups were created on the DTEAM VO (NGI_FI, NGI_NDGF, NGI_DE, NGI_IT, NGI_IE, NGI_UK, NGI_ZA) and 3 ROC Groups were decommissioned (ROC_Italy, SEE, dech)

EGI Catch All CA

During 2011 the EGI Catch All CA setup three new Registration Authorities in Senegal, Egypt and for SixSq (partner in StratusLab) in Switzerland. This brings the total number of RAs to 7.

Core Services for Site Certification

A TOP-BDII, a WMS and an LB service was installed as catch all services for NGIs that do not operate their own services for the site certification process. In addition a portal was built, that syncs with GOCDB and gives the ability to the NGI Managers to add and remove on demand uncertified sites from the catch-all TOP-BDII.


Operations tool and availability computation

Propose Changes for Operations tools

An assessment of the operations tools was completed and the result were presented at the EGI Technical Conference in Lyon.

POEM_and_ACE_requirements

Data more readily available to NGIs

This has been provided by MyEGI. Maybe improvements can be suggested as more experience is gained from its usage.

Follow-up with developers for issues that affect accuracy

There is a high number of unknown status from certain NGI nagios instances / sites. This is still investigated but it seems to involve mostly NGI nagios operations and not developers. This is an ongoing activity and will be followed up by TSA1.7

Operational Level Agreements (OLAs)

MSA 411

The milestone MSA11 "Operational Level Agreements within the EGI PRoduction Infrastructure" was achieved during 2011.

https://documents.egi.eu/document/524


Continue adaptations to the OLA between NGI and sites

The RC OLA has been finalized and available at:

https://documents.egi.eu/document/31

Produce OLA between EGI and NGIs, as well as a Core services OLA

The RP OLA, which was started during 2011, partially covers this, with NGI responsibilities including the services NGI provides as core services, however it is ongoing that as tools evolve more services thresholds should be included in this OLA. The first release of the RP OLA was finalized in 2011 and the second release will come shortly in early 2012.

https://documents.egi.eu/document/463

In 2012 the EGI.eu OLA will cover the services offered by EGI.

Propose an OLA amendment procedure (Spring 2011)

This action was not completed at the OLAs were not finalized. This is an action for 2012

Evaluate the impact of increased availability suspension threshold

During 2011 TSA1.8 evaluated the impact of increasing the availability suspension threshold. The results of the evaluation were presented at the Technical Forum in Lyon:

https://www.egi.eu/indico/conferenceDisplay.py?confId=267


Reconvene with the OLA task force at least once per 2 months

That was not really needed, depending on the requirements sometimes 2 meetings took place within 1 month, as the TF work has to go through the OMB for approval and additional comments to be addressed.

Availability/Reliability

TSA1.8 is responsible for the distribution of monthly league tables. Continue adding useful material to the wiki:

Availability_and_reliability_monthly_statistics

The investigation whether operational tools advancements can simplify the procedure is an ongoing activity and will continue in 2012:

https://rt.egi.eu/guest/Ticket/Display.html?id=289

Regarding the prime causes of site failures investigation: Ongoing, the first step is to determine the causes of the high % of UNKNOWN states in NGI Nagios (mentioned before in the accuracy issues) before going deeper into sites. Site replies to COD tickets for the reports could start be categorized in 2012. The initial results of the investigate show that the problems are mostly relating with operation.

Plans for 2012

Core Grid Services

VO Services

The plan for to 2012 is to finalize the decommission of the legacy ROC Groups in the DTEAM VO. (ROC_Benelux, ROC_France, ROC_UKI). Currently the DTEAM VO services are provided using the VOMRS service. Investigate whether the new VOMS service provides all the needed functionality.

During 2012Q1 TSA1.8 will assess the need and the feasibility of setting up a replicated service of the OPS VO at the GRNET VOMS Infrastructure, while the the primary VO services for the OPS VO are provided by CERN.

EGI Catch All CA

Continue the support and operation of the EGI Catch All CA and the expansion of the RA Network as needed.

Core Services for Site Certification

Continue the support and operation of the Site Certification Core Services.

Operational Level Agreements (OLAs)

MSA 418

The milestone MSA 418 "Operational Level Agreements (OLAs) within the EGI production infrastructure" is planned for 2012Q1 with deadline the end of the first month of 2012Q2.

Produce OLA between EGI and NGIs, as well as a Core services OLA

The 2nd release of the RP OLA will be finalized early 2012Q1. A new work item for 2012 is the EGI.eu OLA. A draft version will be ready in 2012Q2 and the final version is expected in 2012Q3.

OLA Task Force

The OLA Task Force as a fixed group has finished its work. For the upcoming work on the EGI.eu OLA, TSA1.8 will establish direct communication channels with the people that are operating services at the EGI.eu level.

Availability/Reliability

TSA1.8 will continue the handling the validation and distribution of the monthly league tables regarding Resource Center and EGI.eu services and the maintenance of the relevant wiki space:

Availability_and_reliability_monthly_statistics