Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Agenda-02-06-2014"

From EGIWiki
Jump to navigation Jump to search
 
(30 intermediate revisions by 2 users not shown)
Line 17: Line 17:


Recent, or future planned, releases from the product teams:
Recent, or future planned, releases from the product teams:
* TO UPDATE
* [http://www.dcache.org/downloads/1.9/release-notes-2.6.shtml dCache v. 2.6.28]
* [https://github.com/CESNET/glite-lb/wiki/Glite-l&b-release-page#LB_4111 L&B v. 4.1.1]
* [https://issues.infn.it/jira/issues/?jql=project%20%3D%20CREAM%20AND%20fixVersion%20%3D%20%22CREAM%201.16.3%22%20AND%20resolution%20is%20not%20EMPTY CREAM v. 1.16.3]
* [https://issues.infn.it/jira/issues/?jql=project%20%3D%20WMS%20AND%20fixVersion%20%3D%20%223.6.5%20WMS%20Server%22%20AND%20resolution%20is%20not%20EMPTY WMS 3.6.5]
* [http://italiangrid.github.io/storm/2014/05/23/storm-v.1.11.4-released.html STORM v. 1.11.4]
 
 


== 1.2 UMD release  ==
== 1.2 UMD release  ==
Line 37: Line 43:
== 1.3 Staged rollout updates  ==
== 1.3 Staged rollout updates  ==


new:
* empty


old stuff:
old stuff (no early adopters):
* gridway v. 5.14.2
* gridway v. 5.14.2
* globus-info-provider-service v. 0.2.1
* globus-info-provider-service v. 0.2.1
* globus-default-security v.  5.2.4 ( 9 months)
* emi-cluster v. 2.0.1
* security-integration v. 3.0.0 (7 months)
* globus-rls v. 5.2.5
* emi-cluster v. 2.0.1 ( 7 months)
* mpi v. 1.5.3
* globus-rls v. 5.2.5 (3 months)
 
* mpi v. 1.5.3 ( 4 weeks )
=== UMD 3 EA ===


=== UMD 3 Campaign ===
* '''Some sites have the contact points for the EA adopters outdated'''  so please check in table if all contacts and products are still correct and send me email if you need to add / remove some contacts (SSO account mandatory): ([https://www.egi.eu/earlyAdopters/table full site list])


* Contacting Sites ([https://www.egi.eu/earlyAdopters/table full site list]) in order to update the products list now for the UMD-3 products. '''Some sites have the contact points for the EA adopters outdated''' so please check in table if all contacts are still correct and send me email if you need to add / remove some contacts (SSO account mandatory)
=== '''New Products''' ===


* Sites who didn't replied and still tagged as EA for UMD1 / UMD2 products (updated Monday morning):
FTS3, squid and CVMFS will soon include in UMD and it is important to have some early adopters for this components. So if you anyone interested please contact me or cristina to be included in the early adopter list.
** NGI_BG, BG01-IPP
** NGI_DE, FZK-LCG2
** NGI-DK, UNICPH-NBI
** NGI_FRANCE, GRIF
** NGI_FRANCE, IN2P3-CC
** NGI_HR, egee.srce.hr
** NGI_IBERGRID, UPV-GRyCAP
** NGI_NL, SARA
** NGI_SE, HPC2N
** NGI_UA, UA-KNU
** NGI_UK, UKI-LT2-RHUL
** NGI_UK, UKI-NORTHGRID-LANCS-HEP
** NGI_UK, UKI-NORTHGRID-MAN-HEP
** NGI_UK, UKI-SCOTGRID-ECDF
** NGI_UK, UKI-SOUTHGRID-CAM-HEP
** NGI_UK, UKI-SOUTHGRID-OX-HEP
** ROC_canada, CA-McGill-CLUMEQ-T2


== 1.4 Next releases  ==
== 1.4 Next releases  ==
Line 79: Line 71:
== 2.1 Report from DMSU  ==
== 2.1 Report from DMSU  ==


TO UPDATE STATUS
=== ARGUS/WMS Certificate Chain Mixups ===
 
* Affecting several sites, where WMS is unable to make SSL connection to ARGUS.
* from Alessandro Paolini:
* With all probability this is a combination of using <code>curl</code> from the SL6 distribution, which in built with NSS SSL rather than OpenSSL and, as such, does not really support proxy certificates, and a bug in Java, hopefully fixed since Java 7 Update 60.
** The nagios probe '''eu.egi.sec.DPM-GLUE2-EMI-1''' should be modified because it tries to detect some information that the new version of DPM doesn't publish any more
* Related issues:
*** references: [https://ggus.eu/index.php?mode=ticket_info&ticket_id=104943 GGUS #104943], [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105143 GGUS #105143]
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101486
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101554
* This issue is already being investigated at '''3rd level''' but PTs cannot decide who is responsible ant DMSU is overseeing.
=== CREAM CLI/GridSite SegFaults at Long-Lived Proxies ===
* <code>glite-ce-job-submit</code> crashes if the user's proxy certificate has a lifetime exceeding 240 hours (10 days)
* Cause tracked down to GridSite, forwarded to the GridSite PT to fix
* Related issue:
** https://ggus.eu/?mode=ticket_info&ticket_id=104009


== 2.2 Migration of Central SAM services & reconfiguration of NGIs SAM instances  ==
== 2.2 Migration of Central SAM services & reconfiguration of NGIs SAM instances  ==


TO UPDATE STATUS
*Central SAM services were migrated from CERN to the new consortium (GRNET, CNRS and SRCE). In order to enable smooth transition we have agreed to start using new hostnames:
 
*Central SAM services are in the process of migrating from CERN to the new consortium (GRNET, CNRS and SRCE). In order to enable smooth transition we have agreed to start using new hostnames:
** '''mon.egi.eu for grid-monitoring.cern.ch'''
** '''mon.egi.eu for grid-monitoring.cern.ch'''
** '''opsmon.egi.eu for ops-monitor.cern.ch'''
** '''opsmon.egi.eu for ops-monitor.cern.ch'''
Line 97: Line 94:


*The following instances are not yet configured, and tkts have been opened to follow them up:
*The following instances are not yet configured, and tkts have been opened to follow them up:
** cygrid-nagios.grid.ucy.ac.cy - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105123 NGI_CYGRID #105123]
** grid-nagios.ii.edu.mk - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105124 NGI_MARGI #105124]
** mon-ua.bitp.kiev.ua - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105125 NGI_UA #105125]
** nagios.egee.cesnet.cz - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105126 NGI_CZ #105126]
** nagios.ipp.acad.bg - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105127 NGI_BG #105127]
** ngi-de-nagios.gridka.de - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105128 NGI_DE #105128]
** ngi-de-nagios.gridka.de - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105128 NGI_DE #105128]
** node02-02.imi.renam.md - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105129 NGI_MD #105129]
** rnag1.grid.kiae.ru - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105130 ROC_Russia #105130]
** rocnagios.grid.sinica.edu.tw - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105131 ROC_Asia/Pacific #105131]
** sam.grid.am - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105132 NGI_ARMGRID #105132]
** wipp-srs.weizmann.ac.il - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105133 NGI_IL #105133]
** wipp-srs.weizmann.ac.il - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105133 NGI_IL #105133]


Line 135: Line 123:
</ul>
</ul>


== 2.1 EMI-2 decommissioning  ==
== 2.3 EMI-2 decommissioning  ==
 
 
TO UPDATE STATUS


* Probes are running in midmon: [https://wiki.egi.eu/wiki/MW_SAM_tests#EMI-2_tests Documentation].
* Probes are running in midmon: [https://wiki.egi.eu/wiki/MW_SAM_tests#EMI-2_tests Documentation].
** All products but dCache are being retired as previously announced. '''dCache''' extended the support for the 2.2.x versions until July 2014.
** All products but dCache are being retired as previously announced. '''dCache''' extended the support for the 2.2.x versions until July 2014.
* List of services failing '''EMI-2 test''':  
** '''Important Notes:'''
*** APEL is one of the UMD 2/EMI 2 services no more supported - but there are still sites publishing accounting information using UMD 2/EMI 2 APEL clients - see list available at: [http://goc-accounting.grid-support.ac.uk/consumer/ APEL consumer]
*** Tutorial on how to migrate APEL clients from EMI 2 to EMI 3 - available at '''[https://indico.egi.eu/indico/contributionDisplay.py?contribId=118&confId=1994 APEL @ EGI CF 2014]'''
* List of services failing '''EMI-2 tests''':  
** as of Mach 7th - [https://drive.google.com/file/d/0B7LpvREXG9c-WnBGSFV4VGJUNFE/edit?usp=sharing Download XLS file]
** as of Mach 7th - [https://drive.google.com/file/d/0B7LpvREXG9c-WnBGSFV4VGJUNFE/edit?usp=sharing Download XLS file]
** as of March 3rd - [http://bit.ly/EMI2_NGI_07042014 EMI2_endpoints_NGI_07042014]
** as of March 3rd - [http://bit.ly/EMI2_NGI_07042014 EMI2_endpoints_NGI_07042014]
** as of April 24 - [http://goo.gl/vY6Mtm EMI2_endpoints_NGI_24042014]
** as of April 24 - [http://goo.gl/vY6Mtm EMI2_endpoints_NGI_24042014]
** '''as of June 2''' - [TOADD]
** '''as of June 2''' - [https://indico.egi.eu/indico/materialDisplay.py?materialId=4&confId=2223 EMI2_endpoints_NGI_02062014]
*  
* Status - presentation [https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=slides&confId=2223 PPTX], [https://indico.egi.eu/indico/getFile.py/access?resId=1&materialId=slides&confId=2223 PPT], [https://indico.egi.eu/indico/getFile.py/access?resId=2&materialId=slides&confId=2223 PDF]
* Please report any errors ASAP
* Reports received:
** NGI_AEGIS
** NGI_UK
** NGI_NL (BEgrid-ULB-VUB)
* '''ALL SITES PROVIDING UMD 2/EMI 2 services MUST BE IN DOWNTIME'''
 
== 2.4 Obsoleted MW SAM tests  ==


== 2.3 Probes raising alarms since April ==
* Taking in account the OMB approval the following probes were decommissioned starting with '''Wednesday May 28th''':
* none
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#EMI-1_tests UMD-1 tests]
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#gLite_3.2_tests gLite 3.2 tests]
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#Security_SAM_instance Security SAM instance (Classic SE/lcg-CE tests)]


= 3. AOB  =
= 3. AOB  =


==== 3.1 Next meeting  ====
== 3.1 EGI UMD usage survey ==
 
* [https://operations-portal.egi.eu/broadcast/archive/id/1151 BROADCAST] send on 28.05.2014:
** [https://www.surveymonkey.com/s/MQ6G8BZ Survey] - please take the survery by '''June, 15th 2014'''


'''June, 2nd 2014'''  
== 3.2 Next meeting  ==
 
'''June, 16th 2014'''


= 4. Minutes  =
= 4. Minutes  =
[TOADD Minutes 03.06.2014]




[[Category:Operations]]
 
[[Category:Grid_Operations_Meetings]]

Latest revision as of 17:30, 23 October 2014

Audio conference link Conference system is Adobe Connect, no password required.
Audio conference details Indico page



1. Middleware releases and staged rollout

1.1 News from URT

Recent, or future planned, releases from the product teams:


1.2 UMD release

Ready to be released :

  • wms v. 3.6.4
  • gsisshterm v. 2.1.0
  • arc v. 13.11.1
  • bdii-core v. 1.5.7
  • canl v. 2.2.2
  • cream-ui v. 1.15.3
  • gridsite v. 2.2.3
  • DPM / LFC 1.8.8
  • bouncycastle-mail v. 1.46.2
  • dcache-srm-client v. 2.2.22

1.3 Staged rollout updates

new:

  • empty

old stuff (no early adopters):

  • gridway v. 5.14.2
  • globus-info-provider-service v. 0.2.1
  • emi-cluster v. 2.0.1
  • globus-rls v. 5.2.5
  • mpi v. 1.5.3

UMD 3 EA

  • Some sites have the contact points for the EA adopters outdated so please check in table if all contacts and products are still correct and send me email if you need to add / remove some contacts (SSO account mandatory): (full site list)

New Products

FTS3, squid and CVMFS will soon include in UMD and it is important to have some early adopters for this components. So if you anyone interested please contact me or cristina to be included in the early adopter list.

1.4 Next releases

  • Begining June
  • Middle/End of July
  • October

2. Operational issues

2.1 Report from DMSU

ARGUS/WMS Certificate Chain Mixups

CREAM CLI/GridSite SegFaults at Long-Lived Proxies

2.2 Migration of Central SAM services & reconfiguration of NGIs SAM instances

  • Central SAM services were migrated from CERN to the new consortium (GRNET, CNRS and SRCE). In order to enable smooth transition we have agreed to start using new hostnames:
    • mon.egi.eu for grid-monitoring.cern.ch
    • opsmon.egi.eu for ops-monitor.cern.ch
  • CERN services will be operational until May 1st. Afterwards aliases will point to new instances.
  • If Regional & VO SAM instances are not re-configurred *it will stop working* after the switch off of the CERN instance.
  • The following instances are not yet configured, and tkts have been opened to follow them up:


  • Configuration advices:
    1. NGI and VO SAM Nagios instances
      • Create file /etc/voms2htpasswd-static.d/opsmon.conf with the following content:
      /C=HR/O=edu/OU=srce/CN=opsmon.egi.eu
    2. NGI SAM Nagios instances
      • Set the following two variables in YAIM:
      ATP_ROOT_URL="http://mon.egi.eu/atp" POEM_SYNC_URLS="http://mon.egi.eu/poem/api/0.1/json/"
    3. Rerun YAIM
    4. /opt/glite/yaim/bin/yaim -c -s site-info.def -n NAGIOS -n SAM_NAGIOS

    If you prefer not to run YAIM skip the steps 2 & 3 and perform the following:

    1. NGI and VO SAM Nagios instances
      • Restart service voms-htpasswd:
      service voms-htpasswd restart
    2. NGI SAM Nagios instances
      • Modify parameter POEM_SYNC_NS_URLS in file /etc/poem/poem_sync.ini:
      POEM_SYNC_NS_URLS: http://mon.egi.eu/poem/api/0.1/json/
      • Modify parameter ATP_ROOT_URL in file ncg/ncg.conf:
      ATP_ROOT_URL=http://mon.egi.eu/atp Parameter is repeated several times, you need to modify it on all places.

2.3 EMI-2 decommissioning

  • Probes are running in midmon: Documentation.
    • All products but dCache are being retired as previously announced. dCache extended the support for the 2.2.x versions until July 2014.
    • Important Notes:
      • APEL is one of the UMD 2/EMI 2 services no more supported - but there are still sites publishing accounting information using UMD 2/EMI 2 APEL clients - see list available at: APEL consumer
      • Tutorial on how to migrate APEL clients from EMI 2 to EMI 3 - available at APEL @ EGI CF 2014
  • List of services failing EMI-2 tests:
  • Status - presentation PPTX, PPT, PDF
  • Reports received:
    • NGI_AEGIS
    • NGI_UK
    • NGI_NL (BEgrid-ULB-VUB)
  • ALL SITES PROVIDING UMD 2/EMI 2 services MUST BE IN DOWNTIME

2.4 Obsoleted MW SAM tests

3. AOB

3.1 EGI UMD usage survey

  • BROADCAST send on 28.05.2014:
    • Survey - please take the survery by June, 15th 2014

3.2 Next meeting

June, 16th 2014

4. Minutes