Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Agenda-14-07-2014"

From EGIWiki
Jump to navigation Jump to search
(Created page with "{| |- | [http://connect.ct.infn.it/egi-inspire-sa1/ Audio conference link] | ''Conference system is Adobe Connect, no password required.'' |- | [https://indico.egi.eu/indico/mat...")
 
 
(30 intermediate revisions by 4 users not shown)
Line 17: Line 17:


Recent, or future planned, releases from the product teams:
Recent, or future planned, releases from the product teams:
*  
* ARC - Release 13.11u2 version 4.2.0
** fix the nagios probes that broke in the previous release, but there is also an extension to the emi-es schema to make it useable for ATLAS.
* dCache:
** [http://www.dcache.org/downloads/1.9/release-notes-2.6.shtml#dCache_2.6.28 server 2.6.28] - ready to be included in the UMD
** [http://www.dcache.org/downloads/1.9/release-notes-2.2.shtml srmclient 2.2.27]
*** fixed the export SRM_PATH bug in srmrmdir and the ArrayIndexOutOfBoundsException when processing multiple surls invoking srmget
* DPM:
**[https://svnweb.cern.ch/trac/lcgdm/blog/dpm-dsi-1.9.3-3 DPM-dsi - new version 1.9.3-3, already available in EPEL]
*** contains patches for proper EOF handling" & for the an FTS2 issue
*GFAL/lcg_util:
**[http://dmc.web.cern.ch/release/davix-031 DAVIX 0.3.1 was released in EPEL on 04/06/2014]
**[http://dmc.web.cern.ch/release/gfal2-255 GFAL2 2.5.5]
**[http://dmc.web.cern.ch/release/gfal2-python-15 GFAL 2 PYTHON 1.4.1 (currently in EPEL-test)]
**[http://dmc.web.cern.ch/release/gfalfs-15 gfalFS 1.5]
**[http://dmc.web.cern.ch/release/srm-ifce-1.19.0 SRM-IFCE 1.19.0]
*[https://github.com/CESNET/glite-lb/wiki/Glite-l&b-release-page#LB_4121 L&B - v. 4.1.2]
*[https://github.com/CESNET/gridsite/wiki/Gridsite-release-page#GridSite_2251 GridSite v. 2.2.5]
*[https://github.com/CESNET/canl-c/wiki/caNl-c-Release-Page#caNlc_2151 caNL-c v. 2.1.5]
* CREAM
** [http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-18-07-07-2014-v-3-9-0-1#CREAM_LSF_module_v_2_0_4 LSF module 2.0.4]
** [http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-14-03-03-2014-v-3-7-2-1#CREAM_TORQUE_v_2_1_3 TORQUE module v. 2.1.4]
** [http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-18-07-07-2014-v-3-9-0-1#CREAM_SLURM_v_1_0_2 SLURM module] 1.0.2
** [http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-18-07-07-2014-v-3-9-0-1#CREAM_GE_module_v_2_2_0 GE module 2.2.0]
** Future: [https://wiki.egi.eu/wiki/URT:Agenda-07-07-2014#CREAM BLAH 1.20.7 & CREAM 1.16.4]
* [https://issues.infn.it/jira/browse/STOR/fixforversion/11815 STORM v. 1.11.6]
* [https://wiki.egi.eu/wiki/URT:Agenda-07-07-2014#VOMS VOMS]
** [https://issues.infn.it/jira/browse/VOMS/fixforversion/11313 VOMS Admin server v. 3.3.0]
** [https://issues.infn.it/jira/browse/VOMS/fixforversion/11611 VOMS Clients v. 3.0.5, VOMS API Java v. 3.0.4]
** [https://issues.infn.it/jira/browse/VOMS/fixforversion/11816 VOMS C APIs, native clients and server v. 2.0.12]
* [https://wiki.egi.eu/wiki/URT:Agenda-07-07-2014#Globus Globus - Globus Toolkit version 6.0]


== 1.2 UMD release  ==
== 1.2 UMD release  ==
Line 24: Line 53:
'''UMD 3.7.0 released on 12.06.2014''' : http://repository.egi.eu/2014/06/12/release-umd-3-7-0/
'''UMD 3.7.0 released on 12.06.2014''' : http://repository.egi.eu/2014/06/12/release-umd-3-7-0/


* wms v. 3.6.4
== 1.3 Staged rollout updates  ==
* gsisshterm v. 2.1.0
* arc v. 13.11.1
* bdii-core v. 1.5.7
* canl v. 2.2.2
* cream-ui v. 1.15.3
* gridsite v. 2.2.3
* DPM / LFC 1.8.8
* bouncycastle-mail v. 1.46.2
* dcache-srm-client v. 2.2.22
* umd-release v. 3.0.1


== 1.3 Staged rollout updates  ==
* In verfication:
** gfal2 v. 2.5.5


new:  
* active:
** globus-info-provider-service v. 0.2.1
** cream v. 1.16.3


* empty


old stuff (no early adopters):
* Ready to be released:
* gridway v. 5.14.2
** storm v. 1.11.4
* globus-info-provider-service v. 0.2.1
** lb v. 11.1
* emi-cluster v. 2.0.1
** wms v. 3.6.5
* globus-rls v. 5.2.5
** dcache v. 2.6.28
* mpi v. 1.5.3


=== UMD 3 EA ===
=== UMD 3 EA ===
Line 70: Line 90:
* Related issues:
* Related issues:
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101486
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101486
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101554
* This issue is already being investigated at '''3rd level''' but PTs cannot decide who is responsible ant DMSU is overseeing.
* This issue is already being investigated at '''3rd level''' but PTs cannot decide who is responsible ant DMSU is overseeing.
=== CREAM CLI/GridSite SegFaults at Long-Lived Proxies ===
=== CREAM CLI/GridSite SegFaults at Long-Lived Proxies ===
* <code>glite-ce-job-submit</code> crashes if the user's proxy certificate has a lifetime exceeding 240 hours (10 days)
* <code>glite-ce-job-submit</code> crashes if the user's proxy certificate has a lifetime exceeding 240 hours (10 days)
* Cause tracked down to GridSite, forwarded to the GridSite PT to fix
* 2 issues:
** one in GridSite for the segfault, forwarded to the GridSite PT to fix - '''UPDATE''' - fix provided wih [https://github.com/CESNET/gridsite/wiki/Gridsite-release-page#GridSite_2251 GridSite v. 2.2.5 (issues #15)]
** one in caNL-c - '''UPDATE''' - fix provided with [https://github.com/CESNET/canl-c/wiki/caNl-c-Release-Page#caNlc_2151 caNL-c v. 2.1.5, issue #6]
* Related issue:
* Related issue:
** https://ggus.eu/?mode=ticket_info&ticket_id=104009
** https://ggus.eu/?mode=ticket_info&ticket_id=104009
** https://ggus.eu/?mode=ticket_info&ticket_id=105893
=== L&B & CREAM Update Issues - on EMI repositories ===
=== L&B & CREAM Update Issues - on EMI repositories ===
* <code> glite-ce-cream-api-java</code> update cases issues on CREAM for the submission through WMS:
* '''UPDATE'''
* Issue:
** EMI 3 Update 18 solves some of the issues introduced with the previous update:
** https://ggus.eu/?mode=ticket_info&ticket_id=106134
*** [https://github.com/CESNET/glite-lb/wiki/Glite-l%26b-release-page#LB_4121 L&B v. 4.1.2]
*** '''Workaround for 106134 on CREAM''':
    <code>
    # yum downgrade glite-ce-cream-api-java
    # service gLite restart
    </code>
* <code>glite-lb-client</code> JAR file moved to comply with EPEL packaging policy, causes failures due to lack of coordination with CREAM
** https://ggus.eu/?mode=ticket_info&ticket_id=106123
** https://ggus.eu/?mode=ticket_info&ticket_id=106121
*** '''Workaround for 106121 & 106123 on CREAM''':
    <code>
    # rm -f /var/lib/tomcat6/webapps/ce-cream/WEB-INF/lib/glite-lb-client-java.jar
    # ln -s /usr/lib/java/glite-lb-client-java.jar /var/lib/tomcat6/webapps/ce-cream/WEB-INF/lib/glite-lb-client-java.jar
    # service gLite restart
    </code>
* <code>glite-lb-common</code> new version creats problems to WMS:
** https://ggus.eu/?mode=ticket_info&ticket_id=106143
*** '''Workaround for 106143 on WMS''':
    <code>
    # yum downgrade glite-lb-common
    # service gLite restart
    </code>


== 2.2 Migration of Central SAM services & reconfiguration of NGIs SAM instances  ==
== 2.2 Migration of Central SAM services & reconfiguration of NGIs SAM instances  ==
Line 119: Line 122:
*** APEL is one of the UMD 2/EMI 2 services no more supported - but there are still sites (23 sites as of 15.06.2014)  publishing accounting information using UMD 2/EMI 2 APEL clients - see list available at: [http://goc-accounting.grid-support.ac.uk/consumer/ APEL consumer]
*** APEL is one of the UMD 2/EMI 2 services no more supported - but there are still sites (23 sites as of 15.06.2014)  publishing accounting information using UMD 2/EMI 2 APEL clients - see list available at: [http://goc-accounting.grid-support.ac.uk/consumer/ APEL consumer]
*** Tutorial on how to migrate APEL clients from EMI 2 to EMI 3 - available at '''[https://indico.egi.eu/indico/contributionDisplay.py?contribId=118&confId=1994 APEL @ EGI CF 2014]'''
*** Tutorial on how to migrate APEL clients from EMI 2 to EMI 3 - available at '''[https://indico.egi.eu/indico/contributionDisplay.py?contribId=118&confId=1994 APEL @ EGI CF 2014]'''
    [[File:Emi2 accounting.jpg]]


* List of services failing '''EMI-2 tests''':  
* List of services failing '''EMI-2 tests''':  
Line 127: Line 133:
** '''as of June 16th''' - [https://indico.egi.eu/indico/materialDisplay.py?materialId=1&confId=2266 EMI2_endpoints_NGI_16062014]
** '''as of June 16th''' - [https://indico.egi.eu/indico/materialDisplay.py?materialId=1&confId=2266 EMI2_endpoints_NGI_16062014]
* Status 16/06/2014 - presentation [https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=slides&confId=2266 PPTX], [https://indico.egi.eu/indico/getFile.py/access?resId=1&materialId=slides&confId=2266 PPT], [https://indico.egi.eu/indico/getFile.py/access?resId=2&materialId=slides&confId=2266 PDF]
* Status 16/06/2014 - presentation [https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=slides&confId=2266 PPTX], [https://indico.egi.eu/indico/getFile.py/access?resId=1&materialId=slides&confId=2266 PPT], [https://indico.egi.eu/indico/getFile.py/access?resId=2&materialId=slides&confId=2266 PDF]
* NGIs with UMD2/EMI2 services - 14.07.2014:
 
* Following up with COD - '''[https://ggus.eu/index.php?mode=ticket_info&ticket_id=106354 GGUS #106354]'''
* NGIs with UMD2/EMI2 services - TOBE UPDATED for 14.07.2014:
** AsiaPacific
** AsiaPacific
*** TW-eScience, Taiwan-LCG2, IN-DAE-VECC-02, TW-EMI-PPS, IN-DAE-VECC-02, TOKYO-LCG2, TW-NCUHEP (in downtime)
*** TW-eScience, Taiwan-LCG2, IN-DAE-VECC-02, TW-EMI-PPS, IN-DAE-VECC-02, TOKYO-LCG2, TW-NCUHEP (in downtime)
Line 140: Line 148:
* '''ALL SITES PROVIDING UMD 2/EMI 2 services MUST BE IN DOWNTIME'''
* '''ALL SITES PROVIDING UMD 2/EMI 2 services MUST BE IN DOWNTIME'''


== 2.4 Obsoleted MW SAM tests ==
== 2.4 SAM Nagios probes re-factoring ==


* Taking in account the OMB approval the following probes were decommissioned starting with '''Wednesday May 28th''':
* TF started the work regarding re-factoring of SAM Nagios probes, after discussion during [https://indico.egi.eu/indico/getFile.py/access?contribId=3&resId=0&materialId=slides&confId=2190 OMB/26.06.2014]
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#EMI-1_tests UMD-1 tests]
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#gLite_3.2_tests gLite 3.2 tests]
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#Security_SAM_instance Security SAM instance (Classic SE/lcg-CE tests)]


= 3. AOB  =
= 3. AOB  =
Line 152: Line 157:


* [https://operations-portal.egi.eu/broadcast/archive/id/1151 BROADCAST] send on 28.05.2014:
* [https://operations-portal.egi.eu/broadcast/archive/id/1151 BROADCAST] send on 28.05.2014:
** [https://www.surveymonkey.com/s/MQ6G8BZ Survey] - please take the survery by '''June, 15th 2014'''
** [https://www.surveymonkey.com/s/MQ6G8BZ Survey] - please take the survery - IT IS STILL OPEN
 
=== Partial results ===
 
* 78 answers submitted (1/4 ~ 1/5 of production sites)
* ~60% of the sites are using UMD repositories to install at least a subset of their services
** Of which, 92% (41) think that UMD is a useful service. 
* Most important features of UMD
** Unique repository containing most of the packages to install at site level
** Protection from untested updates in the community repositories
** Additional testing on top of developers certification
* Where to improve
** Increase the frequency of the UMD releases, reduce the time to release new updates in the repositories
** Test more the products, with more real users
 
=== Participate to the survey ===
* It can take from less than one minute to few minutes (let say 10-15 minutes)
* We will use the feedback collected by the survey to evolve the UMD services


== 3.2 Next meeting  ==
== 3.2 Next meeting  ==


'''30 June 2014'''
'''July 28, 2014''' - if enough participants available


= 4. Minutes  =
= 4. Minutes  =
Line 162: Line 184:
* [https://indico.egi.eu/indico/materialDisplay.py?materialId=minutes&confId=2266 Minute 16.06.2014]
* [https://indico.egi.eu/indico/materialDisplay.py?materialId=minutes&confId=2266 Minute 16.06.2014]


[[Category:Operations]]
[[Category:Grid_Operations_Meetings]]

Latest revision as of 17:30, 23 October 2014

Audio conference link Conference system is Adobe Connect, no password required.
Audio conference details Indico page



1. Middleware releases and staged rollout

1.1 News from URT

Recent, or future planned, releases from the product teams:

1.2 UMD release

UMD 3.7.0 released on 12.06.2014 : http://repository.egi.eu/2014/06/12/release-umd-3-7-0/

1.3 Staged rollout updates

  • In verfication:
    • gfal2 v. 2.5.5
  • active:
    • globus-info-provider-service v. 0.2.1
    • cream v. 1.16.3


  • Ready to be released:
    • storm v. 1.11.4
    • lb v. 11.1
    • wms v. 3.6.5
    • dcache v. 2.6.28

UMD 3 EA

  • Some sites have the contact points for the EA adopters outdated so please check in table if all contacts and products are still correct and send me email if you need to add / remove some contacts (SSO account mandatory): (full site list)

New Products

FTS3, squid and CVMFS will soon be include in UMD and it is important to have some early adopters for this components. So if you anyone interested please contact me or cristina to be included in the early adopter list.

1.4 Next releases

  • Middle/End of July
  • October

2. Operational issues

2.1 Report from DMSU

ARGUS/WMS Certificate Chain Mixups

  • Affecting several sites, where WMS is unable to make SSL connection to ARGUS.
  • With all probability this is a combination of using curl from the SL6 distribution, which in built with NSS SSL rather than OpenSSL and, as such, does not really support proxy certificates, and a bug in Java, hopefully fixed since Java 7 Update 60.
  • Related issues:
  • This issue is already being investigated at 3rd level but PTs cannot decide who is responsible ant DMSU is overseeing.

CREAM CLI/GridSite SegFaults at Long-Lived Proxies

L&B & CREAM Update Issues - on EMI repositories

  • UPDATE
    • EMI 3 Update 18 solves some of the issues introduced with the previous update:

2.2 Migration of Central SAM services & reconfiguration of NGIs SAM instances

Action closed:

  • Central SAM services were migrated from CERN to the new consortium (GRNET, CNRS and SRCE). In order to enable smooth transition we have agreed to start using new hostnames:
    • mon.egi.eu for grid-monitoring.cern.ch
    • opsmon.egi.eu for ops-monitor.cern.ch
  • All Regional & VO SAM instances were reconfigured.

2.3 EMI-2 decommissioning

  • Probes are running in midmon: Documentation.
    • All products but dCache are being retired as previously announced. dCache extended the support for the 2.2.x versions until July 2014.
    • Important Notes:
      • APEL is one of the UMD 2/EMI 2 services no more supported - but there are still sites (23 sites as of 15.06.2014) publishing accounting information using UMD 2/EMI 2 APEL clients - see list available at: APEL consumer
      • Tutorial on how to migrate APEL clients from EMI 2 to EMI 3 - available at APEL @ EGI CF 2014
   Emi2 accounting.jpg


  • Following up with COD - GGUS #106354
  • NGIs with UMD2/EMI2 services - TOBE UPDATED for 14.07.2014:
    • AsiaPacific
      • TW-eScience, Taiwan-LCG2, IN-DAE-VECC-02, TW-EMI-PPS, IN-DAE-VECC-02, TOKYO-LCG2, TW-NCUHEP (in downtime)
    • NGI_DE
      • TUDresden-ZIH (in downtime), GoeGrid
    • NGI_FRANCE
      • GRIF (2 hosts in downtime), IN2P3-IRES
    • NGI_IBERGRID
      • UB-LCG2 (in downtime)
    • NGI_PL
      • ICM (in downtime)
  • ALL SITES PROVIDING UMD 2/EMI 2 services MUST BE IN DOWNTIME

2.4 SAM Nagios probes re-factoring

  • TF started the work regarding re-factoring of SAM Nagios probes, after discussion during OMB/26.06.2014

3. AOB

3.1 EGI UMD usage survey

  • BROADCAST send on 28.05.2014:
    • Survey - please take the survery - IT IS STILL OPEN

Partial results

  • 78 answers submitted (1/4 ~ 1/5 of production sites)
  • ~60% of the sites are using UMD repositories to install at least a subset of their services
    • Of which, 92% (41) think that UMD is a useful service.
  • Most important features of UMD
    • Unique repository containing most of the packages to install at site level
    • Protection from untested updates in the community repositories
    • Additional testing on top of developers certification
  • Where to improve
    • Increase the frequency of the UMD releases, reduce the time to release new updates in the repositories
    • Test more the products, with more real users

Participate to the survey

  • It can take from less than one minute to few minutes (let say 10-15 minutes)
  • We will use the feedback collected by the survey to evolve the UMD services

3.2 Next meeting

July 28, 2014 - if enough participants available

4. Minutes