Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Agenda-12-01-2015"

From EGIWiki
Jump to navigation Jump to search
 
(11 intermediate revisions by 2 users not shown)
Line 28: Line 28:
** http://toolkit.globus.org/toolkit/docs/6.0/
** http://toolkit.globus.org/toolkit/docs/6.0/
* STORM - released
* STORM - released
** [http://italiangrid.github.io/storm/2015/01/07/storm-v.1.11.5-released.html - STORM v. 1.11.5]
** [http://italiangrid.github.io/storm/2015/01/07/storm-v.1.11.5-released.html STORM v. 1.11.5]
* [https://wiki.egi.eu/wiki/Agenda-05-01-2015#UNICORE UNICORE v. 7.2.0]
* VOMS - released
* VOMS - released
** [http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-22-10-11-2014-v-3-13-0-1#VOMS_clients_v_3_0_5_native_2_0 VOMS C APIs, native clients and server v. 2.0.12, VOMS Clients v. 3.0.5, VOMS API Java v. 3.0.4]
** [http://www.eu-emi.eu/releases/emi-3-monte-bianco/updates/-/asset_publisher/5Na8/content/update-22-10-11-2014-v-3-13-0-1#VOMS_clients_v_3_0_5_native_2_0 VOMS C APIs, native clients and server v. 2.0.12, VOMS Clients v. 3.0.5, VOMS API Java v. 3.0.4]
Line 42: Line 43:
== 1.3 Staged rollout updates  ==
== 1.3 Staged rollout updates  ==


 
* voms-clients 3.0.5
* cream-torque v. 2.1.4
* voms-admin 2.0.12
* cream v. 1.16.4
* glexec-wn v. 1.3.0


=== In Verification ===
=== In Verification ===
* cream-ge v. 2.2.0
* cream-ge v. 2.3.1
* qcg-ntf v. 3.4.0 ('''SL6''')
* dpm v. 1.8.9
* qcg-comp v. 3.4.0 ('''SL6''')
* Globus 6.0.0 (Gram5, Default Security, MyProxy, GridFtp)
 


'''New Products'''
'''New Products'''
* squid v. 2.7.19
* squid v. 2.7.19
* fts3 v. 3.2.27
* fts3 v. 3.2.30


'''Ready to be released:'''
'''Ready to be released:'''
 
* cream-torque v. 2.1.4
* cream v. 1.16.4
* fts3 v.3.2.30
....
....


Line 70: Line 70:


== 1.4 Next releases  ==
== 1.4 Next releases  ==
* Mid of Dicember
* End of Jan 2015
* 2015 Calendar is in planning phase


= 2. Operational issues  =
= 2. Operational issues  =
Line 82: Line 83:
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101486
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101486
* This issue is already being investigated at '''3rd level''' but PTs cannot decide who is responsible ant DMSU is overseeing.
* This issue is already being investigated at '''3rd level''' but PTs cannot decide who is responsible ant DMSU is overseeing.
* '''IMPORTANT''' - [http://indico.cern.ch/event/348018/ ARGUS Future and Support - meeting], [https://indico.egi.eu/indico/getFile.py/access?contribId=6&resId=1&materialId=slides&confId=2284 OMB summary]


== 2.2 EMI-2 decommissioning  ==
== 2.2 APEL multicore accounting ==


* UMD2 APEL clients - [http://bit.ly/apel_clients_umd2 GGUS: Sites using UMD2 APEL clients] - '''6 NGI didn't finish the update''':
As many EGI user communities are now exploiting multicore hardware it is important for them that accounting correctly reflects the usage made of cores and cpus. At the [https://indico.egi.eu/indico/conferenceDisplay.py?confId=2284 December Operations Management Board] (OMB) it was decided to ask all sites using the APEL client to configure it to publish the number of cores used by jobs.
** NGI_UA, ROC_LA, ROC_Canada, NGI_HR, NGI_DE, ROC_Asia/Pacific


* Following up with COD - '''[https://ggus.eu/index.php?mode=ticket_info&ticket_id=106354 GGUS #106354]''' - All reported services were upgraded
To enable multicore accounting, you will need to edit the configuration file for the apel parser. This is the software which parses blah logs and batch logs to produce accounting records. The configuration file is usually found at '''''/etc/apel/parser.cfg'''''. In the section labelled '''[batch]''', change:
** from time to time:
 
*** [https://midmon.egi.eu/nagios/cgi-bin/extinfo.cgi?type=2&host=foam.grid.kiae.ru&service=eu.egi.sec.WN-EMI-2-ops - foam.grid.kiae.ru WN]
<code>
*** [https://midmon.egi.eu/nagios/cgi-bin/extinfo.cgi?type=2&host=udo-ce06.grid.tu-dortmund.de&service=eu.egi.sec.WN-EMI-2-ops - udo-ce06.grid.tu-dortmund.de WN]
parallel = false
</code>


== 2.3 dCache 2.2.X decommissioning  ==
to
 
<code>
parallel = true
</code>
 
This will enable multicore reporting for all future accounting data.
Please note that this '''does not change historical data'''. Also note that republishing old data is not sufficient to show multicore information - the log files will need to be reparsed. If you '''wish to republish''' old data with multicore enabled, please '''open a GGUS ticket''' with the APEL team so that we can help you with the process.
 
If you use the SGE parser, please be aware that it only reports on the number of processors used in a job. It does not report the number of nodes. If you know how to get around this limitation, then please get in touch with the APEL team at apel-admins@stfc.ac.uk.
 
The multicore accounting data can currently be seen here: http://accounting-devel.egi.eu/show.php?
 
Drill down to your site and select the grouping "'''Show data for: Submitting Host'''" as a function of: "'''Number of Processors'''".
 
Values of 0 mean the parallel option was false when the data were published. The Submitting Host is a new feature in the accounting portal which lets a site see in more detail which CEs are publishing.
 
== 2.3 EMI-2 decommissioning  ==
 
* UMD2 APEL clients - [http://bit.ly/apel_clients_umd2 GGUS: Sites using UMD2 APEL clients] - '''3 NGI didn't finish the update''':
** NGI_UA, NGI_DE, ROC_Asia/Pacific
*** especially GOeGrid, MY-UPM-BIRUNI-01 are '''still using UMD2 client'''
 
* '''IMPORTANT''' -  switching-off UMD2/EMI2 APEL service - '''Friday 16th January'''
 
== 2.4 dCache 2.2.X decommissioning  ==


* presented in the OMB meeting, 18.09.2014 - https://wiki.egi.eu/wiki/OMB#9.2014.2F11
* presented in the OMB meeting, 18.09.2014 - https://wiki.egi.eu/wiki/OMB#9.2014.2F11
Line 108: Line 136:
     ggus: https://ggus.eu/index.php?mode=ticket_info&ticket_id=109904 - in progress (admins need more time - till the end of 2014)
     ggus: https://ggus.eu/index.php?mode=ticket_info&ticket_id=109904 - in progress (admins need more time - till the end of 2014)
     gocdb: https://goc.egi.eu/portal/index.php?Page_Type=Service&id=3072 - no service downtime
     gocdb: https://goc.egi.eu/portal/index.php?Page_Type=Service&id=3072 - no service downtime
     nagios: https://midmon.egi.eu/nagios/cgi-bin/status.cgi?host=se-goegrid.gwdg.de - still critical
     nagios: https://midmon.egi.eu/nagios/cgi-bin/status.cgi?host=se-goegrid.gwdg.de - OK - probably the ticket should be closed


     2. NGI_DE/UNI-FREIBURG  
     2. NGI_DE/UNI-FREIBURG  
     ggus: https://ggus.eu/index.php?mode=ticket_info&ticket_id=109905 - in progress (admins need more time - till December)
     ggus: https://ggus.eu/index.php?mode=ticket_info&ticket_id=109905 - in progress (admins need more time - till December)
     gocdb: https://goc.egi.eu/portal/index.php?Page_Type=Service&id=3071 - no service downtime
     gocdb: https://goc.egi.eu/portal/index.php?Page_Type=Service&id=3071 - no service downtime
     nagios: https://midmon.egi.eu/nagios/cgi-bin/status.cgi?host=se.bfg.uni-freiburg.de - still critical
     nagios: https://midmon.egi.eu/nagios/cgi-bin/status.cgi?host=se.bfg.uni-freiburg.de - OK - probably the ticket should be closed
 


</code>
</code>


== 2.4 Configuration of the new VOMS server for OPS in the infrastructure AND SAM  ==
== 2.5 Configuration of the new VOMS server for OPS in the infrastructure AND SAM  ==


* [http://bit.ly/sites_new_VOMS_ops GGUS: Sites need to configure the new VOMS server for ops and LHC VOs]:
* [http://bit.ly/sites_new_VOMS_ops GGUS: Sites need to configure the new VOMS server for ops and LHC VOs]:
** '''NGI_IL, NGI_RO'''
** Everything '''OK''' - action closed


== 2.5 SAM Nagios probes re-factoring  ==
== 2.6 SAM Nagios probes re-factoring  ==


* SAM Update 23
* SAM Update 23 release on 09.12.2015, together with UMD Update 10:
** Staged-Rollout to be started this week (latest tomorrow)- staged-rollout repository in preparation
** https://wiki.egi.eu/wiki/SAMUpdate23
*** Staged-Rollout volunteers:
** http://repository.egi.eu/2014/12/09/release-umd-3-10-0/
**** NGI_NDGF (Petter Urkedal)
**** NGI_FI (Ulf Tigerstedt)
**** NGI_UK (Kashif Mohammad)
**** NGI_IBERGRID (Esteban Freire)
** Documentation to be followed - [https://wiki.egi.eu/wiki/SAMUpdate23 - SAM Update 23 wiki]
** Major changes in SAM Update-23:  
*** Probes are moved to the UMD-3 repository. This decision was approved by the OMB in order to enable probe developers to update probes more frequently and independently from SAM releases.
*** Removal of the SAM GridMon (sam-gridmon) and its dependencies. SAM Update-23 supports only SAM Nagios (sam-nagios). In the future version SAM GridMon will be replaced with the ARGO engine.
*** Detailed list of all tickets can be found here: [https://github.com/ARGOeu/sam-probes/issues?q=is%3Aissue+milestone%3AUpdate-23].


== 2.6 MySQL 5.0 EOL  ==
== 2.7 MySQL 5.0 EOL  ==


* discussed during [https://wiki.egi.eu/wiki/URT:Agenda-29-09-2014#SL5_.26_MySQL_5.0_vs._MySQL_5.X.2C_x.3E.3D1 URT meeting, 29.09.2014]
* discussed during [https://wiki.egi.eu/wiki/URT:Agenda-29-09-2014#SL5_.26_MySQL_5.0_vs._MySQL_5.X.2C_x.3E.3D1 URT meeting, 29.09.2014]
Line 158: Line 178:
* Recommendation - site-admins must be made aware to '''avoid using MySQL v. 5.0''', where possible.
* Recommendation - site-admins must be made aware to '''avoid using MySQL v. 5.0''', where possible.


== 2.7 SL/SLC/CentOS 5 Support Lifetime  ==
== 2.8 SL/SLC/CentOS 5 Support Lifetime  ==
* [https://www.scientificlinux.org/ Scientific Linux Homepage]
* [https://www.scientificlinux.org/ Scientific Linux Homepage]
* [http://linux.web.cern.ch/linux/scientific5/ SLC5]
* [http://linux.web.cern.ch/linux/scientific5/ SLC5]

Latest revision as of 14:22, 12 January 2015

Audio conference link Conference system is Adobe Connect, no password required.
Audio conference details Indico page



1. Middleware releases and staged rollout

1.1 News from URT

Recent, or future planned, releases from the product teams:

1.2 UMD release

1.3 Staged rollout updates

  • voms-clients 3.0.5
  • voms-admin 2.0.12

In Verification

  • cream-ge v. 2.3.1
  • dpm v. 1.8.9
  • Globus 6.0.0 (Gram5, Default Security, MyProxy, GridFtp)

New Products

  • squid v. 2.7.19
  • fts3 v. 3.2.30

Ready to be released:

  • cream-torque v. 2.1.4
  • cream v. 1.16.4
  • fts3 v.3.2.30

....

UMD 3 EA

  • Some sites have the contact points for the EA adopters outdated so please check in table if all contacts and products are still correct and send me email if you need to add / remove some contacts (SSO account mandatory): (full site list)

New Products

FTS3, SQUID are to be include in UMD and it is important to have some early adopters for this components. So if you anyone interested please contact me or cristina to be included in the early adopter list.

1.4 Next releases

  • End of Jan 2015
  • 2015 Calendar is in planning phase

2. Operational issues

2.1 Report from DMSU

ARGUS/WMS Certificate Chain Mixups

  • Affecting several sites, where WMS is unable to make SSL connection to ARGUS.
  • With all probability this is a combination of using curl from the SL6 distribution, which in built with NSS SSL rather than OpenSSL and, as such, does not really support proxy certificates, and a bug in Java, hopefully fixed since Java 7 Update 60.
  • Related issues:
  • This issue is already being investigated at 3rd level but PTs cannot decide who is responsible ant DMSU is overseeing.

2.2 APEL multicore accounting

As many EGI user communities are now exploiting multicore hardware it is important for them that accounting correctly reflects the usage made of cores and cpus. At the December Operations Management Board (OMB) it was decided to ask all sites using the APEL client to configure it to publish the number of cores used by jobs.

To enable multicore accounting, you will need to edit the configuration file for the apel parser. This is the software which parses blah logs and batch logs to produce accounting records. The configuration file is usually found at /etc/apel/parser.cfg. In the section labelled [batch], change:

parallel = false

to

parallel = true

This will enable multicore reporting for all future accounting data. Please note that this does not change historical data. Also note that republishing old data is not sufficient to show multicore information - the log files will need to be reparsed. If you wish to republish old data with multicore enabled, please open a GGUS ticket with the APEL team so that we can help you with the process.

If you use the SGE parser, please be aware that it only reports on the number of processors used in a job. It does not report the number of nodes. If you know how to get around this limitation, then please get in touch with the APEL team at apel-admins@stfc.ac.uk.

The multicore accounting data can currently be seen here: http://accounting-devel.egi.eu/show.php?

Drill down to your site and select the grouping "Show data for: Submitting Host" as a function of: "Number of Processors".

Values of 0 mean the parallel option was false when the data were published. The Submitting Host is a new feature in the accounting portal which lets a site see in more detail which CEs are publishing.

2.3 EMI-2 decommissioning

  • UMD2 APEL clients - GGUS: Sites using UMD2 APEL clients - 3 NGI didn't finish the update:
    • NGI_UA, NGI_DE, ROC_Asia/Pacific
      • especially GOeGrid, MY-UPM-BIRUNI-01 are still using UMD2 client
  • IMPORTANT - switching-off UMD2/EMI2 APEL service - Friday 16th January

2.4 dCache 2.2.X decommissioning

   1. NGI_DE/GoeGrid
   ggus: https://ggus.eu/index.php?mode=ticket_info&ticket_id=109904 - in progress (admins need more time - till the end of 2014)
   gocdb: https://goc.egi.eu/portal/index.php?Page_Type=Service&id=3072 - no service downtime
   nagios: https://midmon.egi.eu/nagios/cgi-bin/status.cgi?host=se-goegrid.gwdg.de - OK - probably the ticket should be closed 
   2. NGI_DE/UNI-FREIBURG 
   ggus: https://ggus.eu/index.php?mode=ticket_info&ticket_id=109905 - in progress (admins need more time - till December)
   gocdb: https://goc.egi.eu/portal/index.php?Page_Type=Service&id=3071 - no service downtime
   nagios: https://midmon.egi.eu/nagios/cgi-bin/status.cgi?host=se.bfg.uni-freiburg.de - OK - probably the ticket should be closed 


2.5 Configuration of the new VOMS server for OPS in the infrastructure AND SAM

2.6 SAM Nagios probes re-factoring

2.7 MySQL 5.0 EOL

  • discussed during URT meeting, 29.09.2014
  • MySQL versions available:
  • SL5:
    • mysql-*5.0.95* -> MySQL 5.0
    • mysql51-*5.1.70* -> MySQl 5.1
    • mysql55-*5.5.32* -> MySQL 5.5
  • SL6:
    • mysql-*5.1.71 -> MySQL 5.1
  • Middleware dependancy on 'mysql-server':
    • VOMS - emi-voms-mysql - confirmed it should work with MySQL 5.1, as on SL6, but not tested
    • CREAM - emi-cream-ce - GGUS #106250
    • DPM - puppetlabs/mysql dependancy on "mysql-server",
    • STORM - storm-backend-server - confirmed it should work with MySQL 5.1, as on SL6, but not tested
    • L&B - glite-lb-server - under investigation - will be checked by the SoftwareProvisioning team on the EGI verification testbed
    • WMS - emi-wms - under investigation - will be checked by the SoftwareProvisioning team on the EGI verification testbed
  • Recommendation - site-admins must be made aware to avoid using MySQL v. 5.0, where possible.

2.8 SL/SLC/CentOS 5 Support Lifetime

3. AOB

3.1 Monthly Availability/Reliability

  • [1] - will be followed up in the next days

3.2 Work on a new Broadcast-usage procedure

3.3 Next meetings

  • Feb. 09, 2014

4. Minutes