Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Agenda-15-07-2014"

From EGIWiki
Jump to navigation Jump to search
(Created page with "{| |- | [http://connect.ct.infn.it/egi-inspire-sa1/ Audio conference link] | ''Conference system is Adobe Connect, no password required.'' |- | [https://indico.egi.eu/indico/mat...")
 
(Blanked the page)
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
{|
|-
| [http://connect.ct.infn.it/egi-inspire-sa1/ Audio conference link]
| ''Conference system is Adobe Connect, no password required.''
|-
| [https://indico.egi.eu/indico/materialDisplay.py?materialId=0&confId=1382 Audio conference details]
| [https://indico.egi.eu/indico/conferenceDisplay.py?confId=2106 Indico page]
|}


<br>
{{TOC right}}
= 1. Middleware releases and staged rollout  =
== 1.1 News from URT  ==
Recent, or future planned, releases from the product teams:
*
== 1.2 UMD release  ==
'''UMD 3.7.0 released on 12.06.2014''' : http://repository.egi.eu/2014/06/12/release-umd-3-7-0/
* wms v. 3.6.4
* gsisshterm v. 2.1.0
* arc v. 13.11.1
* bdii-core v. 1.5.7
* canl v. 2.2.2
* cream-ui v. 1.15.3
* gridsite v. 2.2.3
* DPM / LFC 1.8.8
* bouncycastle-mail v. 1.46.2
* dcache-srm-client v. 2.2.22
* umd-release v. 3.0.1
== 1.3 Staged rollout updates  ==
new:
* empty
old stuff (no early adopters):
* gridway v. 5.14.2
* globus-info-provider-service v. 0.2.1
* emi-cluster v. 2.0.1
* globus-rls v. 5.2.5
* mpi v. 1.5.3
=== UMD 3 EA ===
* '''Some sites have the contact points for the EA adopters outdated'''  so please check in table if all contacts and products are still correct and send me email if you need to add / remove some contacts (SSO account mandatory): ([https://www.egi.eu/earlyAdopters/table full site list])
=== '''New Products''' ===
FTS3, squid and CVMFS will soon be include in UMD and it is important to have some early adopters for this components. So if you anyone interested please contact me or cristina to be included in the early adopter list.
== 1.4 Next releases  ==
* Middle/End of July
* October
= 2. Operational issues  =
== 2.1 Report from DMSU  ==
=== ARGUS/WMS Certificate Chain Mixups ===
* Affecting several sites, where WMS is unable to make SSL connection to ARGUS.
* With all probability this is a combination of using <code>curl</code> from the SL6 distribution, which in built with NSS SSL rather than OpenSSL and, as such, does not really support proxy certificates, and a bug in Java, hopefully fixed since Java 7 Update 60.
* Related issues:
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101486
** https://ggus.eu/index.php?mode=ticket_info&ticket_id=101554
* This issue is already being investigated at '''3rd level''' but PTs cannot decide who is responsible ant DMSU is overseeing.
=== CREAM CLI/GridSite SegFaults at Long-Lived Proxies ===
* <code>glite-ce-job-submit</code> crashes if the user's proxy certificate has a lifetime exceeding 240 hours (10 days)
* Cause tracked down to GridSite, forwarded to the GridSite PT to fix
* Related issue:
** https://ggus.eu/?mode=ticket_info&ticket_id=104009
=== L&B & CREAM Update Issues - on EMI repositories ===
* <code> glite-ce-cream-api-java</code> update cases issues on CREAM for the submission through WMS:
* Issue:
** https://ggus.eu/?mode=ticket_info&ticket_id=106134
*** '''Workaround for 106134 on CREAM''':
    <code>
    # yum downgrade glite-ce-cream-api-java
    # service gLite restart
    </code>
* <code>glite-lb-client</code> JAR file moved to comply with EPEL packaging policy, causes failures due to lack of coordination with CREAM
** https://ggus.eu/?mode=ticket_info&ticket_id=106123
** https://ggus.eu/?mode=ticket_info&ticket_id=106121
*** '''Workaround for 106121 & 106123 on CREAM''':
    <code>
    # rm -f /var/lib/tomcat6/webapps/ce-cream/WEB-INF/lib/glite-lb-client-java.jar
    # ln -s /usr/lib/java/glite-lb-client-java.jar /var/lib/tomcat6/webapps/ce-cream/WEB-INF/lib/glite-lb-client-java.jar
    # service gLite restart
    </code>
* <code>glite-lb-common</code> new version creats problems to WMS:
** https://ggus.eu/?mode=ticket_info&ticket_id=106143
*** '''Workaround for 106143 on WMS''':
    <code>
    # yum downgrade glite-lb-common
    # service gLite restart
    </code>
== 2.2 Migration of Central SAM services & reconfiguration of NGIs SAM instances  ==
*Central SAM services were migrated from CERN to the new consortium (GRNET, CNRS and SRCE). In order to enable smooth transition we have agreed to start using new hostnames:
** '''mon.egi.eu for grid-monitoring.cern.ch'''
** '''opsmon.egi.eu for ops-monitor.cern.ch'''
*CERN services will be operational until May 1st. Afterwards aliases will point to new instances.
*If '''Regional & VO SAM instances''' are not re-configurred '''*it will stop working*''' after the switch off of the CERN instance.
*The following instances are not yet configured, and tkts have been opened to follow them up:
** ngi-de-nagios.gridka.de - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105128 NGI_DE #105128] - '''tkt still open'''
** wipp-srs.weizmann.ac.il - [https://ggus.eu/index.php?mode=ticket_info&ticket_id=105133 NGI_IL #105133] - tkt still open - '''NGI_IL is unresponsive'''
<ul>
<li> Configuration advices:</li>
<ol><li> NGI and VO SAM Nagios instances</li>
* Create file /etc/voms2htpasswd-static.d/opsmon.conf with the following content:
/C=HR/O=edu/OU=srce/CN=opsmon.egi.eu
<li> NGI SAM Nagios instances</li>
* Set the following two variables in YAIM:
ATP_ROOT_URL="http://mon.egi.eu/atp"
POEM_SYNC_URLS="http://mon.egi.eu/poem/api/0.1/json/"
<li> Rerun YAIM</li>
/opt/glite/yaim/bin/yaim -c -s site-info.def -n NAGIOS -n SAM_NAGIOS
</ol>
If you prefer '''not to run YAIM''' skip the steps 2 & 3 and perform the following:
<ol style="list-style-type:lower-latin"><li> NGI and VO SAM Nagios instances</li>
* Restart service voms-htpasswd:
service voms-htpasswd restart
<li> NGI SAM Nagios instances</li>
* Modify parameter POEM_SYNC_NS_URLS in file /etc/poem/poem_sync.ini:
POEM_SYNC_NS_URLS: http://mon.egi.eu/poem/api/0.1/json/
* Modify parameter ATP_ROOT_URL in file ncg/ncg.conf:
ATP_ROOT_URL=http://mon.egi.eu/atp
Parameter is repeated several times, you need to modify it on all places.
</ol>
</ul>
== 2.3 EMI-2 decommissioning  ==
* Probes are running in midmon: [https://wiki.egi.eu/wiki/MW_SAM_tests#EMI-2_tests Documentation].
** All products but dCache are being retired as previously announced. '''dCache''' extended the support for the 2.2.x versions until July 2014.
** '''Important Notes:'''
*** APEL is one of the UMD 2/EMI 2 services no more supported - but there are still sites (23 sites as of 15.06.2014)  publishing accounting information using UMD 2/EMI 2 APEL clients - see list available at: [http://goc-accounting.grid-support.ac.uk/consumer/ APEL consumer]
*** Tutorial on how to migrate APEL clients from EMI 2 to EMI 3 - available at '''[https://indico.egi.eu/indico/contributionDisplay.py?contribId=118&confId=1994 APEL @ EGI CF 2014]'''
    [[File:No._Sites_per_NGI_-_using_UMD2_APEL_clients.jpg]]
* List of services failing '''EMI-2 tests''':
** as of Mach 7th - [https://drive.google.com/file/d/0B7LpvREXG9c-WnBGSFV4VGJUNFE/edit?usp=sharing Download XLS file]
** as of March 3rd - [http://bit.ly/EMI2_NGI_07042014 EMI2_endpoints_NGI_07042014]
** as of April 24 - [http://goo.gl/vY6Mtm EMI2_endpoints_NGI_24042014]
** as of June 2 - [https://indico.egi.eu/indico/materialDisplay.py?materialId=4&confId=2223 EMI2_endpoints_NGI_02062014]
** '''as of June 16th''' - [https://indico.egi.eu/indico/materialDisplay.py?materialId=1&confId=2266 EMI2_endpoints_NGI_16062014]
* Status 16/06/2014 - presentation [https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=slides&confId=2266 PPTX], [https://indico.egi.eu/indico/getFile.py/access?resId=1&materialId=slides&confId=2266 PPT], [https://indico.egi.eu/indico/getFile.py/access?resId=2&materialId=slides&confId=2266 PDF]
* NGIs with UMD2/EMI2 services:
** AsiaPacific
*** TW-eScience, Taiwan-LCG2, IN-DAE-VECC-02, TW-EMI-PPS, IN-DAE-VECC-02, TOKYO-LCG2, TW-NCUHEP (in downtime)
** NGI_DE
*** TUDresden-ZIH (in downtime), GoeGrid
** NGI_FRANCE
*** GRIF (2 hosts in downtime), IN2P3-IRES
** NGI_IBERGRID
*** UB-LCG2 (in downtime)
** NGI_PL
*** ICM (in downtime)
* '''ALL SITES PROVIDING UMD 2/EMI 2 services MUST BE IN DOWNTIME'''
== 2.4 Obsoleted MW SAM tests  ==
* Taking in account the OMB approval the following probes were decommissioned starting with '''Wednesday May 28th''':
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#EMI-1_tests UMD-1 tests]
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#gLite_3.2_tests gLite 3.2 tests]
** [https://wiki.egi.eu/wiki/MW_Nagios_tests#Security_SAM_instance Security SAM instance (Classic SE/lcg-CE tests)]
= 3. AOB  =
== 3.1 EGI UMD usage survey ==
* [https://operations-portal.egi.eu/broadcast/archive/id/1151 BROADCAST] send on 28.05.2014:
** [https://www.surveymonkey.com/s/MQ6G8BZ Survey] - please take the survery by '''June, 15th 2014'''
== 3.2 Next meeting  ==
'''30 June 2014'''
= 4. Minutes  =
* [https://indico.egi.eu/indico/materialDisplay.py?materialId=minutes&confId=2266 Minute 16.06.2014]
[[Category:Operations]]

Latest revision as of 17:14, 13 July 2014