Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "Agenda-01-06-2012"

From EGIWiki
Jump to navigation Jump to search
(Created page with "{{Template:Op menubar}} = Detailed agenda: Grid Operations Meeting 16 April 2012 14h00 Amsterdam time = {| |- | [https://www.egi.eu/indico/materialDisplay.py?materialId=2&con...")
 
 
(24 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{Template:Op menubar}}  
{{Template:Op menubar}}  
[[Category:Grid Operations Meetings]]


= Detailed agenda: Grid Operations Meeting 16 April 2012 14h00 Amsterdam time  =
= Detailed agenda: Grid Operations Meeting 1 June 2012 14h00 Amsterdam time  ('''Friday''')=


{|
{|
|-
|-
| [https://www.egi.eu/indico/materialDisplay.py?materialId=2&confId=1038 EVO direct link]  
| [http://evo.caltech.edu/evoNext/koala.jnlp?meeting=MsMiMI2s2IDBD9989tDt9s EVO direct link]  
| Pwd: '''gridops'''<br>
| Pwd: '''gridops'''<br>
|-
|-
| [https://www.egi.eu/indico/materialDisplay.py?materialId=1&confId=1038 EVO details]  
| [https://indico.egi.eu/indico/getFile.py/access?resId=0&materialId=1&confId=1045 EVO details]  
| [https://www.egi.eu/indico/conferenceDisplay.py?confId=1038 Indico page]
| [https://indico.egi.eu/indico/conferenceDisplay.py?confId=1045 Indico page]
|}
|}


Line 17: Line 18:


=== 1.1 EMI release status ===
=== 1.1 EMI release status ===
''Cristina Aiftimiei (EMI)''
* EMI 1 Updates status: [https://twiki.cern.ch/twiki/bin/view/EMI/EmiEgiGOM#Status_01_06_2012 External wiki]
* EMI 2 Status: [https://twiki.cern.ch/twiki/bin/view/EMI/Emi2EgiGOM#Status_01_06_2012 External wiki]


=== 1.2 Staged Rollout  ===
=== 1.2 Staged Rollout  ===


EMI1 products under staged rollout (RT ticket number, product):
'''Towards UMD2'''


*3319 EMI.dcache.sl5.x86_64-1.9.15
I have started the SW provisioning process this morning, previously we have done some dry runs to test some new things SA2 has implemented.  
*3471 EMI.lfc_oracle.sl5.x86_64-1.8.2


EMI1 products under verification:  
Many of the products in SL5 and SL6 are now covered but not yet all, please check:  


*3686 EMI.blah.sl5.x86_64-1.16.5 - affects CREAM
*https://www.egi.eu/earlyAdopters/teams
*3687 EMI.dpm.sl5.x86_64-1.8.3
*3688 EMI.lcg-util.sl5.x86_64-1.12.0 - affects UI and WN
*3689 EMI.proxyrenewal.sl5.x86_64-1.3.25 - affects MyProxy
*3690 EMI.lfc_mysql.sl5.x86_64-1.8.3
*3691 EMI.lfc_oracle.sl5.x86_64-1.8.3
*3692 EMI.wms.sl5.x86_64-3.3.5


IGE products in verification:  
The main changes affecting the EAs are:  


*3384 IGE.saga.sl5.x86_64-1.6.1 - Need EAs or sites interested
*There is now a single repository where EAs will fetch the packages: http://repository.egi.eu/sw/testing/umd/2/sl5/x86_64/ or http://repository.egi.eu/sw/testing/umd/2/sl6/x86_64/
*3670 IGE.globus-gsissh.sl5.x86_64-4.3.5
**The staged rollout wiki procedures ([[Staged-rollout-procedures]]) have been updated: except the exact repository configuration which will be done soon.  
*3671 IGE.gridsam.sl5.x86_64-2.3.1
*The staged rollout will be followed in RT queue ''sw-rel'' - ''staged-rollout'' queue has been deprecated.  
*3672 IGE.gridway.sl5.x86_64-5.10.1 - Need EAs or sites interested
**This does not change much the way you do things, just that notifications will come from tickets in this queue.  
*3673 IGE.gsisshterm.sl5.x86_64-1.3.3
**The verification is also done on those tickets, meaning you will get extra fast help in case of problems, when replying to those tickets.
*3674 IGE.globus-gram5-ige.sl5.x86_64-5.2.0
*3675 IGE.gridsafe.sl5.x86_64-1.0.1 - Need EAs or sites interested
*3676 IGE.security-integration.sl5.x86_64-2.1.0
*3677 IGE.ogsadai.sl5.x86_64-4.2.1 - Need EAs or sites interested


EMI2:  
The complete list of products/tickets can be seen here:  


*Need EAs for SL6 all products
*[https://rt.egi.eu/rt/Search/Results.html?Query=Queue%20%3D%20%27sw-rel%27%20AND%20%28Status%20%3D%20%27new%27%20OR%20Status%20%3D%20%27open%27%20OR%20Status%20%3D%20%27accepted%27%20OR%20Status%20%3D%20%27developed%27%20OR%20Status%20%3D%20%27stalled%27%20OR%20Status%20%3D%20%27feedback%27%29 Complete list of RT tickets]
*Need EAs for Debian: ARC, Unicore, emi-ui and emi-wn
*Overall there are 36 products in SL5 and the same number in SL6.
*The products in Debian6 are ARC and several Unicore, there was a problem with Debian products which SA2 is trying to solve, as soon as it is solved we will submit them.


== 2. Operational Issues  ==
== 2. Operational Issues  ==
=== 2.1 Proposal: EGI policy for the usage of TMPDIR variable ===
=== 2.1 Sites not publishing user DN in the usage record ===


Jobs need to know where the scratch space for temporary files is.
In the infrastructure - in May - there were more than 90 sites not publishing the UserDN.
* A worker node should provide scratch space as requested by the VO in the VO ID card, and the location must be known to the user's code
The possible reasons are:
* There should be an uniform behavior across the sites
# '''NGIs/Sites who do not want to publish the UserDN, for privacy reasons'''
 
#* A policy decision, it is not in topic for today
 
# '''Accounting data is published through aggregated records''' (''Those sites are not in the list below'')
'''Proposed policy'''
# '''APEL publisher is not properly configured'''
# In a grid job’s environment, the TMPDIR variable ''must'' always contains the path to the location that can be used by the job as a scratch area for temporary files
#: In the APEL publisher configuration file the directive ''publishGlobalUserName'' is by default set to ''NO''. It to enable userDN publication it should be: ''<JoinProcessor publishGlobalUserName="yes">''
# If the TMPDIR variable is not set, the jobs can consider the current working directory as the assigned scratch area.
#: The YAIM directive that triggers this configuration is"''APEL_PUBLISH_USER_DN''", and it should be "''yes''"
 
#: If a site in the list below is misconfigured and wants to publish the UserDNs, the steps are:
 
#* Reconfigure the APEL publisher with the correct configuration (this will enable the UserDN publication in the usage record published from now on)
'''Corollaries'''
#* To re-publish the previous usage records with the UserDN, please open a GGUS ticket vs the ''APEL'' support unit. This action will require central coordination.
* The scratch area should have enough free space to fulfill the VO ID card requirements (if it does not, users should open a GGUS  ticket VS the site)
# '''DN encryption affected by a problem with a java library'''
*The current version of the policy does not assume that the scratch areas are shared between worker nodes. This document will be refined to include also this information (the location of a shared area if needed by parallel jobs) in the job’s environment, or a new policy will be released.
#: As reported in this [https://ggus.eu/ws/ticket_info.php?ticket=82057 GGUS ticket]
*More in general, this policy does not assume that the scratch area has any specific feature, but the disk space.
#: The problem is affecting few sites (<15) that will be contacted by the APEL staff to solve the issue
*If the TMPDIR variable contains an invalid path, this does not mean that the job’s current workdir can be considered the assigned scratch area. This is a configuration error and the users should open a GGUS ticket VS the affected site.
 
*'''[[List of sites not publishing userDN]]'''
 
** '''ACTION:''' on the NGIs, follow up with your sites in the list, and provide comments in the table.
[https://documents.egi.eu/document/1119 Policy document on document DB]
 
'''Provide feedback''' in the next two weeks, you can disseminate it among your site administrators (''specifying that it is a proposal!!'').
=== 2.2 Top-BDII availabilities for April ===
*[https://documents.egi.eu/public/RetrieveFile?docid=1117&version=2&filename=EGI-core_services_availabilities-per_NGI-Apr2012%20NGIs%20core%20services.pdf Top-BDIIs availability tables for April 2012]
* 4 NGIs well under the OLA target
** It was a good month (hopefully not just a ''lucky month'')
** The NGIs failing to reach the target in April, failed also in March. They need to implement some improvements in their top-bdii configuration.
 
=== 2.3 Topics for middleware related sessions at the Technical Forum ===
* TF is in September 2012, in Prague
* Technology providers (EMI, IGE) have provided their availability for workshop/training sessions, possible topics:
** New products available in EMI-2
** New features introduced in the latest releases
** '''..?''' Is there need for workshops on specific topics?


== 3 AOB  ==
== 3 AOB  ==
Line 92: Line 71:
=== 3.1 Next meetings  ===
=== 3.1 Next meetings  ===


[[Category:GridOpsMeeting]]
== Minutes ==
Minutes available here: [https://indico.egi.eu/indico/materialDisplay.py?materialId=minutes&confId=1045 Minutes on indico]

Latest revision as of 17:06, 29 November 2012

Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security

Detailed agenda: Grid Operations Meeting 1 June 2012 14h00 Amsterdam time (Friday)

EVO direct link Pwd: gridops
EVO details Indico page


1. Middleware releases and staged rollout

1.1 EMI release status

Cristina Aiftimiei (EMI)

1.2 Staged Rollout

Towards UMD2

I have started the SW provisioning process this morning, previously we have done some dry runs to test some new things SA2 has implemented.

Many of the products in SL5 and SL6 are now covered but not yet all, please check:

The main changes affecting the EAs are:

  • There is now a single repository where EAs will fetch the packages: http://repository.egi.eu/sw/testing/umd/2/sl5/x86_64/ or http://repository.egi.eu/sw/testing/umd/2/sl6/x86_64/
    • The staged rollout wiki procedures (Staged-rollout-procedures) have been updated: except the exact repository configuration which will be done soon.
  • The staged rollout will be followed in RT queue sw-rel - staged-rollout queue has been deprecated.
    • This does not change much the way you do things, just that notifications will come from tickets in this queue.
    • The verification is also done on those tickets, meaning you will get extra fast help in case of problems, when replying to those tickets.

The complete list of products/tickets can be seen here:

  • Complete list of RT tickets
  • Overall there are 36 products in SL5 and the same number in SL6.
  • The products in Debian6 are ARC and several Unicore, there was a problem with Debian products which SA2 is trying to solve, as soon as it is solved we will submit them.

2. Operational Issues

2.1 Sites not publishing user DN in the usage record

In the infrastructure - in May - there were more than 90 sites not publishing the UserDN. The possible reasons are:

  1. NGIs/Sites who do not want to publish the UserDN, for privacy reasons
    • A policy decision, it is not in topic for today
  2. Accounting data is published through aggregated records (Those sites are not in the list below)
  3. APEL publisher is not properly configured
    In the APEL publisher configuration file the directive publishGlobalUserName is by default set to NO. It to enable userDN publication it should be: <JoinProcessor publishGlobalUserName="yes">
    The YAIM directive that triggers this configuration is"APEL_PUBLISH_USER_DN", and it should be "yes"
    If a site in the list below is misconfigured and wants to publish the UserDNs, the steps are:
    • Reconfigure the APEL publisher with the correct configuration (this will enable the UserDN publication in the usage record published from now on)
    • To re-publish the previous usage records with the UserDN, please open a GGUS ticket vs the APEL support unit. This action will require central coordination.
  4. DN encryption affected by a problem with a java library
    As reported in this GGUS ticket
    The problem is affecting few sites (<15) that will be contacted by the APEL staff to solve the issue

3 AOB

3.1 Next meetings

Minutes

Minutes available here: Minutes on indico