From EGIWiki
Jump to: navigation, search


Detailed agenda: Grid Operations Meeting 16 May 2011

Evo link

Meeting password: gridops

1. Middleware releases and staged rollout

1.1. Update on the status of EMI-1 release (Cristina)

The EMI project is pleased to announce the availability of the EMI 1 (Kebnekaise) release.

This release features for the first time a complete and consolidated set of middleware components from ARC, dCache, gLite and UNICORE. The services, managed in the past by separate providers, and now developed, built and tested in collaboration, follow well established open-source practices and are distributed from a single reference repository. The reference platform for EMI 1 is Scientific Linux 5 64 bit.

Kebnekaise will be supported for 18 months, with 6 additional months of support for security issues.

For more details on the EMI 1 release and the middleware products composing it, please refer to the following links:

EMI 1 Release page [1]

EMI User Forums [2]

EMI Software Repository [3]

EMI Project Home Page [4]

1.2. Staged Rollout (Mario)

gLite 3.2 components:

  1. VOBOX has been tested at LIP, is now in ready for production.
  2. L&B 2.1.21 under staged rollout by IFIC, first feedback OK.
  3. CREAM 1.6.6 and SGE_Utils under staged rollout by LIP. A problem seen due to the latest update of glibc.

The glibc has been updated from glibc-2.5-49 to glibc-2.5-58, before the update of the cream and sge_utils, and we saw segmentation faults in the BLAH component. This may turn to be more general. Sites running cream should check at least the BUpdater<LRMS>.

1.4 Interoperability

Current ongoing points in the integration task forces:


Both UNICORE and Globus very interested in the outcome of the EMI working group

2. Operational Issues

2.1 Workdir and tmpdir for parallel jobs

Jobs wordir and tempdir

New input from NGI_CH: Currently at CSCS the solutions in use are: CREAM JobWrapper and ARC custom submission scripts. Those solutions are obviously middleware specific.

The custom TMPDIR should be used also to solve the local disk throughput limits: with 16 or 32 cores the local disk can be stressed beyond its limit, and high performance network file systems have better performance. Experiences from other site administrator on this topic performance are needed to consider solutions for this problem.

The customization of a job workdir should consider both the job type and the job disk performance requirements. Can the second requirement be attached to the job submission? Is really needed?

2.2 Batch system survey

As announced in the previous meeting prepared a basic but important survey about the batch systems distribution. The question will be:

Should this surveybe managed by NGIs? Or should it be sent directly to side administrators through a broadcast & mail to LCG-ROLLOUT.

2.3 Open Issues

3. AOB

3.2 Next meeting

Proposal: Monday 30 of May 14:00

Personal tools