EGI IGTF Release Process
This page is in draft
About this distribution
This page describe the procedure for the announcement and propagation of a new release of the EGI Trust Anchor distribution. The EGI Policy on Approved Certification Authorities describes the set of trust anchors accepted by EGI and put forward for consideration by the NGIs to install as the 'agreed set' of Identity Management trust anchors. For the time being, only PKI trust achors ('Certification Authorities') are included in this release.
The announcement and distribution of trust anchors within EGI and the NGIs should be done in a well defined order to ensure consistent deployment and to ensure that the monitoring of the NGIs and sites matches the currently recommended version of the trust anchors.
EGI Trust Anchor releases are usually done on the last Monday of the month (except for security or critical bug fixes), but only if such a release would be materially different from the deployed version. EGI will get a preview release in the previous week for SR.
Trust Anchor Policy and Transition Processes
For historical reasons most EGI sites use the 'lcg-CA' meta-package that reflects EGI as well as wLCG policy on approved CAs. This package is being deprecated: new sites can install the new 'ca-policy-egi-core' package (and ca-policy-lcg if desired). For sites that upgrade, changing the repository URL is enough: 'lcg-CA' will trigger installation of both the 'ca-policy-egi-core' package as well as the 'ca-policy-lcg' package.
Today both policies are the same, i.e., the same dependencies are included in both RPMs, and installing either one of them will trigger the installation of the same set of CAs. The EGI specific set of endorsed CAs is encoded in the 'ca-policy-egi-core' meta-package. However, it is not guaranteed that in the future the EGI and wLCG policies will stay aligned, so sites that relied on the 'lcg-CA' package distributed by EGEE to also comply with the wLCG policies on accepted CAs should keep this in mind. Also, the acceptable CA policy for your organisation, country or region may be different from the EGI one: for example, your organisation or NGI may have approved additional CAs to support training activities on your infrastructure, or have additional CAs for your country or your site! As a result, you may want to install additional CAs that are not on the EGI core list.
- IGTF/EUGridPMA Liaison: David Groep (firstname.lastname@example.org)
- Repository and SR Team via RT: Kostas Koumantaros, Mario David, Michel Drescher
- Monitoring/SAM Team (central monitoring configuration centre): Emir Imamagic <email@example.com>, Konstantin Skaburskas <firstname.lastname@example.org>
- NGI Nagios updates: via Monitoring team and SR to NGI operational contacts
- Regional follow-up: NGI contacts and all sites (via CIC portal)
- Security oversight and determination of grace period: EGI CSIRT/Mingchao Ma and Sven Gabriel
- David Groep (under EGI contract O-E-15/SA1) will build the EGI-specific trust anchor package, currently called 'ca-policy-egi-core' and dependencies, and make that available on a special site only intended for INTERNAL use by the deployment process (there is also a ca-policy-lcg package). This "special" distribution contains:
- all CAs accredited under the IGTF "classic", "mics", and "slcs" profiles (following the Policy on Accepted CAs)
- the "ca-policy-egi-core" as well as the legacy/compatibility packages "lcg-CA" and its corresponding "ca-policy-lcg" meta package
- any other EGI specific CAs or exception as decided by EGI (currently none)
- the patch RPM for the Apache mod_ssl bug (the "dummycas" RPM)
- the appropriate NRMS XML, meta-data, SAM/Nagios XML (release date set 2 days to the future), README.txt file, and repo headers, validated as per http://admin-repo.egi.eu/XMLvalidator/
- This distribution will be unit tested before being released, and the RPMs copied to the internal release web site under a specific URL named after the release revision (to allow quick roll-back for SAM <U11.2 monitoring probes):
- ./builddist-egi-multi.pl --srcurl=http://dist.eugridpma.info/distribution/igtf/current -v -f
- On egi-igtf the symlink 'egi/' is then set to the proper version before opening the sw-rel ticket.
- The internal EGI-IGTF liaison web site (not the production site) is:
- wget -r http://egi-igtf.ndpf.info/distribution/egi/
- or using rsync from rsync://egi-igtf.ndpf.info/egi-igtf/egi/
- David will submit a RT ticket to the sw-rel queue
- the RT URL is https://rt.egi.eu/rt/Ticket/Create.html?Queue=24.
- including the release.xml NRMS that gets produced as part of the repository at http://egi-igtf.ndpf.info/distribution/egi/current/meta/ca-policy-egi-core-VERSION-RELEASE.nrms (also called just release.xml).
- ticket subject will be standard: "CA update, version X.Y.Z-R", with a comment inside saying: "please, follow the EGI-IGTF release process at EGI_IGTF_Release_Process".
- The RT ticket that triggers the updates (which can be sent anytime), should systematically include a reminder not to start after mid-week, and preferably on a Monday.
- the EGI specific change log in the release file. This change log is intended for direct distribution to the EGI sites
- In RT, the monitoring team should be involved via a CC to email@example.com, and the security officers via firstname.lastname@example.org.
- this triggers the import of distribution to the EGI Repository "unverified" repository for starting the staged-rollout process via the RT link as explained in the NSRW New Software Release Workflow
- To ensure a smooth, parallel update of the repository and the monitoring/Nagios tests, the non-urgent updates should be started before Wednesdays noon.
- New ticket (with ticket ID) is announced by David Groep (email@example.com) (following on or forwarding the EUGridPMA-Announce list) to
- the EGI CSIRT via firstname.lastname@example.org
- the EGI MW unit via email@example.com, firstname.lastname@example.org (or later email@example.com)
- the monitoring teams to update the SAM CE probes via firstname.lastname@example.org, email@example.com
- Mail paste list: firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com, firstname.lastname@example.org, email@example.com
- Either the EGI CSIRT, the EGI Operational Security Coordinators, or the MW unit can declare this update as urgent.
- The RT ticket will trigger the import into unverified, after which the initial acceptance process will check compliance with the apckage requirements and upload the report.
- The Initial QA should be done within a few hours.
- This will bring the package to ".../sw/stagerollout/"
- At the transition to .../sw/stagerollout/
- SR quickly verifies if the package works in an at-most-one-day process, but does not close the ticket. The SR team now ask the monitoring team (Emir) whether there are any pre-U11.2 SAM/Nagios instances still in production. If so, the RT ticket gets assigned to the Monitoring team if SR is successful!
- Now the process WAITS for the official IGTF release, which will be announced through the announce@eugridpma mailing list. The new EGI trust anchor release is not to be progressed to production until the IGTF announcement is out (please!)
and you wait a bit more, and then ...
- Only for releases with SAM Nagios < U11.2: monitoring team updates the old-style WN probres (CE-sft-caver) as described below, based on the data in the SR repository, to update the legacy CE tests if those are still in production use, as both changes are independent. In case the old probes are to be supported, the following steps should be done in quick succession, an fast-update to Nagios prepared and rolled out a few hours before the CA announcement (new-style probes use dynamic data, from the submission repo for <U11.1, and from production repo for >=U11.2, and do not need to be updated):
- updates the ca_dist.dat file using the new CA RPM list (see below!)
- updates the ca_dist.conf file with the absolute calendar update timings used in Nagios (see below as well), in accordance with the EGI-CSIRT or liaison recommendation (8 days for regular, or 1 day for urgent)
- builds and releases the updated org.sam probes via a very-quick SW process in parallel to the update. This will be the new sam release
- monitoring team informs the NGI Operations managers to update the Nagios instances via firstname.lastname@example.org
- The SR team can now progress to release:
- check the date. If the date in the SAM/Nagios "release.xml" file (<Date> entry) is already passed, ask Kostas to update this file in the repo on final release. Otherwise, just continue...
- moves release from staged roll-out to production (so not to UMD), triggering over-write (non-incremental) of the .../sw/production/cas/1/ directory.
- verifyies that the update was non-incremental and that the whole contents of the ".../production/cas/1/" directory has indeed been replaced at http://repository.egi.eu/sw/production/cas/1/. If the repository has not properly updated itself, contact the repo team (Kostas Koumantatos at <kkoum@GRNET.GR>) to have it fixed: the URL http://repository.egi.eu/sw/production/cas/1/current/ MUST point to the new release!
- Annoucements to be sent:
- the IGTF liaison (DavidG) updates the EGI_IGTF_Release wiki page with installation information to list the new version number.
- The SR team sends the announcement with changelog (from the RT ticket) to the NGI contacts and the sites through the Operations Portal (select "To ROC Managers", "To Production Site Admin","Operation tools"), with a CC to the gLite EMT. The change log to be included in the mail is part of the RT ticket NSRW XML and can also be found at http://repository.egi.eu/sw/production/cas/1/current/meta/ as file ca-policy-egi-core-readme-X.Y.txt(with X.Y the version number).
- In case the update was marked critical, the EGI-CSIRT verifies that the repository links and release notes of the CA related web pages ( and EGI_IGTF_Release) have been updated and that the relevant broadcast of step 8 is achieved by looking at https://cic.egi.eu/index.php?section=vo&page=broadcast_archive (limit the dates to the last few days, and fill "Criterion 3" with "CA"). If a critical update was not released in time, it will (i) warn the technical and operational coordinators of EGI thereof via email@example.com, firstname.lastname@example.org), and (ii) take appropriate action to ensure (as always) the security of the EGI infrastructure.
Notes on the SAM tests
- SAM tests need to be updated, specifically the "lcg-sam-client-sensors" and "grid-monitoring-probes-org.sam" and these need to be installed in production:
Hi Emir, I've found back the instructions for generating those magic Python dictionaries for the worker-node CA probes. You can do it in three steps: - download the CE-sft-caver probe: wget http://svnweb.cern.ch/guest/sam/trunk/probes/src/wnjob/org.sam/probes/org.sam/sam/CE-sft-caver - download the to-be-latest RPMs in a separate directory: mkdir input && cd input && \ wget -q -nH -nd -c -r -l 1 \ http://repository.egi.eu/sw/stagerollout/cas/1/1.XX-R/current/RPMS/ or if you want to use the production one wget -q -nH -nd -c -r -l 1 \ http://repository.egi.eu/sw/production/cas/1/current/RPMS/ - REMOVE the 'dummyca' RPM: rm dummy-ca-certs-20090630-1.noarch.rpm - move back up (keeps things clean) cd .. - generate the dictionaries ./CE-sft-caver -a input -T (or use the "-u" option for an urgent update of 1 day) - do something useful with the generated files - update the RT ticket and allow it to be closed (thus installing the stagerollout/ version into production/) However, there is a fundamental problem with the probe: it uses OpenSSL to calculate the hashes, so the generated dictionaries work *either* with OpenSSL 0.x, OR with OpenSSL1, but NOT with both. That is a limitation of the probe itself. So, generate the ca_data.dat file on a machine that matches the target of the CE probes! Anyway, the old-style probe has to go away, because it deals with 'removed' CAs in the wrong way. It will start complaining even if the CA is still technically OK ... Long live the new probe!