- Status: Pre-production
- Start Date: 17/12/2014
- End Date: -
- EGI.eu contact: Diego Scardaci / firstname.lastname@example.org
- External contact: Hassen Riahi / Hassen.Riahi@cern.ch, Laurence Field / Laurence.Field@cern.ch
The CMS community would like to profit of the EGI Federated Cloud resources to absorb workload peaks in CMS grid.
The Compact Muon Solenoid (CMS) is a general-purpose detector at the Large Hadron Collider (LHC). It is designed to investigate a wide range of physics, including the search for the Higgs boson, extra dimensions, and particles that could make up dark matter. Although it has the same scientific goals as the ATLAS experiment, it uses different technical solutions and a different magnet-system design. This use case foresees the usage of cloud infrastructure broker Vac/Vcycle developed by the University of Manchester. The CERN community has developed an OCCI connector for Vcycle to access the EGI Federated Cloud resources.
More information on Vac/Vcycle are available below.
Vac is a self managing system to control virtual machines which are running on hypervisors which are not managed by an IaaS system. It is an implementation of the vacuum model whereby a VM factory runs on each physical machine. Each factory independently decides to start a VM instance or instances if on a multi-core node. The factory takes care of the VM contextualization based upon the predetermined configuration for the VO. Currently, one instance is started per job and is automatically shut down when the job terminates are no further payloads are available. Information is exchanged between the host and the guest via a directory on the host which is mounted by the guest. One key piece of information shared is the exit status of the job. If the exist status is No work available, the factory will back-off from creating machines and try again later. An aspect of this approach is that as there is no central service and hence avoids a central point of failure and is horizontally scalable. Factories may communicate with each other to achieve target shares for the specific Vac space at the site. With this approach, each VM factory can decide which VO’s VMs to run, based on site-wide target shares and on a peer-to-peer protocol in which the site’s VM factories query each other to discover which VM types they are running, and therefore identify which VO’s VMs should be started as nodes become available again. For sites where most of the resources are dedicated to a few VOs, this approach provides a straight forward solution that is no longer dependent on all the Grid or cloud machinery for these jobs.
Vcycle is an alternative implementation of the vacuum model which can be used in conjunction with IasS providers. Whereas an instance of Vac resides on each physical host, a centralized Vcycle service uses the IaaS interface to manage the VM lifecycle following the same logic as implemented in Vac. It supervises the VMs and instantiates/shutdowns VMs depending on the load coming from the experiment’s central task queue. As with Vac, if the exist status is No work available, the factory will back-off from recreating machines and try again later. Using Vcycle this way can provide elastic capacity using the resource providers it has at its disposal.