Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

VT-CloudCaps:Questionnaire

From EGIWiki
Jump to navigation Jump to search

This should evolve to questionnaire which should map state of user-groups using FedCloud and working with our mini-project. We have to create questionnaire, fill it with already known information and only then approach users.

Feel free to edit, just first ideas!

Pool of topics and questions

Image preparation and management

How has the image been created, how is it managed, should we help with preparation?

  • How many images are used by your group (one, several with different functions)?
  • How did you create the image? From scratch, from basic OS image, using full OS installation, copy of desktop, copy of image prepared in vmware/virtualbox on desktop, group already provides one ...
  • Is it one partition with system, multiple partitions/whole disk (with dedicated place for data, empty space for other data, packages, user data)
  • Is everything required for computation already installed in the image? Would it be interesting to install parts during VM start (contextualization, always latest version of packages)? Is it installing packages/software during/after boot? CVMFS?
  • Image prepared to run with KVM/Xen, in which format (OVF)?
  • Do you rely on a specific Linux Kernel version?
  • How should new versions of the image be distributed and installed? No need, rarely, often via vmcatcher, other way. How do you intend to deal with security updates?
  • Is image signed? Endorsed by some group? Verified by some RP?
  • What kind of hardware requirements (resource demands do  your image and application have? RAM, Disk, Processor, Cores.
  • What are the network requirements of your application? Do you require access to the running instance from external? Which ports do you require to be open? Do you expect arbitrary access from within the instance to the outside world? What are your bandwidth expectations?

Workload management

How do you submit the actual work to the running instances? Should we care, help?

  • Do you use a form of pilot framework? BOINC? Other implementation of call-home?
  • Is VM started by some workload system/application, which immediately submits "jobs"?
  • Who is doing scheduling? VMs running across several providers?
  • Do you do automatic scaling of your framework? Do you require vertical scaling, e.g. sizing up instances, or horizontal scaling, i.e. adding more instances as needed?
  • How long should a VM run (long computation, smaller jobs submited inside VM,...)?
  • Can the VM be preempted or migrated?

AAI and contextualization

How do you intend access to running VMs (should we help, explain what's possible, push contextualization?)

  • Is there support for user contextualization? Already available/would be nice/not needed.
  • Does your system come with pre-installed ssh access, a fixed root password, ssh public key, group accessible public key, other way to login, remote desktop, no need for root access, need for user contextualization (storing ssh key in authorized_keys)
  • Management of running VMs - all started by one representative of VO, image/VM shared between group of users, VM just for one user
  • Does VM contain some credentials to be able to access remote services/data? Could this be injected via contextualization?

Data, big data

In some cases, big data are analyzed/produced by cloud applications. There is usually place for improvements, help, new services...

  • Does your application work with large amounts of data? If yes, which type of access is needed (big shared network storage, virtual disk accessed by some VMs, object storage)? Do you only read or also write this category of data?
  • Is all of the data used by all VMs? Every VM/job is using small subset? Other patterns?
  • Do you require a Hadoop like environment?
  • Are you already using some object storage like S3, CDMI? Data service from EGI (gridftp, SE, SRM)?
  • Large data downloaded/produced during VM lifetime?
  • Need for higher-level control of data access?

What else should we know?

  • Is there a need for other services? Like messaging system, integration with standard EGI services (data?), SQL database?

UseCase-specific questionnaires

OpenModeller

Candidate for block storage, object storage and possibly auto-scaling.

  • How did you create the image? (from scratch, basic installation, full installation etc.)

From scratch, installing the needed libraries and tools as COMPSs, OpenModeller, rOCCI client, etc. and then we saved it.

  • Is everything required for computation already installed in the image? (software, tools etc.)

Yes, even if we could have done it at VM creation time

  • How will you distribute your image and its updates? (vmcaster/vmcatcher, automated using a different tool, by hand)

I uploaded the image in the EGI repository and then I endorsed it at CESNET. I don't know how the other providers published it.

  • What are your resource requirements? (CPU, memory, storage and network)

We never measured minimum requirements for running openModeller, and this can also be quite variable depending on the experiment. However, since the EGI Use Case is related with the BioVeL project, which is currently using a modelling service hosted at CRIA, I think BioVeL expects that the new service instance in Europe should provide at least a similar performance, if not better. The whole service is running on a single machine here:

Dell PowerEdge 1800 (2 processors Intel Xeon CPU 3.80GHz, 4GB Memory DDR2, 400MHz, 6 HDs SCSI of 146GB, 2 HDs SCSI of 300GB)



  • How do you submit work to running instances? (pilot framework or local workload)

We use the VENUS-C/COMPSs framework. An endpoint is provided to user communities and accessed through Taverna for BioVeL users, through a Virtual Research Environment in EUBrazil-OpenBio.

  • Does your application support horizontal (more instances) and vertical (more resources for a single instance) auto-scaling?

Horizontal

  • How do you access your virtual machines once they have been launched?

COMPSs access the VMs for execution.

  • Are you using contextualization? (how and where or why not)

COMPSs at the moment of VM creation, copies the SSH keys


  • What's the character of your data? (size, format, read-only vs. read-write)

The local set of environmental layers has ~32GB. Please note this is just a limited set of the most popular files, so this number can easily increase over the time. Environmental layers are one of the inputs for the modelling procedure, together with a set of species occurrences points. Both are only read by openModeller to generate, test or project models.

  • How are you accessing your data? (copied locally vs. accessed remotely)

Copied locally

  • How much space do you need for a single computation?

Results can either be small XML files generated by creating or testing models (few KB) or a raster file generated by projecting a model (from a few MB up to a few GB, depending on spatial extent, resolution and format). In the new service API there's a new operation that will accept multiple model creations, tests and projections in a single request, which makes this question even more complicated to answer.


  • Could environmental layers be stored in object storage?

Environmental layers are rasters that are usually stored as regular files, but they can also be stored in relational databases, such as TerraLib or rasdaman do. Apparently Oracle Spatial can store rasters using object storage, but I have no experience with this.

  • How are you gathering results and what's their character? (size, format, sensitivity)

The service is asynchronous: clients send job requests and need to retrieve results later when the job is finished. Results are stored for a certain period of time configured by the sysadmin. They are all regular files. Here we keep them for a couple of weeks. It's hard to tell about the size because it depends on the number of requests in that period and on the type of requests. Right now the results in our server are only taking a few MB, but it would be a good idea to reserve some GB for this task. There's no security mechanism provided by the service. Results are retrieved by providing a ticket that is generated for the initial request (a random combination of numbers and characters).

  • Do you support or actively use any dynamic cloud-like environment? (which, how and why)


  • Are you exposing any services to the outside world? (i.e., listening on public interfaces)

Yes, there is an Extended Open Modeller Web Service exposed to the users and deployed outside EGI

  • How are they protected from unauthorized use?

The endpoint is public (at the moment, Renato don't know if you plan something about security) but then the access to the VENUS-C/COMPSs middleware is protected with x509 certificates security.

WeNMR

Candidate for auto-scaling.

  • How did you create the image? (from scratch, basic installation, full installation etc.)
  • Is everything required for computation already installed in the image? (software, tools, data, etc.)
  • How will you distribute your image and its updates? (vmcaster/vmcatcher, automated using a different tool, by hand)
  • What are your resource requirements? (CPU, memory, storage and network)


  • How do you submit work to running instances? (pilot framework or local workload)
  • Are you using contextualization? (how and where or why not)


  • What's the character of your data? (size, format, read-only vs. read-write)
  • Have you considered using object storage to access your data and store the results?
  • Are you dealing with sensitive data?
  • How are you accessing your data? (copied locally vs. accessed remotely)

BNCWeb

Candidate for SQL database.

  • How did you create the image? (from scratch, basic installation, full installation etc.)
  • Is everything required for computation already installed in the image? (software, tools, data, etc.)
  • How will you distribute your image and its updates? (vmcaster/vmcatcher, automated using a different tool, by hand)
  • What are your resource requirements? (CPU, memory, storage and network)


  • How are you accessing your data? (copied locally vs. accessed remotely)
  • Have you considered using a centralized SQL database to share and access your corpus data across multiple instances?
  • Does your application support horizontal (more instances) and vertical (more resources for a single instance) auto-scaling?


  • Are you using contextualization? (how and where or why not)
  • Are you exposing any services to the outside world? (i.e., listening on public interfaces)
  • How are they protected from unauthorized use?

PeachNote

Candidate for Messaging (currently using Amazon SQS), Database (Apache HBase), Auto-Scaling

  • How did you create the image? (from scratch, basic installation, full installation etc.)
  • Is everything required for computation already installed in the image? (software, tools, data, etc.)
  • How will you distribute your image and its updates? (vmcaster/vmcatcher, automated using a different tool, by hand)
  • What are your resource requirements? (CPU, memory, storage and network)


  • We have learned that your VM would need access to Amazon's SQS for job info, to HBase cluster to retrieve and store data, and to the peachnote server to regularly update the workflow code. Which are the hosts and ports these services run on?
  • Does your application support horizontal (more instances) and vertical (more resources for a single instance) auto-scaling?


  • Are you using contextualization? (how and where or why not)
  • Are you exposing any services to the outside world? (i.e., listening on public interfaces)
  • How are they protected from unauthorized use?

WSPGRADE

Candidate for Auto-Scaling

  • How did you create the image? (from scratch, basic installation, full installation etc.)
  • Is everything required for computation already installed in the image? (software, tools, data, etc.)
  • How will you distribute your image and its updates? (vmcaster/vmcatcher, automated using a different tool, by hand)
  • What are your resource requirements? (CPU, memory, storage and network)


  • How are you accessing your data? (copied locally vs. accessed remotely)
  • Does your application support horizontal (more instances) and vertical (more resources for a single instance) auto-scaling?


  • Are you using contextualization? (how and where or why not)
  • Are you exposing any services to the outside world? (i.e., listening on public interfaces)
  • How are they protected from unauthorized use?

GaiaSpace

Candidate for Auto-Scaling, Object-Storage, Block-Storage

  • How did you create the image? (from scratch, basic installation, full installation etc.)
  • Is everything required for computation already installed in the image? (software, tools, data, etc.)
  • How will you distribute your image and its updates? (vmcaster/vmcatcher, automated using a different tool, by hand)
  • What are your resource requirements? (CPU, memory, storage and network)


  • How are you accessing your data? (copied locally vs. accessed remotely)
  • Does your application support horizontal (more instances) and vertical (more resources for a single instance) auto-scaling?


  • Are you using contextualization? (how and where or why not)
  • Are you exposing any services to the outside world? (i.e., listening on public interfaces)
  • How are they protected from unauthorized use?

DIRAC

Candidate for Auto-Scaling

  • How did you create the image? (from scratch, basic installation, full installation etc.)
  • Is everything required for computation already installed in the image? (software, tools, data, etc.)
  • How will you distribute your image and its updates? (vmcaster/vmcatcher, automated using a different tool, by hand)
  • What are your resource requirements? (CPU, memory, storage and network)


  • How are you accessing your data? (copied locally vs. accessed remotely)
  • Does your application support horizontal (more instances) and vertical (more resources for a single instance) auto-scaling?


  • Are you using contextualization? (how and where or why not)
  • Are you exposing any services to the outside world? (i.e., listening on public interfaces)
  • How are they protected from unauthorized use?

DCH

Candidate for <Capability>