Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "VT VAPOR:Progress Report"

From EGIWiki
Jump to navigation Jump to search
m
Line 355: Line 355:
==== Problems and issues we need external help with ====
==== Problems and issues we need external help with ====
None.
None.
== Reporting period: X - Y ==
==== Progress ====
==== Plans for next period ====
==== Problems we encounter, but can solve ====
==== Problems and issues we need external help with ====

Revision as of 11:26, 13 January 2014

Progress Report

VT Reporting guidelines

Reporting period: Dec. 9th 2013 to Jan. 13th 2014

Progress

During this long period, rather few has happened due to the Christmas vacation (main developer was on leave from Dec. 9th to Jan 2nd).

Nevertheless, a first version of VAPOR has been finalized and set on line on Dec. 22nd 2013. It supports the biomed VO, and biomed technical support team is now encouraged to use it. The web application is hosted as the Computing Center in Lyon (IN2P3), while the data collecting services are hosted at the production server in I3S in Sophia Antipolis.

Other tasks performed:

  • new release of the JobMonitor deployed. It fixes flaws and simplifies the configuration.
  • new release of the Lavoisier data integration service deployed. Comes with several important optimizations.
  • Complete the integration of the Running Ratio feature: initially it required the data to be located on the same server as the webapp, now data is exploited remotely.
  • Misc. other bug fixes and improvements.
  • Important rework of the VAPOR Install and Configuration guide.

Plans for next period

  • Investigate a problem of too long delay in the computing of the white list of CEs (2 to 3 minutes depending on parameters).
  • Code cleanness: improve insufficient comments in the code, clean up unneeded files and properties.

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.

Reporting period: Nov. 26th to Dec. 9th

Progress

  • All along this period an important work has consisted in the setting up of a production-ready set of VAPOR services, that includes: fine tune logs configuration, configuration of log rotates, firewall configuration, set up of a backup procedure of the VAPOR VM
  • Bug fixing on the scan of full storage elements (sort files of expired/suspended users)
  • Investigation and fixing of a problem of high CPU consumption of the JobMonitorng tool
  • 2 days face-to-face meeting with Operations Portal team (2 and 3 of December), integration of VAPOR with the Operations Portal at the CC IN2P3 (Lyon, France)
    • Decision made on where the VAPOR data production services should be hosted, i.e. at I3S or at CC IN2P3
    • Installation of the VAPOR webapp on the web server of the CC IN2P3 and link from the Operations Portal. Now accessible for VO biomed only from: https://operations-portal.egi.eu/vapor?vo=biomed
    • Address web design and graphical homogeneity issues
    • Start migration to Lavoisier 2.1 data integration service (used as third party tool within VAPOR)

Plans for next period

Very few should now happen until the end of the month as the main developer will be on leave from Dec. 9th to 31st.

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.

Reporting period: Nov. 12th to Nov. 25th 2013

Progress

  • Complete developments on the scan of storage elements (sort files of expired/suspended users)
  • In preparation of the 2 days meeting with Operations Portal team (2 and 3 of December),
  • Configure all VAPOR services for the biomed VO on the dedicated VM deployed recently:
    • this helped fix issues in some of the tools to make them VO independent (some were initially developed for biomed and thus were specific for it): distinguish proxy certificate files, separate log files etc.
  • Refactoring of some configuration files to make them more intuitive, and update documentation appropriately.

Plans for next period

  • Complete the deployment of all current VAPOR service for the VO biomed.
  • 2 days meeting with Operations Portal team (2 and 3 of December), integration of VAPOR with the Operations Portal

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.

Reporting period: Oct. 30th to Nov. 11th 2013

Progress

  • Continued the developments of VO Data Management features: period mostly dedicated to the scan of Storage Elements based on the file catalog
    • continued the customization and integration of the tool used to scan storage elements filling up.
    • development of web pages to display the reports
  • Continue investigation of the use of GFAL2 python and GFAL FS.

Plans for next period

  • Still a few developments to do on the scan of storage elements (sort files of expired/suspended users)
  • In preparation of the 2 days meeting with Operations Portal team (2 and 3 of December), configure VAPOR services on the dedicated VM deployed recently.

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.

Reporting period: Oct. 16th to 29th 2013

Progress

  • Overall review of the consistency in terms of GUI, styles etc. of the features related to Resource status indicators and statistical reports.
  • Fixing of few minor issues
  • Deployment of a virtual machine with SL64 and EMI3 UI planned to host monitoring services of VAPOR.
  • Developments: VO Data Management features:
    • Started customization and integration of a tool used to scan storage elements filling up.
    • On-going discussion on the use of GFAL2 python and GFAL FS, together with GFAL2 team and VAPOR's partner CNRS Creatis.
    • Start development of SE consistency checking (dark data, lost files) using the GFAL2 API.

Plans for next period

  • Continue the developments of VO Data Management features.
  • Continue deployment of the virtual machine dedicated to host monitoring services of VAPOR.

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.

Reporting period: Oct. 2nd to 15th 2013

Progress

  • Developments: completed most if the remaining features related to Resource status indicators and statistical reports:
    • report of all resource supporting the VO,
    • report resources not in proper production,
    • report computing elements with the 444444 issue (number of jobs waiting or running is 444444),
    • report storage elements which publish negative space values.

in addition to look and feel and ergonomic improvements (tooltips, styles, use cookies to remember columns selected by user etc.).

  • Investigation about the GFAL2 API to address VO Data Management features.
  • Started writing the application architecture description.

Plans for next period

A few minor issues need to be fixed on pages described above, along with an overall review of the consistency in terms of GUI, styles etc. Developments: start development of the VO Data Management features: dealing with storage elements filling up.

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.

Reporting period: Sep. 18th to Oct. 1st 2013

Progress

  • Face to face meeting with the EGI Operation Portal team at the TF13 in Madrid:
    • Demonstration of VAPOR current status
    • Decision to organize a 2-days integration session by end of November/beginning of December to make VAPOR's current status publicly accessible.
  • Developments:
    • Added 2 new web pages to show (1) the current status of resources supporting the VO (CE, SE, WMS, VOMS, LFC), and (2) the list of resources which status is currently not in production (consolidate BDII/GOCDB statuses)
    • Improvements in the Lavoisier VO-views to consolidate data from the BDII and GOCDB.

Plans for next period

  • Complete the new web pages of the current status of resources supporting the VO and the list of resources which status is currently not in production.
  • Enrich look & feel with graphical icons for instance to mention tooltips
  • Start writing application architecture description

Problems we encounter, but can solve

The lack of experience of the main developer is being handled and things are improving. Continuous support is being provided.

Problems and issues we need external help with

None.

Reporting period: Sept. 4th to 17th 2013

Progress

  • Continue the customization of the JobMonitor tool by partner IPHC to support a multi-vo environment. Several phone conferences with the developer.
  • Global actions on the web application:
    • Continued to improve the ergonomy and look and feel of web pages: enrich helps, add customizable tooltips.
    • Tests have revealed weaknesses in the application development: an important action has consisted in improving the safety of the application and adding much more log traces to be able to follow up on the application life.
  • VO Operations, reports of availability of resources:
    • Added some reports in the web pages to show the "running ratio" R(R/W)
    • Almost completed the Lavoisier VO-views to consolidate data from the BDII and GOCDB.
  • VO data management: continuation of the study on VO Data management, very rich discussion with developers of GFAL2 in GGUS ticket #97076

Plans for next period

  • Face to face meeting with the developers of the EGI Operation Portal to discuss future steps.
  • Add 2 new web pages to show (1) the current status of resources supporting the VO (CE, SE, WMS, VOMS, LFC), and (2) the list of resources which status is currently not in production (consolidate BDII/GOCDB statuses)

Problems we encounter, but can solve

The lack of experience of the main developer is being handled and things are improving, although the code production and quality remain lower than expected. Action: heavy support is being provided.

Problems and issues we need external help with

None.

Reporting period: Aug. 21st - Sept. 3rd

Progress

  • Follow up of the improvement of the ergonomy and look and feel of web pages.
  • VO Operations, reports of availability of resources:
    • completed the web page and code dedicated to the production of white list of CEs based on job monitoring reports.
    • Writing of VO-views to consolidate the data from the BDII and GOCDB, using the Lavoisier data integration service. Numerous interactions with the developers of the tool.
  • VO data management: continuation of the study on VO Data management (dark data in particular), discussions with sites admins (FR, UK) to refine the procedures

Plans for next period

  • VO Operations, reports of availability of resources: still some reports to add in the web pages to show the "running ratio" R(R/W)

Problems we encounter, but can solve

  • The lack of experience of the main developer is being handled and things are imporving, altough the code production remains slower than expected. Action: heavy support is being provided to him.
  • Complexity of the Lavoisier data integration service, but the development team is very helpful and reactive.

Problems and issues we need external help with

None.

Reporting period: Aug. 7th to 20th

Progress

  • Work with partner IPHC to upgrade tool JobMonitor to support a multi-vo environment. Several phone conferences.
  • Improvements of the ergonomy of web pages.
  • Improve the management of misc. errors.
  • VO Operations, reports of availability of resources:
    • development of new web pages to report running and waiting jobs, and ratio R(R/W).
  • VO data management:
    • testing of scripts to detect and clean up VO dark data
    • discussions held with sites admins (FR IPHC, UK QMUL) in order to refine the procedure to deal with dark data and lost files. Study of good practices from HEP VOs.

Plans for next period

  • VO Operations management features (resources availability): add new reports on ratio R(R/W), complete the production of the white list of CEs.
  • Continue study on VO Data management (dark data).

Problems we encounter, but can solve

Lack of experience of the main developer in terms of development good practices and application design. This has slown down the activity but this is handled.

Problems and issues we need external help with

None.

Reporting period: July 24th - Aug. 6th

Progress

  • This period was used to make a strong focus on code quality including code review, cleaning up of code, and the first commit of the current version after code and environment cleaning up.
  • Documentation: write a document to describe the application environment and installation procedure.

Development:

  • VO Operations management features
    • resources availability: continue development of reports on running and waiting jobs, and ratio R(R/W).
    • VO data management: start development of scripts to detect and clean up VO dark data.
  • Focus on styles and look and feel of web pages

Plans for next period

Complete VO Operations management features (resources availability): reports on running and waiting jobs, and ratio R(R/W); white list of CEs Continue work on VO Data management (dark data).

Problems we encounter, but can solve

Lack of experience of the main developer in terms of development good practices and application design. This has slown down the activity but this is handled.

Problems and issues we need external help with

None.

Reporting period: Jun. 26th - Jul. 23rd

Progress

Low activity due to summer vacation period: 2 weeks vacation for Flavien Forestier (developer) and 3.5 weeks for Franck Michel (project manager) => 10 days work for Flavien and 2 days for Franck.

  • VO Operations management features (resources availability) :
    • continue development of web pages to view results of the Job Monitor tool of IPHC partner: chart for the history report, table view by computing element.
    • development of web page to view results of the CE monitor: view of number of running, waiting jobs and ration R(R/W) (chart)
    • discuss the parameters of the white list of CEs with IPHC.
  • Deployment of the dev and test environment on a virtual machine.

Plans for next period

  • Refine existing with better ergonomy and presentation
  • Focus on styles and look and feel of web pages
  • Display a white list of CEs

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.

Reporting period: June 19th to 25th

Exceptionally this is a one-week report as I'll be on vacation starting end of this week.

Progress

  • Get skills about using the Twitter Bootstrap software, about Ajax technologies, and javascript graphical librairies (Dygraph)
  • Continue the development of web pages for the VO Operations management features about resources availability: chart for the results of the job monitor of IPHC (number of jobs ok, ko, timed out).
  • Further install and configuration of the virtual machine deployed last week to host developments and tests of VAPOR.

Plans for next period

Very few will be done in the next period due to summer vacations of Franck Michel (back 22 July), and Flavien Forestier (VAPOR developer, back 15 July).

Problems we encounter, but can solve

none.

Problems and issues we need external help with

none.

Reporting period: June 5th to 18th

Progress

  • Work with partner IPHC to upgrade tool JobMonitor to support a multi-vo environment. Several phone conferences.
  • 2 days face-to-face meeting in Lyon (France) between I3S and EGI Operations Portal and VO Operations Dashboard developers team: https://indico.egi.eu/indico/conferenceDisplay.py?confId=1721. Goal: technical discussions on the way to integrate VAPOR developments into the existing VO Operations Dashboard.
  • Also in Lyon, discussion with biomed LFC manager about Data Management procedures to set up.
  • Continued technical phone conferences with partner CNRS IPHC: the existing job monitor tool is being customized to be more general (initially dedicated to biomed).
  • Prototyping of first web pages for the VO Operations management features about resources availability, using both the job monitor of IPHC and the data integrator web service from EGI Operations Portal (Lavoisier).
  • Deployment of a virtual machine at I3S to host developments and tests of VAPOR.

Plans for next period

  • Keep on developing the VO Operations management features: continue integration of job monitor along with appropriate web pages in tabular and chart formats.
  • Start development of web pages to report evolution of running and waiting jobs in the VO in grapical charts
  • Start using technologies such as Twitter Bootstrap and Ajax to make a good-looking, user friendly and reactive web interface.

Problems we encounter, but can solve

The work on the VO data management procedures has been started in Lyon with discussions with LFC manager. Further work will be postponed later during the summer.

Problems and issues we need external help with

Reporting period: May 22nd - June 5th

Progress

  • Decision made on the priorities of the developments, following the discussion with partner VOs about the Functional specification of VAPOR features :
    1. VO Operations management > Report GOCDB and BDII status and Monitor resources availability.
    2. VO Operations management > VO Data Management procedures.
    3. Users database implementation.
  • Developments started on point 1: 2 conference calls held during the period with partner CNRS IPHC that develops a tool to monitor CEs.
  • 6 days (2x3 days) of training courses for Franck Michel during this period.

Plans for next period

  • 2 days face to face meeting with VO Operations Dashboard developers team in Lyon (France), to bootstrap developments within a common environment.
  • Will try to organise calls with VO AUGER who shoed interest for VAPOR.
  • Find volunteer among partners to start working on the possible procedures that can be envisaged about data management (priority 2).

Problems we encounter, but can solve

Difficulty to find someone among partners to start working on the VO data management procedures. Call to be organised with partners CNRS Creatis and CNRS IPHC.

Problems and issues we need external help with

None.

Reporting period: May 8th to 21st

Progress

  • 4 national holidays in the last two weeks explain a rather light advance in this period.
  • Work continued on the Functional specification of VAPOR features : restructuring, additionals, more in depth details, additional related material.
  • Conference held with France Grille VO to get their opinion on the features of VAPOR and the possibility that they use it in the future. Meetings list updated: https://indico.egi.eu/indico/categoryDisplay.py?categId=100
  • Conference with AMC can't be done for now due to constraints of AMC. Will be rescheduled in July.
  • Face to face meeting scheduled on 5 and 6 of June with VO Operations Dashboard developers team in Lyon (France): the idea is to bootstrap joint developments.
  • Fix and improve existing tools to be integrated into VAPOR about GOCDB and BDII status report.
  • Self training of Flavien Forestier continued regarding grid technologies, Symphony2 framework and related development technologies.

Plans for next period

  • Will try to organise calls with 2 other VOs that showed interest in VAPOR: AUGER
  • Flavien Forestier to continue self training on dev technos, and acquire strong in-depth knowledge about functional features and technical solutions.
  • Conatct members of the project to investigate possible technical solutions on data management.
  • Contact biomed members about the tool used to perform CE monitoring tools.

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.

Reporting period: Apr. 22nd - May 7th

Progress

Plans for next period

  • Meetings with partners to continue: call scheduled with France Grille VO. Call with AMC to be scheduled.
  • Flavien Forestier to start self training on Symphony2 framework.

Problems we encounter, but can solve

None.

Problems and issues we need external help with

None.