Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

EGI-InSPIRE:SA1.5-QR7

From EGIWiki
Revision as of 17:15, 11 December 2012 by Krakow (talk | contribs)
Jump to navigation Jump to search
Main EGI.eu operations services Support Documentation Tools Activities Performance Technology Catch-all Services Resource Allocation Security


Inspire reports menu: Home SA1 weekly Reports SA1 Task QR Reports NGI QR Reports NGI QR User support Reports


1. Task Meetings

Date (dd/mm/yyyy) Url Indico Agenda Title Outcome

2. Main Achievements

Repository: Ran the production repository with no internal problems this quarter. There was one scheduled firewall outage of 45 minutes at RAL and a few very small network breaks which prevented the service receiving new data. This data would all have been received the next time the affected clients tried to publish. Total availability 99.79%.

Heavier support load than usual. Mainly due to rollout of CREAM CE and configuration errors during setup. It would be good to have second line support for APEL as for the other middleware. The APEL Team should only be providing third line support.

The planned production release of the new infrastructure slipped beyond the end of this quarter.

There is a test repository running all the time to receive tests from other sites developing their software against SSM the new STOMP and Python-based messaging layer which runs on the production EGI Messaging Infrastructure. All of the other existing and new accounting services have tested using SSM except SGAS where the developer has moved on and has not yet been replaced.

Portal: No operating problems, the VM was configured with 3Gb (1Gb more) to avoid memory shortages on user decryption due to the ever growing number of user records. The new ActiveMQ connector should be a definitive solution to this problem. The last release with the new codebase caused some early problems and regressions that were quickly fixed. After that, the number of user problems decreased sharply, and most tickets and mails are feature requests.

Since the Portal is a VM machine, it can grow easily to accommodate more load, but currently the growth in data size seems to be reasonable.

3. Issues and Mitigation

Issue Description Mitigation Description
SL4-5 migration requires all APEL servers to be down together for several days. Downtime announced well in advance. Primed TPM to respond to sites who post tickets having missed the broadcast.

4. Plans for the next period

Start a production service receiving summaries from other accounting services over SSM and joining them with the old summary system. Once all systems have migrated the old database will be migrated to a new one and the old clients piped into that. This second step is likely to be in Q8. After the first step we will be ready to receive prototype records from storage and cloud iunfrastructures.

Portal: Update the operating system on production to SL5, and to SL6 on some future date.