Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Difference between revisions of "KEDB"

From EGIWiki
Jump to navigation Jump to search
Line 16: Line 16:
For these cases, the references are not provided in the Central database for Known Errors and no other central source of information, as aligning the original source and the central replication of information would lead to synchronisation issues and useless information maintenance overload. Instead, references to direct sources are provided below the table.   
For these cases, the references are not provided in the Central database for Known Errors and no other central source of information, as aligning the original source and the central replication of information would lead to synchronisation issues and useless information maintenance overload. Instead, references to direct sources are provided below the table.   


{| width="100%" align="left" cellspacing="1" cellpadding="10" border="1"
{| width="100%" cellspacing="1" cellpadding="10" border="1" align="left"
|+ Known Error DataBase  
|+ Known Error DataBase  
|-
|-
Line 22: Line 22:
! scope="col" | ID  
! scope="col" | ID  
! scope="col" | Title  
! scope="col" | Title  
! scope="col" | Services affected<br>
! scope="col" | Services affected<br>  
! scope="col" | Entities impacted
! scope="col" | Entities impacted  
! scope="col" | Description  
! scope="col" | Description  
! scope="col" | References  
! scope="col" | References  
! scope="col" | Mitigations
! scope="col" | Mitigations  
! scope="col" | Workaround (availability/description)
! scope="col" | Workaround (availability/description)
|-
|-
Line 32: Line 32:
| 1  
| 1  
| umd-release-3.0.1 rpm has stopped working, preventing new installation of the UMD3 middleware<br>  
| umd-release-3.0.1 rpm has stopped working, preventing new installation of the UMD3 middleware<br>  
| UMD 3.14.2<br>
| UMD 3.14.2<br>  
| All services based on UMD3
| All services based on UMD3  
| umd-release packaged, used for adding the UMD3 repository to an existing installation, is not working anymore  
| umd-release packaged, used for adding the UMD3 repository to an existing installation, is not working anymore  
| Ticket: https://ggus.eu/?mode=ticket_info&amp;ticket_id=122424  
| Ticket: https://ggus.eu/?mode=ticket_info&amp;ticket_id=122424  
Line 49: Line 49:
*detailed postmortem [https://twiki.cern.ch/twiki/pub/LCG/WLCGDailyMeetingsWeek161010/post-mortem-CNAF-CE-Problem-Sept-2016.pdf attached] at [https://twiki.cern.ch/twiki/bin/view/LCG/WLCGDailyMeetingsWeek161010 WLCG meeting agenda, reported by CNAF] <br>
*detailed postmortem [https://twiki.cern.ch/twiki/pub/LCG/WLCGDailyMeetingsWeek161010/post-mortem-CNAF-CE-Problem-Sept-2016.pdf attached] at [https://twiki.cern.ch/twiki/bin/view/LCG/WLCGDailyMeetingsWeek161010 WLCG meeting agenda, reported by CNAF] <br>


| <div>RCs should: </div>
| <div>In order to minimize impact of disruptions by upgrades, RCs should: </div>  
*use minimal installation of the OS in order to minimize conflicts<br>
*use minimal installation of the OS in order to minimize conflicts<br>  
*not upgrade all the services at the same time<br>
*not upgrade all the services at the same time (if there are several instances running for the same service)<br>  
*not upgrade all the services automatically as the staged-rollout cannot contemplate all the possible configurations of the underlying OS/MW<br>
*not upgrade all the services automatically<br>
<div><br></div>
 
| remove dracut-fips packagebefore upgrading<br>
| remove dracut-fips packagebefore upgrading<br>
|}
|}

Revision as of 13:15, 17 October 2016

This page provides a central database for Known Errors; it collects known errors, namely identified problems/issues for which an underlying cause has been identified already. Known errors are shared on EGI wiki for the following reasons:

  • known errors are tracked here for use by HelpDesk team, espacially 1st and 2nd line support, so that known issues can be referenced on new GGUS tickets reporting correlated incidents
  • the HelpDesk team can suggest new known errors and add them to the shared KEDB and make use of them in case of shifts
  • users, VO members, RC/OC operators can be referenced with known errors a slong as they are in use
  • workarounds can be referenced/reported together with each known error (the known error will be included in this page even if no workarounds are available yet)
  • an ID is provided in order to identify easily the known error
  • a quick reference to an incident is always associated to a known error, in order to switch to a concrete example of incident related to the known err

Known errors and workarounds are also provided:

  • by Technology Providers in the Release Notes of their products,
  • by service providers of EGI in the documentation provided for their services
  • by the UMD team in the Release Notes of a specific software release

For these cases, the references are not provided in the Central database for Known Errors and no other central source of information, as aligning the original source and the central replication of information would lead to synchronisation issues and useless information maintenance overload. Instead, references to direct sources are provided below the table. 

Known Error DataBase
Creation date (YYYY/MM/DD) ID Title Services affected
Entities impacted Description References Mitigations Workaround (availability/description)
2016-06-29 1 umd-release-3.0.1 rpm has stopped working, preventing new installation of the UMD3 middleware
UMD 3.14.2
All services based on UMD3 umd-release packaged, used for adding the UMD3 repository to an existing installation, is not working anymore Ticket: https://ggus.eu/?mode=ticket_info&ticket_id=122424 yum can be invoked with the "--nogpgcheck"
2016-10-10 2 canL upgrade of UMD 3.14.4 and UMD 4.2.1 can break proxy renewal on CREAM CREAM All VOs using the upgraded CREAM CREAM services can stop submitting jobs using proxy renewal
In order to minimize impact of disruptions by upgrades, RCs should:
  • use minimal installation of the OS in order to minimize conflicts
  • not upgrade all the services at the same time (if there are several instances running for the same service)
  • not upgrade all the services automatically
remove dracut-fips packagebefore upgrading