Difference between revisions of "APEL/MessageFormat"
Line 224: | Line 224: | ||
| | | | ||
|- | |- | ||
| | | EarliestEndTime | ||
| int | | int | ||
| Start time of the first job in the month (epoch) | | Start time of the first job in the month (epoch) | ||
| | | | ||
|- | |- | ||
| | | LatestEndTime | ||
| int | | int | ||
| Start time of the last job in the month (epoch) | | Start time of the last job in the month (epoch) | ||
Line 305: | Line 305: | ||
Group: /atlas | Group: /atlas | ||
Role: Role=production | Role: Role=production | ||
EarliestEndTime: 1267527463 | |||
LatestEndTime: 1269773863 | |||
WallDuration: 23425 | WallDuration: 23425 | ||
CpuDuration: 2345 | CpuDuration: 2345 |
Revision as of 16:13, 15 July 2011
APEL Message Format
This describes a new message format for getting data between the APEL clients and the server.
Terminology:
- A message is one file which is sent and received by the SSM. Usually a message will contain a number of records (eg 1000)
- A record corresponds to one row in the database. It contains a number of key-value pairs as specified by the tables below
- The header in each message tells the server which type of records are in that message. You need one header per message, so one header per file.
Job Records
A message is one file. It can contain multiple records. Different records must be separated by the end of record marker (%%).
Description
Header APEL-individual-job-message: v0.1
The header only appears once at the top of each message (that is once at the top of each file). It defines the type of record and the schema version.
Key | Value | Description | Mandatory |
---|---|---|---|
Site | String | GOCDB sitename | Yes |
SubmitHost | String | Head node where the job was submitted | Yes |
LocalJobID | String | Batch System Job ID | Yes |
LocalUserID | String | Local username | |
GlobalUserName | String | User's X509 DN | |
UserFQAN | String | User's VOMS attributes | |
WallDuration | int | Wallclock time for the job (seconds) | Yes |
CpuDuration | int | CPU time for the job (seconds) | Yes |
Processors | int | Number of processors |
|
NodeCount | int | Number of nodes | |
StartTime | int | Start time of the job (epoch) | Yes |
EndTime | int | Stop time of the job (epoch) | Yes |
MemoryReal | int | Memory consumed by job (kbytes) | |
MemoryVirtual | int | Virtual memory consumed by job (kbytes) | |
ScalingFactorUnit | String | HepSpec | SpecInt | custom | Yes |
ScalingFactor | double | Value of either HepSpec, SpecInt or custom | Yes |
End of record: %%
Notes: If !ScalingFactorUnit/Value is not available it should be set to:
ScalingFactorUnit = 'custom' ScalingFactor = 1
If !GlobalUserName or !UserFQAN is not published, the value for these fields on the server will be set to 'None'.
Example Message
APEL-individual-job-message: v0.1 Site: RAL-LCG2 SubmitHost: ce01.ncg.ingrid.pt:2119/jobmanager-lcgsge-atlasgrid LocalJobID: 31564872 LocalUserID: atlasprd019 GlobalUserName: /C=whatever/D=someDN UserFQAN: /voname/Role=NULL/Capability=NULL WallDuration: 234256 CpuDuration: 2345 Processors: 2 NodeCount: 2 StartTime: 1234567890 EndTime: 1234567899 MemoryReal: 1000 MemoryVirtual: 2000 ScalingFactorUnit: SpecInt2000 ScalingFactor: 1000 %% ...another job record... %% ... %%
Summary Job Records
Description
Header: APEL-summary-job-message: v0.1
The header only appears once at the top of each message. It defines the type of record and the schema version.
Key | Value | Description | Mandatory |
Site | String | GOCDB sitename | Yes |
Month | int | Month of summary | Yes |
Year | int | Year of summary | Yes |
GlobalUserName | String | User's X509 DN | |
VO | String | User's VO | |
Group | String | User's VOMS group | |
Role | String | User's VOMS role | |
EarliestEndTime | int | Start time of the first job in the month (epoch) | |
LatestEndTime | int | Start time of the last job in the month (epoch) | |
WallDuration | int | Wall clock time for the job | Yes |
CpuDuration | int | CPU time for the job | Yes |
NormalisedCpuDuration | int | Normalised CPU time for the job | Yes |
NormalisedWallDuration | int | Normalised Wall clock time for the job | Yes |
NumberOfJobs | int | Total number of jobs | Yes |
End of record: %%
Notes:
If !GlobalUserName, VO, Role or Group are not published, the value for these fields on the server will be set to 'None'.
A single job record must only be included in one summary record to avoid duplication of data.
Example Message
APEL-summary-job-message: v0.1 Site: RAL-LCG2 Month: 3 Year: 2010 GlobalUserName: /C=whatever/D=someDN VO: atlas Group: /atlas Role: Role=production EarliestEndTime: 1267527463 LatestEndTime: 1269773863 WallDuration: 23425 CpuDuration: 2345 NormalisedCpuDuration: 2500 NormalisedWallDuration: 244435 NumberOfJobs: 100 %% ...another summary job record... %% ... %%
Summary Sync Records
The summary Sync records are used for the creation of the apel-sync Nagios test. It is a mechanism for the central APEL server to know the number of records that each site is storing locally.
Description
Header: APEL-sync-message: v0.1
The header only appears once at the top of each message. It defines the type of record and the schema version.
Key | Value | Description | Mandatory |
---|---|---|---|
Site | String | GOCDB sitename | Yes |
NJobs | int | Total number of jobs for that month | Yes |
NDays | int | Number of days between earliest and latest job in month | Yes |
Month | int | Month | Yes |
Year | int | Year | Yes |
End of record: %%
Notes:
Each record indicates the number of jobs run on the site per month. This data is used to create the Nagios apel-sync test.
Example Message
APEL-sync-message: v0.1 Site: RAL-LCG2 NJobs: 3479 NDays: 29 Month: 1 Year: 2010 %% ...another sync record... %% ... %%