Alert.png The wiki is deprecated and due to be decommissioned by the end of September 2022.
The content is being migrated to other supports, new updates will be ignored and lost.
If needed you can get in touch with EGI SDIS team using operations @ egi.eu.

Application Accounting

From EGIWiki
Revision as of 16:40, 28 November 2012 by Perl (talk | contribs) (Created page with "=  Application Accounting = == Motivation == In order to account software usage, we want to have a sensor that provides the following data: *Which software/binary was ...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

 Application Accounting

Motivation

In order to account software usage, we want to have a sensor that provides the following data:

  • Which software/binary was called?
  • How long did the software run?
  • What was the return value?
  • If applicable: What signal forced the termination (SIGINT/SIGKILL/...)

Repository

The software can be found on github.com/hperl/app-accounting.

Technique

In order to account software usage, we need to intercept all calls that start a new process. The bash and most other shells use the libc "execve()" function to execute a program. Further, one can overwrite library functions by specifying another library in the LD_PRELOAD environment variable. The library built from "acc_preload.c" replaces "execve()" with the following instructions:

  1. Fork the process, let the child execute the actual program
  2. record the start time
  3. wait on the child using "waitpid()", e.g. wait until the actual program ends
  4. record the end time
  5. construct an accounting record
  6. send the accounting record to the server
  7. special case: if the server does not respond, send the process to the background and wait until the server answers again

The accounting bash (acc_bash) is a short and simple shell script that sets the environment variables and calls the bash:

#!/bin/sh

LD_PRELOAD=libacc_preload bash

Design Decisions

Why client/server?

I assumed that the actual accounting bash (acc_bash) would run directly on the grid nodes. The user mapping would probably require additional information from the head nodes or the queueing node. So the rudimentary information is assembled on the grid nodes (client) and then sent to the server, which can put additional information into the record and submit it to the accounting repository.

Why overwrite a libc function?

It is a good tradeoff between low-levelness and easy deployment. Directly intercepting the SYS_execve systemcall would have been even more elegant, but it would also require a kernel patch. Patching the kernel of every grid node is an overkill. With the current method, the deployment only requires the following:

  • Installation of a library (libacc_preload)
  • Installation of a binary (acc_bash)

This has been packaged into a DEB and RPM in addition to a tarball.

Integration into APEL

To be discussed.