NGI CZ:DDM for auger

From EGIWiki
Revision as of 09:02, 24 May 2017 by Chudoba (talk | contribs) (Operational issues)
Jump to: navigation, search

Distributed Data Management for VO auger

Files created at grid are stored at some Storage Element (SE) and registered at LFC. The production system also registers if the simulation job finished OK and that results are available. When files are consolidated to CC IN2P3, they are first copied to SRM at Lyon and registered in LFC under different name and then copied to SRB and registered in SimDB. The system should be revisited in 2014.

Operational issues

  • 201705 - ELOGis available for Auger Distributed Computing Issues
  • 201607 There were 16 files declared as lost at praguelcg2 site ( These files were found missing during a check of consistency of DPM DB with the content on disk servers.
  • 201512 Reported loss of 467 files at KIT ( 286 files have another replica, 181 files were unique. Those unique 181 files were deleted from LFC.
  • 201407 End of support at Wuppertal site: all file at must be deleted

Bulk deletion tools

A python script to be used for bulk deletions in the LFC. Typically we use it when a site reports loss of files or when a site is decommissioned. It takes as an argument either an SE name (if all files from a given SE should be deleted) or a filename of a file, which contains list of files to be deleted. It also supports a dry run, which does not delete anything and only reports what would be deleted. The script deletes almost 15 files per second (performance obtained from a deletion of 3M files from

Examples of usage:

nohup /usr/bin/time python /home/chudoba/lfc/ -f files_to_be_deleted.txt > lfc_deletion.`date +"%Y%m%d%H%M"`.log 2>&1 &

The format of the file files_to_be_deleted.txt:


Note: I did a deletion with an auger production role. Several files were not deleted because they were written with sgm role. So a second run of the deletion script must be run with sgm role. Example to get a list of files which were not deleted:

grep ERR lfc_deletion.201605231817.log > lfc1_deletion_list2
sed -i 's#ERR.*srm#srm#' lfc1_deletion_list2

Note: I have not covered other sources of errors, because there were none.

Example how to produce a list of files to be deleted based on the SE name:

/usr/bin/time python /home/chudoba/lfc/ -d -s > lfc1_deletion_wupp.`date +"%Y%m%d%H%M"`.log 2>&1

--Chudoba 08:51, 4 August 2016 (CEST)