DIRAC Data Management conistency, integrity and coherence of 狄拉克的數(shù)據(jù)管理的一致性完整性與一致性_第1頁
DIRAC Data Management conistency, integrity and coherence of 狄拉克的數(shù)據(jù)管理的一致性完整性與一致性_第2頁
DIRAC Data Management conistency, integrity and coherence of 狄拉克的數(shù)據(jù)管理的一致性完整性與一致性_第3頁
DIRAC Data Management conistency, integrity and coherence of 狄拉克的數(shù)據(jù)管理的一致性完整性與一致性_第4頁
DIRAC Data Management conistency, integrity and coherence of 狄拉克的數(shù)據(jù)管理的一致性完整性與一致性_第5頁
已閱讀5頁,還剩17頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、marianne bargiotticernmarianne bargiottichep07 sep 2-7 2007, victoria, bc2outlineo dirac data management system (dms)o lhcb catalogues and physical storageo dms integrity checkingo conclusionsmarianne bargiottichep07 sep 2-7 2007, victoria, bc3dirac data management systemo dirac project (distributed

2、 infrastructure with remote agent control) is the lhcb workload and data management systemqdirac architecture based on services and agentssee a. tsaregorodsev poster 189o the dirac data management system deals with three components: allows to know where files are storedallows to know what are the co

3、ntents of the filesunderlying grid storage elements (se) where files are storedconsistency between these catalogues and storage elements is for a reliable data managementsee a.c smith poster 195marianne bargiottichep07 sep 2-7 2007, victoria, bc4lhcb file catalogueo lhcb choice: lcg file catalogue (

4、lfc)qit allows registering and retrieving the location of physical replicas in the grid infrastructure. qit stores:file information (lfn, size, guid)replica informationo dirac wms uses lfc information to decide where jobs can be scheduledfundamental to avoid any kind of inconsistencies both with sto

5、rages and with related catalogues (bk meta data db)o baseline choice for dirac: central lfc qone single master (r/w) and many ro mirrorsqcoherence ensured by single write endpoint marianne bargiottichep07 sep 2-7 2007, victoria, bc5registration of replicaso guid check: before the registration in the

6、 lcg file catalogue, at the beginning of transfer phase, the existence of file guid to be transferred is checkedqto avoid guid mismatch problem in registrationo after a successful transfer, lfc registration of files is divided into 2 atomic operationsq booking of meta data fields with the insertion

7、in the dedicated table of lfn, guid and size q replica registration marianne bargiottichep07 sep 2-7 2007, victoria, bc6lhcb bookkeeping meta data dbo the bookkeeping (bk) is the system that stores data provenience information.o it contains information about jobs and files and their relations: qjob:

8、 application name, application version, application parameters, which files it has generated etc.qfile: size, event, filename, guid, from which job it was generated etc.o the bookkeeping db represents the main gateway for users to select the available data and datasets.o all data visible to users ar

9、e flagged as has replica all the data stored in the bk and flagged as having replica, must be correctly registered and available in lfc.marianne bargiottichep07 sep 2-7 2007, victoria, bc7storage elementso dirac storage element clientqprovides uniform access to grid storage elementsqimplemented with

10、 plug-in modules for access protocolssrm, gridftp, bbftp, sftp, http supportedo srm is the standard interface to grid storageo lhcb has 14 srm endpoints qdisk and tape storage for each t1 site o srm will allow browsing the storage namespace (since srm v2.2)o functionalities are exposed to users thro

11、ugh gfal library api qpython binding of gfal library is used to develop the dirac toolsmarianne bargiottichep07 sep 2-7 2007, victoria, bc8data integrity checkso considering the high number of interactions among dm system components, integrity checking is part of the dirac data management system. o

12、two ways of performing checks: those running as agents within the dirac frameworkthose launched by the data manager to address specific situations. o the agent type of checks can be broken into two further distinct types. qthose solely based on the information found on se/lfc/bkqthose based on a pri

13、ori knowledge of where files should exist based on the computing model i.e dst always present at all t1s disksmarianne bargiottichep07 sep 2-7 2007, victoria, bc9dms integrity agents overviewo the complete suite for integrity checking includes an assortment of agents:marianne bargiottichep07 sep 2-7

14、 2007, victoria, bc10data integrity checks & dm consoleo the data management console is the interface for the data manager.qthe dm console allows data integrity checks to be launched.o the development of these tools has been driven by experienceqmany catalog operations (fixes)bulk extraction of

15、replica informationdeletion of replicas according to sitesextraction of replicas through lfc directorieschange of replicas se name in the cataloguecreations of bulk transfer/removal jobs marianne bargiottichep07 sep 2-7 2007, victoria, bc11bk - lfc consistency agento main problem affecting bk: many

16、lfns registred in the bk but failed to be registred on lfc missing files in the lfc: users trying to select lfns in the bk cant find any replica in the lfcpossible causes: failing of registration on the lfc due to failure on copy, temporary lack of service.o bklfc: performs massive check on producti

17、onsqchecking from bk dumps of different productions against same directories on lfcqfor each production: checking for the existence of each entry from bk against lfccheck on file sizesin case of missing or problematic files, reports to the integritydbbklfcmarianne bargiottichep07 sep 2-7 2007, victo

18、ria, bc12lfc pathologieso many different possible inconsistencies arising in a complex computing model:q file metadata registred on lfc but missing information on size (set to 0):q missing replica field in the replica information table on the db(bugs from dirac old version, now fixed)q srm:/gridka-d

19、cache.fzk.de:8443/castor/cern.ch/grid/lhcb/production/dc06/v1-lumi2/00001354/digi/0000/00001354_00000027_9.digi gridka-tapeq cern_castor, wrong info in the lhcb configuration service q sfn, rfio, bbftp q blank spaces on the surl path, carriage returns, presence of port number in the surl path.marian

20、ne bargiottichep07 sep 2-7 2007, victoria, bc13lfc se consistency agento lfc replicas need perfect coherence with storage replicas both in path, protocol and size:qreplication issue: check whether the lfc replicas are really resident on physical storages (check the existence and the size of files)qr

21、egistration issues: lfc-se agent stores problematic files in central integritydb according to different pathologies:lfcsemarianne bargiottichep07 sep 2-7 2007, victoria, bc14se lfc consistency agento checks the se contents against lcg file catalogue:qlists the contents of the se qchecks against the

22、catalogue for corresponding replicas qmissing efficient storage interface for bulk meta data queries (directory listings) not possible to list the content of remote directories and getting associated meta-data (lcg-ls)selfcmarianne bargiottichep07 sep 2-7 2007, victoria, bc15storage usage agento usi

23、ng the registered replicas and their sizes on the lfc, this agent constructs an exhaustive picture of current lhcb storage usage:qworks through breakdown by directoriesqloops on lfc extracting files sizes according to different storagesqstores information on central integritydbqproduce a full pictur

24、e of disk and tape occupancy on each storage qprovides an up-to-dated picture of lhcbs usage of resources in almost real time o foreseen development: using lfc accounting interface to have a global picture per sitemarianne bargiotti16data integrity agent o the integrity agent takes actions over a wi

25、de number of pathologies stored by agents in the integritydb.o action taken:qlfc se:in case of missing replica on lfc: produce surl paths starting from lfn, according to dirac configuration system for all the defined storage elements;extensive search throughout all t1 sesif search successful, regist

26、ration of missing replicas. same action in case of zero-size files, wrong sa-path,.qbk - lfc:if file not present on lfc:extensive research performed on all sesif file is not found anywhere removal of flag has replica: no more visible to usersif file is found: update of lfc with missing file infos ex

27、tracted from storagesqse lfc:files missing from the catalogue can be:registered in catalogue if lfn is presentdeleted from se if lfn is missing on the cataloguebkselfclfcsedirac csselfcsemarianne bargiottichep07 sep 2-7 2007, victoria, bc17prevention of inconsistenciesqeach operation that can fail i

28、s wrapped in a xml record as a request which can be stored in a request db.qrequest dbs are sitting in one of the lhcb vo boxes, which ensures that these records will never be lostqthese requests are executed by dedicated agents running on vo boxes, and are retried as many times as needed until they

29、 succeedqexamples: files registration operation, data transfer operation, bk registrationqchecking on file transfers based on file size or checksum, etc.marianne bargiottichep07 sep 2-7 2007, victoria, bc18conclusionso integrity checks suite is an important part of data management activityo further

30、development will be possible with srm v2 (se vs lfc agent)o most effort now in the prevention of inconsistencies (checksums, failover mechanisms)marianne bargiottichep07 sep 2-7 2007, victoria, bc19backupmarianne bargiottichep07 sep 2-7 2007, victoria, bc20dirac architectureo dirac (distributed infr

31、astructure with remote agent control) is the lhcbs grid projecto dirac architecture split into three main component types: qservices - independent functionalities deployed and administered centrally on machines accessible by all other dirac componentsqresources - grid compute and storage resources at remote sitesqagents - lightweight software components that request jobs from the central services for a specific purpose. o the dirac data management system is made up an assortment of these

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論