北郵-數(shù)據(jù)庫系統(tǒng)原理(英文)-12-15_第1頁
北郵-數(shù)據(jù)庫系統(tǒng)原理(英文)-12-15_第2頁
北郵-數(shù)據(jù)庫系統(tǒng)原理(英文)-12-15_第3頁
北郵-數(shù)據(jù)庫系統(tǒng)原理(英文)-12-15_第4頁
北郵-數(shù)據(jù)庫系統(tǒng)原理(英文)-12-15_第5頁
已閱讀5頁,還剩33頁未讀, 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權,請進行舉報或認領

文檔簡介

1、PART 4DATA STORAGE AND QUERYChapter 12 Indexing and HashingDatabase System Concepts - Chapter 12 Indexing and Hashing -3Contents in This ChapternBasic concepts about and classification of indexingn ordered indices, hash indicesnProperties/types of ordered indicesn primary/clustering indices, seconda

2、ry/non-clustering indicesn dense indices, sparse indicesn single-level indices, multi-level indices (e.g. B+-tree, B-tree)nHash indicesn hash functionsn static hash, dynamic hashDatabase System Concepts - Chapter 12 Indexing and Hashing -4目目 標!標!n學會準確建立、管理索引,提高數(shù)據(jù)訪問速學會準確建立、管理索引,提高數(shù)據(jù)訪問速度度n 索引類型索引類型n s

3、elect、insert、update時的索引管理時的索引管理Database System Concepts - Chapter 12 Indexing and Hashing -512.1 Basic ConceptsnHow to locate records in DB file quickly?nIndexing (索引技術)nmechanisms used to speed up access to desired datane.g. for relation account(account-number, branch-name, balance) shown in Fig.12

4、.1, the index branch-name physical address of record (i.e.tuple) in DB file accountnSearch Keynattributes or a set of attributes used to look up the records in a filene.g. branch-name A-217A-110A-101A-215A-201A-218A-102A-222A-305Fig. 12.1 DB indexed file account and its index fileNote: the file acco

5、unt is logically a sequential file, but its records may be stored non-contiguously or non-ordered on the disklogical file accountphysical file account index file(account-number, branch-name, balance) indexed fileDatabase System Concepts - Chapter 12 Indexing and Hashing -712.1 Basic Concepts (cont.)

6、nIndexing nmapping from search-key to storage locations of the file records, i.e. search-key storage locations of the records in disks搜索鍵(search key)索引文件(index file)數(shù)據(jù)文件(主文件、被索引文件indexed file)s1s2s3.si.sj.sn散列函數(shù)Hs1s2s3.si.sj.sna3a2a1aiajaman-1anb3b2b1bibjbmbn-1bns3s2s1sisjsmsn-1snd3d2d1didjdmdn-1dn

7、r(R), R (A, B, S, . , , D)h(si)h(sn)h(s1)索引項index entrys1s2s3sism圖 12.0.1 索引技術(indexing)及其分類ordered indiceshash indices搜索鍵(search key)Database System Concepts - Chapter 12 Indexing and Hashing -9nTwo basic kinds of indices:nordered indices: the index file is used to store the index entries in which

8、the search key of the records and the address of the records are stored in sorted ordernhash indices: the “hash function” is used to map the the search key of the records to the address of the records nthe records are stored in the “buckets”, the number of the bucket is as the address of the records

9、 and is determined by the hash function12.1 Basic Concepts (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -10nDBS files with indexing mechanism include two parts :nindexed file, in which data records are storednindex file, in which index entries are included ne.g. Fig.12.1nThe ind

10、exed file can be organized asn sequential file n heap filen hash filen clustering file 12.2 Ordered IndicesDatabase System Concepts - Chapter 12 Indexing and Hashing -1112.1 Basic Concepts (cont.)nIndex filen set of index entries of the form nIndex files are typically much smaller than the original

11、file search-keylocationDatabase System Concepts - Chapter 12 Indexing and Hashing -12nOrdered index (in index file) nthe index entries in the index file are stored in some sorted order, for instance, in accordance with the order of the search key/* 索引項的排列順序與(被索引文件中)搜索鍵的排列順序一致n e.g. in Fig.12.1( ) ,

12、the index file is sorted by branch-name 12.2 Ordered Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -13nPrimary/clustering index (in index file)nconsidering a index file and its corresponding indexed file. nif the indexed file is a sequential file, and the search key in in

13、dex file also specifies the sequential order of the indexed filen/* 索引文件的搜索鍵所規(guī)定的順序與被索引的順序文件中的紀錄順序一致ne.g. in Fig.12.1 , (branch-name, address of records) defines the same orders of the records as that in sequential indexed file accountnnote: also called clustering index12.2-I Primary and Secondary In

14、dices Database System Concepts - Chapter 12 Indexing and Hashing -14nThe search key of a primary index is usually but not necessarily the primary keyn e.g. in Fig. 12.1, branch-name is not the primary key of account12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Ind

15、exing and Hashing -15n聚集索引 vs 主索引 在實際數(shù)據(jù)庫系統(tǒng)中,如SQL Server、DB2等,n聚集索引: 指索引文件的搜索鍵所規(guī)定的順序與被索引的順序文件中的紀錄順序一致n主索引: 指建立在主鍵(主屬性上)的索引n當一個表定義了主鍵后,DBMS會自動為該表在主鍵上建立聚集索引,該索引同時又是主索引12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -16n一個表上只能建立一個聚集索引,也只能有一個主索引。但可以建

16、立多個非聚集索引n如果一個表上沒有定義主鍵,則不會有主索引;但可以為該表建立一個聚集索引n主索引一定是聚集索引,但聚集索引不一定是主索引主索引一定是聚集索引,但聚集索引不一定是主索引12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -17n實驗:對比分析n 在沒有定義主鍵的關系表中插入數(shù)據(jù): 在關系表所在數(shù)據(jù)庫文件中,各個元組/行/記錄順序一般是無序,取決于數(shù)據(jù)插入順序heap文件n 在有主鍵的關系表中插入數(shù)據(jù): 數(shù)據(jù)庫文件是有序的,按照主

17、鍵順序排列12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -18nSecondary indexn an index whose search key specifies an order different from the sequential order of the file. nalso called non-clustering indexn e.g. Fig.12.5 , secondary index on balance

18、 field of accountnSecondary indices have to be dense indicesnIndex-sequential file (索引順序文件)nordered sequential file with a primary/clustering index on the search key n e.g. Fig.12.1 12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -19Fig.12.5 Sec

19、ondary Index on balance field of account12.2-I Primary and Secondary Indices (cont.)Database System Concepts - Chapter 12 Indexing and Hashing -20n Dense index n the index record in the index file appears for every search-key value in the indexed fileneach value of search-key in the indexed file cor

20、responds to an index entry in the index file n e.g. Fig.12.1 n In a dense primary index, the index record contains the search-key value and a pointer to the first record with that search-key valuen the rest of the records with the same search-key value would be stored sequentially after the first re

21、cordne.g. Fig.12.1, search-key value “Perryridge” corresponds to three records in the file12.2-II Dense and Sparse IndicesDatabase System Concepts - Chapter 12 Indexing and Hashing -21nSparse Indexn index file contains index entries for only some search-key values in the indexed filene.g. Fig.12.3 n

22、With respect to the sparse index, to locate a file record with search-key value Knfind index entry with largest search-key value Knsearch file sequentially starting at the record to which this index entry points12.2-II Dense and Sparse Indices (cont.)Database System Concepts - Chapter 12 Indexing an

23、d Hashing -22Fig.12.3 Sparse index for the file account index file indexed file12.2-II Dense and Sparse IndicesDatabase System Concepts - Chapter 12 Indexing and Hashing -23nThe indices in Fig.12.1, Fig.12.3, and Fig.12.5 are single-level indicesnThe index file may be very large, and cannot be entir

24、ely kept in memory nTo access the index file quickly, the index file is stored on disk as a sequential file and construct a sparse index on this sequential index filenouter index a sparse index of primary index fileninner index the primary index filenIf even outer index is too large to fit in main m

25、emory, yet another level of index can be created, and so on.12.2-III Single-level and Multi-level Indices Fig. 12.4 Two-level sparse indexindexed fileindex fileNote: similar to hierachical page tables or file indexs in OSDatabase System Concepts - Chapter 12 Indexing and Hashing -25nB+-tree indices

26、and B-tree indices are two types of efficient multi-level indices, and widely used in DBSnHow to build a B/B+ tree index for a DB file with sparse index.12.2-III Single-level and Multi-level IndicesFig. An Example of B+-tree index堆、二叉搜索樹?!Database System Concepts - Chapter 12 Indexing and Hashing -2

27、6nThe file records are stored in a set of bucketsn a bucket is a unit of storage containing one or more records (a bucket is typically a disk block).nHash function hn a function from the set of all search-key values K in a file to the set of all bucket addresses, i.e. the set of the addresses of fil

28、e recordsntypical hash functions perform computation on the internal binary representation of the search-key. nHash file organizationn obtaining the bucket of a record directly from its search-key value using a hash function12.5 Hashing Index FilesFig.12.21 Hash file organization of account file, wi

29、th branch-name as key Hash function:returns the sum of the binary representations of the characters modulo 10Database System Concepts - Chapter 12 Indexing and Hashing -28Handling of Bucket OverflowsnBucket overflow (溢出) can occur because of ninsufficient buckets nskew in distribution of records. Th

30、is can occur due to two reasons:nmultiple records have same search-key valuenchosen hash function produces non-uniform distribution of key valuesnAlthough the probability of bucket overflow can be reduced, it cannot be eliminated; it is handled by using overflow buckets.Database System Concepts - Ch

31、apter 12 Indexing and Hashing -29Handling of Bucket Overflows (cont.)nOverflow chaining the overflow buckets of a given bucket are chained together in a linked list.nAbove scheme is called closed hashing. Fig. 12.22 in an hash structureDatabase System Concepts - Chapter 12 Indexing and Hashing -30Ha

32、sh IndicesnHashing can be used not only for file organization, but also for index-structure creation. nA hash index organizes the search keys, with their associated record pointers, into a hash file structure.nStrictly speaking, hash indices are always secondary indices nif the file itself is organized using hashing, a separate primary hash index on it using the same search-key is unnecessary. nHowever, we use the term hash index to refer to both secondary index structures and hash organized files

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
  • 6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論