版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、Computer Science and Technology專(zhuān)業(yè)畢業(yè)論文 精品論文 IMPROVING THE EFFICIENCY OF BITMAP INDEXES FOR DATA WAREHOUSES關(guān)鍵詞:Data Warehous Bitmap index Compression Data reorganization摘要:The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as dat
2、a warehouses and scientific databases. Unlike thetraditional databases systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over
3、 a long period of time. The warehouse data are used for analyticalpurposes by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex a
4、nd adhoc in nature and require hugevolumes of data to be processed. A promise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer querie
5、s.However, as the database grow in size, the bitmap index size increases as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an e
6、fficient bitmap indexing technique to improving theperformance of the Word-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data se
7、ts with high-cardinalityattributes, in most cases, the bitmap index of the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve high
8、er compressionrates. To reorder the tuples of base data, our approach consists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compr
9、ession and query processing. Sorting of the indexed attribute ensures long runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed
10、attributed. Our experiments, conducted on five data setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for vario
11、us attributes cardinalities. Moreover,we also found out that the execution time measured for both equality and range queries wassubstantially improved.正文內(nèi)容 The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses a
12、nd scientific databases. Unlike thetraditional databases systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period
13、 of time. The warehouse data are used for analyticalpurposes by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in na
14、ture and require hugevolumes of data to be processed. A promise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as
15、the database grow in size, the bitmap index size increases as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitma
16、p indexing technique to improving theperformance of the Word-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-c
17、ardinalityattributes, in most cases, the bitmap index of the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compression
18、rates. To reorder the tuples of base data, our approach consists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and que
19、ry processing. Sorting of the indexed attribute ensures long runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Ou
20、r experiments, conducted on five data setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes
21、cardinalities. Moreover,we also found out that the execution time measured for both equality and range queries wassubstantially improved.The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses and scientific datab
22、ases. Unlike thetraditional databases systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period of time. The wareh
23、ouse data are used for analyticalpurposes by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in nature and require hu
24、gevolumes of data to be processed. A promise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as the database grow i
25、n size, the bitmap index size increases as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitmap indexing techniqu
26、e to improving theperformance of the Word-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-cardinalityattribute
27、s, in most cases, the bitmap index of the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compressionrates. To reorder t
28、he tuples of base data, our approach consists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and query processing. Sort
29、ing of the indexed attribute ensures long runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Our experiments, cond
30、ucted on five data setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes cardinalities. More
31、over,we also found out that the execution time measured for both equality and range queries wassubstantially improved.The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses and scientific databases. Unlike thetra
32、ditional databases systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period of time. The warehouse data are used
33、for analyticalpurposes by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in nature and require hugevolumes of data t
34、o be processed. A promise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as the database grow in size, the bitmap
35、index size increases as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitmap indexing technique to improving thep
36、erformance of the Word-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-cardinalityattributes, in most cases, t
37、he bitmap index of the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compressionrates. To reorder the tuples of base d
38、ata, our approach consists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and query processing. Sorting of the indexed
39、attribute ensures long runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Our experiments, conducted on five data
40、setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes cardinalities. Moreover,we also found
41、out that the execution time measured for both equality and range queries wassubstantially improved.The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses and scientific databases. Unlike thetraditional databases
42、systems, these data sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period of time. The warehouse data are used for analyticalpurpo
43、ses by the knowledge workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in nature and require hugevolumes of data to be processed. A p
44、romise way to speed up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as the database grow in size, the bitmap index size increase
45、s as well. It is thereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitmap indexing technique to improving theperformance of the W
46、ord-Aligned Hybrid for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-cardinalityattributes, in most cases, the bitmap index of
47、the indexed attribute (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compressionrates. To reorder the tuples of base data, our approach c
48、onsists to simply sort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and query processing. Sorting of the indexed attribute ensures l
49、ong runs ofones and zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Our experiments, conducted on five data setswith table card
50、inalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes cardinalities. Moreover,we also found out that the execut
51、ion time measured for both equality and range queries wassubstantially improved.The progresses of the computer technology have enabled the development of new kinds ofvery large database applications such as data warehouses and scientific databases. Unlike thetraditional databases systems, these data
52、 sets are characterized by the high dimensionality, andare mostly read-only and append-only systems. A data warehouse is large repository of information containing data collected fromdifferent data sources over a long period of time. The warehouse data are used for analyticalpurposes by the knowledg
53、e workers (executive, manager, analyst) of an organization to makebetter and faster decisions. The query performance in such an environment is a criticalchallenge as most of the associated queries are complex and adhoc in nature and require hugevolumes of data to be processed. A promise way to speed
54、 up the query performance in datawarehouses in the bitmap index. Because the bitwise logical operations are well supported bycomputer hardware, the bitmaps can easily and effciently be combined to answer queries.However, as the database grow in size, the bitmap index size increases as well. It is th
55、ereforeimperative to compress the index in order to improve the query performance. One of the mostnotable bitmap index compression schemes is the word-aligned hybrid code (WAH). This thesis aims to present an efficient bitmap indexing technique to improving theperformance of the Word-Aligned Hybrid
56、for high-cardinality attributes in data warehousingapplications. WAH compression is based on the run-length encoding, and its performancedepends on the presence of long runs of identical bits. For large data sets with high-cardinalityattributes, in most cases, the bitmap index of the indexed attribu
57、te (column) would be sparse;hence the compression process would be ineffective. To make the bitmap compression efficient,it's therefore necessary to reorder the data warehouse tuples to achieve higher compressionrates. To reorder the tuples of base data, our approach consists to simply s
58、ort the indexedattribute before generating the bitmap index. Our strategy is therefore used as a preprocessingstep before compression, only to improve the performance, without affecting algorithms usedfor compression and query processing. Sorting of the indexed attribute ensures long runs ofones and
59、 zeros. These long runs of ones and zeros are desirable for any compression schemethat is based on run length encoding, such as WAH. To further improve the performance, wepropose to bin and cluster the indexed attributed. Our experiments, conducted on five data setswith table cardinalities varying from 100,000 to 2,000,000 records, showed that our strategysignificantly improves the performance of the word-aligned hybrid code. We observed that thesizes of the bitmap indices considerably decreased for various attributes cardinalities. Moreover,we also fo
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 狼獲獎(jiǎng)?wù)n件教學(xué)課件
- 統(tǒng)計(jì)分析軟件模擬試題三及答案
- 飛向太空的航程說(shuō)課稿
- 隊(duì)列口令說(shuō)課稿
- 適合小班課件教學(xué)課件
- 怎樣評(píng)價(jià)課件教學(xué)課件
- 南京工業(yè)大學(xué)浦江學(xué)院《公益營(yíng)銷(xiāo)》2021-2022學(xué)年第一學(xué)期期末試卷
- 南京工業(yè)大學(xué)浦江學(xué)院《籌資原理和技巧》2022-2023學(xué)年第一學(xué)期期末試卷
- 秸稈打捆協(xié)議書(shū)(2篇)
- 南京工業(yè)大學(xué)《應(yīng)用統(tǒng)計(jì)學(xué)》2023-2024學(xué)年第一學(xué)期期末試卷
- 無(wú)縫線(xiàn)路鋪設(shè)與養(yǎng)護(hù)維修方法
- 高分子物理教案(Word)
- 豆綠色時(shí)尚風(fēng)送貨單excel模板
- 新蘇教版五年級(jí)上冊(cè)科學(xué)全冊(cè)教學(xué)課件(2022年春整理)
- 小學(xué)體育水平一《走與游戲》教學(xué)設(shè)計(jì)
- 秋日私語(yǔ)(完整精確版)克萊德曼(原版)鋼琴雙手簡(jiǎn)譜 鋼琴譜
- 鹽酸安全知識(shí)培訓(xùn)
- 萬(wàn)盛關(guān)于成立醫(yī)療設(shè)備公司組建方案(參考模板)
- 科技特派員工作調(diào)研報(bào)告
- 中波廣播發(fā)送系統(tǒng)概述
- 縣疾控中心中層干部競(jìng)聘上崗實(shí)施方案
評(píng)論
0/150
提交評(píng)論