![hive的基本操作_第1頁(yè)](http://file3.renrendoc.com/fileroot_temp3/2021-12/26/d6b23cd9-ccee-4c8e-8ca2-d82f20de25e5/d6b23cd9-ccee-4c8e-8ca2-d82f20de25e51.gif)
![hive的基本操作_第2頁(yè)](http://file3.renrendoc.com/fileroot_temp3/2021-12/26/d6b23cd9-ccee-4c8e-8ca2-d82f20de25e5/d6b23cd9-ccee-4c8e-8ca2-d82f20de25e52.gif)
![hive的基本操作_第3頁(yè)](http://file3.renrendoc.com/fileroot_temp3/2021-12/26/d6b23cd9-ccee-4c8e-8ca2-d82f20de25e5/d6b23cd9-ccee-4c8e-8ca2-d82f20de25e53.gif)
版權(quán)說(shuō)明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
1、Hive的基本操作一般情況下hive所操作的數(shù)據(jù)文件是已經(jīng)存在的(也可以是外部導(dǎo)入的),常 見(jiàn)的web日志文件格式有多種(如josn格式)。注意:Hive所創(chuàng)建的數(shù)據(jù)庫(kù)和數(shù)據(jù)表都是在 HDFS里的某個(gè)目錄如果是數(shù)據(jù)庫(kù),那么在 HDFS里就是:/user/hive/warehouse/庫(kù)名稱(chēng).db 如果是數(shù)據(jù)表,那么在 HDFS里就是:/user/hive/warehouse/庫(kù)名稱(chēng).db/表名 稱(chēng)Hive下默認(rèn)有一個(gè)庫(kù)default,如果不建庫(kù),直接建表,則表建在defaule庫(kù)下。hive> CREATE SCHEMA laserdb:OKI' line 七曰c已n: 0 _
2、DC se c-dmIhSHOW DATABASES;OKdefaalEu呂亡rcLbTime tafcen; C-CiSS aecond3F Fetclied: 2 iiive> uae default;OKTime 匸aken: 0-034 seconds iiivie> slioir xables;OKU3®r_movieUBerinovielU5Er_inovlE2ufler_iriovie3Time taken; C.D2B aeeonds. Fetched: 4 r&w(s| hivie> I1. Hive的基本操作(1) 建庫(kù)命令:CREATE
3、 SCHEMA 庫(kù)名;(2) 建表命令:CREATE TABL表名(字段名稱(chēng)、類(lèi)型);女口: CREATE TABLE tuoguan_tbl (flied string);數(shù)據(jù)表里的內(nèi)容,實(shí)質(zhì)就是HDFS里的某個(gè)文件,需要把這個(gè)文件解析為數(shù)據(jù)表的格式。(3) 創(chuàng)建普通表,每行的字段用逗號(hào)分隔create table web_log(id int, name string, address string) row format delimited fields termi nated by ','查看表的命令:show tables;(5) 查看表中數(shù)據(jù)的命令:select *
4、不需要轉(zhuǎn)換為 map reduceselect * from tuogua n_tbl;(6) 查看表結(jié)構(gòu)命令:Desc表名稱(chēng);(7) 舉例:在Linux文件系統(tǒng)/home/oracle 下有一個(gè)文件 t_hive.txt (文本以tab分隔)#查看數(shù)據(jù)文件的內(nèi)容(文本以tab分隔) vi t_hive.txt16236112134123117213712311123411234#創(chuàng)建新表FIELDS TERMINATEDINTO TABLE t_hivehive> CREATE TABLE t_hive (a int, b int, c int) ROWFORMAT DELIMITED
5、BY 't'OKTime taken: 0.489 seconds#導(dǎo)入數(shù)據(jù)t_hive.txt至U t_hive 表hive> LOAD DATA LOCAL INPATH 7home/cos/demo/t_hive.txt'OVERWRITECopying data from file:/home/cos/demo/t_hive.txtCopying file: file:/home/cos/demo/t_hive.txtLoading data to table default.t_hiveDeleted hdfs:/:9000/user/hive/ware
6、house/t_hiveOKTime taken: 0.397 seconds#查看表hive> show tables;OKt_hiveTime taken: 0.099 seconds#查看表數(shù)據(jù)hive> select * from t_hive;OK16236112134123117213712311123411234Time taken: 0.264 seconds#查看表結(jié)構(gòu)hive> desc t_hive;OKaintbintc intTime taken: 0.1 seconds#修改表,增加一個(gè)字段hive> ALTER TABLE t_hive A
7、DD COLUMNS (new_col String);OKTime taken: 0.186 secondshive> desc t_hive;OKa intb intc intnew_col stringTime taken: 0.086 seconds#重命名表名ALTER TABLE t_hive RENAME TO t_hadoop;OKTime taken: 0.45 secondshive> show tables;OKt_hadoopTime taken: 0.07 seconds#刪除表hive> DROP TABLE t_hadoop;OKTime tak
8、en: 0.767 seconds#查看表hive> show tables;OKTime taken: 0.064 seconds(8) 如果不想把HDFS里的文件進(jìn)行移動(dòng),則可以創(chuàng)建外部表:create external table web_log2 (id int, name string, address string ) Location /user/weblog/2. 將json格式的web日志文件user_movie.json導(dǎo)入到Hive的某個(gè)表中方法一:使用第三方j(luò)ar包(1)使用一個(gè)第三方的jar 包 json-serde-1.3.6-SNAPSHOT-jar-wit
9、h-dependencies.jar(老師給的),將其復(fù)制到HIVE_HOME/lib目錄下創(chuàng)建表user_moviecreate table user_movie(custid string, sno string, genreid string, movieid string) ROW FORMAT SERDE'org.openx.data.jsonserde.JsonSerDe STORED AS TEXTFILE;(3)將 json 文件 user_movie.json 導(dǎo)入到表 user_movie 中首先將json文件上傳到Linux文件系統(tǒng)/home/oracle目錄下面
10、,然后在Linux命令下執(zhí)行如下命令:(本地的文件直接導(dǎo)入到HDFS相應(yīng)的目錄里)1-1 5f e 一空匕 * J us e±_EM-vi*. >/ua&i:/t_in£vieoracleioraele hedocd""|或者在hive命令中執(zhí)行:(從HDFS里直接導(dǎo)入數(shù)據(jù),這個(gè)會(huì)把HDFS里的文件移動(dòng)到 HIVE表的相應(yīng)目錄里)hive> Load data local inpath T/home/oracle/user_jk>vie- jsan1 into table u3er_Tnovie3; Loading data t
11、o table defau11*user_movie3Table defau.lt.uaer_mDvie3 stats: numFiles-lj totaISize*821779JOKTine taken: 0.245 eco-ads方法二:使用 hive 自帶的 jar 包 hive-hcatalog-core-1.2.1.jar需要把 HIVE_HOME/hcatalog/share/hcatalog/hive-hcatalog-core-1.2.1.jar 復(fù)制到 HIVE_HOME/lib,然后在hive命令下用下面方法建表。create table user_movie2(custi
12、d stri ng, sno stri ng, gen reid stri ng, movieid stri ng)ROWFORMAT SERDE)rg. apache.hive.hcatalog.data.Jso nSerDe' STORED AS TEXTFILE;其他都一樣。3. 數(shù)據(jù)導(dǎo)入還以剛才的t_hive為例。#創(chuàng)建表結(jié)構(gòu)hive> CREATE TABLE t_hive (a int, b int, c int) ROWFORMAT DELIMITED FIELDS TERMINATED BY 't'從操作本地文件系統(tǒng)加載數(shù)據(jù)(LOCAL)hive&
13、gt; LOAD DATA LOCAL INPATH 7home/cos/demo/t_hive.txt'OVERWRITE INTO TABLE t_hive ;Copying data from file:/home/cos/demo/t_hive.txtCopying file: file:/home/cos/demo/t_hive.txtLoading data to table default.t_hiveDeleted hdfs:/:9000/user/hive/warehouse/t_hiveOKTime taken: 0.612 seconds#在HDFS中查找剛剛導(dǎo)入
14、的數(shù)據(jù) hadoop fs -cat /user/hive/warehouse/t_hive/t_hive.txt16236112134123117213712311123411234從HDFS加載數(shù)據(jù)創(chuàng)建表t_hive2hive> CREATE TABLE t_hive2 (a int, b int, c int) ROWFORMAT DELIMITED FIELDSBY 't'TERMINATED#從HDFS加載數(shù)據(jù)hive> LOAD DATA INPATH 7user/hive/warehouse/t_hive/t_hive.txt' t_hive2
15、;Loading data to table default.t_hive2Deleted hdfs:/:9000/user/hive/warehouse/t_hive2OKTime taken: 0.325 secondsOVERWRITEINTO TABLE#查看數(shù)據(jù)hive> select * from t_hive2;OK16236112134123117213712311123411234Time taken: 0.287 seconds從其他表導(dǎo)入數(shù)據(jù)hive> INSERT OVERWRITE TABLE t_hive2 SELECT * FROM t_hive ;T
16、otal MapReduce jobs = 2Launching Job 1 out of 2Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_201307131407_0002. Tracking URL =:50030/jobdetails.jsp?jobid=job_201307131407_0002Kill Command = /home/cos/toolkit/hadoop-1.0.3/libexec/./bin/hadoop job-Dmapred.jo
17、b.tracker=hdfs:/:9001 -kill job_201307131407_0002Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 02013-07-16 10:32:41,979 Stage-1 map = 0%, reduce = 0%2013-07-16 10:32:48,034 Stage-1 map = 100%, 2013-07-16 10:32:49,050 Stage-1 map = 100%, 2013-07-16 10:32:50,068 Stage-1
18、 map = 100%, 2013-07-16 10:32:51,082 Stage-1 map = 100%, 2013-07-16 10:32:52,093 Stage-1 map = 100%, 2013-07-16 10:32:53,102 Stage-1 map = 100%, 2013-07-16 10:32:54,112 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.03 sec reduce = 0%, Cumulative CPU 1.03 sec reduce = 0%, Cumulative CPU 1.03 sec
19、reduce = 0%, Cumulative CPU 1.03 sec reduce = 0%, Cumulative CPU 1.03 sec reduce = 0%, Cumulative CPU 1.03 sec reduce = 100%, Cumulative CPU 1.03 secMapReduce Total cumulative CPU time: 1 seconds 30 msecEnded Job =job_201307131407_0002Ended Job = -314818888, job is filtered out (removed at runtime).
20、Moving data to:hdfs:/:9000/tmp/hive-cos/hive_2013-07-16_10-32-31_323_5732404975764 014154/-ext-10000Loading data to table default.t_hive2Deleted hdfs:/:9000/user/hive/warehouse/t_hive2Table default.t_hive2stats:num_partitions:0, num_files:1, num_rows: 0, total_size:56, raw_data_size: 07 Rows loaded
21、to t_hive2MapReduce Jobs Launched:Job 0: Map: 1 Cumulative CPU: 1.03 sec HDFS Read: 273 HDFS Write: 56 SUCCESSTotal MapReduce CPU Time Spent: 1 seconds 30 msecOKTime taken: 23.227 seconds hive> select * from t_hive2;OK16236112134123117213712311123411234Time taken: 0.134 seconds創(chuàng)建表并從其他表導(dǎo)入數(shù)據(jù)#刪除表hiv
22、e> DROP TABLE t_hive;#創(chuàng)建表并從其他表導(dǎo)入數(shù)據(jù)hive> CREATE TABLE t_hive AS SELECT * FROM t_hive2 ;Total MapReduce jobs = 2Launching Job 1 out of 2Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job = job_201307131407_0003. Tracking URL =:50030/jobdetails.jsp?jobid=job_20130
23、7131407_0003Kill Command = /home/cos/toolkit/hadoop-1.0.3/libexec/./bin/hadoop job -Dmapred.job.tracker=hdfs:/:9001 -kill job_201307131407_0003 Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 02013-07-16 10:36:48,612 Stage-1 map = 0%, reduce = 0%2013-07-16 10:36:54,648
24、Stage-1 map = 100%,2013-07-16 10:36:55,657 Stage-1 map = 100%,2013-07-16 10:36:56,666 Stage-1 map = 100%,2013-07-16 10:36:57,673 Stage-1 map = 100%,2013-07-16 10:36:58,683 Stage-1 map = 100%,2013-07-16 10:36:59,691 Stage-1 map = 100%,reduce = 0%, Cumulative CPU 1.13 sec reduce = 0%, Cumulative CPU 1
25、.13 sec reduce = 0%, Cumulative CPU 1.13 sec reduce = 0%, Cumulative CPU 1.13 sec reduce = 0%, Cumulative CPU 1.13 sec reduce = 100%, Cumulative CPU 1.13 secMapReduce Total cumulative CPU time: 1 seconds 130 msecEnded Job =job_201307131407_0003Ended Job = -670956236, job is filtered out (removed at
26、runtime).Moving data to:hdfs:/:9000/tmp/hive-cos/hive_2013-07-16_10-36-39_986_1343249562812 540343/-ext-10001Moving data to: hdfs:/:9000/user/hive/warehouse/t_hiveTable default.t_hivestats:num_partitions:0, num_files:1, num_rows: 0, total_size:56, raw_data_size: 07 Rows loaded to hdfs:/:9000/tmp/hiv
27、e-cos/hive_2013-07-16_10-36-39_986_1343249562812540343/-ext-10000MapReduce Jobs Launched:Job 0: Map: 1 Cumulative CPU: 1.13 sec HDFS Read: 272 HDFS Write: 56 SUCCESSTotal MapReduce CPU Time Spent: 1 seconds 130 msecOKTime taken: 20.13 seconds hive> select * from t_hive;OK1623611213412311721371231
28、1123411234Time taken: 0.109 seconds僅復(fù)制表結(jié)構(gòu)不導(dǎo)數(shù)據(jù) hive> CREATE TABLE t_hive3 LIKE t_hive;hive> select * from t_hive3;OKTime taken: 0.077 seconds4數(shù)據(jù)導(dǎo)出從HDFS復(fù)制到HDFS其他位置 hadoop fs -cp /user/hive/warehouse/t_hive / hadoop fs -ls /t_hiveFound 1 items-rw-r-r- 1 cos supergroup56 2013-07-16 10:41 /t_hive/0
29、00000_0 hadoop fs -cat /t_hive/000000_016236112134123117213712311123411234通過(guò)Hive導(dǎo)出到本地文件系統(tǒng)hive> INSERT OVERWRITE LOCAL DIRECTORY 7tmp/t_hive' SELECT * FROM t_hive;Total MapReduce jobs = 1Launching Job 1 out of 1Number of reduce tasks is set to 0 since there's no reduce operatorStarting Job
30、 = job_201307131407_0005. Tracking URL = :50030/jobdetails.jsp?jobid=job_201307131407_0005Kill Command = /home/cos/toolkit/hadoop-1.0.3/libexec/./bin/hadoop job -Dmapred.job.tracker=hdfs:/:9001 -kill job_201307131407_0005Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 0
31、2013-07-16 10:46:24,774 Stage-1 map = 0%, reduce = 0%2013-07-16 10:46:30,823 Stage-1 map = 100%,2013-07-16 10:46:31,833 Stage-1 map = 100%,2013-07-16 10:46:32,844 Stage-1 map = 100%,2013-07-16 10:46:33,856 Stage-1 map = 100%,2013-07-16 10:46:34,865 Stage-1 map = 100%,2013-07-16 10:46:35,873 Stage-1
32、map = 100%,2013-07-16 10:46:36,884 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 0.87 sec reduce = 0%, Cumulative CPU 0.87 sec reduce = 0%, Cumulative CPU 0.87 sec reduce = 0%, Cumulative CPU 0.87 sec reduce = 0%, Cumulative CPU 0.87 sec reduce = 0%, Cumulative CPU 0.87 sec reduce = 100%, Cumulati
33、ve CPU 0.87 secMapReduce Total cumulative CPU time: 870 msecEnded Job =job_201307131407_0005Copying data to local directory /tmp/t_hiveCopying data to local directory /tmp/t_hive7 Rows loaded to /tmp/t_hiveMapReduce Jobs Launched:Job 0: Map: 1 Cumulative CPU: 0.87 sec HDFS Read: 271 HDFS Write: 56 S
34、UCCESSTotal MapReduce CPU Time Spent: 870 msecOKTime taken: 23.369 seconds#查看本地操作系統(tǒng)hive> ! cat /tmp/t_hive/000000_0;hive> 162361121341231172137123111234112345. Hive 查詢 HiveQL注:以下代碼將去掉 map,reduce的日志輸出部分。普通查詢:排序,列別名,嵌套子查詢hive> FROM (> SELECT b,c as c2 FROM t_hive> )t> SELECT t.b, t.c
35、2> WHERE b>2> LIMIT 2;1213213連接查詢:JOINhive> SELECT t1.a,t1.b,t2.a,t2.b> FROM t_hive t1 JOIN t_hive2 t2 on t1.a=t2.a> WHERE t1.c>10;1 12 1 1211211241241261126112712712聚合查詢1 : count, avghive> SELECT count(*), avg(a) FROM t_hive;731.142857142857142聚合查詢 2 : count, distincthive>
36、; SELECT count(DISTINCT b) FROM t_hive;3聚合查詢 3 : GROUP BY, HAVING#GROUP BYhive> SELECT avg(a),b,sum(c) FROM t_hive GROUP BY b,c16.02356.026211.023461.01213I. 0123417.0213#HAVINGhive> SELECT avg(a),b,sum(c) FROM t_hive GROUP BY b,c HAVING sum(c)>3056.0262II. 02341.012346. Hive視圖Hive視圖和數(shù)據(jù)庫(kù)視圖的
37、概念是一樣的,我們還以 t_hive為例hive> CREATE VIEW v_hive AS SELECT a,b FROM t_hive where c>30;hive> select * from v_hive;4127121 1211 2刪除視圖hive> DROP VIEW IF EXISTS v_hive;OKTime taken: 0.495 seconds7. Hive分區(qū)表分區(qū)表是數(shù)據(jù)庫(kù)的基本概念,但很多時(shí)候數(shù)據(jù)量不大,我們完全用不到分區(qū)表。Hive是一種OLAP數(shù)據(jù)倉(cāng)庫(kù)軟件,涉及的數(shù)據(jù)量是非常大的,所以分區(qū)表在這個(gè)場(chǎng)景就顯得非常重要!下面我們重新定義
38、一個(gè)數(shù)據(jù)表結(jié)構(gòu):t_hft創(chuàng)建數(shù)據(jù) vi /home/cos/demo/t_hft_20130627.csv000001,092023,9.76000002,091947,8.99000004,092002,9.79000005,091514,2.2000001,092008,9.70000001,092059,9.45 vi /home/cos/demo/t_hft_20130628.csv000001,092023,9.76000002,091947,8.99000004,092002,9.79000005,091514,2.2000001,092008,9.70000001,092059,9.45創(chuàng)建數(shù)據(jù)表DROP TABLE IF EXISTS t_hft;CREATE TABLE t_hft(SecurityID STRING,tradeTime STRING,PreClosePx DOUBLE)ROW FORMAT DELIMITED FIELDS TERMI
溫馨提示
- 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2024-2025學(xué)年新教材高中歷史第二單元三國(guó)兩晉南北朝的民族交融與隋唐統(tǒng)一多民族封建國(guó)家的發(fā)展第8課三國(guó)至隋唐的文化課后課時(shí)作業(yè)新人教版必修中外歷史綱要上
- 第4課 《3 積極防病》(教學(xué)設(shè)計(jì))-2023-2024學(xué)年四年級(jí)下冊(cè)綜合實(shí)踐活動(dòng)安徽大學(xué)版
- 第九課 日益重要的國(guó)際組織 第三課時(shí) 教學(xué)設(shè)計(jì)-2023-2024學(xué)年道德與法治六年級(jí)下冊(cè)統(tǒng)編版
- Unit 1 Making friends Part B 第1課時(shí)(教學(xué)設(shè)計(jì))-2024-2025學(xué)年人教PEP版(2024)英語(yǔ)三年級(jí)上冊(cè)
- 短跑教學(xué)設(shè)計(jì)4 -九年級(jí)體育與健康
- 8我們受特殊保護(hù) 第三課時(shí)(教學(xué)設(shè)計(jì))-部編版道德與法治六年級(jí)上冊(cè)
- 教學(xué)設(shè)計(jì)-浙教信息技術(shù)六下第13課 《擴(kuò)音系統(tǒng)的優(yōu)化》
- Unit 4 History And Traditions Assessing Your Progress 教學(xué)設(shè)計(jì)-2023-2024學(xué)年高中英語(yǔ)人教版(2019)必修第二冊(cè)
- 2025年偏擺檢查儀合作協(xié)議書(shū)
- 2025年中國(guó)安全地墊市場(chǎng)運(yùn)行態(tài)勢(shì)及行業(yè)發(fā)展前景預(yù)測(cè)報(bào)告
- 高壓氧科工作總結(jié)高壓氧科個(gè)人年終總結(jié).doc
- 電子電路基礎(chǔ)習(xí)題解答
- 《政治學(xué)概論》教學(xué)大綱
- 食品生物化學(xué)習(xí)題謝達(dá)平(動(dòng)態(tài))
- 保安員工入職登記表
- 斷路器控制回路超詳細(xì)講解
- 簽證戶口本完整翻譯模板
- 睿達(dá)RDCAM激光雕刻切割軟件V5.0操作說(shuō)明書(shū)
- 變電設(shè)備運(yùn)行與維護(hù)培訓(xùn)課件(共102頁(yè)).ppt
- 機(jī)械設(shè)計(jì)基礎(chǔ)平面連桿機(jī)構(gòu)課件
- 寒假學(xué)習(xí)計(jì)劃表-
評(píng)論
0/150
提交評(píng)論