hadoop安裝過程.doc_第1頁
hadoop安裝過程.doc_第2頁
hadoop安裝過程.doc_第3頁
hadoop安裝過程.doc_第4頁
hadoop安裝過程.doc_第5頁
已閱讀5頁,還剩9頁未讀 繼續(xù)免費閱讀

下載本文檔

版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領

文檔簡介

此文檔收集于網(wǎng)絡,如有侵權,請聯(lián)系網(wǎng)站刪除l 32bit windows xp 上安裝64bit ubuntu ,vbox設置系統(tǒng)處理器數(shù)量設置1,為2會報錯l BIOS 啟用vt x-AMD-v 支持 進入BIOS-Advanced BIOS Features-Virtualization-Disabled(預設值)修改為Enabled,儲存(save),重啟。l Vbox加載ubuntul Ubuntu共享windows文件夾設置l 安裝增強功能 vbox-設備-安裝增強功能l 在ubuntu中創(chuàng)建的掛載目錄為/media/shared,命令為:sudo mkdir /media/sharedl sudo passwd rootl 切換成root賬戶 sudo sl 在windows E:盤創(chuàng)建文件夾ubuntu1110_64sharefolderl sudo mount.vboxsf ubuntu1110_64sharefolder /media/shared 將文件夾ubuntu1110_64sharefolder掛載到/media/shared下l 設定開機就自動掛載。指令sudo gedit /etc/fstab 開啟fstab,最後面加入ubuntu1110_64sharefolder /media/shared vboxsf rw 0 0l 安裝jdk-7u3-linux-x64.tar.gz(參考/yang_hui1986527/article/details/6677450)進入 media/shared運行 sudo mkdir /usr/lib/jvmsudo tar zxvf ./ jdk-7u3-linux-x64.tar.gz -C /usr/lib/jvm z通過gzip指令處理備份文件x從備份文件中還原文件v顯示指令執(zhí)行過程f指定備份文件cd/usr/lib/jvmcd到jvm目錄下sudomvjdk1.7.0/java-7-sun 改名apt-get install vim 安裝vim包修改環(huán)境變量 vim/.bashrc添加:exportJAVA_HOME=/usr/lib/jvm/java-7-sunexportJRE_HOME=$JAVA_HOME/jreexportCLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/libexportPATH=$JAVA_HOME/bin:$PATHshift+:wq保存退出,輸入以下命令使之立即生效。source/.bashrc配置默認JDK版本由于ubuntu中可能會有默認的JDK,如openjdk,所以,為了將我們安裝的JDK設置為默認JDK版本,還要進行如下工作。執(zhí)行代碼:sudoupdate-alternatives-install/usr/bin/javajava/usr/lib/jvm/java-7-sun/bin/java300sudoupdate-alternatives-install/usr/bin/javacjavac/usr/lib/jvm/java-7-sun/bin/javac300sudoupdate-alternatives-install/usr/bin/jarjar/usr/lib/jvm/java-7-sun/bin/jar300執(zhí)行代碼:sudoupdate-alternatives-configjava測試Java version安裝hadoop1.0.0mkdir /home/appsudo tar zxvf ./hadoop-1.0.0.tar.gz C /home/app/cd/home/app/hadoop-1.0.0進入Hadoop目錄viconf/hadoop-env.sh修改配置文件,指定JDk安裝路徑source conf/hadoop-env.sh使立即生效修改Hadoop核心配置文件core-site.xml,這里配置的是HDFS的地址和端口號 1Vimconf/4hdfs:/localhost:900056修改Hadoop中HDFS的配置,配置的備份方式默認為3,因為安裝的是單機版,所以需要改為1 1vimconf/hdfs-site.xml123dfs.replication4156修改Hadoop中MapReduce的配置文件,配置的是JobTracker的地址和端口 1vimconf/mapred-site.xml123mapred.job.tracker4localhost:900156接下來,啟動Hadoop,在啟動之前,需要格式化Hadoop的文件系統(tǒng)HDFS,進入Hadoop文件夾,輸入下面命令 1bin/hadoop namenode format(為什么啟動之前要執(zhí)行這一步,否則50060和50070會報錯)然后啟動Hadoop,輸入命令 1bin/start-all.sh這個命令為所有服務全部啟動。 最后,驗證Hadoop是否安裝成功。打開瀏覽器,分別輸入一下網(wǎng)址:http:/localhost:50030 (MapReduce的Web頁面)http:/localhost:50070 (HDfS的web頁面)如果都能查看,說明安裝成功建立ssh無密碼登錄本機Ssh-keygen t rsa P “”回車進入/.ssh/目錄下,將id_rsa.pub追加到authorized_keys授權文件中,開始是沒有authorized_keys文件的登錄localhost,如圖執(zhí)行退出命令安裝hbase進入cd /media/shared/sudo tar zxvf ./ hbase-0.92.1.tar.gz C /home/app/ 安裝完之后/home/app/下出現(xiàn)hbase-0.92.1修改配置進入hbase-0.92.1目錄Vim conf/hbase-env.sh修改配置文件修改hbase-env.sh#必修配置的地方為:export JAVA_HOME=/usr/lib/jvm/java-7-sunexport HBASE_CLASSPATH=/home/app/hbase-0.92.1/confexport HBASE_OPTS=-XX:+UseConcMarkSweepGCexport HBASE_MANAGES_ZK=true其中,JAVA_HOME為java安裝路徑,HBASE_CLASSPATH為Hadoop安裝路徑。source conf/hbase-env.sh使立即生效修改配置文件 / 偽分布式配置hbase-site.xml(/Linux/2012-03/56349.htm)修改其內容為: hbase.rootdir hdfs:/localhost:9000/hbase The directory shared by region servers. # hbase.master.port# 60000# hbase.cluster.distributed# true# # #perty.dataDir #/home/Hadooptest/zookeeper-3.4.3/zookeeperdir/zookeeper-data # perty.clientPort# 2181# # hbase.zookeeper.quorum# zookeeper# 啟動 /home/app/hadoop-1.0.0/bin/start-all.shrootzhuwei-VirtualBox:/home/app/hbase-0.92.1/bin# ./stop-hbase.shrootzhuwei-VirtualBox:/home/app/hbase-0.92.1/bin# ./start-hbase.shrootzhuwei-VirtualBox:/home/app/hbase-0.92.1/bin# ./hbase shell安裝zookeeper-3.4.3.tar.gzsudo tar zxvf ./zookeeper-3.4.3.tar.gz -C /home/app/將“/ zookeeper-3.4.3 /conf”目錄下zoo_sample.cfg修改名稱為“zoo.cfg”Sudo mv zoo_sample.cfg zoo.cfgrootzhuwei-VirtualBox:/home/app/zookeeper-3.4.3# sudo mkdir zookeeper_data新建文件夾修改dataDir參數(shù)/home/app/zookeeper-3.4.3/zookeeper_databin/zkServer.sh startbin/zkCli.sh -server :2181/Connecting to ZooKeeper黃色部分為terminal報出的信息rootzhuwei-VirtualBox:/home/app/zookeeper-3.4.3# bin/zkCli.sh -server :2181Connecting to :21812012-04-27 12:17:50,875 myid: - INFO main:Environment98 - Client environment:zookeeper.version=3.4.3-1240972, built on 02/06/2012 10:48 GMT2012-04-27 12:17:50,889 myid: - INFO main:Environment98 - Client environment:=zhuwei-VirtualBox2012-04-27 12:17:50,890 myid: - INFO main:Environment98 - Client environment:java.version=1.7.0_032012-04-27 12:17:50,896 myid: - INFO main:Environment98 - Client environment:java.vendor=Oracle Corporation2012-04-27 12:17:50,896 myid: - INFO main:Environment98 - Client environment:java.home=/usr/lib/jvm/java-7-sun/jre2012-04-27 12:17:50,897 myid: - INFO main:Environment98 - Client environment:java.class.path=/home/app/zookeeper-3.4.3/bin/./build/classes:/home/app/zookeeper-3.4.3/bin/./build/lib/*.jar:/home/app/zookeeper-3.4.3/bin/./lib/slf4j-log4j12-1.6.1.jar:/home/app/zookeeper-3.4.3/bin/./lib/slf4j-api-1.6.1.jar:/home/app/zookeeper-3.4.3/bin/./lib/netty-3.2.2.Final.jar:/home/app/zookeeper-3.4.3/bin/./lib/log4j-1.2.15.jar:/home/app/zookeeper-3.4.3/bin/./lib/jline-0.9.94.jar:/home/app/zookeeper-3.4.3/bin/./zookeeper-3.4.3.jar:/home/app/zookeeper-3.4.3/bin/./src/java/lib/*.jar:/home/app/zookeeper-3.4.3/bin/./conf:.:/usr/lib/jvm/java-7-sun/lib:/usr/lib/jvm/java-7-sun/jre/lib2012-04-27 12:17:50,897 myid: - INFO main:Environment98 - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib2012-04-27 12:17:50,898 myid: - INFO main:Environment98 - Client environment:java.io.tmpdir=/tmp2012-04-27 12:17:50,898 myid: - INFO main:Environment98 - Client environment:piler=2012-04-27 12:17:50,898 myid: - INFO main:Environment98 - Client environment:=Linux2012-04-27 12:17:50,899 myid: - INFO main:Environment98 - Client environment:os.arch=amd642012-04-27 12:17:50,899 myid: - INFO main:Environment98 - Client environment:os.version=3.0.0-12-generic2012-04-27 12:17:50,900 myid: - INFO main:Environment98 - Client environment:=root2012-04-27 12:17:50,900 myid: - INFO main:Environment98 - Client environment:user.home=/root2012-04-27 12:17:50,901 myid: - INFO main:Environment98 - Client environment:user.dir=/home/app/zookeeper-3.4.32012-04-27 12:17:50,903 myid: - INFO main:ZooKeeper433 - Initiating client connection, connectString=:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher4ef5c3a62012-04-27 12:17:50,937 myid: - INFO main-SendThread():ClientCnxn$SendThread933 - Opening socket connection to server /:2181Welcome to ZooKeeper!2012-04-27 12:17:50,965 myid: - INFO main-SendThread(localhost:2181):ZooKeeperSaslClient125 - Client will not SASL-authenticate because the default JAAS configuration section Client could not be found. If you are not using SASL, you may ignore this. On the other hand, if you expected SASL to work, please fix your JAAS configuration.JLine support is enabled2012-04-27 12:17:51,006 myid: - INFO main-SendThread(localhost:2181):ClientCnxn$SendThread846 - Socket connection established to localhost/:2181, initiating sessionzk: :2181(CONNECTING) 0 2012-04-27 12:17:51,094 myid: - INFO main-SendThread(localhost:2181):ClientCnxn$SendThread1175 - Session establishment complete on server localhost/:2181, sessionid = 0x136f1fa35960000, negotiated timeout = 30000WATCHER:WatchedEvent state:SyncConnected type:None path:nullHive安裝(數(shù)據(jù)庫)Cd /media/shared/sudo tar xzvf ./hive-0.8.1.tar.gz -C /home/app/vim /.bashrcexport HIVE_HOME=pwd修改環(huán)境變量export PATH=$JAVA_HOMEbin:$HIVE_HOMEbin:$PATHvim hive-env.sh.templateexport HADOOP_HOME=/home/app/hadoop-1.0.0cp hive-default.xml.template hive-site.xml復制一份hive-default.xml為hive-site.xml 設置主要環(huán)境變量(手動)export HADOOP_HOME=/home/app/hadoop-1.0.0(你自己的hadoop安裝路徑) 在hdfs建立目錄用來保存hive數(shù)據(jù)$ $HADOOP_HOME/bin/hadoop fs -mkdir /tmp $ $HADOOP_HOME/bin/hadoop fs -mkdir /user/hive/warehouse $ $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp $ $HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse 運行運行 bin/hive 即可hive SET mapred.job.tracker=localhost:50030; hive SET -v;hive show tables;hive create table log_stat(ip STRING,time STRING,http_request STRING,uri STRING,http STRING,status int,code STRING);(此表創(chuàng)建時沒有指定數(shù)據(jù)分隔方式和行分隔方式,下面將刪除它重新建)OKTime taken: 9.893 secondshive drop table log_stat;OKTime taken: 2.172 secondshive create table log_stat(ip STRING,time STRING,http_request STRING,uri STRING,http STRING,status int,code STRING) row format delimited fields terminated by t lines terminated by n stored as textfile;(指定數(shù)據(jù)分隔方式t,行分隔n,以文本存儲)hive load data local inpath /home/app/pig-0.9.2/tutorial/scripts/load_result/part-m-00000 overwrite into table log_stat;(如果數(shù)據(jù)在hdfs上,則不需要Local關鍵字)hive dfs -ls /user/hive/warehouse;Found 1 itemsdrwxr-xr-x - root supergroup 0 2012-06-21 10:19 /user/hive/warehouse/log_stathive dfs -ls /user/hive/warehouse/log_stat;Found 1 items-rw-r-r- 1 root supergroup 1593 2012-06-21 10:19 /user/hive/warehouse/log_stat/part-m-00000hive發(fā)現(xiàn)hive中select count(*) 會花費很多時間而且執(zhí)行mapreducehive insert overwrite local directory /media/shared/reg_3 select a.* from log_stat a;(將表數(shù)據(jù)導入本地文件)hive load data inpath /user/root/load_result/part-m-00000 overwrite into table tt;(hdfs文件導入表)load data inpath hdfs:/localhost:9000/user/root/outlog/click_mini/20120628 into table click_mini;hive set hive.enforce.bucketing=true;hiveset hive.enforce.bucketing;外部表創(chuàng)建桶不會拆分目錄hive select * from ad_3rd tablesample(bucket 3 out of 3 on rand();select count(*) from ad_3rd tablesample(bucket 2 out of 3 on access_date);這句執(zhí)行有問題select count(*) from ad_3rd tablesample(bucket 1 out of 3 on access_date);這句可以安裝mysqlapt-get install mysql-client-5.1 mysql-server-5.1rootzhuwei-VirtualBox:/etc/mysql# service mysql statusrootzhuwei-VirtualBox:/home# mysql -uroot -proot進入mysqlmysql create user hive identified by 123456;創(chuàng)建用戶密碼mysql grant all privileges on *.* to hive% with grant option;賦權mysql select user();查看當前用戶mysql create database hive;sudo apt-get install mysql-server使用上面的命令下載的版本較低移除mysqlsudo apt-get autoremove -purge mysql-server-5.1sudo apt-get remove mysql-serversudo apt-get autoremove mysql-serversudo apt-get remove mysql-common (非常重要)MySQL-5.5.23-1.linux2.6.x86_64.tar安裝sudo tar zxvf ./MySQL-5.5.23-1.linux2.6.x86_64.tar -C/home/apphive連接mysqlconf/hive-site.xml配置 javax.jdo.option.ConnectionURL jdbc:derby:;databaseName=metastore_db;create=true jdbc:mysql:/localhost:3306/hive?createDatabaseIfNotExist=true JDBC connect string for a JDBC metastore javax.jdo.option.ConnectionDriverName org.apache.derby.jdbc.EmbeddedDriver com.mysql.jdbc.Driver Driver class name for a JDBC metastore javax.jdo.option.ConnectionUserName APP hive username to use against metastore database javax.jdo.option.ConnectionPassword mine 123456 password to use against metastore database復制mysqljdbc驅動至hive lib下mysql-connector-java-5.1.19-bin.jarrootzhuwei-VirtualBox:/home/app/hive-0.8.1# bin/hivehive show tables;證明成功OKTime taken: 5.852 secondshivemysql use hive;Reading table information for completion of table and column namesYou can turn off this feature to get a quicker startup with -ADatabase changedmysql select * from TBLS;查看mysql保存的hive元數(shù)據(jù)pig安裝(pig操作HDFS文件,管理mapreduce作業(yè))sudo tar zxvf ./pig-0.9.2.tar.gz -C /home/app/$ sudo vi /etc/profileExport JAVA_HOME=/usr(加上這一句,否則不會成功)export PIG_INSTALL=/home/app/pig-0.9.2export PATH=$PATH:$PIG_INSTALL/binexport PIG_HADOOP_VERSION=20export PIG_CLASSPATH=$HADOOP_INSTALL/conf (pig的mapReduce模式)$ source /etc/profilerootzhuwei-VirtualBox:/home/app/pig-0.9.2/tutorial/src/org/apache/pig/tutorial# javac -classpath /home/app/pig-0.9.2/pig-0.9.2.jar *.java編譯src下java文件rootzhuwei-VirtualBox:/home/app/pig-0.9.2/tutorial/src# jar -cvf tutorial.jar org將安裝報下tutorial 下org文件夾內容打包,可以執(zhí)行scripts目錄下示例pig程序運行rootzhuwei-VirtualBox:/home/app/pig-0.9.2/tutorial/scripts# pig -x local -param load_path=outlog/ipad_ads_error test.pig(自己編寫的script,程序中接受參數(shù)load_path用來指定輸入的數(shù)據(jù)文件)Linux -Eclipse hadoop配置eclipse hadoop開發(fā)環(huán)境在Eclipse下安裝hadoop-plugin。 1.復制 hadoop安裝目錄/contrib/eclipse-plugin/hadoop-0.20.2-eclipse-plugin.jar 到 eclipse安裝目錄/plugins/ 下。 2.重啟eclipse,配置hadoop installation directory。 如果安裝插件成功,打開Window-Preferens,你會發(fā)現(xiàn)Hadoop Map/Reduce選項,在這個選項里你需要配置Hadoop installation directory。配置完成后退出。 3.配置Map/Reduce Locations。 在Window-Show View中打開Map/Reduce Locations。 在Map/Reduce Locations中新建一個Hadoop Location。在這個View中,右鍵-New Hadoop Location。在彈出的對話框中你需要配置Location name,如myubuntu,還有Map/Reduce Master和DFS Master。這里面的Host、Port分別為你在mapred-site.xml、core-site.xml中配置的地址及端口。如: Map/Reduce Master Java代碼 1. localhost 2. 9001DFS Master Java代碼 1. localhost 2. 9000配置完后退出。點擊DFS Locations-myubuntu如果能顯示文件夾(2)說明配置正確,如果顯示拒絕連接,請檢查你的配置。 第三步,新建項目。 File-New-Other-Map/Reduce Project 項目名可以隨便取,如hadoop-test。 復制 hadoop安裝目錄/src/example/org/apache/hadoop/example/WordCount.java到剛才新建的項目下面。rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop dfs -mkdir pig(在hdfs上創(chuàng)建pig文件夾)Hadoop hello world程序rootzhuwei-VirtualBox:/home/app# cd hadoop-1.0.0/rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# mkdir inputrootzhuwei-VirtualBox:/home/app/hadoop-1.0.0/input# vi test1.txttest1.txt輸入(HelloWorldByeWorld)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0/input# vi test2.txttest2.txt輸入(HelloHadoopGoodbyeHadoop)cd 到hadoop安裝目錄,運行下面命令rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop fs -put input test1.txt(這個命令將input文件夾上傳到了hadoop文件系統(tǒng),文件夾命名為test1.txt)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop dfs -rmr test1.txt(刪除test1.txt文件夾)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop dfs -put input in(這個命令將input文件夾上傳到了hadoop文件系統(tǒng),文件夾命名為in)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop jar hadoop-examples-1.0.0.jar wordcount in out(運行wordcount計數(shù)in 文件夾,結果輸出至out目錄)rootzhuwei-VirtualBox:/home/app/hadoop-1.0.0# bin/hadoop jar hadoop-examples-1.0.0.jar wordcount in outWarning: $HADOOP_HOME is deprecated.12/06/14 10:09:22 INFO input.FileInputFormat: Total input paths to process : 212/06/14 10:09:23 INFO mapred.JobClient: Running job: job_201206131736_000112/06/14 10:09:24 INFO mapred.JobClient: map 0% reduce 0%12/06/14 10:10:00 INFO mapred.JobClient: map 100% reduce 0%12/06/14 10:10:30 INFO mapred.JobClient: map 100% reduce 100%12/06/14 10:10:36 INFO mapred.JobClient: Job complete: job_201206131736_000112/06/14 10:10:37 INFO mapred.JobClient: Counters: 2912/06/14 10:10:37 INFO mapred.JobClient: Job Counters 12/06/14 10:10:37 INFO mapred.JobClient: Launched reduce tasks=112/06/14 10:10:37 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=4968612/06/14 10:10:37 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=012/06/14 10:10:37 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=012/06/14 10:10:37 INFO mapred.JobClient: Launched map tasks=212/06/14 10:10:37 INFO mapred.JobClient: Data-local map tasks=212/06/14 10:10:37 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=2743612/06/14 10:10:37 INFO mapred.JobClient: File Output Format Counters 12/06/14 10:10:37 INFO mapred.JobClient: Bytes Written=4012/06/14 10:10:37 INFO mapred.JobClient: FileSystemCounters12/06/14 10:10:37 INFO mapred.JobClient: FILE_BYTES_READ=7812/06/14 10:10:37 INFO mapred.JobClient: HDFS_BYTES_READ=26712/06/14 10:10:37 INFO mapred.JobClient: FILE_BYTES_WRITTEN=6470012/06/14 10:10:37 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=4012/06/14 10:10:37 INFO mapred.JobClient: File Input Format Counters 12/06/14 10:10:37 INFO mapred.JobClient: Bytes Read=4912/06/14 10:10:37 INFO mapred.JobClient: Map-Reduce Framework12/06/14 10:10:37 INFO mapred.JobClient: Map output materialized bytes=8412/06/14 10:10:37 INFO mapred.JobClient: Map input records=212/06/14 1

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
  • 4. 未經(jīng)權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
  • 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論