professional nosql database_第1頁
professional nosql database_第2頁
professional nosql database_第3頁
professional nosql database_第4頁
professional nosql database_第5頁
已閱讀5頁,還剩380頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、PROFESSIONAL NoSQLINTRODUCTIONxvii PART ICHAPTER 1CHAPTER 2CHAPTER 3GETTING STARTEDNoSQL: What It Is and Why You Need It3Hello NoSQL: Getting Initial Hands-on Experience21Interfacing and Interacting with NoSQL.43 PART IICHAPTER 4CHAPTER 5CHAPTER 6CHAPTER 7CHAPTER 8CHAPTER 9LEARNING THE NoSQL BASICSU

2、nderstanding the Storage Architecture71PerforCRUD Operations97Querying NoSQL Stores117Modifying Data Stores and Managing Evolution137Indexing and Ordering Data Sets149Managing Tranions and Data Integrity169 PART IIICHAPTER 10CHAPTER 11CHAPTER 12CHAPTER 13GAINING PROFICIENCY WITH NoSQLUsing NoSQL in

3、the Cloud187Scalable Parallel Processing with MapReduce217Analyzing Big Data with Hive233Surveying Database Internals253 PART IVCHAPTER 14CHAPTER 15CHAPTER 16CHAPTER 17MASTERING NoSQLChoosing Among NoSQL Flavors271Coexistence285Performance Tuning301Tools and Utilities311APPENDIXInstallation and Setu

4、p Instructions329INDEX351PROFESSIONALNoSQLShashank TiwariJohn Wiley & Sons, Inc.Professional NoSQLPublished byJohn Wiley & Sons, Inc. 10475 Crosspoint BoulevardIndianapolis, IN 46256Copyright © 2011 by John Wiley & Sons, Inc., Indianapolis, Indiana Published simultaneously in Canada

5、ISBN: 978-0-470-94224-6Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted

6、under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-86

7、00. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201)748-6008, or online at.Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representa

8、tions or warres withrespect to the accuracy or completeness of the contents of this work and specifically disclaim all warres, includingwithout limitation warres of fitness for a particular purpose. No warranty may be created or extended by sales or pro-motional materials. The advice and strategies

9、contained herein may not be suitable for every situation. This work is soldwith the understanding that the publisher is not engaged in rendering legal, If professional assistance is required, the services of a competent professionaling, or other professional services. should be sought. Neither the p

10、ub-lisher nor the author shall be liable for damages arising herefrom. The fact that an organization or website is referred toin this work as a citation and/or a potential source of further information does not mean that the author or the publisherendorses the information the organization or website

11、 may pror recommendations it may make. Further,ersshould be aware that Internet website listed in this work may have changed or disappeared between when this work waswritten and when it is.For general information on our other products and services please contact our Customer Care Department within t

12、he United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books.Library of Congress Control Number: 2011930307Trademark

13、s: Wiley, the Wiley logo, Wrox, the Wrox logo, Programmer to Programmer, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affiliates, in the United States and other countries, and may not be used without written permission. All other trademark

14、s are the property of their respective owners. John Wiley & Sons, Inc., is not associated with any product or vendor mentioned in this book.I would like to dedicate my work on this book to myparents, Mandakini and Suresh Tiwari.Everything I do successfully, including writing this book, is a resu

15、lt of the immense support of my dear wife, Caren and my adorable sons,Ayaan and Ezra.CREDITSEXECUTIVE EDITORRobert ElliotPRODUCTION MANAGERTim TatePROJECT EDITORSydney JonesVICE PRESIDENT AND EXECUTIVE GROUP PUBLISHERRichard SwadleyTECHNICAL EDITORSStefan Edlich Matt IngenthronVICE PRESIDENT AND EXE

16、CUTIVE PUBLISHERBarry PruettPRODUCTION EDITORDaniel ScribnerASSOCIATE PUBLISHERJim MinatelCOPY EDITORKim CofferPROJECT COORDINATOR, COVERKatherine CrockerEDITORIAL DIRECTORRobyn B. SieskyPROOFERScott Klemp, Word OneEDITORIAL MANAGERMary Beth WakeeldINDEXERRobert SwansonLANCER EDITORIAL MANAGERRosema

17、rie GrahamCOVER DESIGNERLeAndra YoungMARKETING MANAGERAshley ZurcherCOVER IMAGE© René MansiABOUT THE AUTHORSHASHANK TIWARI is an experienced software developer and technology entrepreneur with interests in the areas of high-performance applications, analytics, web applications, and mobile

18、platforms. He enjoys data visualization, statistical and machine learning, coffee, deserts and bike riding. He is the author of many technical articles and books and a speaker at many conferences worldwide.Learn more about his company, Treasury of Ideas, at.his blogator follow him onat tshanky. He l

19、ives with his wife and two sons inPalo Alto, California.ABOUT THE TECHNICAL EDITORSPROF. DR. STEFAN EDLICH is a senior lecturer at Beuth HS of Technology Berlin (U.APP.SC) with a focus on NoSQL, Software-Engineering and Cloud Computing. Beside many scientific papersand journal articles, he is a cont

20、inuous speaker at conferences and IT events concerning enterprise, NoSQL, and ODBMS topics since 1993.Furthermore, he is the author of twelve IT books written for Apress, OReilly, Spektrum/Elsevier, Hanser, and other publishers. He is a founding member of OODBMS.org e.V. and started the worlds First

21、 International Conference on Object Databases (ICOODB.org) series. He runs the NoSQL Archive, organizes NoSQL events, and is constantly writing about NoSQL.MATT INGENTHRON is an experienced web architect with a software development background. He has deep expertise in building, scaling and operating

22、 global-scale Java, Ruby on Rails and AMP web applications. Having been with Couchbase, Inc. since its inception, he has been a coredeveloper on the Open Source Membase NoSQL project, a contributor to the Memcached project,and a leader for new developments in the Java spymemcached. Matts NoSQL exper

23、iences arewidespthough, having experience with Hadoop, HBase and other parts of the NoSQL world.ACKNOWLEDGMENTSTHIS BOOK REPRESENTS the efforts of many people, and I sincerely thank them for their contribution.Thanks to the team at Wiley. Youthe book possible!Thanks to Matt and Stefan for the valuab

24、le inputs and the technical review.Thanks to my wife and sons for encouraging and supporting me through the process of writing this book. Thanks to all the members of my family and friends who have always believed in me.Thanks to all who have contributed directly or indirectly to this book and who I

25、 may have missed unintentionally.Shashank TiwariCONTENTSINTRODUCTIONxviiCHAPTER 1: NOSQL: WHAT IT IS AND WHY YOU NEED IT3Denition and Introduction Context and a Bit of History Big DataScalabilityDenition and IntroductionSorted Ordered Column-Oriented Stores Key/Value StoresDocument Databases Graph D

26、atabases SummaryCHAPTER 2: HELLO NOSQL: GETTING INITIAL HANDS-ON EXPERIENCE447910111418192021First Impressions Examining Two Simple ExamplesA Simple Set of Persistent Preferences Data2222283737404243Storing Car Make and MDataWorking with Language BindingsMongoDBs Drivers A First Look at ThriftSummar

27、yCHAPTER 3: INTERFACING AND INTERACTING WITH NOSQLIf No SQL, Then What?Storing and Accessing DataStoring Data In and Accessing Data from MongoDB Querying MongoDBStoring Data In and Accessing Data from Redis Querying RedisStoring Data In and Accessing Data from HBase Querying HBase4344454951565962PAR

28、T I: GETTING STARTEDCONTENTSStoring Data In and Accessing Data from Apache Cassandra Querying Apache CassandraLanguage Bindings for NoSQL Data StoresBeing Agnostic with Thrift Language Bindings for Java Language Bindings for Language Bindings for Ruby Language Bindings for PHPSummary6364656566686869

29、70 PART II: LEARNING THE NOSQL BASICSCHAPTER 4: UNDERSTANDING THE STORAGE ARCHITECTURE73Working with Column-Oriented Databases Using Tables and Columns in Relational Databases Contrasting Column Databases with RDBMSColumn Databases as Nested Maps of Key/Value Pairs Laying out the WebtableHBase Distr

30、ibuted Storage ArchitectureDocument Store InternalsStoring Data in Memory-Mapped FilesGuidelines for Using Collections and Indexes in MongoDB MongoDB Reliability and DurabilityHorizontal ScalingUnderstanding Key/Value Stores in Memcached and RedisUnder the Hood of Memcached Redis InternalsEventually

31、 Consistent Non-relational DatabasesConsistent Hashing Object VersioningGossip-Based Membership and Hinted HandoffSummary7475777981828586878889909192939495969697CHAPTER 5: PERFORCRUD OPERATIONSCreating RecordsCreating Records in a Document-Centric Database Using the Create Operation in Column-Orient

32、edDatabasesUsing the Create Operation in Key/Value Maps9799105108xCONTENTSAccessing DataAccessing Documents from MongoDB Accessing Data from HBaseQuerying RedisUpdating and Deleting DataUpdating and Modifying Data in MongoDB, HBase, and Redis110111112113113114115116117Limited Atomicity and TranSumma

33、ryional IntegrityCHAPTER 6: QUERYING NOSQL STORESSimilarities Between SQL and MongoDB Query FeaturesLoading the MovieLens Data MapReduce in MongoDBAccessing Data from Column-Oriented Databases Like HBaseThe Historical Daily Market Data Querying Redis Data Stores Summary118119126129129131135CHAPTER 7

34、: MODIFYING DATA STORES AND MANAGING EVOLUTION137Changing Document DatabasesSchema-less FlexibilityExporting and Importing Data from and into MongoDB Schema Evolution in Column-Oriented Databases HBase Data Import and ExportData Evolution in Key/Value Stores Summary138141143145147148148CHAPTER 8: IN

35、DEXING AND ORDERING DATA SETS149Essential Conceptsa Database Index150151154160163164165166166168Indexing and Ordering in MongoDBCreating and Using Indexes in MongoDBCompound and Embedded Keys Creating Unique and Sparse Indexes Keyword-based Search and MultiKeysIndexing and Ordering in CouchDBThe B-t

36、ree Index in CouchDB Indexing in Apache Cassandra SummaryxiCONTENTSCHAPTER 9: MANAGING TRANDATA INTEGRITYIONS AND169RDBMS and ACIDIsolation Levels and Isolation StrategiesDistributed ACID SystemsConsistency Availability Partition ToleranceUpholding CAPCompromising on Availability Compromising on Par

37、tition Tolerance Compromising on ConsistencyConsistency Implementations in a Few NoSQL ProductsDistributed Consistency in MongoDB Eventual Consistency in CouchDBEventual Consistency in Apache Cassandra Consistency in MembaseSummary169171173174174175176179179180181181181183183183CHAPTER 10: USING NOS

38、QL IN THE CLOUD187Google App Engine Data Store188189193197198202205205207211213214GAESDK: Installation, Setup, and Getting StartedEssentials of Data M Queries and Indexesing for GAE inAllowed Filters and Result OrderingTersely Exploring the Java App Engine SDKAmazon SimpleDBGetting Started with Simp

39、leDB Using the REST APIAccessing SimpleDB Using Java Using SimpleDB with Ruby andSummaryCHAPTER 11: SCALABLE PARALLEL PROCESSING WITH MAP REDUCE217Understanding MapReduceFinding the Highest Stock Price for Each Stock Uploading Historical NYSE Market Data into CouchDB218221223xiiPART III: GAINING PRO

40、FICIENCY WITH NOSQLCONTENTSMapReduce with HBaseMapReduce Possibilities and Apache Mahout SummaryCHAPTER 12: ANALYZING BIG DATA WITH HIVE226230232233Hive BasicsBack to Movie Ratings Good Old SQL JOIN(s) in Hive QLExplain Plan Partitioned TableSummaryCHAPTER 13: SURVEYING DATABASE INTERNALS23423924624

41、8250252252253MongoDB Internals MongoDB Wire Protocol Inserting a Document Querying a Collection MongoDB Database FilesMembase ArchitectureHypertable Under the Hood Regular Expression Support Bloom FilterApache CassandraPeer-to-Peer M254255257257258261263263264264264264265266266267268Based on Gossip

42、and Fast WritesHinted HandoffBerkeley DBStorage CongurationSummary-entropyCHAPTER 14: CHOOSING AMONG NOSQL FLAVORS271Comparing NoSQL ProductsScalabilityTranional Integrity and Consistency272272274275277Data MingQuerying SupportxiiiPART IV: MASTERING NOSQLCONTENTSAccess and Interface AvailabilityBenc

43、hmarking Performance27827928028028028128128228328550/5095/5Scansand Update and UpdateScalability Test Hypertable TestsContextual ComparisonSummaryCHAPTER 15: COEXISTENCEUsing MySQL as a NoSQL Solution Mostly Immutable Data StoresPolyglot Persistence at285289290291292292293295300300301Data Waousing a

44、nd Business IntelligenceWeb Frameworks and NoSQLUsing Rails with NoSQL Using Django with NoSQL Using Spring DataMigrating from RDBMS to NoSQL SummaryCHAPTER 16: PERFORMANCE TUNINGGoals of Parallel AlgorithmsThe Implications of Reducing Latency How to Increase ThroughputLinear ScalabilityInuencing Eq

45、uationsAmdahls Law Littles Law Message Cost MPartitioningScheduling in Heterogeneous Environments Additional Map-Reduce TuningCommunication Overheads CompressionFile Block Size Parallel CopyingHBase CoprocessorsLeveraging Bloom Filters Summary301301302302303303304305305306307307307308308308309309xiv

46、CONTENTSCHAPTER 17: TOOLS AND UTILITIES311RRDTool Nagios Scribe Flume Chukwa PigInterfacing with Pig Pig Latin BasicsNodetoolOpenTSDB Solandra312314315316316317318318320321322324325325326326329Humbird and C5tGeoCouchAlchemy Database WebdisSummaryAPPENDIX: INSTALLATION AND SETUP INSTRUCTIONSInstallin

47、g and Setting Up HadoopInstalling HadoopConguring a Single-node Hadoop Setup Conguring a Pseudo-distributed Mode SetupInstalling and Setting Up HBaseInstalling and Setting Up HiveConguring HiveOverlaying Hadoop CongurationInstalling and Setting Up HypertableMaking the Hypertable Distribution FHS-Com

48、pliant Conguring Hadoop with HypertableInstalling and Setting Up MongoDBConguring MongoDBInstalling and Conguring CouchDBInstalling CouchDB from Source on Ubuntu 10.04Installing and Setting Up Redis Installing and Setting Up CassandraConguring Cassandra Conguring log4j for Cassandra Installing Cassa

49、ndra from Source329330331331335335336337337338339339340340341342343343343344xvCONTENTSInstalling and Setting Up Membase Server and Memcached Installing and Setting Up NagiosDownloading and Building Nagios Conguring NagiosCompiling and Installing Nagios Plugins Installing and Setting Up RRDtool Insta

50、lling Handler Socket for MySQL344345346347348348349INDEX351xviINTRODUCTIONTHE GROWTH OF USER-DRIVEN CONTENT has fueled a rapid increase in the volume and type of data that is generated, manipulated, analyzed, and archived. In addition, varied newer sets of sources, including sensors, Global Position

51、ing Systems (GPS), automated trackers and monitoring systems, are generating a lot of data. These larger volumes of data sets, often termed big data, are imposing newer challenges and opportunities around storage, analysis, and archival.In parallel to the fast data growth, data is also becoincreasin

52、gly semi-structured and sparse.This means the traditional data management techniques around upfront sche relational references is also being questioned.finition andThe quest to solve the problems related to large-volume and semi-structured data has led to the emergence of a class of newer types of d

53、atabase products. This new class of database products consists of column-oriented data stores, key/value pair databases, and document databases.Collectively, these are identified as NoSQL.The products that fall under the NoSQL umbrella are quite varied, each with their unique sets of features and va

54、lue propositions. Given this, it often becomes difficult to decide which product to use for the case at hand. This book prepares you to understand the entire NoSQL landscape. It providesthe essential concepts that athe building blocks for many of the NoSQL products. Instead ofcovering a single produ

55、ct exhaustively, it provides a fair coverage of a number of different NoSQLproducts. The emphasis is often on bth and underlying concepts rather than a full coverage ofevery product API. Because a number of NoSQL products are covered, a good bit of comparative analysis is also included.If you are unsure where to start with NoSQL and how to learn to manage and analyze big data, then you will find this book to be a good i

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論