




版權說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權,請進行舉報或認領
文檔簡介
1、Wei.LiSenior Data ArchitectureAlpine SolutionGreenplum數(shù)據(jù)庫數(shù)據(jù)庫l 海量并行處理 (Massively Parallel Processing) DBMSl 基于 PostgreSQL 8.2 相同的客戶端功能 增加支持并行處理的技術 增加支持數(shù)據(jù)倉庫和BI的特性 外部表(external tables)/并行加載(parallel loading) 資源管理 查詢優(yōu)化器增強(query optimizer enhancements)S1002 Network ConfigurationGreenplum 基本體系架構(gòu)基本體系架構(gòu)客戶端接
2、口和程序客戶端接口和程序l psqll pgAdmin IIIl ODBC/Datadirectl JDBCl Perl DBIl Pythonl libpqlOLE DBMaster Hostl 訪問系統(tǒng)的入口l 數(shù)據(jù)庫偵聽進程 (postgres)l 處理所有用戶連接l 建立查詢計劃l 協(xié)調(diào)工作處理過程l 管理工具l 系統(tǒng)目錄表和元數(shù)據(jù)(數(shù)據(jù)字典)l 不存放任何用戶數(shù)據(jù)l 每段(Segment)存放一部分用戶數(shù)據(jù)l 一個系統(tǒng)可以有多段l 用戶不能直接存取訪問l 所有對段的訪問都經(jīng)過Masterl 數(shù)據(jù)庫監(jiān)聽進程(postgres)監(jiān)聽來自Master的連接Segmentl Greenplu
3、m數(shù)據(jù)庫之間的連接層l 進程間協(xié)調(diào)和管理l 基于千兆以太網(wǎng)架構(gòu)l 屬于系統(tǒng)內(nèi)部私網(wǎng)配置l 支持兩種協(xié)議:TCP or UDPInterconnectGreenplum 高可用性體系架構(gòu)高可用性體系架構(gòu)l Standby 節(jié)點用于當 Master 節(jié)點損壞時提供 Master 服務l Standby 實時與 Master 節(jié)點的 Catalog 和事務日志保持同步Master/Standby 鏡像保護鏡像保護l 每個Segment的數(shù)據(jù)冗余存放在另一個Segment上,數(shù)據(jù)實時同步l 當Primary Segment失敗時,Mirror Segment將自動提供服務l Primary Segme
4、nt恢復正常后,使用gprecoverseg F 同步數(shù)據(jù)。數(shù)據(jù)冗余數(shù)據(jù)冗余-Segment 鏡像保護鏡像保護l Hash分布 CREATE TABLE DISTRIBUTED BY (column ,) 同樣數(shù)值的內(nèi)容被分配到同一個Segment上l 循環(huán)分布 CREATE TABLE DISTRIBUTED RANDOMLY 具有同樣數(shù)值的行內(nèi)容并不一定在同一個Segment上表分布的策略表分布的策略-并行計算的基礎并行計算的基礎表分布的策略表分布的策略-并行計算的基礎并行計算的基礎查詢命令的執(zhí)行查詢命令的執(zhí)行l(wèi) 舉例說明:按卡號、客戶號、機構(gòu)的分布方式優(yōu)劣點分布存儲分布存儲查詢命令的執(zhí)行
5、查詢命令的執(zhí)行查詢命令的執(zhí)行查詢命令的執(zhí)行SQL查詢處理機制查詢處理機制SELECT customer, amount FROM sales JOIN customer USING (cust_id) WHERE date=04302008;并行并行查詢計劃查詢計劃表分區(qū)的概念表分區(qū)的概念l 將一張大表邏輯性地分成多個部分,如按照分區(qū)條件進行查詢,將減少數(shù)據(jù)的掃描范圍,提高系統(tǒng)性能。l 提高對于特定類型數(shù)據(jù)的查詢速度和性能l 也可以更方便數(shù)據(jù)庫的維護和更新l 兩種類型: Range分區(qū) (日期范圍或數(shù)字范圍)/如日期、價格等 List 分區(qū),例如地區(qū)、產(chǎn)品等l Greenplum中的表分區(qū)在使
6、用中具有總表的繼承性,并通過Check參數(shù)指定相應的子表l 分區(qū)的子表依然根據(jù)分布策略分布在各segment上l 分區(qū)是一種非常有用的優(yōu)化措施,例如一年的交易按交易日期分區(qū)后,查詢一天的交易性能將提高365倍!Segment 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegment 3ASegment 3BSegment 3CSegment 3DJan 2005Feb 2005Mar 2005Apr 2005May 2005Jun 2005Jul 2005Aug 2005Sep 2005Oct
7、 2005Nov 2005Dec 2005每個分區(qū)表的數(shù)據(jù)平均分布到各個節(jié)點表分區(qū)可減少數(shù)據(jù)的搜索范圍,提高查詢性能Data Distribution & PartitioningSegment 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegment 3ASegment 3BSegment 3CSegment 3DSegment 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegmen
8、t 3ASegment 3BSegment 3CSegment 3DSegment 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegment 3ASegment 3BSegment 3CSegment 3DSELECT COUNT(*) FROM orders WHERE order_date= Oct 20 2005 AND order_date Oct 27 2005VSHash DistributionHash Distribution+ Table PartitioningSegmen
9、t 1ASegment 1BSegment 1CSegment 1DSegment 2ASegment 2BSegment 2CSegment 2DSegment 3ASegment 3BSegment 3CSegment 3DFull Table Scan VS. Partition Pruning表分區(qū)示意圖表分區(qū)示意圖壓縮存儲壓縮存儲l 壓縮存儲 支持ZLIB和QUICKLZ方式的壓縮,壓縮比可到10:1 壓縮表只能是Append Only方式 壓縮數(shù)據(jù)不一定會帶來性能的下降,壓縮表將消耗CPU資源,而減少I/O資源占用l 語法CREATE TABLE foo (a int, b tex
10、t) WITH (appendonly=true, compresstype=zlib, compresslevel=5); 行列存儲行列存儲l Greenplum支持行或列存儲模式 列模式目前只支持Append Only 如果常用的查詢只取表中少量字段,則列模式效率更高,如查詢需要取表中的大量字段,行模式效率更高 語法: CREATE TABLE sales2 (LIKE sales) WITH (appendonly=true, orientation=column); 效率比較測試:測試1:需要去表中所有字段,此時行存儲更快。select * from dw_ods.s1_sema_sc
11、mcaccp_row where crdacct_nbr= 4033930000166380411;41秒select * from dw_ods.s1_sema_scmcaccp_col where crdacct_nbr= 4033930000166380411;116秒測試2:只取表中少量字段,列存儲更快select crdacct_status from dw_ods.s1_sema_scmcaccp_row where crdacct_nbr= 4033930000166380411;35秒select crdacct_status from dw_ods.s1_sema_scmca
12、ccp_col where crdacct_nbr= 4033930000166380411;3秒外部表加載外部表加載l 外部表的特征 Read-only 數(shù)據(jù)存放在數(shù)據(jù)庫外 可執(zhí)行SELECT, JOIN, SORT等命令,類似正規(guī)表的操作l 外部表的優(yōu)點 并行方式加載 ETL的靈活性 格式錯誤行的容錯處理 支持多種數(shù)據(jù)源l 兩種方式 External Tables: 基于文件 Web Tables: 基于URL或指令基于外部表的高速數(shù)據(jù)加載基于外部表的高速數(shù)據(jù)加載l 利用并行數(shù)據(jù)流引擎,Greenplum可以直接用SQL操作外部表l 數(shù)據(jù)加載完全并行Master主機Segment主機內(nèi)部互
13、聯(lián)網(wǎng)千兆以太網(wǎng)交換機gpfdistgpfdistSegment主機Segment主機Segment主機外部表文件外部表文件ETL服務器內(nèi)部網(wǎng)絡外部表加載的特征外部表加載的特征l 并行數(shù)據(jù)加載提供最好的性能l 能夠處理遠程存儲的文件l 采用HTTP協(xié)議l 200 MB/s data distribution rate per gpfdistl gpfdist文件分發(fā)守護進程啟動:gpfdist -d /var/load_files/expenses -p 8080 -l /home/gpadmin/log &l 外部表定義:CREATE EXTERNAL TABLE ext_expens
14、es ( name text, date date, amount float4, description text )LOCATION (gpfdist/etlhost:8081/*,gpfdist/etlhost:8082/*)FORMAT TEXT (DELIMITER |)ENCODING UTF-8LOG ERRORS INTO ext_expenses_loaderrors SEGMENT REJECT LIMIT 10000 ROWS ;lLoad good rows and catch poorly formatted rows, such as: rows with miss
15、ing or extra attributes rows with attributes of the wrong data type rows with invalid client encoding sequenceslDoes not apply to constraint errors: PRIMARY KEY, NOT NULL, CHECK or UNIQUE constraintslOptional error handling clause for external tables:LOG ERRORS INTO error_table SEGMENT REJECT LIMIT
16、count ROWS | PERCENT( PERCENT based on gp_reject_percent_threshold parameter )lExampleCREATE EXTERNAL TABLE ext_customer (id int, name text, sponsor text) LOCATION ( gpfdist:/filehost:8081/*.txt ) FORMAT TEXT ( DELIMITER | NULL ) LOG ERRORS INTO err_customer SEGMENT REJECT LIMIT 5 ROWS; 外部表加載異常處理外部表
17、加載異常處理l Data resides outside the databasel No database statistics for external table datal Not meant for frequent or ad-hoc accessl Can manually set rough statistics in pg_class: UPDATE pg_class SET reltuples=400000, relpages=400 WHERE relname=myexttable;外部表靜態(tài)統(tǒng)計優(yōu)化外部表靜態(tài)統(tǒng)計優(yōu)化l PostgreSQL commandSupport
18、 loading and unloadingOptimized for loading a large number of rowsLoads all rows in one command (not parallel)Loads data from a file or from standard inputSupports error handling as does external tablesl EXAMPLECOPY mytable FROM /data/myfile.csv WITH CSV HEADER; (文件生成在Master)COPY mytable FROM /data/
19、myfile.csv WITH CSV HEADER;(文件生成在本地)COPY country FROM /data/gpdb/country_data WITH DELIMITER | LOG ERRORS INTO err_country SEGMENT REJECT LIMIT 10 ROWS;COPY SQL 命令命令l Drop indexes and recreate after load Increase maintenance_work_mem parameter to speed up CREATE INDEX operationsl Run ANALYZE after l
20、oadl Run VACUUM after load errors、delete、upate。l Do not use ODBC INSERT to load large volumes of data數(shù)據(jù)加載性能優(yōu)化提示數(shù)據(jù)加載性能優(yōu)化提示l 限制查詢隊列的激活數(shù)l 防止系統(tǒng)過載(CPU, disk I/O, memory)資源負載管理資源負載管理(Workload Management)資源隊列的兩種典型管理方式資源隊列的兩種典型管理方式l Resource Queue Limits ACTIVE THRESHOLDEXAMPLE: CREATE RESOURCE QUEUE adhoc
21、ACTIVE THRESHOLD 10 IGNORE THRESHOLD 1000.0; COST THRESHOLDEXAMPLES: CREATE RESOURCE QUEUE batch1 COST THRESHOLD 1000000.0 NOOVERCOMMIT;CREATE RESOURCE QUEUE batch1 COST THRESHOLD 1e+6;Greenplum 性能監(jiān)控器性能監(jiān)控器l Highly interactive web-based performance monitoringl Real-time and historic views of:Resource
22、 utilizationQueries and query internalsDashboardGreenplum Performance MonitorSystem MetricsGreenplum Performance MonitorQuery MonitorGreenplum Performance MonitorBackups and Restoresl Parallel backups (gp_dump)l Parallel restores (gp_restore)l Automating dump/restores (gpcrondump, gpdbrestore)l Non-
23、parallel backups and restores(pg_dump/pg_restore/psql)備份與恢復備份與恢復l 用于在同構(gòu)環(huán)境間遷移數(shù)據(jù)結(jié)構(gòu)、數(shù)據(jù)、function備份恢復并行備份和恢復并行備份和恢復(gp_dump/gp_restore)l 用于在異構(gòu)環(huán)境間遷移數(shù)據(jù)結(jié)構(gòu)、數(shù)據(jù)、function串行備份和恢復串行備份和恢復(pg_dump/pg_restore)l Each active segment is dumped in parallell Dump files created in segment data directory by defaultl Suppor
24、ts compression (gzip)l Ensure sufficient disk space for dump filesl A backup set is identified by a timestamp keylExample:Back up a database: gp_dump gpdbBack up a database, and create dump files in a centralized location on all hosts: gp_dump -gp-d=/home/gpadmin/backups gpdbBack up a particular sch
25、ema only: gp_dump -n myschema mydatabaseBack up a single segment instance only (by noting the dbid of the segment instance): gp_dump -gp-s=i5 gpdbRunning a Parallel Backup (gp_dump)lOn the master host gp_catalog_1_ gp_cdatabase_1_ gp_dump_1_ gp_dump_status_1_ lOn the segment hosts gp_dump_0_gp_dump_
26、status_0_Dump Files Created by gp_dumpl Use gp_restore commandl Need timestamp key from gp_dumpl Make sure dump files are placed on correct segment hostsl Make sure database exists before restoringl Database-level server configuration settings are not restoredl Examples Restore an Greenplum database
27、 using backup files created by gp_dump:gp_restore -gp-k=2005103112453 -d gpdb Restore a single segment instance only (by noting the dbid of the segment instance):gp_restore -gp-k=2005103112453 -d gpdb -gp-s=i5Running a Parallel Restore (gp_restore)l gpcrondumpCalls to gp_dumpCan be called directly o
28、r can schedule using CRONSend email notificationsFlexible dump optionsCan copy configuration filesCan dump system catalogsCan dump global objectsCan include a post-dump scriptl gpdbrestoreRestores from gpcrondump filesCan restore from an archive host no need to pre-place dump files on segmentsSchedu
29、ling Routine Backupsl Green plum also supports pg_dump and pg_restore l Useful for migrating data to/from other DBMSl pg_dump creates a single dump fileCan be slow on very large databasesRun at low-usage timesSupports compressionCan dump data as INSERT or COPY commandsgp-syntax option includes DISTR
30、IBUTED BY statements in DDLNon-Parallel Backups and Restoresl Dump a database called mydb into a SQL-script file:pg_dump mydb db.sqll To reload such a script into a (freshly created) database named newdb:psql -d newdb -f db.sqll Dump a Greenplum database in tar file format and include distribution p
31、olicy information:pg_dump -Ft -gp-syntax mydb db.tarl To dump a database into a custom-format archive file:pg_dump -Fc mydb db.dumpl To reload an archive file into a (freshly created) database named newdb:pg_restore -d newdb db.dumpl To dump a single table named mytab:pg_dump -t mytab mydb db.sqll T
32、o specify an upper-case or mixed-case name in -t and related switches, you need to double-quote the name; else it will be folded to lower case. But double quotes are special to the shell, so in turn they must be quoted. Thus, to dump a single table with a mixed-case name, you need something like:pg_
33、dump -t MixedCaseName mydb mytab.sqlNon-Parallel Backups and Restores Example客戶端工具客戶端工具l pgAdmin3 圖形化管理和SQL執(zhí)行/分析/監(jiān)控工具l psql 行命令操作和管理工具pgAdmin3 for GPDBl pgAdmin3 is the leading graphical Open Source management, development and administration tool for PostgreSQL Greenplum has contributed extensive GP
34、DB-specific enhancements With GPDB 3.3, Greenplum ships and supports this toolpgAdmin3 for GPDBpgAdmin3 for GPDBl 監(jiān)控活動session ,同SQL:select * from pg_stat_activity;l 監(jiān)控鎖,從pg_lock中獲取信息l 可以停止正在運行的SQLPSQLl Connect through the masterl Connection information database name (-d | PGDATABASE) master host nam
35、e (-h | PGHOST) master port (-p | PGPORT) user name (-U | PGUSER)l First time connections template1 database default superuser account (gpadmin)Issuing SQL Statementsl Interactive modepsql mydatabasemydatabase=# SELECT * FROM foo; l Non-interactive mode (single command) psql mydatabase ac “SELECT *
36、FROM foo;”l Non-interactive mode (multiple commands) psql mydatabase af /home/lab1/sql/createdb.sqll (Use semi-colon (;) to denote end of a statement)Common PSQL Meta Commandsl ? (help on psql meta-commands) l h (help on SQL command syntax)l dt (show tables) l dtS (show system tables) l dg or du (sh
37、ow roles) l l (show databases)l c db_name (connect to this database) l q (quit psql) l ! (Enter into shell mode)l df (show function)l dn(show schema)l Set search_path=l timingpostgresql.confLocalLocalLocalLocall 參數(shù)參考Adminguidel 重要參數(shù):max_connection,share_buff,work_meml Local 變量的修改,如max_stack_depth需要修
38、改所有segment上的valueLocal, Global, and Master-Onlypostgresql.conf filel Located in master or segment instances data directoryl Used to set configuration parameters on the system levell Parameters that are using the default setting are commented out (#)l Requires a restart (or reload using gpstop -u) fo
39、r changes to take effectViewing Parameter Settingsl View a specific parameter setting Example: SHOW search_path;l View all parameter settings Example: SHOW ALL;l Set parameter Example: set search_path=public set client_encoding=gb18030 Configuring Host-Based Authenticationl 客戶端授權 是否允許從某個客戶端的連接 用戶是否可
40、以連接到所請求的數(shù)據(jù)庫l pg_hba.conf file 基于host address, database, and/or DB user account控制權限 位于master和segment實例的數(shù)據(jù)目錄中 系統(tǒng)初始化時進行default配置Default Master Host pg_hba.confl Local connections allowed for Greenplum superuserl Remote connections not allowedl EXAMPLE# TYPE DATABASE USER CIDR-ADDRESS METHOD local all g
41、padmin ident sameuser local all all ident sameuser host all gpadmin /0 trust 配置配置 pg_hba.confl EXAMPLE# TYPE DATABASE USER CIDR-ADDRESS METHOD localallall trust hostall all/32 trust hostall all:1/128 trust hostall all06/32 trust hostcarddwetl2/32md5 hostgpadmina
42、ll/24md5l gpstop -u 可與在不重啟數(shù)據(jù)庫方式下,讓設置生效SQL語法語法l 具體參考Gp sql language:數(shù)據(jù)類型數(shù)據(jù)類型l 常用數(shù)據(jù)類型 CHAR,VARCHAR,TEXT Smallint ,integer,bigint Numeric, real,double precision Timestamp,date,time Boolean Array 類型。如 integerl 其它數(shù)據(jù)類型請參考lAll system catalogs in pg_catalog schemalStandard PostgreSQL system cata
43、logs (pg_*)lGreenplum-specific catalogs: gp_configuration gp_distribution_policy gp_id gp_version_at_initdb pg_resqueue pg_exttable pg_tables pg_class pg_stat_activitylTo list all system catalog tables in psql: dtSlTo list all system views in psql: dvSl其它 catalog 參考 System Catalog Tables and Views函數(shù)
44、函數(shù)l 日期函數(shù)Extract(day|month|year。 From date);Select date + 1 day:interval,date+ 1 month:intervalSELECT date_part(day, TIMESTAMP 2001-02-16 20:38:40); Result: 16SELECT date_trunc(hour, TIMESTAMP 2001-02-16 20:38:40); Result: 2001-02-16 20:00:00 pg_sleep(seconds); 系統(tǒng)日期變量Current_dateCurrent_timeCurrent_t
45、imestampNow()Timeofday() 在 事務中發(fā)生變化,以上函數(shù)在事務中不變函數(shù)函數(shù)l 字符串處理函數(shù)Substr/length/lower/upper/trim/replace/positionrPad/lpadTo_char, | (字符串連接)substring(string from pattern/ * ,like,simillar to (模式匹配)l 其它雜類Case 。When/Coalescenullifgenerate_seriesIn/not in/exists/any/all各類函數(shù)參考:存儲過程存儲過程l Greenplum支持SQL/PYTHON/PE
46、RL/C語言構(gòu)建函數(shù),以下著重介紹SQL 存儲過程。 一個存儲過程就是一個事務,包括對子過程的調(diào)用都在一個事務內(nèi) 存儲過程結(jié)構(gòu):CREATE FUNCTION somefunc() RETURNS integer AS $DECLARE quantity integer := 30;BEGIN RETURN .;END;$ LANGUAGE plpgsql; 賦值給一個變量或行/記錄賦值用下面方法:identifier := expression例子:user_id := 20; 執(zhí)行一個沒有結(jié)果的查詢: PERFORM query;一個例子:PERFORM create_mv(cs_sess
47、ion_page_requests_mv, my_query);存儲過程請參考:存儲過程請參考:存儲過程存儲過程l動態(tài)SQLEXECUTE command-string INTO STRICT target;lSELECT INTOExample:SELECT ID INTO VAR_ID FROM TABLEAl獲取結(jié)果狀態(tài)GET DIAGNOSTICS variable = item , .;一個例子:GET DIAGNOSTICS integer_var = ROW_COUNT;lSQL返回變量SQLERRM, SQLSTATEl控制結(jié)構(gòu)IF . THEN . ELSEIF . THEN
48、 . ELSELOOP, EXIT, CONTINUE, WHILE, FOR l從函數(shù)返回有兩個命令可以用來從函數(shù)中返回數(shù)據(jù):RETURN 和 RETURN NEXT 。Syntax:RETURN expression;l設置回調(diào)EXEC SQL WHENEVER condition action; condition 可以是下列之一: SQLERROR,SQLWARNING,NOT FOUND存儲過程存儲過程l異常處理EXCEPTION WHEN unique_violation THEN - do nothing END;忽略錯誤:EXCEPTION WHEN OTHERS THEN R
49、AISE NOTICE an EXCEPTION is about to be raised; RAISE EXCEPTION NUM:%, DETAILS:%, SQLSTATE, SQLERRM;END;l錯誤和消息RAISE level format , expression , .;Level:Info:信息輸入Notice:信息提示Exception:產(chǎn)生一個例外,將退出存儲過程Example: RAISE NOTICE Calling cs_create_job(%), v_job_id;Performance Tuningl Introduction to performance
50、 tuningl Common performance problemsl Tracking down a performance probleml Query profiling (EXPLAIN, EXPLAIN ANALYZE)l Query tuningl 我的經(jīng)驗:性能調(diào)優(yōu)性能調(diào)優(yōu)lSet performance expectationsacceptable response times, queries per minute, etc.BenchmarkslKnow your baseline hardware performancethroughput / capacitylKn
51、ow your workloadheavy usage timesresource contentiondata contentionlFocus your optimizationsApproaching a Performance Tuning Initiative lHardware Issues / Failed SegmentslResource AllocationlContention Between Concurrent WorkloadslInaccurate Database StatisticslUneven Data DistributionlSQL Formulati
52、onlDatabase Design Common Causes of Performance IssueslDisk failureslHost failureslNetwork failureslOS not tuned for GreenplumlDisk Capacity 70% maximum recommended VACUUM after updates, deletes and loadslVACUUM configuration parameters max_fsm_relations = tables + indexes + system tables max_fsm_pa
53、ges = 16 * max_fsm_relations Hardware Issuesl Greenplum resource queueslimit active queries in the systemlimit the size of a query a particular user can runlPerform admin tasks at low usage timesData loading, ETLVACUUM, ANALYZEBackupslDesign applications to prevent lock conflictsConcurrent sessions
54、not updating the same data at the same timelResource-related Configuration Parameterswork_mem = 32MBmaintenance_work_mem = 64MBshared_buffers = 125MBResource Allocation and ContentionlDatabase statistics used by the query plannerlRun ANALYZE afterData loadsRestores from backupsChanges to schema (add
55、ing indexes, etc.)Inserts, Updates, or DeleteslCan configure statistics collection default_statistics_target = 25 gp_analyze_relative_error = .25 on specific table columnsALTER TABLE name ALTER column SET STATISTICS # Database Statistics (ANALYZE)lTable Distribution Key Considerations Even data dist
56、ributionLocal vs. distributed operationsEven query processing lChecking for data skewgpskew t schema.tableUnix system utilities (gpssh):gpssh f seg_host -sar 1 100lRebalancing a Table CREATE TABLE sales_temp (LIKE sales) DISTRIBUTED BY (date, total, customer); INSERT INTO sales_temp SELECT * FROM sa
57、les; DROP sales; ALTER TABLE sales_temp RENAME TO sales; Greenplum Data Distributionl Data Type Selectionsmallest size possible to fit data INTEGER, not NUMERIC(11,2) for integer fieldssame data type across tables for join columnsconsider hash of wide join keys, using BYTEA instead of CHAR(100) varc
58、har or text for character data lDenormalization (star schema)lTable Partitioning Database DesignlUse sparingly in Greenplum DatabaselTry workload without indexes firstlCan improve performance of OLTP type workloadslOther index considerations:Avoid on frequently updated columnsAvoid overlapping index
59、esUse bitmap indexes where applicable instead of B-treeDrop indexes for loadsConsider a clustered indexlConfiguring Index Usage:enable_indexscan = on | offDatabase Design - IndexeslGeneral ConsiderationsKnow your dataMinimize returned rowsAvoid unnecessary columns/tables in result setAvoid large sor
60、ts if possibleMatch data types in joinslGreenplum-specific ConsiderationsJoin on common distribution key columns when possibleConsider data distribution policy and query predicatesSQL FormulationlSystem Catalog Tables and Viewspg_stat_activitypg_locks / pg_classlDatabase LogsLocated in master (and segment)
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 會場設備租賃合同范本
- 醫(yī)美針劑合同范本
- 創(chuàng)業(yè)課題申報書怎么寫好
- 廠房帶看合同范例
- 午休托管班合同范本
- 廠房排氣安裝合同范本
- 代加工燈具合同范本
- 包辦入學合同范本
- 單位委托印刷合同范本
- 推動農(nóng)村充電基礎設施發(fā)展計劃
- 2025年復工復產(chǎn)培訓考核試卷及答案
- 2025年01月中國疾控中心信息中心公開招聘1人筆試歷年典型考題(歷年真題考點)解題思路附帶答案詳解
- 北京市豐臺區(qū)2024-2025學年高二上學期期末英語試題
- 2025年高三第二學期物理備課組教學工作計劃
- 丁香園:2024年12月全球新藥月度報告-數(shù)據(jù)篇
- 生產(chǎn)與運作管理-第5版 課件全套 陳志祥 第1-14章 生產(chǎn)系統(tǒng)與生產(chǎn)運作管理概述 -豐田生產(chǎn)方式與精益生產(chǎn)
- 2025年湖南理工職業(yè)技術學院高職單招職業(yè)技能測試近5年??及鎱⒖碱}庫含答案解析
- 2024年西安航空職業(yè)技術學院高職單招職業(yè)適應性測試歷年參考題庫含答案解析
- 臨平區(qū)九年級上學期期末考試語文試題(PDF版含答案)
- 2024年港作拖輪項目可行性研究報告
- 課題申報書:“四新”建設與創(chuàng)新創(chuàng)業(yè)人才培養(yǎng)基本范式研究
評論
0/150
提交評論