f314b3a8a8890b58f5ad41d708c5a833.ppt
- Количество слайдов: 51
<Insert Picture Here> Oracle Database 11 g for Data Warehousing Presenter’s Name Presenter’s Title
Agenda • Technology • Monitoring • Information Life-cycle Management (ILM) • Oracle Optimized Warehouse Initiative • Market <Insert Picture Here>
<Insert Picture Here> Technology
Parallel Execution select c. cust_last_name , sum(s. amount_sold) from customers c, sales s where c. cust_id = s. cust_id group by c. cust_last_name ; Data on Disk Parallel Servers scan join aggregate Scanners Joiners Aggregators Coordinator
Partitioning – Benefits D OR S S ER D S ER R DE OR OR N JA A US E OP R EU B FE JA N B FE Large Table Partition Composite Partition Difficult to Manage Divide and Conquer Better Performance Easier to Manage More flexibility to match business needs Improve Performance Transparent to applications
Partitioning in Oracle Database 11 g Interval Partitioning • Partitions are created automatically as data arrives Y R TO S N VE IN ER RD S ER D O B OR R E NF JA J AN B FE MA R P RA N JA F A BM E
Partitioning in Oracle Database 11 g Complete Composite Partitioning • • Range – range List – list List – hash List – range S OR R DE >5 000 RS 00 >50 E RD O B FE J RANGE-RANGE Order Date by Order Value ld Go O 0100 00 50 AN RS DE R SA U EU P RO E LIST-RANGE Region by Order Value r ilve S PE A US RO EU LIST-LIST Region by Customer Type
Partitioning in Oracle Database 11 g Reference Partitioning • Inherit partitioning strategy ck Sto lds Ho e Lin s Item k toc s S ld Ho S O R DE R ackrs B e Ord ick s P t Lis e Lin s Item S ER D OR ck Ba ers Ord Partition ORDERS by Date k Pic ts s Li N JA ck Sto lds Ho e Lin s m Ite OR ck Ba ers Ord S R DE k Pic ts Lis R MA e Lin s Item ck Sto lds Ho RS DE OR k Pic ts Lis ck Ba ers Ord B FE ck Sto lds Ho e Lin s Item D OR ck Ba ers Ord S ER k Pic ts s Li R AP
Partitioning in Oracle Database 11 g Virtual Column-Based Partitioning ORDERS ORDER_ID -----9834 -US-14 8300 -EU-97 3886 -EU-02 2566 -US-94 3699 -US-63 ORDER_DATE CUSTOMER_ID. . . ----------- -12 -JAN-2007 65920 14 -FEB-2007 39654 16 -JAN-2007 4529 19 -JAN-2007 15327 02 -FEB-2007 18733 REGION AS (SUBSTR(ORDER_ID, 6, 2)) -----US EU EU US US S requires no storage • Partition by ORDER_DATE, REGION • REGION R DE OR N JA B FE A US E OP R EU
Compression • Tables and indexes can be compressed • Can be specified on a per-partition basis • Typical compression ratio 3: 1 • Requires more CPU to load data • Decompression hardly costs resources • Compress for all DML operations • Less data on disk • Requires less time to read • Completely transparent Up To 3 X Compression
SQL Query Result Cache • Store query results in cache • Repetitive executions can use cached result • Data Warehouse queries • Long-running, IO-intensive • Expensive computations • Return few rows • Excellent opportunity for SQL Query Result Cache ---------------------------------| Id | Operation | Name | ---------------------------------| 0 | SELECT STATEMENT | | | 1 | RESULT CACHE | fz 6 cm 4 jbpcwh 48 wcyk 60 m 7 qypu | | 2 | SORT GROUP BY ROLLUP | | |* 3 | HASH JOIN | | etc.
SQL Query Result Cache Opportunity • Retail customer data (~50 GB) • Concurrent users submitting queries randomly • Executive dashboard with 12 heavy analytical queries • Cache results only at in-line view level • 12 queries run in random, different order – 4 queries cached • Measure average, total response time for all users # Users No cache Cache Improvement 2 186 s 141 s 24% 4 267 s 201 s 25% 8 447 s 334 s 25%
Other Performance Features Transparent to Your Application • Materialized Views • Transparent rewrites of expensive queries • Including rewrites on remote objects • Incremental automatic refresh • Bitmap Indexes • Optimal storage • Ideal for star look-a-like schemas • SQL Access Advisor – based on workload • Materialized view advice • Index advice • Partition advice
Bring Algorithms to the Data Not Data to the Algorithms • Analytic computations done in the database SQL analytics OLAP • • SQL Analytics OLAP Data Mining Statistics • Scalability • Security Data Mining Statistics • Backup & Recovery • Simplicity
Native Support for Pivot and Unpivot SALESREP Q 1 Q 2 Q 3 Q 4 ----- -----100 230 240 260 300 101 200 220 250 260 102 260 280 265 310 SALESREP -----100 100 101 101 102 102 QU REVENUE -- -----Q 1 230 Q 2 240 Q 3 260 Q 4 300 Q 1 200 Q 2 220 Q 3 250 Q 4 260 Q 1 260 Q 2 280 Q 3 265 Q 4 310
Native Support for Pivot and Unpivot QUARTERLY_SALESREP Q 1 Q 2 Q 3 Q 4 ----- -----100 230 240 260 300 101 200 220 250 260 102 260 280 265 310 SALESREP -----100 100 101 101 102 102 QU REVENUE -- -----Q 1 230 Q 2 240 Q 3 260 Q 4 300 Q 1 200 Q 2 220 Q 3 250 Q 4 260 Q 1 260 Q 2 280 Q 3 265 Q 4 310 select * from quarterly_sales unpivot include nulls (revenue for quarter in (q 1, q 2, q 3, q 4)) order by salesrep, quarter ;
Native Support for Pivot and Unpivot SALES_BY_QUARTER SALESREP 'Q 1' 'Q 2' 'Q 3' 'Q 4' ----- -----100 230 240 260 300 101 200 220 250 260 102 260 280 265 310 SALESREP -----100 100 101 101 102 QU REVENUE -- -----Q 1 230 Q 2 240 Q 3 160 Q 4 90 Q 3 100 Q 4 140 Q 4 70 Q 1 200 Q 2 220 Q 3 250 Q 4 260 Q 1 260 select * from sales_by_quarter pivot (sum(revenue) for quarter in ('Q 1', 'Q 2', 'Q 3', 'Q 4')) order by salesrep ;
Transform Data Where Data Resides In-database ETL technology Extract Load Transform Insert Data Pump Transportable Tablespaces Partition Exchange Loading Change Data Capture Distributed Queries SQL*Loader External Tables Table Functions Multi-Table Insert MERGE DML error logging
Asynchronous Change Data Capture PMOPs Log files Time-based subscription windows Oracle Database 11 g Change Data Read-consistent subscription Log Miner and Streams DW Tables Transform SQL, PL/SQL, Java • Capture changes from [redo | archive] logs • No changes to source applications • Minimal performance impact on source applications OLTP DB • Store changes in change tables • Provide (bulk) SQL interface to change data
RAC – Scale Incrementally W 300% o r 200% k l 100% o d 3 6 9 12 15 Months 18 21 24
Automatic Storage Management • Storage pool for database files • Load-balanced across disks • Capacity on demand • Add/remove storage on-line • Automatic IO load balancing • Fault tolerant, high performance • Automatically mirrors and stripes • Low cost • No IO tuning required • No volume manager or file system needed
Mixed Workloads report Budget table update accurate report • Concurrent small data loads and queries • Looks like. . . OLTP • Oracle's read consistency • Readers never block writers Rollback • Writers never block readers Segment Before • Queries are always consistent and auditable Image • No deadlocks • Introduced in Oracle V 4 (1982) – major improvements in V 6 (1988)
Database Resource Manager • Protect the system pro-actively • Maximum number of concurrent operations • Priority-dependent maximum Degree Of Parallelism (DOP) Sales Analysis High Priority Ad Hoc Reports Medium Priority ETL Jobs Low Priority 20 users (DOP 10) 200 users (DOP 4)
Oracle Database Security Authorize Access Control Protect stored data Audit Authenticate Protect data in transit Marketing Finance Sales Identity Management
Feature Usage for Large-Scale Data Warehouses Partitioning, parallelism, and compression are the foundation for large-scale data warehousing Source: TB Club Report: A survey of 30 multi-TB Oracle DW’s – data July 2006
<Insert Picture Here> Monitoring
I/O Monitoring Database Control
I/O Monitoring Database Control
Parallel Execution Monitoring Database Control
Near Real-Time SQL Monitoring Coming in Grid Control
Parallel SQL Monitoring Coming in Grid Control
<Insert Picture Here> Information Life-cycle Management (ILM)
Information Lifecycle Management “The policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost effective IT infrastructure from the time information is conceived through its final disposition. ” Storage Networking Industry Association (SNIA) Data Management Forum Active Data Less Active Data Historical Data
Information Lifecycle Management Orders Q 1 Orders Active High Performance Storage Tier Q 2 Orders Q 3 Orders Less Active Low Cost Storage Tier Q 4 Orders Older Orders Historical Online Archive Storage Tier
Traditional Storage Approach All data resides on a single storage tier High Performance Storage Tier = $72 per Gb Active All data on active = $972, 000!
Partitioning is the Foundation for ILM Partition data onto appropriate storage tier High Performance Storage Tier = $72 per Gb Active Low cost Storage Tier = $14 per Gb Less Active Read only Storage Tier = $7 per Gb Historical
Partitioning is the Foundation for ILM Move data onto appropriate storage tier High Performance Storage Tier = $72 per Gb 5% Active Low cost Storage Tier = $14 per Gb 35% Less Active Read only Storage Tier = $7 per Gb 60% Historical
Partitioning is the Foundation for ILM Reduce storage costs accordingly High Performance Storage Tier = $72 per Gb 5% Active $49, 800 Low cost Storage Tier = $14 per Gb 35% Less Active $67, 700 Read only Storage Tier = $7 per Gb 60% Historical $58, 000
Introduce Compression Reduce storage costs across all tiers 5% Active $49, 800 35% Less Active $67, 700 60% Historical $58, 000 Lets use compression factor of 3 $16, 600 $22, 600 $19, 400
Cost Savings by Storage Tier
<Insert Picture Here> Oracle Optimized Warehouse Initiative
Oracle Optimized Warehouse Initiative Goals for Oracle data warehouse solutions: • Provide superior system performance • Provide a superior customer experience
Full Range of DW Solution Options Custom • Flexibility for the most demanding data warehouse • Benefits: High performance Unlimited scalability Completely customizable Industry-leading database and hardware Optimized Warehouse Reference Configuration • Documented best-practice configurations for data warehousing • Benefits: High performance Simple to scale; modular building blocks Industry-leading database and hardware Available today with HP, IBM, Sun, EMC/Dell • Database Options • Management Packs • Scalable systems preinstalled and preconfigured: ready to run out-of-the-box • Database Options • Management Packs High performance Simple to buy Fast to implement Easy to maintain Competitively priced • Partitioning • RAC Pre-configured, Pre-installed, Validated Flexibility
<Insert Picture Here> Market
Data Warehouse Market Oracle is the Data Warehousing DBMS Market Leader Source: IDC, 2006 - Worldwide Data Warehousing Tools 2005 Vendor Shares
Leading Scalability Wintercorp VLDB Survey 1998 Survey Sears HCIA Wal-Mart Tele Danmark Citicorp MCI NDC Health Sprint Ford Acxiom Teradata Informix Teradata DB 2 Informix Oracle Teradata Oracle 2003 Survey 4. 63 4. 50 4. 42 2. 84 2. 47 1. 88 1. 85 1. 30 1. 20 1. 13 France Telecom AT&T SBC Anonymous Amazon. com Kmart Claria HIRA Fed. Ex Vodafone Gmbh Source: http: //www. wintercorp. com Oracle Proprietary Teradata DB 2 Oracle Teradata Oracle Sybase IQ Teradata 2005 Survey 29. 23 26. 27 24. 81 16. 19 13. 00 12. 59 12. 10 11. 94 9. 98 9. 91 Yahoo! AT&T KT-IT Group AT&T LGR - Cingular Amazon. com Anonymous UPSS Amazon. com Nielsen Media Oracle 100. 39 Daytona 93. 88 DB 2 49. 40 Daytona 26. 71 Oracle 25. 20 Oracle 24. 77 DB 2 19. 65 Microsoft 19. 47 Oracle 18. 56 Sybase IQ 17. 69
Oracle DW 10+TB Customers (3/2006) Various Platforms and Architectures • • • Acxiom Allstate Amazon Cellcom Century. Tel Chase Choicepoint Claria Experian KTF Cingular 16 TB 15 TB 61 TB 14 TB 10 TB 30 TB 14 TB 38 TB 14 TB 25 TB HP Sun (RAC) HP HP IBM (RAC) Sun Sun HP HP • • • Mastercard NASDAQ Nex. Tel NYSE Group Reliance Ltd Starwood TIM (Italy) Turkcell UBS AG UPS Yahoo! 20 TB 35 TB 28 TB 15 TB 13 TB 12 TB 14 TB 15 TB 10 TB 130 TB Hundreds of Terabyte+ DW Customers! IBM (RAC) Sun HP HP (RAC) Sun HP Fujitsu
Summary • Technology • Monitoring • Information Life-cycle Management (ILM) • Oracle Optimized Warehouse Initiative • Market <Insert Picture Here>
For More Information http: //search. oracle. com BI & Data Warehousing or oracle. com
f314b3a8a8890b58f5ad41d708c5a833.ppt