Скачать презентацию Gestion efficace de Séries Temporelles en P 2 Скачать презентацию Gestion efficace de Séries Temporelles en P 2

8dc4cf68c6231b1cb7debaac28df604f.ppt

  • Количество слайдов: 30

Gestion efficace de Séries Temporelles en P 2 P Application à l'analyse technique et Gestion efficace de Séries Temporelles en P 2 P Application à l'analyse technique et l'étude des objets mobiles G. Gardarin, B. Nguyen, L. Yeh, K. Zeitouni, B. Butnaru, I. Sandu-Popa Laboratoire PRi. SM – Université de Versailles Saint-Quentin BDA’ 09 - Namur Gardarin et al. -- BDA'09

Motivation l Technical Analysis (Economy) Determine buy / sell operations based on time series Motivation l Technical Analysis (Economy) Determine buy / sell operations based on time series calculations parameter tuning ¡ Empirical / Tuning : many simulations to be run ¡ Very large time series (quotes every 15 secs over thousands of items, with years of data) ¡ Objective : delegate computing power and caching in a P 2 P network ¡ l Mobile objects ¡ Compute aggregate queries over time series of sensor data (see paper for more details) Gardarin et al. -- BDA'09 2

/! Time Series vs. Data Streams /! l Time Series Persistent data queried on /! Time Series vs. Data Streams /! l Time Series Persistent data queried on demand using complex queries. Historical type data. Size is an issue. Current commercial performance : under 1 M per second l Data Stream Transient data queried by simple continuous queries (event detection). Real Time oriented. Gardarin et al. -- BDA'09 3

Contributions l Extensible TS model with functional operators compatible with XQuery 1. 1 l Contributions l Extensible TS model with functional operators compatible with XQuery 1. 1 l Efficient P 2 P techniques for XQuery 1. 1 with special TS management Current XQ 1. 1 engines have very poor performance. Gardarin et al. -- BDA'09 4

Outline l Introduction l Time Series Model l P 2 P TS computing l Outline l Introduction l Time Series Model l P 2 P TS computing l Experiments l Conclusion Gardarin et al. -- BDA'09 5

Time Series Model Gardarin et al. -- BDA'09 Time Series Model Gardarin et al. -- BDA'09

ROSe. S Model : an infinite vector TS Entry Date (ISO) Date Value 2007 ROSe. S Model : an infinite vector TS Entry Date (ISO) Date Value 2007 -01 -05 2007 -01 -08 … 2009 -03 -06 5517. 35 5518. 59 … 2534. 45 PX 1 -Close (CAC) Value (xs: double) TS define a (metric) vector space TS 3=TS 1+s*TS 2 Gardarin et al. -- BDA'09 7

Granularity l Some choice of semantics needs to be made in order to perform Granularity l Some choice of semantics needs to be made in order to perform scale change. l Adopted semantics : ¡ Entry Date = interval start (included) ¡ Next entry date = interval end (excluded) l Need to define beginning of day with ms. precision Gardarin et al. -- BDA'09 8

Null values l Unknown value (? ) l Undefined value (!) Date Value 2007 Null values l Unknown value (? ) l Undefined value (!) Date Value 2007 -01 -05 2007 -01 -06 2007 -01 -08 2007 -01 -09 2009 -03 -06 5517. 35 ! 5518. 59 ? 2534. 45 Assume value ! for all dates preceding the first one. Management of “end” of TS needs ! value Gardarin et al. -- BDA'09 9

Relational like operators Filter Map Gardarin et al. -- BDA'09 10 Relational like operators Filter Map Gardarin et al. -- BDA'09 10

Union and Intersection Gardarin et al. -- BDA'09 11 Union and Intersection Gardarin et al. -- BDA'09 11

K-ary joins JOINfun(S 1, . . . Sk) = {[t, m] | [t, val K-ary joins JOINfun(S 1, . . . Sk) = {[t, m] | [t, val 1] in S 1 and … [t, valk] in Sk and m = fun(val 1, …valk)} /! Must define behavior for null values. Gardarin et al. -- BDA'09 12

Some window functions Moving Average Relative Strength Index (RSI) Moving Average Convergence/Divergence (MACD) In Some window functions Moving Average Relative Strength Index (RSI) Moving Average Convergence/Divergence (MACD) In general : “Constant” or Linear complexity in w Linear in t Gardarin et al. -- BDA'09 13

Some buy/sell rules BUY = SEL>0(XAVG 9(MAVG 12(S) - MAVG 26(S))) SELL = SEL>1. Some buy/sell rules BUY = SEL>0(XAVG 9(MAVG 12(S) - MAVG 26(S))) SELL = SEL>1. 1(MAVG 26(S) /MAVG 12(S))) Gardarin et al. -- BDA'09 14

TS/XML : a practical exchange format Benefit from the expressive power of XQ to TS/XML : a practical exchange format Benefit from the expressive power of XQ to write rules ! TS Schema XQ 1. 1 mavg implementation (naïve) /! Limited maths functions Gardarin et al. -- BDA'09 15

Preliminary results N W Our JAVA 1000 10 <1 16 1000 50 <1 45 Preliminary results N W Our JAVA 1000 10 <1 16 1000 50 <1 45 1000 100 <1 91 2000 10 <1 28 2000 50 <1 90 2000 100 <1 178 4000 10 <1 53 4000 50 <1 179 4000 100 <1 357 16000 10 4 212 16000 50 4 765 16000 100 4 1404 100000 10 25 1914 100000 50 25 5026 100000 100 25 9251 500000 10 128 9862 500000 50 130 28259 500000 129 49347 mavg System X l Xeon-X 5450@3. 00 GHz l Java 6, 1 GB Heap MACD (java) Gardarin et al. -- BDA'09 16

XQ Problems l Important overhead with XML typechecking and structure (limit to xs: double) XQ Problems l Important overhead with XML typechecking and structure (limit to xs: double) l Limited Maths functions l TS are in fact manipulated in let clauses Enhance our XQ processor with non-XQ functions on XML-TS data that respect our schema Gardarin et al. -- BDA'09 17

P 2 P TS Computing Gardarin et al. -- BDA'09 P 2 P TS Computing Gardarin et al. -- BDA'09

How can we achieve scalability ? l Observation : ¡ Many runs of a How can we achieve scalability ? l Observation : ¡ Many runs of a given user share intermediate results ¡ Many users share intermediate results l Divide computation cost by n l Divide disk read/write time by n l Divide memory usage by n Gardarin et al. -- BDA'09 19

TS Distribution – horizontal partitioning (N) CHORD DHT N/K /! Overlap is necessary /! TS Distribution – horizontal partitioning (N) CHORD DHT N/K /! Overlap is necessary /! Choice limits window size Gardarin et al. -- BDA'09 20

DHT l Two sorts of “key/value” pairs ¡ ¡ Key : TSName list of DHT l Two sorts of “key/value” pairs ¡ ¡ Key : TSName list of slices IDs (numbered) Key : TSName+Slice. ID peer containing the slice l Connect/Disconnect is managed by the DHT l Computation algorithm ¡ ¡ ¡ P 1 wants to compute Q 1 P 1 gets the location of all TS Slices needed Ship query to peers Compute query on peer (if possible) Transfer results to P 1 Gardarin et al. -- BDA'09 21

(Naïve) Caching JOIN(MAVG(CAC 40, 10), SCALE(RSI(CAC 40, 14), 100), SUM) [7, 8, 9] Limitation (Naïve) Caching JOIN(MAVG(CAC 40, 10), SCALE(RSI(CAC 40, 14), 100), SUM) [7, 8, 9] Limitation : equivalent expressions Open issue : how to choose peer ? Gardarin et al. -- BDA'09 22

Experiments Gardarin et al. -- BDA'09 Experiments Gardarin et al. -- BDA'09

XQ 2 P Prototype l 98% XQuery 1. 0 compliant database (java) l XQ XQ 2 P Prototype l 98% XQuery 1. 0 compliant database (java) l XQ 1. 1 window functionalities l Optimized external TS functions l P 2 P storage and computing http: //cassiopee. prism. uvsq. fr/XQ 2 P/ Gardarin et al. -- BDA'09 24

P 2 PTester infrastructure Gardarin et al. -- BDA'09 25 P 2 PTester infrastructure Gardarin et al. -- BDA'09 25

Experimental evaluation using P 2 PTester (4 machines) P TINDEX TR TP TNET TQ Experimental evaluation using P 2 PTester (4 machines) P TINDEX TR TP TNET TQ TP 2 P 8 7, 1 56, 8 4473 <1 400 4930 16 6, 3 100, 8 2176 <1 400 2677 32 7, 4 236, 8 1106 <1 400 1743 64 8, 4 537, 6 580 <1 400 1518 128 8, 9 1139, 2 286 <1 400 1825 256 9, 7 2483, 2 140 <1 400 3023 Gardarin et al. -- BDA'09 26

Relative gain simulation (no caching) Gardarin et al. -- BDA'09 27 Relative gain simulation (no caching) Gardarin et al. -- BDA'09 27

Conclusion Already efficient, still lots to do… Gardarin et al. -- BDA'09 Conclusion Already efficient, still lots to do… Gardarin et al. -- BDA'09

Current / Future Work l TS Granularity operators l “Enhanced” Caching (canonical form transformation) Current / Future Work l TS Granularity operators l “Enhanced” Caching (canonical form transformation) l Date-based join optimization l TS Distance computation, top-k l XQ 1. 1 window operator optimization l Other XQ 2 P improvements Gardarin et al. -- BDA'09 29

Merci ! Gardarin et al. -- BDA'09 Merci ! Gardarin et al. -- BDA'09