b5481c62f4d79072d99b58c7878da9ca.ppt
- Количество слайдов: 35
Scalable Trigger Processing Discussion of publication by Eric N. Hanson et al Int Conf Data Engineering 1999 CS 561
Motivation n Triggers popular for: ¨ Integrity constraint checking ¨ Alerting, logging, etc. n Commercial database systems ¨ Limited triggering capabilities ¨ 1 trigger/update-type on table; n n or at best 100. But : Current technology doesn’t scale well And, internet and web-based applications may need millions of triggers.
An Example Trigger n Example “stock ticker notification”: ¨ Stock holding: 100*IBM ¨ Query: Inform an agent whenever the price of the stock holding crosses $10, 000 Create Trigger stock-watch from quotes q on update(q. price) when q. name=‘IBM’ and 100*q. price > 10, 000 do raise event Threshold. Crossed(100*q. price). ¨ Note: We may need 1, 000 or millions of such triggers ¨ Web interface may allow users to create such triggers
What Next? Problem description n Trigger. Man system architecture n Predicate index n Trigger processing n
Problem Definition n Given: Relational DB, Trigger statements, Data Stream Find: Triggers corresponding to each stream item Objective: Scalable trigger processing system n Assumptions: n n ¨ Number of distinct structures of trigger expressions is relatively small ¨ All trigger expression structures small enough to fit in main memory
The Problem, once more. Requires millions of triggers (on huge data). n Steps for trigger processing n ¨ Event monitoring ¨ Condition evaluation ¨ Executing triggered action n Response time for database operations critical !
Related Work ECA Model (not scalable) Indexing Range Predicates, Marking-Based [Hans 96 b, Ston 90] (large memory, complicated storage) Parallel Processing [Gupt 89, Hell 98] AI [Forg 82, Mira 87] (smaller rule set)
Overall Driving Idea n n If large number of triggers are created, then many have the same format. Triggers share same expression signature except that parameters substituted. Group predicates from trigger conditions based on expression signatures into equivalence classes Store them in efficient main memory data structures
Trigger. Man System
Components n n Trigger. Man Datablade (lives inside Informix) Data Sources ¨ n Trigger. Man Client applications ¨ n Create /drop triggers, etc. Trigger. Man Driver ¨ n Local/remote tables/streams; must capture updates and transmit to Trigger. Man (place in a queue) Periodically involve Tman. Test() fn to perform condition testing and action execution. Trigger. Man console ¨ Direct user interaction interface for trigger creation, system shutdown, etc.
Trigger. Man Syntax n Trigger syntax create trigger <trigger. Name> [in set. Name] [optional. Flags] from. List [on event. Spec] [when condition] [group by attribute. List] [having group. Condition] do action
Example : Salary Increases Update Fred’s salary when Bob’s salary is updated create trigger update. Fred from emp on update (emp. salary) when emp. name = ’Bob’ do exec. SQL ’update emp set salary=: NEW. emp. salary where emp. name=’’Fred’’’
Example : Real Estate Database “If new house added which is in neighborhood that salesperson Iris reprensents then notify her” House (hno, address, price, nno, spno) Salesperson (spno, name, phone) Represents (spno, nno) Neighborhood (nno, name, location) create trigger Iris. House. Alert on insert to house from salesperson s, house h, represents r when s. name = ‘Iris’ and s. spno=r. spno and r. nno=h. nno do raise event New. House. In. Iris. Neighborhood(h. hno, h. address)
Trigger Condition Structure n Expression signature = Emp. name n FROM: Data src: emp ON: Event : update WHEN: boolean exp. CONSTANT Expression signature consists of ¨ Data source ID ¨ Operation code, e. g. insert, delete, etc. ¨ Generalized Expression (parameterized)
Condition structure (contd) n Steps to obtain canonical representation of WHEN clause Translate expression to CNF ¨ Group each conjunct by data source they refer to ¨ n n n Selection Predicate will be of form : (C 11 OR C 12 OR. . ) AND. . . AND (Ck 1 OR …), where each Cij refers to same tuple variable. Each conjunct refers to zero, one, or more data sources Group conjuncts by set of sources they refer to If one data source, then selection predicate ¨ If two data sources, then JOIN predicate ¨
Triggers for stock ticker notification n Create trigger T 1 from stock when stock. ticker = ‘GOOG’ and stock. value < 500 do notify_person(P 1) n Create trigger T 2 from stock when stock. ticker = ‘MSFT’ and stock. value < 30 do notify_person(P 2) n Create trigger T 3 from stock when stock. ticker = ‘ORCL’ and stock. value < 20 do notify_person(P 3) n Create trigger T 4 from stock when stock. ticker = ‘GOOG’ do notify_person(P 4)
Expression Signature n Idea: Common structures in condition of triggers n T 1: stock. ticker = ‘GOOG’ and stock. value < 500 T 2: stock. ticker = ‘MSFT’ and stock. value < 30 T 3: stock. ticker = ‘ORCL’ and stock. value < 20 Expression Signature: n E 1: stock. ticker = const 1 and stock. value < const 2 n T 4: stock. ticker = ‘GOOG’ ¨ Expression Signature: n n E 2: stock. ticker = const 3 Expression signature defines equivalence class of all instantiations of expression with different constants
What to do now n Only a few distinct expression signatures, build data structures to represent them explicitly (in memory) n Create constant tables that store all different constants, and link them to their expression signature
Main Structures n A-treat Network ¨ Network n n for trigger condition testing For a trigger to fire, all conditions must be true Expression Signature ¨ Common n n structure in a trigger E 1: stock. ticker = const 1 and stock. value < const 2 Constant Tables ¨ Constants for each expression signature
A-Treat Network to represent a trigger n n For each trigger condition stock. ticker = const 1 and stock. value < const 2 Root stock. ticker = const 1 stock. value < const 2 Node 1 Node 2 alpha-node predicates
Condition Testing n A-Treat network is a discrimination network for trigger condition testing. For a predicate to be satisfied, all its conjuncts should be true. n This is checked using A-Treat network. n
A-Treat network (Hanson 1992) Define rule Sales. Clerk If emp. sal>30, 000 And emp. dno=dept. dno And dept. name=“sales” And emp. jno=job. jno And job. title=“clerk” Then Action
Expression Signature Table Ex. ID Data Source Signature Description Constant Number of Constant Table Constants Organization E 1 stock … const_e 1 2 Main Memory E 2 stock … const_e 2 1 Main memory E 1: stock. ticker = const 1 and stock. value < const 2 E 2: stock. ticker = const 3
Constant Tables n Tables of constants in trigger conditions Const_e 1 Ex. ID Trigger ID Constant 1 Constant 2 Next Node E 1 T 1 GOOG 500 Node 2 E 1 T 2 MSFT 30 Node 2 E 1 T 3 ORCL 20 Rest Node 2 Const_e 2 Ex. ID Trigger ID Constant 1 Next Node E 2 T 4 GOOG Null T 1: stock. ticker = ‘GOOG’ and stock. value < 500 T 2: stock. ticker = ‘MSFT’ and stock. value < 30 T 3: stock. ticker = ‘ORCL’ and stock. value < 20 Rest T 4: stock. ticker = ‘GOOG’
Tables n Primary tables ¨ trigger_set (ts. ID, name, comments, creation_date, is. Enabled) ¨ Trigger (trigger. ID, ts. ID, name, comments, trigger_text, creation_date, is. Enabled, …) ¨ Trigger cache in main memory for recently accessed triggers.
Predicate Index n Tables expression_signature(sig. ID, data. Src. ID, signature. Desc, const. Table. Name, constant. Set. Size, constant. Set. Organization) ¨ const_table. N(expr. ID, trigger. ID, next. Network. Node, const 1, … const. K, rest. Of. Predicate) ¨ n n Root of predicate index linked to data source predicate indices Each data source contains an expression signature list Each expression signature links to its constant table. Index expressions on most selective conjunct (rest on fly).
Predicate Index hash(src-ID) Goal: Given an update, identify all predicates that match it.
Processing Trigger Definition Parse the trigger and validate it n Convert the when clause to conjunctive normal form n Group the conjuncts by the distinct sets of tuple variables they refer to n Form a trigger condition graph, that is, undirected graph with node for each tuple variable and edge for join predicates. n Build A-Treat network n
Processing trigger definition (2) n For each selection predicate ¨ If predicate with same signature not seen before Add signature of predicate to list n And, add signature to expression_signature table n If signature has a constant placeholder in it, create a constant table for the signature. n Add constants n ¨ Else if predicate has constants, add a row to the constant table for the expression
Alternate Organizations n Storage for the expression signature’s equivalence class: ¨ Main memory lists ¨ Main memory index ¨ Non-indexed database table ¨ Indexed database table n efficiency Scalability For each expression signature, choose a structure depending on number of triggers.
Processing update descriptors n On getting an update descriptor (token) (data src ID, operator code, old/new tuple) ¨ Locate data source predicate index from root of predicate index. ¨ For each expression signature, find constant matching the token using index. ¨ Check additional predicate clauses against the token. ¨ When all predicate clauses of a trigger have matched, pin the trigger in main memory ¨ Bring in A-treat network representing that trigger to process aremaining part of trigger, like join, etc. ¨ If trigger condition is satisfied, execute action.
Processing an Update Stock (ticker=GOOG, value=495) Root Index of stock. ticker=const 1 E 1 Other source Predicate index… E 2 E 1: stock. ticker = const 1 and stock. value < const 2 const_e 1 const_e 2 Trigger ID Constant 1 Constant 2 Next Node T 1 GOOG 500 Node 2 T 2 MSFT 30 Node 2 T 3 ORCL 20 Node 2 const_e 1
Concurrency ¨ Better scalability even on single processor
Concurrency n Identified elements that can be parallelized ¨ Token-level n ¨ Condition-level n ¨ Multiple selection conditions tested concurrently Rule-action-level n ¨ Multiple tokens processed in parallel Multiple rule actions fired at the same time Data-level n Set of data values in the network processed in parallel
Conclusion : Overall Key Points n n n If a large number of triggers are created, many of them have almost the same format Group triggers with same structure together into expression signature equivalence classes Number of distinct signatures is small enough to fit into main memory (index) Develop a selection predicate index structures Architecture to build a scalable trigger system.
b5481c62f4d79072d99b58c7878da9ca.ppt