fe8f9e57471492c8894aca9ff1955f13.ppt
- Количество слайдов: 20
Management of Uncertainty in Publish/Subscribe Systems Haifeng Liu Department of Computer Sceince University of Toronto
Publish/Subscribe Model TSX NYSE AMG Publisher N=5 Subscriptions: IBM > 85 ORCL < 10 JNJ > 60 Stock markets 8 12 ORCL= HON=24 MSFT =27 Notification Publisher Publications Broker Network =84 IBM JNJ= 58 19 = INTC Notification Subscriptions Subscriber NASDAQ
Applications Enabled by Publish/Subscribe • • Selective information dissemination Information Filtering on the Internet Location-based services Workflow management Intra-enterprise process automation Logistics and supply chain management Enterprise application integration Network monitoring and (distributed) system management
Types of Uncertainties • Lack of information – Buy a cheap car • Imprecision – Sensor data: temperature 15~20ºC, – Location: location (x, y) location t+1(x’, y’) • Semantics – Synonyms: vehicle vs. automobile – Class taxonomy: CD player vs. electronics – Different expression: 5 years experience vs. graduated in 2001 Problem: manage uncertainties, imprecision and semantics in publish/subscribe system
Agenda • Distributed Publish/Subscribe Model and Content-based Routing • Uncertainties in Publish/Subscribe • Research Challenges • Approximate P/S Model • Graph-structured Model • Current Status • Research Plan
Publish/Subscribe Messages • Advertisement (ad) – publication patterns used by publishers to announce the set of publications they are going to publish – E. g. { (stock, any), (price, any) } • Subscription (sub) – User interest specification – E. g. (stock = “yahoo”) & ( price ≤ $35) • Publication (pub) – Information, data, event – E. g. { (stock, “yahoo”), (price, $32. 79) }
Content-based Routing Advertisement Distributed Overlay Broker Network … *Adopted from SIENA, Gryphon, REBECA and Hermes …
Content-based Routing Subscribing Subscription Distributed Overlay Broker Network … *Adopted from SIENA, Gryphon, REBECA and Hermes …
Content-based Routing Publishing Publication Distributed Overlay Broker Network … *Adopted from SIENA, Gryphon, REBECA and Hermes …
Subscription Forwarding I Covering optimization P: {(car = Honda), (price, $20 K)} S 1: (car=Honda) & (price <= $30 K) S 2: (car=Honda) & (price <= $25 K) S 1 covers S 2 Distributed Overlay s 1 Broker Network … *Adopted from SIENA, Gryphon, REBECA and Hermes S 2 …
Subscription Forwarding II Merging optimization S 1: (car=Honda) & (price ≤ $30 K) S 2: (car=Toyota) & (price ≤ $25 K) P: {(car = Honda), (price, $20 K)} S’ : (car = any) & (price ≤ $30 K) Distributed Overlay S 1 Broker Network S’ … … S 2 *Adopted from SIENA, Gryphon, REBECA and Hermes
Publish/Subscribe Router • Forwarding of advertisements – Via flooding • Forwarding of subscriptions – Forward along reverse ad path • Matching of ad and sub (Intersecting) – Optimizations • Covering/merging of subs • Forwarding of publications – Forward along reverse sub path • Matching of sub and pub
Uncertainties in Distributed Publish/Subscribe System • Messages – uncertain subscription } representation: modeling – uncertain publication • Relations – Between sub and pub } – Between sub and sub computation: • Result Matching Covering Merging – Return top K matches } aggregation: ranking
Research Challenges • Develop a publish/subscribe model to express uncertainties/semantics in publications and subscriptions • Model approximate matching and semantic matching • Model approximate covering/merging and semantic covering/merging • Scalability to large number of subscribers and high publishing rate
Approximate Matching Model • Model – Sub: fuzzy set – Pub: possibility distribution • Matching – Possibility measure – Necessity measure • Ranking – “min” or “product” for conjunction – “max” or “plus” for disjunction
Graph-structured Model • Model – Pub: directed graph – Sub: directed graph pattern – Semantic: ontology • Matching PAPER 17 AUTHOR CONFERENCE “Arno Jacobsen” SIGMOD YEAR “ 2001” LOCATION “California” – Pattern graph maps to data graph if the topology Academic Publication (structure) of the two graphs matches and all Proceedings variable constraints (literal and ontology) are satisfied • Ranking WWW Jacobsen’s Publications Report VLDB PAPER 17
Current Status • Work to date – Develop an approximate p/s model to express uncertainties and an efficient algorithm to do approximate matching – Develop a covering and merging optimizations for approximate content-based routing – Develop a graph-based p/s architecture applied to the dissemination of RDF metadata (including RSS) – Develop two novel algorithms (covering and merging) for creation of a distributed content-based routing network for graph-structured data.
Comments from Previous Meeting • Probability model • Qualitative similarity measure • Validate our results – Real data set – Interactive evaluation
Research Plan I • Membership Function Mining – Get a real data set – “Learn” the membership function • Clustering: K-means, DBscan • Regression: neural network • Semantic Matching and Routing Computation – Matching on ontology – Covering on ontology – Merging on ontology
Research Plan II • Design an experiment to validate the mining results • Design a method to combine possibility measure and necessity measure for ranking • Push thresholds down the matching plan to increase the efficiency of matching algorithm • Use probabilities as an alternative to model uncertainties and imprecision
fe8f9e57471492c8894aca9ff1955f13.ppt