fa294cd891aada2d70edfb84b2886fc3.ppt
- Количество слайдов: 46
Sara Bouchenak – INRIA, France IBM Research Alan Cox – Rice University, Houston Steven Dropsho – EPFL, Lausanne Sumit Mittal – IBM Research, India Willy Zwaenepoel – EPFL, Lausanne © 2006 IBM Corporation
Dynamic Web Caching Dynamic Web Content Cache HTTP request SQL req. Internet SQL res. HTTP response Web server Application server Database server Web tier Client Business tier Database tier Motivation for Caching § Represents large portion of web requests § Stock quotes, bidding-buying status on auction site, best-sellers on bookstore § Generation places huge burden on application servers 2 © 2006 IBM Corporation
Dynamic Web Caching Dynamic Web Content § Dynamic Content Not easy to Cache – Ensure consistency, invalidate cached entries due to updates • Write requests can modify entries used by read requests – Caching logic inserted at different points in the application • Entry and exit of requests, access to underlying database • Correlation between requests and their database accesses Ø Most solutions rely on “manually” understanding complex application logic 3 © 2006 IBM Corporation
Dynamic Web Caching Our Contributions § Design a cache “Auto. Web. Cache” that • Ensures consistency of cached documents • Insertion of caching logic transparent to application – Make use of aspect-oriented programming § Analysis of the cache • Transparency of injecting caching logic • Improvement in response time for test-bed applications 4 © 2006 IBM Corporation
Dynamic Web Caching – Solution Approach HTTP request SQL req. Internet SQL res. Client HTTP response Web server Consistency § Correlation between read and write requests Transparency Cache Check Web Page Cache §Capture information flow Cache inserts, invalidations Application server Request info Database server Database access Caching Logic Auto. Web. Cache 5 © 2006 IBM Corporation
Dynamic Web Caching Outline § Design of Auto. Web. Cache – Maintaining cache consistency • Determine relationship between reads and updates – Cache Structure § Aspectizing Web Caching – Insertion of caching logic transparently § Evaluation – Analysis of effectiveness, transparency § Conclusion 6 © 2006 IBM Corporation
Dynamic Web Caching Maintaining Cache Consistency – Read Requests § Response to read-only requests cached § Read SQL queries recorded with cache entry Index: URI (read. Handler. Name + read. Handler. Args) Cached web page URI 1 Web. Page 1 { Read Query 11, Read Query 12, …. } URI 2 Web. Page 2 { Read Query 21, Read Query 22, …. } … … Associated Read Queries 7 © 2006 IBM Corporation
Dynamic Web Caching Maintaining Cache Consistency – Write Requests § Result not cached § Write SQL queries recorded § Intersect write SQL queries with read queries of cached pages § Invalidate if non-zero intersection WS WS RS RS No Invalidation 8 © 2006 IBM Corporation
Dynamic Web Caching Invalidating Cache Entries Index: URI (read. Handler. Name + read. Handler. Args) URI 1 Remove Cached web page Web. Page 1 { Read Query 11, Read Query 12, …. } URI 2 Web. Page 2 { Read Query 21, Read Query 22, …. } URI 3 Web. Page 3 { Read Query 31, Read Query 32, …. } Associated Read Queries URIn Write Query 9 © 2006 IBM Corporation
Dynamic Web Caching Query Analysis Engine § Determines intersection between SQL queries § Three levels of granularity for intersection – Column based – Value based – Extra query based § Balance precision with complexity 10 © 2006 IBM Corporation
Dynamic Web Caching Column Based Intersection Invalidate if Column_Read = Column_Updated a b c 5 8 7 1 10 9 SELECT T. a FROM T WHERE T. b = 8 UPDATE T SET T. c = 7 WHERE T. b = 10 Ok UPDATE T SET T. a = 12 WHERE T. b = 10 Invalidate 11 © 2006 IBM Corporation
Dynamic Web Caching Value Based Intersection Invalidate if Rows_Read = Rows_Updated a b c 5 7 1 Invalid ate with column -based 8 10 9 SELECT T. a FROM T WHERE T. b = 8 UPDATE T SET T. a = 7 WHERE T. b = 10 Ok UPDATE T SET T. a = 12 WHERE T. b = 8 Invalidate 12 © 2006 IBM Corporation
Dynamic Web Caching Extra Query Based Intersection Generate extra query to find missing values a b c 5 7 1 Invalid ate with valuebased 8 10 9 ? ? SELECT T. a FROM T WHERE T. b = 8 UPDATE T SET T. a = 3 WHERE T. c = 9 SELECT T. b FROM T WHERE T. c = 9 Ok 13 © 2006 IBM Corporation
Dynamic Web Caching Outline § Design of Auto. Web. Cache – Maintaining cache consistency • Determine relationship between reads and updates – Cache Structure § Aspectizing Web Caching – Insertion of caching logic transparently § Evaluation – Analysis of effectiveness, transparency § Conclusion 14 © 2006 IBM Corporation
Dynamic Web Caching – Solution Approach HTTP request SQL req. Internet SQL res. Client HTTP response Web server Transparency §Capture information flow Cache Check Web Page Cache inserts, invalidations Application server Request info Database server Database access Caching Logic Auto. Web. Cache 15 © 2006 IBM Corporation
Dynamic Web Caching Aspect-Oriented Programming (AOP) § Modularize cross-cutting concerns - Aspects – Logging, billing, exception handling § Works on three principles – Capture the execution points of interest – Pointcuts (1) • Method calls, exception points, read/write accesses – Determine what to do at these pointcuts – Advice (2) • Encode cross-cutting logic (before/ after/ around) – Bind Pointcuts and Advice together – Weaving (3) • Aspect. J compiler for Java 16 © 2006 IBM Corporation
Dynamic Web Caching Insertion of Caching Logic Original web application Weaving Rules Caching library Aspect Weaving (Aspect J) Cacheenabled web application version 17 © 2006 IBM Corporation
Dynamic Web Caching Aspectizing Read Requests Cache check Original code of a read-only request handler // Execute SQL queries … SQL query 1 SQL query 2 … // Generate a web document web. Doc = … Capturing request entry String cached. Doc = Cache. get (uri, input. Info); if (cached. Doc != null) return cached. Doc; // Cache hit Capture main Collecting dependency info Capturing SQL queries Collect SQL query info Cache insert Capturing request exit Cache. add(web. Doc, uri, input. Info, dependency. Info); // Cache miss // Return the web document … 18 © 2006 IBM Corporation
Dynamic Web Caching Aspectizing Write Requests Original code of a write request handler // Execute SQL queries … SQL query 1 SQL query 2 … … Collecting invalidation info Capturing SQL queries Collect SQL query info Capture main Cache invalidation Capturing request exit // Cache consistency Cache. remove(invalidation. Info); // Return 19 © 2006 IBM Corporation
Dynamic Web Caching Capturing Servlet’s main Method // Pointcut for Servlets’ main method pointcut servlet. Main. Method. Execution(. . . ) : execution( void Http. Servlet+. do. Get( Http. Servlet. Request, Http. Servlet. Response)) || execution( void Http. Servlet+. do. Post( Http. Servlet. Request, Http. Servlet. Response)); §Pointcut captures entry and exit points of web request handlers §Cache Checks and Inserts for Read Requests §Invalidations for Update Requests 20 © 2006 IBM Corporation
Dynamic Web Caching Weaving Rules for Cache Checks and Inserts // Advice for read-only requests around(. . . ) : servlet. Main. Method. Execution (. . . ) { // Pre-processing: Cache check String cached. Doc; cached. Doc =. . . call Cache. get of Auto. Web. Cache if (cached. Doc != null) { . . . return cached. Doc } // Normal execution of the request proceed(. . . ); // Post-processing: Cache insert. . . call Cache. add of Auto. Web. Cache } 21 © 2006 IBM Corporation
Dynamic Web Caching Weaving Rules for Cache Invalidations // Advice for write requests after(. . . ) : servlet. Main. Method. Execution (. . . ) { // Cache invalidation. . . call Cache. remove of Auto. Web. Cache } 22 © 2006 IBM Corporation
Dynamic Web Caching Weaving Rules for Collecting Consistency Information // Pointcut for SQL query calls pointcut sql. Query. Call( ) : call(Result. Set Prepared. Statement. execute. Query()) || call(int Prepared. Statement. execute. Update()); // Advice for SQL query calls after( ) : sql. Query. Call ( ) {. . . collect consistency info . . . } §After each SQL query, note §Query template §Query instance values 23 © 2006 IBM Corporation
Dynamic Web Caching Transparency of Auto. Web. Cache § Ability to Capture Information Flow – Entry and exit points of request handlers • e. g. do. Get(), do. Post() APIs for Java Servlets – Modification to underlying data sets • e. g. JDBC calls for SQL requests – Multiple sources of dynamic behavior • Currently handle dynamic behavior from SQL queries • Need standard interfaces for all sources 24 © 2006 IBM Corporation
Dynamic Web Caching Hidden State Problem … Number number = get. Random ( ); Image img = get. Image (number); request execution display. Image (img); … § Request does not contain all information for response creation § Occurs when random nos. , timers etc. used by application § Subsequent requests result in different responses § Duty of developer to declare such requests non-cacheable 25 © 2006 IBM Corporation
Dynamic Web Caching Use of Application Semantics § Aspect-orientedness relies on code syntax – Cannot capture semantic concepts § In TPC-W application – Best Seller requests allows dirty reads for 30 sec – Conforms to specification clauses 3. 1. 4. 1 and 6. 3. 3. 1 § Application semantics can be used to improve performance – Best seller cache entry time-out set for 30 sec 26 © 2006 IBM Corporation
Dynamic Web Caching Outline § Design of Auto. Web. Cache – Maintaining cache consistency • Determine relationship between reads and updates – Cache Structure § Aspectizing Web Caching – Insertion of caching logic transparently § Evaluation – Analysis of effectiveness § Conclusion 27 © 2006 IBM Corporation
Dynamic Web Caching Evaluation Environment § RUBi. S – Auction site based on e. Bay – Browsing items, bidding, leaving comments etc. – Large number of requests that can be satisfied quickly § TPC-W – Models an on-line bookstore – Listing new products, best-sellers, shopping cart etc. – Small number of requests that are database intensive § Client Emulator – Client browser emulator generates requests – Average think time, session time conform to TPCW v 1. 8 specification – Cache warmed for 15 min, statistics gathered over 30 min 28 © 2006 IBM Corporation
Dynamic Web Caching Response Time for RUBi. S – Bidding Mix 140 Response Time (ms) 120 100 No cache 80 Auto. Web. Cache 60 40 20 0 0 200 400 600 800 1000 Number of Clients 29 © 2006 IBM Corporation
Dynamic Web Caching Relative Benefits for different Requests in RUBi. S 20 15 10 r se U w Vi e em It Vi ew B id s R Se ar ch ch Vi ew gn at C t Se ar m Pu t C id Pu t B ow Bu y N Br ow se C Br ow t M e at 0 se R gn 5 Ab ou Percent of Requests 25 Request Type Hits Misses 30 © 2006 IBM Corporation
Dynamic Web Caching Response Time for TPC-W – Shopping Mix Response Time (ms) 10000 100 10 1 50 100 150 200 250 300 350 400 Number of Clients No cache Auto. Web. Cache Optimization for Semantics 31 © 2006 IBM Corporation
Dynamic Web Caching Relative Benefits for different Requests in TPC-W 20 15 10 5 t l es re qu ch se ar pr in de r or od uc qu i t d et ai ry la y d i de r or ne w p ro du sp ct s n e in te ho m e s ut ec ex ra c ea rc tio h s le r el t s be s m in re qu es t 0 ad Percent of Requests 25 Request Type Hits based on app. semantics Misses 32 © 2006 IBM Corporation
Dynamic Web Caching Implementation of Auto. Web. Cache Web applicat ion # Appli cat ion TPCW AOP-based caching # J a v Java co a de cla s size s 46 Caching library 12 K li ne s 5. 8 K J a v Java co a de cla s size s 13 4. 6 K li ne s # Aspe Size of ct. J Aspec files t. J (weavi cod ng e rules ) 1 150 line s 33 © 2006 IBM Corporation
Dynamic Web Caching Conclusion § Auto. Web. Cache - a cache that • Ensures consistency of cached documents – Query Analysis • Insertion of caching logic transparent to application – Make use of aspect-oriented programming § Transparency of Auto. Web. Cache • Well-defined, standard interfaces for information flow • Presence of hidden states • Use of application semantics 34 © 2006 IBM Corporation
IBM Research Questions / Comments / Suggestions ! © 2006 IBM Corporation
IBM Research Thank You!! © 2006 IBM Corporation
Dynamic Web Caching SQL Query Structure SELECT T. a FROM T WHERE T. b=10 Column(s) Selected Column(s) Updated Table Concerned Predicate Condition UPDATE T SET T. c WHERE 20 < T. d < 35 37 © 2006 IBM Corporation
Dynamic Web Caching Response Time for RUBi. S – Bidding Mix Response time (ms) 140 120 100 80 60 40 20 0 0 200 400 600 800 1000 Number of Clients No cache AC extra query AC column based Hand-coded AC value based 38 © 2006 IBM Corporation
Dynamic Web Caching Response Time for TPCW – Shopping Mix Response time (ms) 10000 100 10 1 0 50 100 150 200 250 300 350 400 450 Number of Clients No cache AC extra query AC column based Hand-coded AC value based 39 © 2006 IBM Corporation
Dynamic Web Caching Cache Structure in Auto. Web. Cache Index: URI (read. Handler. Name + read. Handler. Args) Cached web page URI 1 Index: SQL String Web. Page 1 Read. Query. Template 1 <value vector, URI> pair <instance values 1 a, URI 1> <instance values 1 b, URI 41> <instance values 1 c, URI 57> Read. Query. Template 2 URI 2 Web. Page 2 … … <instance values 2 a, URI 7> Read. Query. Template 3 <instance values 3 a, URI 12> … … Remove If a Write Query invalidates Read. Query. Template 1 with instances values 1 a 40 © 2006 IBM Corporation
Dynamic Web Caching Evaluation § Analysis of Auto. Web. Cache – Effect on performance of applications – Relation of application semantics to cache efficiency – Relative benefit of caching on different read-only requests – Usefulness of AOP techniques in implementing the caching system 41 © 2006 IBM Corporation
Dynamic Web Caching Vi ew U se r em It Vi ew B Vi ew Se ar ch R C ch ar Se id s gn at t m C Pu t d ow B i Pu t ow Br Bu y N se R gn at C se ow Br ou t M e 350 300 250 200 150 100 50 0 Ab Response Time (ms) Breakdown of Response Times for Requests in RUBi. S Request Type Extra time for a Miss (on top of overall response time) Overall avg. response time 42 © 2006 IBM Corporation
Dynamic Web Caching Breakdown of Response Times for Requests in TPC-W 300 250 200 150 100 50 st l h re d e se ar c ct od u qu e ta i ry pr or de r i nq ui la y or de r d is p du ct ne w p ro ra c e in te ho m s n tio h ar c te s e ex ec u s st be in re qu es el le rs t 0 ad m Response Time (ms) 350 Request Type Extra time for a Miss (on top of overall response time) Overall avg. response time 43 © 2006 IBM Corporation
Dynamic Web Caching Key Aspect-Oriented Programming Concepts § “Join points” identify executable points in system – Method calls, read and write accesses, invocations § “Pointcuts” allow capturing of various join points § “Advice” specifies actions to be performed at pointcuts – Before or after the execution of a pointcut – Encode the cross-cutting logic 44 © 2006 IBM Corporation
Dynamic Web Caching Conclusion § Dynamic Content Not easy to Cache – Ensure consistency, invalidate cached entries as a result of updates Ø Auto. Web. Cache – Query Analysis – Caching logic inserted at different points in the application • Entry and exit of requests, access to underlying database – Most solutions rely on understanding complex application logic Ø Auto. Web. Cache – Transparent insertion of caching logic using AOP § Transparency affected by • Well-defined, standard interfaces for information flow • Presence of hidden states • Use of application semantics 45 © 2006 IBM Corporation
Dynamic Web Caching versus Query Caching § The two are complimentary § Web caching useful when app server is bottleneck § Documents can be cached nearer to the client, distributed § Can make use of application semantics with web page caching (best seller for TPC-W) 46 © 2006 IBM Corporation
fa294cd891aada2d70edfb84b2886fc3.ppt