Скачать презентацию Large-scale Incremental Processing Using Distributed Transactions and Notifications Скачать презентацию Large-scale Incremental Processing Using Distributed Transactions and Notifications

0af59f86206147c625341a523eee5dee.ppt

  • Количество слайдов: 33

Large-scale Incremental Processing Using Distributed Transactions and Notifications Written By Daniel Peng and Frank Large-scale Incremental Processing Using Distributed Transactions and Notifications Written By Daniel Peng and Frank Dabek Presented By Michael Over 1

Abstract n Task: Updating an index of the web as documents are crawled Requires Abstract n Task: Updating an index of the web as documents are crawled Requires continuously transforming a large repository of existing documents as new documents arrive n One example of a class of data processing tasks that transform a large repository of data via small, independent mutations n 2

Abstract n These tasks lie in a gap between the capabilities of existing infrastructure Abstract n These tasks lie in a gap between the capabilities of existing infrastructure Databases – Storage/throughput requirements n Map. Reduce – Create large batches for efficiency n n Percolator A system for incrementally processing updates to a large data set n Deployed to create the Google web search index n Now processes the same number of documents per day but reduced the average of documents in Google search results by 50% n 3

Outline n Introduction n Design Bigtable n Transactions n Timestamps n Notifications n Evaluation Outline n Introduction n Design Bigtable n Transactions n Timestamps n Notifications n Evaluation n Related Work n Conclusion and Future Work n 4

Task: Build an index of the web that can be used to answer search Task: Build an index of the web that can be used to answer search queries. n Approach: n Crawl every page on the web and process them n Maintain a set of invariants – same content, link inversion n Could be done using a series of Map. Reduce operations n 5

Challenge n Challenge: Update the index after recrawling some small portion of the web. Challenge n Challenge: Update the index after recrawling some small portion of the web. n Could we run Map. Reduce over just the recrawled pages? n No, there are links between the new pages and the rest of the web n Could we run Map. Reduce over the entire repository? n Yes, this is how Google’s web search index was produced prior to this work n What are some effects of this? 6

Challenge n What about a DBMS? n n What about distributed storage systems like Challenge n What about a DBMS? n n What about distributed storage systems like Bigtable? n n Cannot handle the sheer volume of data Scalable but does not provide tools to maintain data invariants in the face of concurrent updates Ideally, the data processing system for the task of maintaining the web search index would be optimized for incremental processing and able to maintain invariants 7

Percolator n Provides the user with random access to a multiple petabyte repository n Percolator n Provides the user with random access to a multiple petabyte repository n Process documents individually Many concurrent threads ACID compliant transactions n Observers – Invoked when a user-specified column changes n Designed specifically for incremental processing n 8

Percolator Google uses Percolator to prepare web pages for inclusion in the live web Percolator Google uses Percolator to prepare web pages for inclusion in the live web search index n Can now process documents as they are crawled n Reducing the average document processing latency by a factor of 100 n Reducing the average of a document appearing in a search result by nearly 50% n 9

Outline n Introduction n Design Bigtable n Transactions n Timestamps n Notifications n Evaluation Outline n Introduction n Design Bigtable n Transactions n Timestamps n Notifications n Evaluation n Related Work n Conclusion and Future Work n 10

Design n Two main abstractions for performing incremental processing at large scale: ACID compliant Design n Two main abstractions for performing incremental processing at large scale: ACID compliant transactions over a random access repository n Observers – a way to organize an incremental computation n n A Percolator system consists of three binaries: A Percolator worker n A Bigtablet server n A GFS chunkserver n 11

Outline Introduction n Design n Bigtable n Transactions n Timestamps n Notifications n Evaluation Outline Introduction n Design n Bigtable n Transactions n Timestamps n Notifications n Evaluation n Related Work n Conclusion and Future Work n 12

Bigtable Overview n n Percolator is built on top of the Bigtable distributed storage Bigtable Overview n n Percolator is built on top of the Bigtable distributed storage system Multi-dimensional sorted map n n Keys: (row, column, timestamp) tuples Provides lookup and update operations on each row Row transactions enable atomic read-modify-write operations on individual rows Runs reliably on a large number of unreliable machines handling petabytes of data 13

Bigtable Overview A running Big. Table consists of a collection of tablet servers n Bigtable Overview A running Big. Table consists of a collection of tablet servers n Each tablet server is responsible for serving several tablets n Percolator maintains the gist of Bigtable’s interface n Percolator’s API closely resembles Bigtable’s n Challenge: Provide the additional features of multirow transactions and the observer framework n 14

Outline Introduction n Design n n Big. Table n Transactions Timestamps n Notifications n Outline Introduction n Design n n Big. Table n Transactions Timestamps n Notifications n Evaluation n Related Work n Conclusion and Future Work n 15

Transactions Percolator provides cross-row, cross-table transactions with ACID snapshot-isolation semantics n Stores multiple versions Transactions Percolator provides cross-row, cross-table transactions with ACID snapshot-isolation semantics n Stores multiple versions of each data item using Bigtable’s timestamp dimension n Provides snapshot isolation, which protects against write-write conflicts n Percolator must explicitly maintain locks n Example of transaction involving bank accounts n 16

Transactions Key Bal: Data Bob Joe 8: 7: $6 6: 5: $10 8: 7: Transactions Key Bal: Data Bob Joe 8: 7: $6 6: 5: $10 8: 7: $6 6: 5: $2 Bal: Lock 8: 7: I am Primary 6: 5: 8: 7: Primary @ Bob. bal 6: 5: Bal: Write 8: data @ 7 7: 6: data @ 5 5: 17

Outline Introduction n Design n Big. Table n Transactions n n Timestamps n Notifications Outline Introduction n Design n Big. Table n Transactions n n Timestamps n Notifications Evaluation n Related Work n Conclusion and Future Work n 18

Timestamps n n n Server hands out timestamps in strictly increasing order Every transaction Timestamps n n n Server hands out timestamps in strictly increasing order Every transaction requires contacting the timestamp oracle twice, so this server must scale well For failure recovery, the timestamp oracle needs to write the highest allocated timestamp to disk before responding to a request. For efficiency, it batches writes, and "pre-allocates" a whole block of timestamps. How many timestamps do you think Google’s timestamp oracle serves per second from 1 machine? 19 Answer: 2, 000 (2 million) per second

Outline Introduction n Design n Big. Table n Transactions n Timestamps n n Notifications Outline Introduction n Design n Big. Table n Transactions n Timestamps n n Notifications Evaluation n Related Work n Conclusion and Future Work n 20

Notifications Transactions let the user mutate the table while maintaining invariants, but users also Notifications Transactions let the user mutate the table while maintaining invariants, but users also need a way to trigger and run the transactions. n In Percolator, the user writes “observers” to be triggered by changes to the table n Percolator invokes the function after data is written to one of the columns registered by an observer n 21

Notifications Percolator applications are structured as a series of observers n Notifications are similar Notifications Percolator applications are structured as a series of observers n Notifications are similar to database triggers or events in active database but they cannot maintain data invariants n Percolator needs to efficiently find dirty cells with observers that need to be run n To do so, it maintains a special “notify” Bigtable column, containing an entry for each dirty cell n 22

Outline Introduction n Design n Big. Table n Transactions n Timestamps n Notifications n Outline Introduction n Design n Big. Table n Transactions n Timestamps n Notifications n n Evaluation Related Work n Conclusion and Future Work n 23

Evaluation Percolator lies somewhere in the performance space between Map. Reduce and DBMSs n Evaluation Percolator lies somewhere in the performance space between Map. Reduce and DBMSs n Converting from Map. Reduce – Percolator was built to create Google’s large “base” index, a task previously done by Map. Reduce n In Map. Reduce, each day several billions of documents were crawled and fed through a series of 100 Map. Reduces, resulting in an index which answered user queries n 24

Evaluation Using Map. Reduce, each document spent 2 -3 days being indexed before it Evaluation Using Map. Reduce, each document spent 2 -3 days being indexed before it could be returned as a search result n Percolator crawls the same number of documents, but the document is sent through Percolator as it is crawled n The immediately advantage is a reduction in latency (the median document moves through over 100 x faster than with Map. Reduce) n 25

Evaluation Percolator freed Google from needing to process the entire repository each time documents Evaluation Percolator freed Google from needing to process the entire repository each time documents were indexed n Therefore, they can increase the size of the repository (and have, now 3 x it’s previous size) n Percolator is easier to operate – there are fewer moving parts: just tablet servers, Percolator workers, and chunkservers n 26

Evaluation n Question: How do you think Percolator performs in comparison to Map. Reduce Evaluation n Question: How do you think Percolator performs in comparison to Map. Reduce if: 1% of the repository needs to be updated per hour? n 30% of the repository needs to be updated per hour? n 60% of the repository needs to be updated per hour? n 90% of the repository needs to be updated per hour? n 27

Evaluation 28 Evaluation 28

Evaluation n n Comparing Percolator versus “raw” Bigtable Percolator introduces overhead relative to Bigtable, Evaluation n n Comparing Percolator versus “raw” Bigtable Percolator introduces overhead relative to Bigtable, a factor of four overhead on writes due to 4 round trips: n Percolator -> Timestamp Server -> Percolator -> Tentative Write -> Percolator -> Timestamp Server -> Percolator -> Commit -> Percolator 29

Outline Introduction n Design n Big. Table n Transactions n Timestamps n Notifications n Outline Introduction n Design n Big. Table n Transactions n Timestamps n Notifications n n Evaluation n Related Work n Conclusion and Future Work 30

Related Work n n Batch processing systems like Map. Reduce are well suited for Related Work n n Batch processing systems like Map. Reduce are well suited for efficiently transforming or analyzing an entire repository DBMSs satisfy many of the requirements of an incremental system but does not scale like Percolator Bigtable is a scalable, distributed, and fault tolerant storage system, but is not designed to be a data transformation system Cloud. TPS builds an ACID-compliant datastore on top of distributed storage but is intended to be a backend for a website (stronger focus on latency and partition tolerance than Percolator) 31

Outline Introduction n Design n Big. Table n Transactions n Timestamps n Notifications n Outline Introduction n Design n Big. Table n Transactions n Timestamps n Notifications n Evaluation n Related Work n n Conclusion and Future Work 32

Conclusion and Future Work Percolator has been deployed to produce Google’s websearch index since Conclusion and Future Work Percolator has been deployed to produce Google’s websearch index since April, 2010 n It’s goals were reducing the latency of indexing a single document with an acceptable increase in resource usage n Scaling the architecture costs a very significant 30 -fold overhead compared to traditional database architectures n n How much of this is fundamental to distributed storage systems and how much could be optimized away? 33