Distributed Systems Topic 10 Cloud Computing Systems Dr

Distributed Systems Topic 10: Cloud Computing Systems Dr. Michael R. Lyu Computer Science & Engineering Department The Chinese University of Hong Kong © Chinese University, CSE Dept. Distributed Systems / 10 - 1

Outline 1. Cloud Computing Introduction 2. Virtualization Concept 3. The Hadoop Distributed File System (HDFS) 4. Map. Reduce Computing Paradigm 5. Summary © Chinese University, CSE Dept. Distributed Systems / 10 - 2

1 What Is Cloud Computing? Cloud computing is a hot buzzword recently – More and more companies have claimed their products are cloud computing based So, what is cloud computing – A new name of an old technology created by marketing people? – A new technology focusing on higher-speed computing? – A new programming paradigm? © Chinese University, CSE Dept. Distributed Systems / 10 - 3

1 What Is Cloud Computing? Let us see what cloud computing is by its economic drive – See an example on hosting an Internet service » e. g. , a website Server CUHK co. ltd. © Chinese University, CSE Dept. Distributed Systems / 10 - 4

1. 1 Hosting Computing Systems CUHK co. ltd. A single server is enough for small business A small amount of budget © Chinese University, CSE Dept. Distributed Systems / 10 - 5

1. 1 Hosting Computing Systems CUHK co. ltd. More servers are needed if you have more clients © Chinese University, CSE Dept. Cost more manpower and money to provide the service Distributed Systems / 10 - 6

1. 1 Hosting Computing Systems CUHK co. ltd. A huge data center is needed if you have a huge number of clients Cost a huge amount of money to provide the service © Chinese University, CSE Dept. Distributed Systems / 10 - 7

1. 1 Hosting Computing Systems CUHK co. ltd. A huge data center is needed if you have a huge number of clients Cost a huge amount of money to provide the service © Chinese University, CSE Dept. Distributed Systems / 10 - 8

1. 1 Hosting Computing Systems The number of concurrent users is dynamic – You need to prepare for the peak number » Otherwise, bad user experience! – So you have to pay for the servers no matter whether you are using them or not » The money you spend to buy the hardware » The salary for the maintainers » The electricity (including that for air conditioning) » Other maintenance costs, e. g. , repairing, upgrading, renting a room for the servers © Chinese University, CSE Dept. Distributed Systems / 10 - 9

1. 2 Cloud Computing Motivations How to make it more economic when hosting computing systems I wish this cloud can really host my system so that I can provide a scalable, on-demand service Server CUHK co. ltd. © Chinese University, CSE Dept. Distributed Systems / 10 - 10

1. 2 Cloud Computing Introduction Hosting computing systems in a scalable, ondemand way – To be economic! Cloud computing – An emerging computing concept that approaches such a goal – Based on a lot of distributed system techniques » Service-oriented system architecture » Virtualization technology » Distributed computing paradigm © Chinese University, CSE Dept. Distributed Systems / 10 - 11

1. 3 Pay As You Go The core of cloud computing is computing/ storage outsourcing – To another company – To a dedicating unit in the same organization vs Pay as you go!! © Chinese University, CSE Dept. Pay for exactly what you’ve used Cost Down!!! Distributed Systems / 10 - 12

1. 3 Outsourcing Electronics industry Cloud computing Service vendor Brand vendor Cloud computing provider Manufacturing outsourcing Computing and storage outsourcing Different manufacturing requirements in low season and high season # of product lines # of workers Services for end users Different service requirements in different times # of hardware machines # of maintainers Cost down!!! © Chinese University, CSE Dept. Distributed Systems / 10 - 13

1. 4 Cloud Computing System Architecture Computing System User software, API Application OS, middleware Hardware © Chinese University, CSE Dept. Platform Infrastructure Distributed Systems / 10 - 14

1. 4 Cloud Computing System Architecture Three cloud computing stacks A cloud user’s own application and Platform OS Saa. S: Software as a service Infrastructure Paa. S: Platform as a service Provided by the cloud computing providers © Chinese University, CSE Dept. Iaa. S: Infrastructure as a service Distributed Systems / 10 - 15

1. 5 Cloud Computing Industry Amazon Elastic Compute Cloud (EC 2) Microsoft's Windows Azure Platform Google App Engine Other small startups: Heroku & Engine Yard And more (and more)… © Chinese University, CSE Dept. Distributed Systems / 10 - 16

1. 5 Cloud Computing Industry Amazon Elastic Compute Cloud (EC 2) – Xen-based virtual computing environment » A user can run Linux-based applications – Iaa. S: A user can control nearly the entire software stack, from the kernel upwards. – Provides low level of virtualization » Raw CPU cycles, block-device storage, IP-level connectivity Microsoft's Windows Azure Platform Google App Engine Other small startups: Heroku & Engine Yard And more (and more)… © Chinese University, CSE Dept. Distributed Systems / 10 - 17

1. 5 Cloud Computing Industry Amazon Elastic Compute Cloud (EC 2) Microsoft's Windows Azure Platform – Paa. S: Use Windows Azure Hypervisor as the infrastructure, and use the. NET framework as the application container – Support general-purpose computing – Users can choose language, but cannot control the underlying operating system or runtime Google App Engine Other small startups: Heroku & Engine Yard And more (and more)… © Chinese University, CSE Dept. Distributed Systems / 10 - 18

1. 5 Cloud Computing Industry Amazon Elastic Compute Cloud (EC 2) Microsoft's Windows Azure Platform Google App Engine – Support APIs for the Google Datastore, Google Accounts, URL fetch, image manipulation, and email services, etc. – Saa. S: Application domain-specific platforms » Not suitable for general-purpose computing Other small startups: Heroku & Engine Yard And more (and more)… © Chinese University, CSE Dept. Distributed Systems / 10 - 19

1. 5 Cloud Computing Industry Amazon Elastic Compute Cloud (EC 2) Microsoft's Windows Azure Platform Google App Engine Other small startups: Heroku & Engine Yard – Based on Ruby on Rails – Paa. S And more (and more)… © Chinese University, CSE Dept. Distributed Systems / 10 - 20

2 Virtualization Concept Virtualization refers to the act of creating a virtual (rather than actual) version of – computer hardware platform – operating system (OS) – storage device – database server – computer network resources ♦ We can virtualize physical servers to support all applications running over them ♦ Virtualization of the key to Cloud Computing © Chinese University, CSE Dept. Distributed Systems / 10 - 21

2. 1 The Traditional Server Concept Web Server App Server DB Server EMail Windows Linux Windows IIS Glassfish My. SQL Exchange © Chinese University, CSE Dept. Distributed Systems / 10 - 22

2. 1 The Traditional Server Concept System administrators often talk about servers as a whole unit – Including the hardware, the OS, the storage, and the applications Servers are often referred to by their functions – e. g. , the Web servers, the SQL servers – If the SQL servers are overloaded, the administrator must add in a new one © Chinese University, CSE Dept. Distributed Systems / 10 - 23

2. 1 If Something Going Wrong… Web Server App Server DB Server EMail Windows DOWN! Linux Windows My. SQL Exchange IIS © Chinese University, CSE Dept. Distributed Systems / 10 - 24

2. 1 The Traditional Server Concept If server failure is experienced – Unless there are redundant servers, the whole service is down The administrators can implement clusters of servers – Make the service more fault tolerant – However, even clusters are not scalable – And, resource utilization is typically low © Chinese University, CSE Dept. Distributed Systems / 10 - 25

2. 2 The Virtual Server Concept Virtual Machine Monitor (VMM) layer between Guest OS and hardware © Chinese University, CSE Dept. Distributed Systems / 10 - 26

2. 2 The Virtual Server Concept Virtual servers encapsulate the service software away from the hardware – A virtual server can be hosted in one or more hardware machines – One hardware machine can host more than one virtual server Server 1 Guest OS Server 2 Guest OS Clustering Service Console VMM (Virtual Machine Monitor) x 86 Architecture © Chinese University, CSE Dept. Intercepts hardware requests Distributed Systems / 10 - 27

2. 2 Virtualized Vs. Traditional App App App OS OS OS Operating System Hypervisor Hardware Traditional Stack Virtualized Stack © Chinese University, CSE Dept. Distributed Systems / 10 - 28

2. 2 The Virtual Server Concept A flexible way to host computing systems – Virtual servers are just files in the storage » They can be migrated from one machine to another easily – Hardware machines can be removed and introduced conveniently – Administrators can adjust the amount of resources allocated to each virtual server » Accommodate to the service requirements © Chinese University, CSE Dept. Distributed Systems / 10 - 29

2. 3 Virtualization Technique Supports Software implementations – Commercial software » e. g. , VMWare – Open-source software » e. g. , XEN, Virtual. Box Hardware support – Intel VT (Intel Virtualization Technology) – AMD-V (AMD-Virtualization) Virtualization is a well-established technique © Chinese University, CSE Dept. Distributed Systems / 10 - 30

3 Large-scale Computing in Cloud ¨ Large-scale computing for data mining problems on commodity hardware ¨ Challenges: – How do you distribute computation? – How can we make it easy to write distributed programs? – How can you handle machine failures? » One server may stay up 3 years (1, 000 days) » If you have 1, 000 servers, expect to lose 1/day » It is estimated that Google had 2 M machines in 2013 l 2, 000 machine fails every day! © Chinese University, CSE Dept. Distributed Systems / 10 - 31

3. 1 Idea and Solution ¨ Issue: Copying data over a network takes time ¨ Idea: – Bring computation close to the data – Store files multiple times for reliability ¨ Map. Reduce addresses these problems – Google’s computational/data manipulation model – Elegant way to work with big data – Storage Infrastructure – File system » Google: GFS; Hadoop: HDFS – Programming model » Map-reduce © Chinese University, CSE Dept. Distributed Systems / 10 - 32

Relationship HBase Hive Mahout Pig Map. Reduce Zoo. Keeper Hadoop Projects Hadoop HDFS Hardware & OS Distributed Systems / 10 - 33

3. 3 The Hadoop Distributed File System (HDFS) ¨ With hundreds of machines at hand, failure is the norm rather than exception ¨ Traditional file storage system cannot cope with the scale and failure faced by large clusters ¨ The Hadoop Distributed File System (HDFS) is a natural solution to this problem – Distributed File System – Provides global file namespace – Replica to ensure data recovery © Chinese University, CSE Dept. Distributed Systems / 10 - 34

3. 3 The Hadoop Distributed File System (HDFS) ¨ A HDFS instance may consist of thousands of server machines, each storing part of the file system’s data. ¨ Since we have huge number of components, and each component has non-trivial probability of failure, it means that there is always some component that is non-functional. ¨ Detection of faults and quick, automatic recovery from them are a core architectural goal of HDFS. © Chinese University, CSE Dept. Distributed Systems / 10 - 35

3. 3 Data Characteristics ¨Streaming data access ¨Batch processing rather than interactive user access ¨Write-once-read-many: a file, once created, written and closed, need not be changed – this assumption simplifies coherency © Chinese University, CSE Dept. Distributed Systems / 10 - 36

3. 4 HDFS Architecture ¨Master/slave architecture – Master: Name. Node – Slave: Data. Node ¨HDFS exposes a file system namespace (Name. Node) and allows user data to be stored in files. ¨A file is split into one or more blocks and set of blocks are stored in Data. Nodes. © Chinese University, CSE Dept. Distributed Systems / 10 - 37

HDFS Architecture Metadata ops Metadata(Name, replicas. . ) (/home/foo/data, 6. . . Namenode Client Block ops Read Datanodes replication B Blocks Rack 1 Write Rack 2 Client Distributed Systems / 10 - 38

3. 5 File System Namespace ¨ Namenode maintains the file system. – Hierarchical file system with directories and files. – Create, remove, rename, etc. – Any meta information changes to the file system is recorded by the Namenode. ¨ An application can specify the number of replicas of the file needed: replication factor of the file. This information is stored in the Namenode. © Chinese University, CSE Dept. Distributed Systems / 10 - 39

3. 5 Data Replication ¨ HDFS is designed to store very large files across machines in a large cluster. – Each file is a sequence of blocks. – All blocks in the file except the last are of the same size. ¨ Blocks are replicated for fault tolerance. ¨ Block size and replicas are configurable per file. ¨ The Name. Node receives a Heartbeat and a Block. Report from each Data. Node in the cluster. ¨ Block. Report contains all the blocks on a Data. Node. © Chinese University, CSE Dept. Distributed Systems / 10 - 40

3. 6 Replica Selection ¨ Replica selection for read operation: HDFS tries to minimize the bandwidth consumption and latency. ¨ If there is a replica on the Reader node then that is preferred. ¨ HDFS cluster may span multiple data centers: replica in the local data center is preferred over the remote one. © Chinese University, CSE Dept. Distributed Systems / 10 - 41

3. 6 Safemode Startup ¨Each Data. Node checks in with Heartbeat and Block. Report. ¨Name. Node verifies that each block has acceptable number of replicas ¨After a configurable percentage of safely replicated blocks check in with the Name. Node, Name. Node exits Safemode. ¨It then makes the list of blocks that need to be replicated. ¨Name. Node then proceeds to replicate these blocks to other Data. Nodes. © Chinese University, CSE Dept. Distributed Systems / 10 - 42

3. 7 Filesystem Metadata ¨ The HDFS namespace is stored by Name. Node. ¨ Name. Node uses a transaction log called the Edit. Log to record every change that occurs to the filesystem meta data. – For example, creating a new file – Change replication factor of a file – Edit. Log is stored in the Name. Node’s local filesystem © Chinese University, CSE Dept. Distributed Systems / 10 - 43

3. 7 Name. Node ¨ Keeps image of entire file system namespace. ¨ When the Namenode starts up – Gets the Fs. Image and Editlog. – Update Fs. Image with Edit. Log information. – Stores a copy of the Fs. Image as a checkpoint. ¨ In case of crash – Last checkpoint is recovered. © Chinese University, CSE Dept. Distributed Systems / 10 - 44

3. 7 Data. Node ¨A Data. Node stores data in files in its local file system. – Each block of HDFS is a separate file. – These files are placed in different directories. – Creation of new directory is determined by heuristics. ¨When the filesystem starts up: – Generates Blockreport. – Sends this report to Name. Node. © Chinese University, CSE Dept. Distributed Systems / 10 - 45

HDFS Summary • Reliable distributed file system • Data kept in “chunks” spread across machines • Each chunk replicated on different machine and racks – Seamless recovery from disk or machine failure C 0 C 2 C 1 D 6 D 0 C 3 D 1 C 6 C 2 D 0 C 1 C 6 C 0 …. E 1 A 2 B 6 Distributed Systems / 10 - 46

4 Map. Reduce ¨ Warm-up task – We have a huge text document – Count the number of times each distinct word appears in the file ¨ Sample application – Analyze web server logs to find popular URLs © Chinese University, CSE Dept. Distributed Systems / 10 - 47

4. 1 Task: Word Count ¨ Using Unix tool chain, we can count the occurrences of words: – Words (doc. txt) | sort | uniq –c » Where words takes a file and outputs the words in it, one per line ¨ This way of counting captures the essence of Map. Reduce – – Mapper (done by words) Group by keys and sort (done by sort) Reducer (done by uniq) Hadoop handles the partition and parallelization © Chinese University, CSE Dept. Distributed Systems / 10 - 48

4. 1 Map. Reduce: Overview ¨ Sequentially read a lot of data ¨ Map: – Extract something you care about ¨ Group by key: Sort and Shuffle ¨ Reduce: – Aggregate, summarize, filter or transform ¨ Write the result © Chinese University, CSE Dept. Distributed Systems / 10 - 49

4. 1 Map. Reduce ¨ Input: a set of key-value paris ¨ Programmer must specifies two methods: – Map(k, v) -> <k’, v’> » Takes a key-value pair and outputs a set of key-value pairs » There is one Map call for every (k, v) pair – Reduce (k’, <v’>) -> <k’, v’’> » All values v’ with the same key k’ are reduced together and processed in v’ order » There is one Reduce function call per unique key k’ © Chinese University, CSE Dept. Distributed Systems / 10 - 50

4. 1 Map. Reduce: Word Count Example ¨ Now that one document changes to a large corpus of documents © Chinese University, CSE Dept. Distributed Systems / 10 - 51

4. 2 Map. Reduce: Word Count Example // key: document name; value: text of the document Map(key, value): for each word w in value: emit(w, 1) // key: a word; value: an iterator over counts Reduce(key, values): result = 0 for each count v in values: results += v emit(key, result) In-class Practice © Chinese University, CSE Dept. Distributed Systems / 10 - 52

4. 3 When To Use Map. Reduce? You have a cluster of computers You are working with large dataset You are working with independent data (or assumed) The data processing can be casted into Map and Reduce operations © Chinese University, CSE Dept. Distributed Systems / 10 - 53

4. 3 Map. Reduce Implementations Map. Reduce by Google Inc. – Centric to Google’s searching engine » Generate Google's index of the World Wide Web Open-source implementations – Hadoop » Yahoo! Search Webmap is a Hadoop application to produce data used in every Yahoo! Web search query – And many others » Disco, Skynet, … © Chinese University, CSE Dept. Distributed Systems / 10 - 54

4. 4 Map. Reduce: Environment ¨ Map. Reduce environment takes care of: – Partitioning the input data – Scheduling the program’s execution across a set of machines – Performing the group by key step – Handling machine failures – Managing required inter-machine communication © Chinese University, CSE Dept. Distributed Systems / 10 - 55

4. 4 Map. Reduce Map: Read input and produces a set of key-value pairs Group by key: Collect all pairs with the same key Reduce: Collect all values belonging to the key and output Distributed Systems / 10 - 56

4. 4 Map. Reduce • Move computation to the data C 0 C 2 C 1 D 6 D 0 C 3 D 1 C 6 C 2 D 0 C 1 C 6 C 3 …. E 1 A 2 B 6 Data. Node Task. Tracker Bring computation directly to the data! Data. Node also serve as compute servers Distributed Systems / 10 - 57

4. 5 Data Flow ¨ Input and final output are stored on a distributed file system (FS): – Scheduler tries to schedule map tasks “close” to physical storage location of input data ¨ Intermediate results are stored on local FS of Map and Reduce workers ¨ Output is often input to another Map. Reduce task © Chinese University, CSE Dept. Distributed Systems / 10 - 58

4. 5 Coordination: Master ¨ Master node takes care of coordination: – Task status: (idle, in-progress, completed) – Idle tasks get scheduled as workers become available – When a map task completes, it sends the master the location and sizes of its R intermediate files, one for each reducer – Master pushes this info to reducers ¨ Master pings workers periodically to detect failures © Chinese University, CSE Dept. Distributed Systems / 10 - 59

4. 6 Dealing with Failures ¨ Map worker failure – Map tasks completed or in-progress at worker are reset to idle – Reduce workers are notified when task is rescheduled on another worker ¨ Reduce worker failure – Only in-progress tasks are reset to idle – Reduce task is restarted ¨ Master failure – Map. Reduce task is aborted and client is notified. © Chinese University, CSE Dept. Distributed Systems / 10 - 60

4. 6 How Many Map and Reduce Jobs? ¨ M map tasks, R reduce tasks ¨ Rule of a thumb: – Make M much larger than the number of nodes in the cluster – One chunk per map is common – Improves dynamic load balancing and speeds up recovery from worker failures ¨ Usually R is smaller than M – Output is spread across R files © Chinese University, CSE Dept. Distributed Systems / 10 - 61

5 Summary Cloud computing can provide elastic Internet-based services – Host computing systems in an on-demand, scalable manner – For cost-down purpose Cloud computing systems are typically implemented as service-based systems: Iaa. S, Paa. S, Saa. S. Virtualization technology is key to implement cloud computing systems Hadoop Distributed File System handles failures systematically Map. Reduce is an emerging computing paradigm that can be implemented with cloud computing systems © Chinese University, CSE Dept. Distributed Systems / 10 - 62