ac79de69b04e02cfa600650079cfe6da.ppt
- Количество слайдов: 39
p 2 p. Web The p 2 pweb Project Low cost Peer to Peer solutions for high availability web hosting 19 Mai 2005 Séminaire « Peer-To-Peer : Concept, Tools and Applications » Ecole d’ingénieurs de Genève Peer-To-Peer : Concept, Tools and Applications Slide
p 2 p. Web Agenda 1. The Project goals 2. Web hosting solutions and architecture 3. The p 2 pweb solution 4. 5. Project constraints and key technologies Related projects 6. 7. The project components 1. Global server load balancing system 2. Distributed set of web server 3. Monitoring system 4. Node architecture and hardware Conclusion Peer-To-Peer : Concept, Tools and Applications Slide 2
p 2 p. Web The Project goals To explore and implement low cost solutions for high availability web hosting “Do More with Less” Our targets are : • small or medium structures (associations, NGO, etc …) • with limited resources (money, IT people) • with important web hosting needs (bandwidth available) – rich and complex web site – medium to high web traffic – high availability and visibility needs It may fit very well the needs of many project in Least Developed Countries : Tele. Centres Networks, Rural Organisations, Universities, Cultural Centres, Public Libraries, Community Multimedia Centres, Health Networks, etc. . . Peer-To-Peer : Concept, Tools and Applications Slide 3
p 2 p. Web Example of hosted web site Afromix. org (personal web site) A portal of African and Caribbean Cultures since 1993 A complex web site using multiple technologies • in house Perl Content Management System (CMS) • an extended discographic database (1600 artist, more than 50 styles from all Africa and French West Indies) • multilingual (French, English, Spanish) site running on a JAVA application server (Tomcat) • about 25 000 files, 400 000 pages/month, 2 million hits/month, 60 000 unique visitors/month Mediaport. net (community web site) One of the first French web pioneer, first developed in INA • • • Peer-To-Peer : Concept, Tools and Applications mostly static content (near 10 000 files) multilingual (French, English) site running on a PHP CMS (ezpublish) it’s the main p 2 pweb test platform and it will evolve to an open web hosting solution for artistic and cultural web projects (an editorial committee is forming) Slide 4
p 2 p. Web The web hosting market • Free web hosting – Very limited • static html or small PHP site (limited computing resources) • can’t use your own domain name • Professional web hosting – A broad range of services • private virtual server • dedicated server • Co/location – But price is quite high • 100 -200€/month for one dedicated server • and maintenance can be complex Peer-To-Peer : Concept, Tools and Applications Slide 5
p 2 p. Web Centralized architecture Server in one location : Server and Internet link are single point of failure (SPOF) Peer-To-Peer : Concept, Tools and Applications Slide 6
p 2 p. Web Centralized architecture (cont. ) High availability architecture Multi-homing with BGP routing Datacenter hosting Load Balancers - BGP routing Reverse Proxy / Cache / SSL accelerators - hardware load balancing Load Balancers - SAN storage Web servers Application Servers In theory, no SPOF • but very complex architecture • very high cost Database cluster SAN Storage Peer-To-Peer : Concept, Tools and Applications Slide 7
p 2 p. Web CDN Architecture Content Delivery Network Service delivered by companies like Akamai, Speedera, and others. Edge servers provide caching and data replication for fast delivery to clients worldwide. A solution for very high traffic web site. Very expensive solution. Peer-To-Peer : Concept, Tools and Applications Slide 8
p 2 p. Web alternative web hosting • Community based web hosting – Initiatives from various associations ouvaton. coop, globenet. net, autre. net, altern. net, . . . – Most of the time, people share their money and knowledge to buy and administer one or two dedicated server. • Home server – We now have sufficient bandwidth (ADSL) computing power (PCs), good software (apache, linux …) – We lack reliability ! Peer-To-Peer : Concept, Tools and Applications Slide 9
p 2 p. Web Peer-To-Peer : Concept, Tools and Applications First idea : big home server Slide 10
p 2 p. Web Second idea (better one) Lots of people (family, friends, co-workers, …) already have : • An ADSL Internet access or Permanent High Speed Connection • One or more PCs (with a lot of unused disk space) So, what about sharing those resources to build a more powerful and resilient network of web servers Peer-To-Peer : Concept, Tools and Applications Slide 11
p 2 p. Web Hosting : the p 2 pweb way ADSL ISP 2 ADSL ISP 1 ADSL ISP 3 Each member of the p 2 pweb network share a portion of his Internet bandwidth (most of the time an ADSL line) and host a small server. The result is a powerful network that is the sum of the bandwidth and computing resources of all the members. Peer-To-Peer : Concept, Tools and Applications Slide 12
p 2 p. Web A peer to peer solution • Somehow, it’s a return to the very fundamentals principles of Internet: – a cooperative solution (network of servers) – a distributed solution (no central control) – a fault tolerant solution (resilience) • But with all the power of existing internet and open source technologies – consumer computers and internet access – overlay network and services over the Internet – It is a peer to peer solution ! Peer-To-Peer : Concept, Tools and Applications Slide 13
p 2 p. Web The project constraints • Unreliable component – Node failure is not an exception, it’s the rule. – Internet link failure, power outage, server crash … • Automatic function – Murphy’s law : servers will always crash when there is nobody to fix the problem (at night, when you are on vacation …) • Pragmatic approach – Build from existing component – Simple and efficient solutions are priority choices Peer-To-Peer : Concept, Tools and Applications Slide 14
p 2 p. Web Key technologies Mass market products are available at low cost now ! • ADSL lines – 1 Mb/s Up - 15 Mb/s Down for 30€ / month (free. fr) • ADSL router / firewall / ethernet or wifi – D-LINK, Net. Gear, LINKSYS from 75 to 150 € • Small Servers – PC barebones (Asus, Biostar, Shuttle …) • from 300 to 500 € – mini i. Mac (Apple) • 499 € • Open Source Software – BSD, Linux, apache, tomcat, etc … Peer-To-Peer : Concept, Tools and Applications Slide 15
p 2 p. Web Related projects You. Serv (IBM) • • • http: //www. almaden. ibm. com/cs/people/bayardo/userv/ You. Serv is software that forms a webserving "grid" by allowing its users to pool their desktop computing resources to create one large, virtual webspace. An intranet project, more oriented on desktop file sharing. Unfortunately not open source Vergenet (Simon Horman) • • http: //www. vergenet. net/ Vergenet has servers located in Sydney, Amsterdam, London, Tokyo and Indiana. These servers are all running Linux and a variant of Super Sparrow to load balance traffic between them. Super Sparrow enables users to load balance traffic between geographically separated points of presence by finding the site networkwise closest to clients. This is done by accessing BGP routing information (but it require direct access to a BGP router) Peer-To-Peer : Concept, Tools and Applications Slide 16
p 2 p. Web Related projects (cont. ) Coral (New York University) • • http: //www. coralcdn. org/ Coral is peer-to-peer content distribution network, comprised of a world-wide network of web proxies and name servers Publishing through Coral is as simple as appending a short string to the hostname of objects' URLs; a peer-to-peer DNS layer transparently redirects browsers to participating caching proxies an URL like www. myserver. com/some/path. html becomes www. myserver. com. nyud. net: 8090/some/path. html Coral is in fact running on top of the planet-lab network (a grid computing research network : http: //www. planet-lab. org/) Globule (Vrije University Amsterdam) • • http: //www. globule. org/ Globule is a module for the Apache Web server that allows a given server to replicate its documents to other Globule servers. Clients are automatically redirected to one of the available replicas. The project provide both content replication and HTTP or DNS based redirection mechanisms Peer-To-Peer : Concept, Tools and Applications Slide 17
p 2 p. Web P 2 PWeb - Project Components • A global server load balancing system – Two main functions • Load balance the traffic on the web servers • Provide failover = only send traffic on alive web servers • A distributed set of web server – And a set of tools to : • Publish content on the servers • Keep all servers in sync (replication mechanism) • Monitoring services Peer-To-Peer : Concept, Tools and Applications Slide 18
p 2 p. Web Global server load balancing • Load balancing – achieved using Round Robin DNS • simple system, with well known limits (http: //www. tenereillo. com/GSLBPage. Of. Shame. htm) • Failover – achieved by coupling a monitoring system (NAGIOS) with the DNS • DNS entries have short TTL (time to live) • NAGIOS monitors each web servers • When a server change state (for example DOWN) a special handler is called that update the DNS entry and reload the DNS • The failed server is no longer announced by the DNS To have a fully redundant system, we use 3 independents DNS (all primary), each running its own NAGIOS instance Peer-To-Peer : Concept, Tools and Applications Slide 19
p 2 p. Web GSLB : Failover illustrated Initial DNS entries : all server are up www 300 IN A 82. 66. 103. 28 www 300 IN A 195. 101. 152. 113 www 300 IN A 82. 232. 203. 167 www 300 IN A 66. 35. 250. 210 Server 195. 101. 152. 113 fails In the syslog trace, we can see : 22: 46 nagios: SERVICE ALERT: ns 1; HTTP-P 2 PWEB; CRITICAL; SOFT; 1; Connection refused by host 22: 23: 47 nagios: SERVICE ALERT: ns 1; HTTP-P 2 PWEB; CRITICAL; SOFT; 2; Connection refused by host 22: 24: 46 nagios: SERVICE ALERT: ns 1; HTTP-P 2 PWEB; CRITICAL; HARD; 3; Connection refused by host After 3 unsuccessfull try, a notification is send by email to the admin 22: 24: 46 nagios: SERVICE NOTIFICATION: nagios; ns 1; HTTP-P 2 PWEB; CRITICAL; notify-by-email; Connection refused by host The specific handler is called 22: 24: 47 nagios: SERVICE EVENT HANDLER: ns 1; HTTP-P 2 PWEB; CRITICAL; HARD; 3; http_p 2 pweb_handler And the DNS is reloaded 22: 24: 47 named[17379]: master/p 2 pweb. net. zone: 1: no TTL specified; using SOA MINTTL instead And now we can verify that the DNS entries are www 300 IN A 82. 66. 103. 28 ; www 300 IN A 195. 101. 152. 113 www 300 IN A 82. 232. 203. 167 www 300 IN A 66. 35. 250. 210 Failover time is : 2 or 3 minutes (NAGIOS) + DNS max TTL (here 5 minutes) = less than 10 minutes Peer-To-Peer : Concept, Tools and Applications Slide 20
p 2 p. Web GSLB : next steps Improvements : – Better service provisioning (manual process for now) – Better support for “long downtime” • When a server crash for a long period of time and then recovers its content may be outdated • We must not announce it back until it has re-synchronize itself – Proximity load balancing • The goal is to load balance traffic between geographically distributed servers by finding the site network-wise closest to clients. • A technology used in the CDN (Content Delivery Network) world We can use part of the globule project, as Globule support DNS redirection based on 'AS-path length' policy (used in BGP routing) which tries to redirect clients to a server close to them. These BGP information's can be collected through routeviews. org (no direct access to a BGP router needed) Peer-To-Peer : Concept, Tools and Applications Slide 21
p 2 p. Web server content management ADSL ISP 2 ADSL ISP 1 ADSL ISP 3 We have a set of web servers and we need tools to : – Publish content on all servers – Keep them in sync (content replication) Two main replication strategies • • primary backup : one master server to form replicas active replication : if any changes, one replica propagates them back to all the other ones Peer-To-Peer : Concept, Tools and Applications Slide 22
p 2 p. Web static content replication Replica ADSL ISP 2 Replica ADSL ISP 1 ADSL ISP 3 Replica Master One server play the master’s role – Content is published first on the master (for example via FTP) – Then the content is either pushed or pulled on the replica The easiest way is to use rsync (rsync. samba. org) Content can be pulled via anonymous rsync from master Content can be pushed via rsync over ssh (using private/public key pair for security) Peer-To-Peer : Concept, Tools and Applications Slide 23
p 2 p. Web Content replication : rsync is a file transfer program for Unix systems. rsync provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand. Anonymous rsync server (pull mode) • Run as a standalone daemon or can be launched by inetd • Advanced security options (read-only, chroot, IP access list) • Use : run from crontab on each mirror rsync -a master. mydomain. com: : www/ /data/www/ Rsync over SSH (push mode) • Need ssh access on each mirror • And ssh cryptographic keys exchange for unattended operation • Use : run on demand or from crontab on master rsync -a /data/www/ user@mirror. mydomain. com: : /data/www/ Useful options --compress --bwlimit=KBPS compress file data during the transfer limit I/O bandwidth; KBytes per second Peer-To-Peer : Concept, Tools and Applications Slide 24
p 2 p. Web Content distribution : Satellite For a lot of geographically distributed mirrors, an interesting solution can be Datacasting over satellite • Technology used by some CDN vendors – Skycache, cidera, Skystream. com, panamsat. com • Now available at lower cost from worldspace. fr (Sat. Post Solution) Peer-To-Peer : Concept, Tools and Applications Slide 25
p 2 p. Web Use of CMS Nowadays most webmasters use CMS (Content Management System) tools for publishing – A lot of open source and commercial tools • • Spip, mambo, typo 3, phpnuke, … (php) Bricolage, metadot, slashcode, … (perl) Cofax, opencms, magnolia, jahia, … (java) Plone, cps, zwook, … (python) • But none of them has direct support for a distributed architecture • Most use a database as a backstore • Database distributed transaction and replication is a hard problem Peer-To-Peer : Concept, Tools and Applications Slide 26
p 2 p. Web CMS : a pragmatic solution Replica webmaster ADSL ISP 2 ADSL ISP 1 Replica ADSL ISP 3 CMS Back office Replica html export Master : static html files The webmaster publish using the CMS as usual – The content is exported as static html files – Then distributed on the replicas using rsync Constraint : the CMS must support export with “static like URLs” Either directly or thru URL rewriting /article/sport/2005/4/13/football. html (good) /article. php? id_category=3&id_article=25 (bad for mirroring) Peer-To-Peer : Concept, Tools and Applications Slide 27
p 2 p. Web CMS : distributed architecture (1) ADSL ISP 2 Senegal Mali ADSL ISP 1 ADSL ISP 3 XML content exchange Ivory coast Burkina faso Example : a non-governmental organization has activity over 4 countries and want to provide a global web presence. The same global web design and tools are used on all servers. Local publishing Each local webmaster publish news about his country using the CMS on the local server Content exchange using web services Each local web server “collect” (pull) new articles from the other servers using some RSS ( Really Simple Syndication) web services Global web presence Global content is (re)constructed on each server (from all data from the others) and served on Internet Such solution may be constructed by hacking/customizing existing CMS Peer-To-Peer : Concept, Tools and Applications Slide 28
p 2 p. Web CMS : distributed architecture (2) CMS + Message-oriented middleware (MOM) A MOM is a client/server infrastructure that increases the interoperability, portability and flexibility of an application by allowing the application to be distributed over multiple heterogeneous platforms. Thru the use of queue system, a MOM can provide asynchronous reliable data exchange. MOM is typically asynchronous and peer-to-peer and supports – Point to point communication – Publish and subscribe communication There is a standardized interface in Java : JMS (java Message Service) API Various open source implementation in the java world Active. MQ (activemq. codehaus. org) Open. JMS (openjms. sourceforge. net) Joram (joram. objectweb. org) Manta. Ray (mantamq. org) No CMS use it now (as far as i know), but it may be a very good solution Peer-To-Peer : Concept, Tools and Applications Slide 29
p 2 p. Web Performance monitoring We collaborate with the webperf. org project – Web. Perf is a system for measuring response time of specified URLs from multiple locations on the internet. – The project is founded on the premise that there are lot of other companies who also require such a monitoring service. If the other companies are willing to monitor our URLs, we will montior theirs (a free co-peering arrangement). Some perl script installed on local node collect data from other web site, then data are pushed to a central repository for further analysis. A web interface allow members to display various statistics. A view of one’s web site as seen from all other the world. Peer-To-Peer : Concept, Tools and Applications Slide 30
p 2 p. Web Peer-To-Peer : Concept, Tools and Applications Webperf. org : sample graph (1) Slide 31
p 2 p. Web Peer-To-Peer : Concept, Tools and Applications Webperf. org : sample graph (2) Slide 32
p 2 p. Web Peer-To-Peer : Concept, Tools and Applications Webperf. org : sample graph (3) Slide 33
p 2 p. Web Node architecture and security Security Internet Mandatory • Hardware router/firewall with NAT capabilities • Internal private network using RFC 1918 IP address (192. 168. x. y) ADSL or Cable modem No incoming traffic from the outside other than required Controlled via redirect on the firewall Ethernet link • http (port 80) • ssh (port 22, optional) Ethernet router/firewall Optional Wifi access point P 2 pweb traffic Private Ethernet LAN Web server Peer-To-Peer : Concept, Tools and Applications Slide 34
p 2 p. Web Node hardware (example) Run on the corner of a desk • An ethernet and wifi switch Connect other computers (not shown here) • A web and application server Mac mini (apple) running apache 2 and tomcat • A firewall Embedded PC (www. pcengines. ch) running pf (packet filter) on Open. BSD from a compact flash No noise, and low electric power consumption (near 50 W) Peer-To-Peer : Concept, Tools and Applications Slide 35
p 2 p. Web Conclusion • It can be done (at low cost) • It runs, with good results (service uptime measured by siteuptime. com) www. p 2 pweb. net hosted by the p 2 pweb network monitored Since: 9/23/2004 Outages: 40 Total Uptime: 99. 560% Downtime/year: 38, 5 hours www. afromix. org hosted on a single node monitored Since: 9/23/2004 Outages: 37 Total Uptime: 97. 634% Downtime/year: 207, 3 hours • Still a lot of improvements Not already an easy to use solution : node admin still require good Unix knowledge • Most important : a new way to design web applications Peer-To-Peer : Concept, Tools and Applications Slide 36
p 2 p. Web The Future What we can provide right now P 2 pweb. net : a global load balancing solution for any distributed web project • Just provide the servers IP addresses and a health check URL Mediaport. net : a Community web hosting solution • We can host various web projects We are looking for Partnerships in the following domains : Packaging an easy and ready to use solution for deploying web mirrors (industrializing the solution) • dedicated LINUX or BSD Distro with preinstalled packages • “all in one” solution : Java CMS + MOM in one webapp application Helping in deploying such solution in Least Developed Countries The P 2 PWeb Solution fits perfectly for Least Developed Countries with weak bandwidth and low connectivity, Peer-To-Peer : Concept, Tools and Applications Slide 37
p 2 p. Web Contacts P 2 pweb is a Source. Forge project (bsd license) www. p 2 pweb. net or mediaport. sourceforge. net Contacts : about the project : fgaillard@w 3 architect. com you want to be hosted on mediaport. net : fabrice. gaillard@mediaport. net pierre. genillon@mediaport. net Peer-To-Peer : Concept, Tools and Applications Slide 38
p 2 p. Web Questions Thank you • Questions ? Peer-To-Peer : Concept, Tools and Applications Slide 39
ac79de69b04e02cfa600650079cfe6da.ppt