Jigsaw W 3 C s Java Web Server CHEN

Jigsaw W 3 C’s Java Web Server CHEN Ge CSIS, HKU March 9, 2000.

What is Jigsaw? • Jigsaw is a Web Server developed by W 3 C. The source code is of version 2. 0. 4. • Jigsaw is written in pure Java, and the document of Jigsaw claims “Jigsaw will run on any platform that supports Java, with no changes”

Jigsaw’s Internal Design • Basic Concepts in Jigsaw – Resource – Frame – Filter – Indexer

Jigsaw’s Internal Design – A Resource is a full Java object, containing only information that the raw Resource (a file, a directory. . . ) can provide (e. g. , for a file, the size, last modification date. . . )

Jigsaw’s Internal Design – A Frame is a full Java Object, containing all the information needed to serve this Resource using a specific Protocol (e. g. , HTTPFrame for HTTP). Jigsaw attaches Frames to Resources to handle protocol related activities.

Jigsaw’s Internal Design – A Filter is a full Java Object, associated to a Frame, that can modify the Request and/or the Reply. For example the Authentication is handled by a special filter.

Jigsaw’s Internal Design – A Indexer is also a full Java Object, which tries to create and setup some resource automatically. The resources can be created depending on their name or their extension. Once the resource has been created, the Indexer is also in charge of attaching the right frames to this resource, like the HTTP frame, the filters and so on.

Jigsaw’s Internal Design • A sample File. Resource:

Jigsaw’s Internal Design • The inheritance tree of Jigsaw:

Jigsaw’s Internal Design • The Resource. Store. Manage – Jigsaw use a Resource. Store. Manage to manage all the resource used in the runtime of the server. – The Resource. Store. Manage’s implementation use some simple cache policy to make the resource reference more efficient.

Jigsaw’s Internal Design • The Resource. Store. Manage – It seems that, if we start multiple Jigsaws on a machine, all the servers will share the same Resource. Store. Manage.

How Jigsaw Works? • Jigsaw utilize Java’s thread extensively. All the major classes of Jigsaw are running as Java Threads. • Jigsaw separates serving document into two different processing stages – Indexing Stage – Serving Stage

How Jigsaw Works? • Separate serving document into two stages make the resource lookup and resource sharing more efficent.

How Jigsaw Works? • Using an sample http request handling process in Jigsaw to explain how Jigsaw works, and view some important classes in Jigsaw.

How Jigsaw Works? • When Jigsaw starts up – An instance of the class httpd created. httpd

How Jigsaw Works? Indexer Initialize() httpd manager httpd root

How Jigsaw Works? • After initialization, the httpd call the initialize. Server. Socket() to create the server socket and the Client. Socket. Facotry which is a pool of Socket. Client. Indexer manager httpd root Indexer initialize. Server. Socket() manager httpd root factory Server port

How Jigsaw Works? • After create the server socket successfully, httpd creates a thread, assigns itself to the thread, and runs as a thread. Indexer manager httpd root factory this. thread = new Thread (this); ; …; this. thread. run(); Indexer manager httpd root factory

How Jigsaw Works? public void run () { … while ( ( ! finishing) && ( socket != null ) ) { Socket ns = null ; try { ns = socket. accept() ; ns. set. Tcp. No. Delay(true); } catch (IOException e) { … } if ( (socket != null) && (ns != null) && (factory != null) ) factory. handle. Connection (ns) ; } // Our socket has been closed, perform associated cleanup(restarting) ; } Indexer manager httpd root factory

How Jigsaw Works? • When there’s an incoming connection request, the httpd thread uses the client pool to handle the request. /archives/index. html Indexer manager httpd root factory handle. Connection() Socket. Client. Factory

How Jigsaw Works? • Socket. Client. Factory maintains a pool of Socket. Clients. It first either finds a free Socket. Client thread in the pool, if available, or kills some old connections to get a free Socket. Client, if the load have not exceeds the max-load.

How Jigsaw Works? • When the Factory find a free Socket. Client, it bind the incoming socket to the Socket. Client, and the Socket. Client will starts a thread to perform the request. • Because all the Socket. Clients run as threads, when a connection is bound to a Socket. Client, the server can keep on listening on the net for new incoming requests.

How Jigsaw Works? • Then the Socket. Client will process the incoming request, and calls the httpd’s perform(Request) method to lookup for the necessary resources. The perform method will return back a object containing all the resources needed to reply the request. Soket. Client uses this object to send back the request result.

How Jigsaw Works? • After the request is replied, the Socket. Client close the connection and return itself to the free Socket. Client pool in the Socket. Client. Factory.

Some Notices on Jigsaw • Jigsaw uses a thread pool (cache) to handle incoming request, instead creates a new process or thread to deal with it. • Jigsaw caches requested resources in a resource hashtable according to the resource id, in order to reduce file system access.

Some Notices on Jigsaw • When the httpd is looking up the requested resource, if it is a read only request, it will only increase the lock number (reference number) of that object, so one resource can be accessed simultaneously by multiple request.

Some Notices on Jigsaw • Jigsaw also uses some LRU algorithm to discard long-unused resources in memory. And also kills long idle connection and Socket. Client to decrease the server’s work load.

Some Notices on Jigsaw • It does not provide direct configuration method to run it on multi-machines cooperatively. But its internal design makes it possible to replace or add some of its classes to gain new functions.

End of Jigsaw

High-performance Web Servers in other institutes • JAWS---An Web Server in CS Dept. , Washington University – JAWS’s findings: • “Factoring out I/O, the primary determinant to server performance is the concurrency strategy” • “For single CPU machines, single-threaded solutions are acceptable and perform well. However, they do not scale for multi-processor platforms. “

High-performance Web Servers in other institutes – JAWS’s findings: • Process-based concurrency implementations perform reasonably well when the network is the bottleneck. However, on high-speed networks like ATM, the cost of spawning a new process per request is relatively high. • Multi-threaded designs appear to be the choice of the top Web server performers. The cost of spawning a thread is much cheaper than that of a process.

High-performance Web Servers in other institutes – JAWS’ Framework Overview:

High-performance Web Servers in other institutes – “. . . the key to developing high performance Web systems is through a design which is flexible enough to accommodate different strategies for dealing with server load and is configurable from a high level specification describing the characteristics of the machine and the expected use load of the server. ” – More on JAW: • http: //www. cs. wustl. edu/~jxh/research/resear ch. html

High-performance Web Servers in other institutes • Scalable Web Server Architecture from Lucent & UT Austin

High-performance Web Servers in other institutes • Scalable Web Server Architecture from Lucent & UT Austin – Redirection Server is used – Data is distributed among the servers – Problem occurs on constantly moving docs

High-performance Web Servers in other institutes • Web Server Clusters in UCSB – Master/Slave Architecture

High-performance Web Servers in other institutes • Web Server Clusters in UCSB – Masters handles static requests. – Dynamic content requests may be processed locally at masters, or redirected to a slave node or another master. – Slaves may be either dedicated or nondedicated.