WebTorrent a BitTorrent Extension for High Availability Servers Gary Sivek Steven Sivek Jonathan Wolfe Michael Zhivich gsivek mit edu ssivek mit edu jwolfe mit edu mzhivich mit edu May 6 2004 Abstract Achieving content high availability is one of the most important goals of a webserver system In order to achieve high availability in the traditional client server setting the server must have the bandwidth and the hardware needed to handle any peak load that might occur However this is a very costly and rarely practical solution especially for most non commercial servers subjected to the Slashdotting effect We propose a WebTorrent system based on BitTorrent that will leverage the resources of the clients to help the server make the content more available Such a system will help alleviate the load on the server and reduce client download times 1 1 1 Vision WebTorrent Overview In this paper we describe WebTorrent our system to improve delivery of web sites in high demand It leverages the BitTorrent infrastructure in order to provide faster download speeds for clients requesting very popular web sites and to alleviate the heavy load on these web servers The system consists of plugins for both the client and the server in our implementation we use Mozilla as the client browser of choice and Apache as the server of choice The success of the system relies on the willingness of clients to maintain small caches of bundled web sites for other clients to retrieve using BitTorrent However the overhead associated with the added infrastructure should not be too significant to reduce server performance during periods of low load 1 2 Design Choice We considered all three approaches to improving the flow of web traffic client side software only server side software only and coordinated client side and server side software The first two approaches have the advantage of being easy to deploy since they require work either on the part of a single server administrator or a handful of web surfers but not both However serverside only approaches tend to involve increasing hardware bandwidth or otherwise spending money and none of these options are desirable or immediately available to the average home broadband server Client only approaches tend to involve distributed caching which has legal implications makes it harder for a server administrator to change the content reliably and requires extra work for the client to determine where the cached content exists We feel that implementing software on both ends is the most elegant solution allowing easy administration on the server end and no more than a one time install on the part of the willing clients Moreover as our tests demonstrate performance benefits are seen even if a small fraction of clients use WebTorrent enabled browsers thus it is to the advantage of both the server administrator and the clients to use our system 1 1 3 Key Challenges for WebTorrent The BitTorrent infrastructure is currently designed to handle large files a single transfer chunk is 256KB The majority of web site requests involve HTML files and images that are much smaller than this minimum BitTorrent chunk Because of the new proposed use of BitTorrent the WebTorrent system must appropriately adapt the BitTorrent protocol to be useful for small requests while maintaining performance acceptable for browsing the web We propose a scheme for bundling appropriate HTML files and images together in order to reduce the combined overhead of invoking BitTorrent for many small files In addition we configure BitTorrent to use smaller chunks of 16KB when transferring webpage bundles While doing so adds to the total overhead this strategy ensures a reasonable number of chunks in a bundle such that BitTorrent benefits are not lost Backwards compatibility must be maintained such that clients without the WebTorrent plugin can still be served by a WebTorrent enabled server In order to achieve this goal WebTorrent enabled servers simply continue to serve content in traditional fashion for such clients In fact WebTorrent strives to reduce the load even further by making use of clients who volunteer to also act as backup servers Clients that do not support WebTorrent can then be redirected to those alternative servers within reason to shed more load 2 2 1 Motivation The Slashdot Effect Our motivation for this system is the common Slashdot effect This effect occurs when a server that cannot handle large amounts of traffic is bombarded with visitors because it is hosting a website that has been linked from an immensely popular site like Slashdot 11 Our goal is to create a system that uses BitTorrent to deliver such high demand web sites both to increase performance for clients who sometimes cannot even retrieve at any speed a copy of the desired web site and to reduce load on such Slashdotted servers to keep them from shutting down Slashdot itself should not solve this problem on its own by caching such web content Sites that generate revenue from advertisements would prefer that clients load their site directly rather than a cached copy from Slashdot More importantly Slashdot would then have to ensure that it does not hold a stale copy of the site in its cache 12 2 2 BitTorrent and Its Limitations BitTorrent works by decentralizing the download process for clients Instead of fetching an entire large and popular file Linux kernel source for example from a single server BitTorrent clients download just a small torrent file which contains information about the tracker and the pieces of the desired file The clients then retrieve a list of peers from the tracker which acts as a coordinator and keeps information about which pieces of the file each peer has Once the client has a list of peers it can exchange file pieces with these peers without communicating to the original server The client also keeps the tracker informed about the pieces that it already downloaded and retrieves a new list of peers after some time interval Thus BitTorrent is spreading out the load among clients effectively taking the load away from the original webserver and providing clients with a potentially faster way to download that large file While BitTorrent succeeds in its goal distributing the server load among the clients it suffers from several robustness problems First all clients download the torrent information file from the original server called the directory server While the torrent file is small it is possible that the
View Full Document
Unlocking...