Penn CIS 400 - Combining Coding and Block Schemes for P2P Transmission

Unformatted text preview:

Combining Coding and Block Schemes for P2PTransmissionBen [email protected] [email protected]. Zach [email protected]. Sanjeev [email protected] 14, 2006AbstractWe introduce a new method for the distribution of large files in apeer to peer network based upon a combination of rateless co des, networkco ding, and blo ck transmission. Each of these schemes can contribute ina different way towards the collaborative transfer of a file. We believea combination of the above schemes will improve the download rate, aseach scheme creates a unique bottleneck at some point during the trans-mission. By using randomization and rateless codes to generate encodingsof information during a file transfer, the receiver will always receive usefulinformation. As peers containing large amounts of information drop outof the network, there is generally a problem gaining information aboutcertain rare pieces of a file. Due to the fact that we use randomized en-co dings, we attempt to eliminate this problem by ensuring that there isinformation to be gained about every piece of a file.We develop a protocol, perform analysis on its properties, and imple-ment it such that we can run tests in a real network environment. Thenetwork coding and rateless codes will ensure that the file transfers willprogress even if people frequently leave or join the network during trans-mission. By using encodings we attain a far more robust network in whichrare pieces fail to exist. We modify the protocol of an existing p2p clientin order to perform these measurements.1 Current ApproachesIn recent years peer-to-peer file transfer has gained immense popularity. Fromprograms such as Napster which were originally intended to share music files, toprograms such as Kaz aa and Gnutella which were more varied in content, usershave willingly transfered increasingly large files to one another. Only recentlyhave these collaborative methods of file transfer started to compete with thesimple server-client method for many types of files, such as Linux CD images.1Today the uses for large scale file transfers over networks are more prevalentthan ever, and range from s oftware patch distribution to video on demand. Inp2p environments, files were originally transfered from one user to another, butnow schemes have evolved in which many users with the sam e file, or pieces of it,can all actively aid in the transmission to another user, such that his downloadspeed bec omes the only bottleneck.1.1 BitTorrent: A Block Transfer SchemeRecently, the system known as BitTorrent [1] has emerged as one of the mostcommonly used p2p programs on the internet. BitTorrent has many advantages,such as the fact that the central server does not need to communicate with theend downloaders at all after the original handshaking process. Also, BitTorrentuses as tit-for-tat trading mechanism which specifies that the downloader cannot get access to new pieces of the file he is downloading unless he is alsocontributing to the network as an uploader. This virtually eliminates freeriding.Another advantage is that even if the initial server goes offline, the file transfersto other users can still continue as long as one user w hich has the entire filecontinues to act as a “seed” to the system. BitTorrent has become so popular,in fact, that it made up over 50% of p2p traffic on the internet in June 2004 [2].Although BitTorrent works extremely well in practice, it is possible to dobetter. The problem with BitTorrent is that it utilizes block transfer, meaningthat it breaks the file into k blocks, and transfers these individually. Thus, thereare only k useful transmissions in this network. As users receive these blocksthey may then transfer them on to other users per their request. Often, a setof blocks can become exceedingly rare in the network, especially when the peergroup lacks seeds. These rare blocks then become the bottleneck causing theend of a file transfer to take an unnecessarily long time. BitTorrent attempts tohandle this with a “rarest first” block transfer policy. Furthermore, BitTorrentalso has an “endgame” mode built in to the software, specifically designed toaid in the download of the last few pieces. According to Cohen, this is neededbecause of the tendency for the last few blocks to be downloaded from a “singlehosed modem line”The rarest first policy is hindered by the fact that the rareness of blockschanges over time. We introduce a method of removing the idea of a rare block.By incorporating coding techniques, there can be many more than k blocks, ofwhich only k are required. This helps reduce the need for both the rarest firstpolicy and the explicit endgame transfer mode.1.2 Avalanche: A Random Coding Based SchemeA good solution to this problem is to introduce a randomized encoding schemeand transfer encoded pieces of data based upon the original file rather thantransferring the original blocks themselves. In this approach, the original datais still broken down into blocks. A rateless encoding is then used to generate2as many encoding symbols as are required (a nearly infinite amount are avail-able). A common scheme which is used to encode packets is to simply take theexclusive-or of any number of the original blocks of data. These XORed encod-ing symbols can then be thought of as a system of linear equations, of whichthe user only needs to receive enough encoding symbols to solve the system inorder to recover the original file. A very powerful, and efficient scheme is theDigital Fountain [3] approach, in which it is proven that with high probability,a downloader only needs to receive k ∗ (1 + δ) encoding symbols to decode theoriginal data of size k, where δ is small.Taking Digital Fountain one step further, a new system has been proposedcalled Avalanche [4] which appears to be the first use of rateless codes in ap2p setting in which a downloader receives data from many different users onthe network. Avalanche uses the Digital Fountain approach combined withnetwork coding techniques to achieve a significant improvement over BitTorrentin simulation. The fact that users can now receive encoded information, fromwhich they can find out a little bit about multiple blocks at once, leads to muchfaster file transfers. Unfortunately, the network coding techniques which areused in the Avalanche system can lead to redundant data being sent to thedownloader time and time again. After a while, the


View Full Document

Penn CIS 400 - Combining Coding and Block Schemes for P2P Transmission

Download Combining Coding and Block Schemes for P2P Transmission
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Combining Coding and Block Schemes for P2P Transmission and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Combining Coding and Block Schemes for P2P Transmission 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?