Optimal Scheduling of Peer-to-Peer File Dissemination

Jochen Mundinger,Richard R. Weber,Gideon Weiss
DOI: https://doi.org/10.48550/arXiv.cs/0606110
2006-06-30
Abstract:Peer-to-peer (P2P) overlay networks such as BitTorrent and Avalanche are increasingly used for disseminating potentially large files from a server to many end users via the Internet. The key idea is to divide the file into many equally-sized parts and then let users download each part (or, for network coding based systems such as Avalanche, linear combinations of the parts) either from the server or from another user who has already downloaded it. However, their performance evaluation has typically been limited to comparing one system relative to another and typically been realized by means of simulation and measurements. In contrast, we provide an analytic performance analysis that is based on a new uplink-sharing version of the well-known broadcasting problem. Assuming equal upload capacities, we show that the minimal time to disseminate the file is the same as for the simultaneous send/receive version of the broadcasting problem. For general upload capacities, we provide a mixed integer linear program (MILP) solution and a complementary fluid limit solution. We thus provide a lower bound which can be used as a performance benchmark for any P2P file dissemination system. We also investigate the performance of a decentralized strategy, providing evidence that the performance of necessarily decentralized P2P file dissemination systems should be close to this bound and therefore that it is useful in practice.
Networking and Internet Architecture,Data Structures and Algorithms,Optimization and Control
What problem does this paper attempt to address?
The problem that this paper attempts to solve is: How to optimize the scheduling problem in peer - to - peer (P2P) file distribution systems to minimize the time required to fully distribute a file from the server to all end - users. Specifically, the author hopes to provide a theoretical lower bound on performance as a performance benchmark for any P2P file distribution system and explore the performance of decentralized strategies. ### Background and Problem Description of the Paper In peer - to - peer (P2P) overlay networks, such as BitTorrent and Avalanche, file distribution is usually achieved in the following ways: - The file is divided into many equal - sized parts. - Users can download each part from the server or other users who have already downloaded these parts (for network - coding - based systems such as Avalanche, linear - combination parts can also be downloaded). However, most of the previous performance evaluations of these systems were limited to relative comparisons and mainly relied on simulations and measurements. This paper provides an analytical solution based on a new version of the broadcast problem - the uplink sharing model, thus providing a theoretical basis for the performance of P2P file distribution systems. ### Main Contributions 1. **Theoretical Analysis**: The author provides a new uplink sharing model and proves that in the case where all users have the same upload capacity, the minimum distribution time is the same as the broadcast problem in the simultaneous send / receive model. 2. **Solutions in the General Case**: For the case of different upload capacities, the author provides a mixed - integer linear programming (MILP) solution and a fluid - limit solution. 3. **Performance of Decentralized Strategies**: The author studies a decentralized random strategy and shows that even under this simple strategy, the expected time for file distribution is similar to the minimum time under centralized control, thus proving the practical usability of decentralized P2P systems. ### Mathematical Formulas - The minimum distribution time \(T^*\) is: \[ T^*=\frac{1+\lfloor\log_2 N\rfloor}{M} \] where \(M\) is the number of file parts and \(N\) is the number of end - users. ### Conclusion This paper not only provides a theoretical lower bound on performance but also verifies the effectiveness of decentralized strategies, which is of great significance for designing efficient and practical P2P file distribution systems.