Tpnfs: Efficient Support of Small Files Processing over Pnfs

Bo Wang,Jinlei Jiang,Guangwen Yang
DOI: https://doi.org/10.1109/ipdpsw.2013.36
2013-01-01
Abstract:Large scale data-intensive applications that consume and produce terabytes or even pet bytes of data raise an ever increasing demand on I/O bandwidth. In order to meet this demand, NFSv4 architects design parallel NFS (pNFS), an NFS extension allowing clients to read/write data from/to multiple data servers in parallel. Though pNFS can support large files processing efficiently, we found that it has deficiency in processing small files. Unfortunately, small files dominate for a large number of applications in scientific computing environments. To deal with the problem, this paper presents tpNFS, an extension to pNFS that adds a transport driver to the pNFS metadata servers to make data of files, no matter small files or large files, be stripped more evenly onto multiple data servers. Our experiments with booting DomU clients from tpNFS and manipulating a large number of files show that tpNFS has better performance than pNFS for small files processing, especially when many clients read/write concurrently. As for large files processing, tpNFS introduces nearly no overhead when compared with pNFS.
What problem does this paper attempt to address?