Dynamic Binary Parallelization
Jing Yang
2011-01-01
Abstract:A large and important base of existing software is being left behind by emerging microprocessor architectures. Recently, fundamental issues in microprocessor technologies have led designers to increase the number of cores on a chip instead of increasing its single-threaded performance. Many-core designs with 4 to 8 cores are ubiquitous, and trends suggest that core counts will continue to grow for the foreseeable future [29, 36]. Unfortunately, most existing software is designed for single-core processors, and is therefore unable to fully exploit the increased processing power offered by many-core processors. This existing software base represents years and sometimes decades of investment. One solution to the problem is program parallelization; however, state-of-the-art parallelization technologies are not always practical for existing software. Many existing techniques require source code to be rewritten using parallel languages [15, 33] or libraries [35, 88], but this is often impractical due to cost: efforts to analyze, fix, and test existing software due to the Y2K bug alone were estimated to have cost about $20 billion in the 1990’s [78], and rewriting code to find opportunities for parallelism would be a much larger task. Alternatively, automatic parallelization techniques do not require code to be rewritten, but they typically do require access to the source code for analysis. In many cases, all or some of the source code and development tool chain may be lost or, in the case of third-party software, never available. Furthermore, software systems often involve components written in different programming languages, which makes cross-module parallelization difficult, if not impossible. Some parallelization techniques do not require source code and analyze the binary executable directly [19, 89, 96], but even these techniques are applied statically and so cannot parallelize across dynamically linked executables and libraries, which are not known until run time and can change or be upgraded. To address these problems, the proposed research will answer the question: