Lightweight Workflow Engine Based On Hadoop And Osgi

Shengmei Luo,Lixia Liu,Juan Yang,Di Zhang
DOI: https://doi.org/10.1109/ICBNMT.2013.6823954
2013-01-01
Abstract:The growth of data used by data-intensive computations has out-paced the growth of the power of a single processor. In this paper, we propose a high performance and scalability workflow engine, a lightweight parallel and distributed computing platform based on HADOOP and OSGI. Workflow engine provides flexible and scalable OSGI-based interfaces which user can implement to define data processing functions and expand functions in HADOOP ecosystem. Workflow engine consists of invoking interfaces, scheduler engine, optimizer and fault-tolerance manager. Given a workflow process, the engine analyzes data dependencies among nodes, then dispatches them to Map Reduce clusters based on the current status of the system. This paper also describes the design and strategy in detail on implementation processes, key points, relationships of each participants and etc. Our experiment demonstrates that workflow engine can significantly improve the performance of workflow execution.
What problem does this paper attempt to address?