A method for optimizing the execution of a text data conversion script

Jiang Dawei,Chen Ke,Wei Jiarong,Shou Lidan,Chen Gang,Hu Tianlei,Wu Sai
2018-01-01
Abstract:The invention discloses an execution optimization method of a text data conversion script. Aiming at the text data conversion scripts executed by distributed network processing, the text data conversion scripts are parsed and the execution plan tree is generated; tuple-based multisets are used as the data model of text data, the text data transformation script contains the data operations of modifying and transforming the structure and content of the multisets; according to the different execution scenarios of the conversion script, the corresponding execution optimization method is adopted; according to the optimized execution plan, a logical program is generated to process and run, so as to efficiently transform and process the data on the big data platform. The method of the invention can be applied to the processing of massive text data in the data preparation stage, and can effectively reduce the space-time cost of the text data conversion script during execution and improve the efficiency of the data preparation stage by applying the execution optimization method oriented to the text data conversion script.
What problem does this paper attempt to address?