Improving pix2code based Bi-directional LSTM

Yanbin Liu,Qidi Hu,Kunxian Shu
DOI: https://doi.org/10.1109/auteee.2018.8720784
2018-11-01
Abstract:Pix2code is a framework based on deep learning to transform a graphical user interface screenshot created by the designer into computer coder with 77% of accuracy. The architecture is based on CNN and LSTM.LSTM has been broadly applied to natural language processing about language model, which is both general and effective at capturing long-term dependencies. However, the standard LSTM predicting in time sequence ignores the contextual information of the future, but sometimes it is not enough just to look at the previous word. Computer code is a relative spatial relationship and not only needs to recognize token but also fully understands the structure of all sequences. In order to solve the problem, the pix2code model is optimized by Bidirectional LSTM, which allows the output layer to get complete past and future context information for each point in the input sequence. The model’s transforming accuracy in the test set has been significantly improved reaching 85%.
What problem does this paper attempt to address?