Content Classification Tasks with Data Preprocessing Manifestations

Mamoona Anam,Dr. Kantilal P. Rane,Ali Alenezi,Ruby Mishra,Dr. Swaminathan Ramamurthy,Ferdin Joe John Joseph
DOI: https://doi.org/10.14704/web/v19i1/web19094
2022-01-20
Webology
Abstract:Deep reinforcement learning has a major hurdle in terms of data efficiency. We solve this challenge by pretraining an encoder with unlabeled input, which is subsequently finetuned on a tiny quantity of task-specific input. We use a mixture of latent dynamics modelling and unsupervised goal-conditioned RL to encourage learning representations that capture various elements of the underlying MDP. Our approach significantly outperforms previous work combining offline representation pretraining with task-specific finetuning when limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience) and compares favourably with other pretraining methods that require orders of magnitude more data. When paired with larger models and more diverse, task-aligned observational data, our methodology shows great promise, nearing human-level performance and data efficiency on Atari in the best-case scenario.
What problem does this paper attempt to address?