Evaluating large language models trained on code. arXiv 2021
Mark Chen, Jerry Tworek, Heewoo Jun, Q Yuan, HPdO Pinto, J Kaplan, H Edwards, Y Burda, N Joseph, G Brockman, A Ray, R Puri, G Krueger, M Petrov, H Khlaaf, G Sastry, P Mishkin, B Chan, S Gray, N Ryder, M Pavlov, A Power, L Kaiser, M Bavarian, C Winter, P Tillet, FP Such, D Cummings, M Plappert, F Chantzis, E Barnes, A Herbert-Voss, WH Guss, A Nichol, A Paino, N Tezak, J Tang, I Babuschkin, S Balaji, S Jain, W Saunders, C Hesse, AN Carr, J Leike, J Achiam, V Misra, E Morikawa, A Radford, M Knight, M Brundage, M Murati, K Mayer, P Welinder, B McGrew, D Amodei, S McCandlish, I Sutskever, W Zaremba
2021-01-01
Abstract: