Abstract:Categorical variables have no intrinsic ordering, and researchers often adopt a fixed-effect (FE) approach in empirical analysis. However, this approach has two significant limitations: it overlooks textual labels associated with the categorical variables; and it produces unstable results when there are only limited observations in a category. In this paper, we propose a novel method that utilizes recent advances in large language models (LLMs) to recover overlooked information in categorical variables. We apply this method to investigate labor market mismatch. Specifically, we task LLMs with simulating the role of a human resources specialist to assess the suitability of an applicant with specific characteristics for a given job. Our main findings can be summarized in three parts. First, using comprehensive administrative data from an online job posting platform, we show that our new match quality measure is positively correlated with several traditional measures in the literature, and at the same time, we highlight the LLM's capability to provide additional information conditional on the traditional measures. Second, we demonstrate the broad applicability of the new method with a survey data containing significantly less information than the administrative data, which makes it impossible to compute most of the traditional match quality measures. Our LLM measure successfully replicates most of the salient patterns observed in a hard-to-access administrative dataset using easily accessible survey data. Third, we investigate the gender gap in match quality and explore whether there exists gender stereotypes in the hiring process. We simulate an audit study, examining whether revealing gender information to LLMs influences their assessment. We show that when gender information is disclosed to the GPT, the model deems females better suited for traditionally female-dominated roles.Institutional subscribers to the NBER working paper series, and residents of developing countries may download this paper without additional charge at www.nber.org.

Recovering Overlooked Information in Categorical Variables with LLMs: An Application to Labor Market Mismatch

The Unequal Opportunities of Large Language Models: Revealing Demographic Bias through Job Recommendations

JobFair: A Framework for Benchmarking Gender Hiring Bias in Large Language Models

LABOR-LLM: Language-Based Occupational Representations with Large Language Models

Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias

Hire Me or Not? Examining Language Model's Behavior with Occupation Attributes

Perceived warmth and competence predict callback rates in meta-analyzed North American labor market experiments

Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?

Are Emily and Greg Still More Employable than Lakisha and Jamal? Investigating Algorithmic Hiring Bias in the Era of ChatGPT

Unboxing Occupational Bias: Grounded Debiasing of LLMs with U.S. Labor Data

Different Bias Under Different Criteria: Assessing Bias in LLMs with a Fact-Based Approach

Incongruent Skills and Experiences in Online Labor Market.

Aligning with Whom? Large Language Models Have Gender and Racial Biases in Subjective NLP Tasks

Are LLMs Rational Investors? A Study on Detecting and Reducing the Financial Bias in LLMs

"You Gotta be a Doctor, Lin": An Investigation of Name-Based Bias of Large Language Models in Employment Recommendations

A Study of Implicit Ranking Unfairness in Large Language Models

Locating and Mitigating Gender Bias in Large Language Models

The Mismeasure of Man and Models: Evaluating Allocational Harms in Large Language Models

Facilitating Multi-Role and Multi-Behavior Collaboration of Large Language Models for Online Job Seeking and Recruiting

Latent Ability Model: A Generative Probabilistic Learning Framework for Workforce Analytics

Prejudice and Volatility: A Statistical Framework for Measuring Social Discrimination in Large Language Models