Exploring the Capabilities of Large Language Models for Generating Diverse Design Solutions

Kevin Ma,Daniele Grandi,Christopher McComb,Kosa Goucher-Lambert
2024-05-02
Abstract:Access to large amounts of diverse design solutions can support designers during the early stage of the design process. In this paper, we explore the efficacy of large language models (LLM) in producing diverse design solutions, investigating the level of impact that parameter tuning and various prompt engineering techniques can have on the diversity of LLM-generated design solutions. Specifically, LLMs are used to generate a total of 4,000 design solutions across five distinct design topics, eight combinations of parameters, and eight different types of prompt engineering techniques, comparing each combination of parameter and prompt engineering method across four different diversity metrics. LLM-generated solutions are compared against 100 human-crowdsourced solutions in each design topic using the same set of diversity metrics. Results indicate that human-generated solutions consistently have greater diversity scores across all design topics. Using a post hoc logistic regression analysis we investigate whether these differences primarily exist at the semantic level. Results show that there is a divide in some design topics between humans and LLM-generated solutions, while others have no clear divide. Taken together, these results contribute to the understanding of LLMs' capabilities in generating a large volume of diverse design solutions and offer insights for future research that leverages LLMs to generate diverse design solutions for a broad range of design tasks (e.g., inspirational stimuli).
Human-Computer Interaction,Artificial Intelligence
What problem does this paper attempt to address?
### The Problem the Paper Attempts to Solve This paper aims to explore the capability of large language models (LLMs) in generating diverse design solutions. Specifically, the researchers hope to enhance the diversity of generated design solutions by adjusting LLM parameters and using different prompt engineering techniques. The main research questions of the paper include: 1. **The impact of parameter tuning on the diversity of design solutions**: How do different parameter settings affect the diversity of design solutions generated by LLMs? 2. **The impact of prompt engineering techniques on the diversity of design solutions**: How do different prompt engineering methods affect the diversity of design solutions generated by LLMs? ### Research Background In the early stages of the design process, diverse inspirational stimuli are crucial for sparking creativity and innovation. Past research has primarily obtained a large number of design examples through crowdsourcing, but this method has some limitations, such as most crowdsourcing workers potentially lacking professional skills, resulting in generated design solutions that are neither practical nor feasible. In recent years, with the development of large language models (such as GPT-4), these models have shown great potential in generating high-quality design solutions. However, the solutions generated by LLMs are often less diverse than those generated by humans, which is a significant challenge in design tasks that require novelty and diversity. ### Research Methods 1. **Data Collection**: - **Crowdsourced Design Solutions**: Collected 100 design solutions for each design problem from the Amazon Mechanical Turk platform. - **LLM-Generated Design Solutions**: Generated 4000 design solutions using GPT-4, covering five different design themes, eight parameter combinations, and eight prompt engineering techniques. 2. **Parameter Tuning**: - Adjusted the temperature and Top-P parameters, setting them at low, medium, and high levels to explore their impact on the diversity of generated solutions. 3. **Prompt Engineering Techniques**: - **Zero-Shot Prompting**: Directly asked the LLM to generate design solutions. - **Few-Shot Prompting**: Provided some examples before generation to guide the LLM in generating higher quality solutions. - **Critique Prompting**: Asked the LLM to elaborate and explain its generated solutions in detail to increase the detail and diversity of the solutions. - **Adjective and Phrase Enhancement**: Included adjectives like "novel," "unique," "creative," or phrases like "you are a design expert" in the prompts to inspire the LLM to generate more diverse solutions. 4. **Diversity Evaluation**: - Used various computational metrics (such as Determinantal Point Process, nearest generated sample distance, convex hull volume, and average distance to centroid) to quantify the diversity of the generated design solutions. ### Main Findings - **Human-Generated Solutions Typically Have Higher Diversity Scores**: Across all design themes, human-generated solutions consistently had higher diversity scores than LLM-generated solutions. - **Impact of Parameter Tuning and Prompt Engineering Techniques**: Certain parameter settings and prompt engineering techniques can significantly enhance the diversity of LLM-generated solutions, but the effects vary by design theme. - **Semantic Differences**: Through logistic regression analysis, it was found that in some design themes, there are significant semantic differences between human and LLM-generated solutions, while in other themes, there are no significant differences. ### Conclusion This study provides important insights into the potential of LLMs in generating diverse design solutions and offers valuable references for future research on utilizing LLMs to generate diverse design solutions. Although LLMs still need improvement in some aspects, with appropriate parameter tuning and prompt engineering techniques, LLMs can become an important auxiliary tool in the design process.