Abstract:Access to large amounts of diverse design solutions can support designers during the early stage of the design process. In this paper, we explore the efficacy of large language models (LLM) in producing diverse design solutions, investigating the level of impact that parameter tuning and various prompt engineering techniques can have on the diversity of LLM-generated design solutions. Specifically, LLMs are used to generate a total of 4,000 design solutions across five distinct design topics, eight combinations of parameters, and eight different types of prompt engineering techniques, comparing each combination of parameter and prompt engineering method across four different diversity metrics. LLM-generated solutions are compared against 100 human-crowdsourced solutions in each design topic using the same set of diversity metrics. Results indicate that human-generated solutions consistently have greater diversity scores across all design topics. Using a post hoc logistic regression analysis we investigate whether these differences primarily exist at the semantic level. Results show that there is a divide in some design topics between humans and LLM-generated solutions, while others have no clear divide. Taken together, these results contribute to the understanding of LLMs' capabilities in generating a large volume of diverse design solutions and offer insights for future research that leverages LLMs to generate diverse design solutions for a broad range of design tasks (e.g., inspirational stimuli).

What problem does this paper attempt to address?

### The Problem the Paper Attempts to Solve This paper aims to explore the capability of large language models (LLMs) in generating diverse design solutions. Specifically, the researchers hope to enhance the diversity of generated design solutions by adjusting LLM parameters and using different prompt engineering techniques. The main research questions of the paper include: 1. **The impact of parameter tuning on the diversity of design solutions**: How do different parameter settings affect the diversity of design solutions generated by LLMs? 2. **The impact of prompt engineering techniques on the diversity of design solutions**: How do different prompt engineering methods affect the diversity of design solutions generated by LLMs? ### Research Background In the early stages of the design process, diverse inspirational stimuli are crucial for sparking creativity and innovation. Past research has primarily obtained a large number of design examples through crowdsourcing, but this method has some limitations, such as most crowdsourcing workers potentially lacking professional skills, resulting in generated design solutions that are neither practical nor feasible. In recent years, with the development of large language models (such as GPT-4), these models have shown great potential in generating high-quality design solutions. However, the solutions generated by LLMs are often less diverse than those generated by humans, which is a significant challenge in design tasks that require novelty and diversity. ### Research Methods 1. **Data Collection**: - **Crowdsourced Design Solutions**: Collected 100 design solutions for each design problem from the Amazon Mechanical Turk platform. - **LLM-Generated Design Solutions**: Generated 4000 design solutions using GPT-4, covering five different design themes, eight parameter combinations, and eight prompt engineering techniques. 2. **Parameter Tuning**: - Adjusted the temperature and Top-P parameters, setting them at low, medium, and high levels to explore their impact on the diversity of generated solutions. 3. **Prompt Engineering Techniques**: - **Zero-Shot Prompting**: Directly asked the LLM to generate design solutions. - **Few-Shot Prompting**: Provided some examples before generation to guide the LLM in generating higher quality solutions. - **Critique Prompting**: Asked the LLM to elaborate and explain its generated solutions in detail to increase the detail and diversity of the solutions. - **Adjective and Phrase Enhancement**: Included adjectives like "novel," "unique," "creative," or phrases like "you are a design expert" in the prompts to inspire the LLM to generate more diverse solutions. 4. **Diversity Evaluation**: - Used various computational metrics (such as Determinantal Point Process, nearest generated sample distance, convex hull volume, and average distance to centroid) to quantify the diversity of the generated design solutions. ### Main Findings - **Human-Generated Solutions Typically Have Higher Diversity Scores**: Across all design themes, human-generated solutions consistently had higher diversity scores than LLM-generated solutions. - **Impact of Parameter Tuning and Prompt Engineering Techniques**: Certain parameter settings and prompt engineering techniques can significantly enhance the diversity of LLM-generated solutions, but the effects vary by design theme. - **Semantic Differences**: Through logistic regression analysis, it was found that in some design themes, there are significant semantic differences between human and LLM-generated solutions, while in other themes, there are no significant differences. ### Conclusion This study provides important insights into the potential of LLMs in generating diverse design solutions and offers valuable references for future research on utilizing LLMs to generate diverse design solutions. Although LLMs still need improvement in some aspects, with appropriate parameter tuning and prompt engineering techniques, LLMs can become an important auxiliary tool in the design process.

Exploring the Capabilities of Large Language Models for Generating Diverse Design Solutions

How Far Can We Extract Diverse Perspectives from Large Language Models?

How Can Large Language Models Help Humans in Design and Manufacturing?

Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation

Evaluating Large Language Models for Material Selection

Towards Controllable Generative Design: A Conceptual Design Generation Approach Leveraging the FBS Ontology and Large Language Models

Generative large language models in engineering design: opportunities and challenges

Fine-tuning large language models for domain adaptation: Exploration of training strategies, scaling, model merging and synergistic capabilities

Supporting Sensemaking of Large Language Model Outputs at Scale

Divergent Creativity in Humans and Large Language Models

Exploring Large Language Models Through a Neurodivergent Lens: Use, Challenges, Community-Driven Workarounds, and Concerns

Large Language Models as In-context AI Generators for Quality-Diversity

LLM Discussion: Enhancing the Creativity of Large Language Models via Discussion Framework and Role-Play

Beyond designer's knowledge: Generating materials design hypotheses via large language models

Human Creativity in the Age of LLMs: Randomized Experiments on Divergent and Convergent Thinking

Improving Linguistic Diversity of Large Language Models with Possibility Exploration Fine-Tuning

Law of the Weakest Link: Cross Capabilities of Large Language Models

Optimizing Large Language Models for Dynamic Constraints through Human-in-the-Loop Discriminators

Assessing and Understanding Creativity in Large Language Models