Release of Pre-Trained Models for the Japanese Language

Kei Sawada,Tianyu Zhao,Makoto Shing,Kentaro Mitsui,Akio Kaga,Yukiya Hono,Toshiaki Wakatsuki,Koh Mitsuda

2024-04-02

Abstract:AI democratization aims to create a world in which the average person can utilize AI techniques. To achieve this goal, numerous research institutes have attempted to make their results accessible to the public. In particular, large pre-trained models trained on large-scale data have shown unprecedented potential, and their release has had a significant impact. However, most of the released models specialize in the English language, and thus, AI democratization in non-English-speaking communities is lagging significantly. To reduce this gap in AI access, we released Generative Pre-trained Transformer (GPT), Contrastive Language and Image Pre-training (CLIP), Stable Diffusion, and Hidden-unit Bidirectional Encoder Representations from Transformers (HuBERT) pre-trained in Japanese. By providing these models, users can freely interface with AI that aligns with Japanese cultural values and ensures the identity of Japanese culture, thus enhancing the democratization of AI. Additionally, experiments showed that pre-trained models specialized for Japanese can efficiently achieve high performance in Japanese tasks.

Computation and Language,Artificial Intelligence,Computer Vision and Pattern Recognition,Machine Learning,Audio and Speech Processing

What problem does this paper attempt to address?

The paper mainly addresses the following issues: 1. **Promoting AI Democratization**: By releasing pre-trained models optimized for Japanese, it lowers the barriers for non-English speakers to access and utilize advanced AI technologies. 2. **Bridging the Language Gap**: Most existing high-performance pre-trained models focus on English, leading to significant lag in AI resource access for non-English communities. The paper addresses this issue by releasing a series of pre-trained models specifically for Japanese. 3. **Cultural Adaptability**: Ensuring that the released models reflect Japanese cultural values and maintain the characteristics of Japanese culture, thereby enhancing the inclusivity of AI democratization. Specifically, the paper releases the following types of Japanese pre-trained models: - **Language Model (GPT)**: A Japanese language model based on the Generative Pre-trained Transformer (GPT) architecture, used for text generation tasks. - **Language-Image Model (CLIP)**: A model that connects visual concepts with natural language, used for tasks such as zero-shot image classification. - **Stable Diffusion Model**: A model used to generate high-quality images based on text prompts. - **Speech Model (HuBERT)**: A self-supervised speech representation learning model used for automatic speech recognition tasks. Through experimental validation, these Japanese-specific pre-trained models have been shown to efficiently achieve high performance in handling Japanese-related tasks. Additionally, the stable diffusion model has demonstrated its ability to process Japanese inputs and produce outputs that align with Japanese cultural characteristics.

Release of Pre-Trained Models for the Japanese Language

Rapidly Developing High-quality Instruction Data and Evaluation Benchmark for Large Language Models with Minimal Human Effort: A Case Study on Japanese

RakutenAI-7B: Extending Large Language Models for Japanese

Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI

Applying Large Language Models for Automated Essay Scoring for Non-Native Japanese

OPT: Open Pre-trained Transformer Language Models

Quantifying Memorization and Detecting Training Data of Pre-trained Language Models using Japanese Newspaper

From English To Foreign Languages: Transferring Pre-trained Language Models

Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations

h2oGPT: Democratizing Large Language Models

Operationalizing and Implementing Pretrained, Large Artificial Intelligence Linguistic Models in the US Health Care System: Outlook of Generative Pretrained Transformer 3 (GPT-3) as a Service Model

JASMINE: Arabic GPT Models for Few-Shot Learning

Relational Pretrained Transformers Towards Democratizing Data Preparation [Vision].

Performance of Generative Pretrained Transformer on the National Medical Licensing Examination in Japan

Phoenix: Democratizing ChatGPT across Languages

Evolution of Natural Language Processing Technology: Not Just Language Processing Towards General Purpose AI

Linguistic Landscape of Generative AI Perception: A Global Twitter Analysis Across 14 Languages

A Glimpse in ChatGPT Capabilities and its impact for AI research

The Dark Side of the Language: Pre-trained Transformers in the DarkNet

Enhancing English abstract quality for non-English speaking authors using ChatGPT: A comparative study of Taiwan, Japan, China, and South Korea with slope graphs

Pre-Trained Language Models and Their Applications