Abstract:Artificial Intelligence (AI) has steadily improved across a wide range of tasks. However, the development and deployment of AI are almost entirely controlled by a few powerful organizations that are racing to create Artificial General Intelligence (AGI). The centralized entities make decisions with little public oversight, shaping the future of humanity, often with unforeseen consequences. In this paper, we propose OML, which stands for Open, Monetizable, and Loyal AI, an approach designed to democratize AI development. OML is realized through an interdisciplinary framework spanning AI, blockchain, and cryptography. We present several ideas for constructing OML using technologies such as Trusted Execution Environments (TEE), traditional cryptographic primitives like fully homomorphic encryption and functional encryption, obfuscation, and AI-native solutions rooted in the sample complexity and intrinsic hardness of AI tasks. A key innovation of our work is introducing a new scientific field: AI-native cryptography. Unlike conventional cryptography, which focuses on discrete data and binary security guarantees, AI-native cryptography exploits the continuous nature of AI data representations and their low-dimensional manifolds, focusing on improving approximate performance. One core idea is to transform AI attack methods, such as data poisoning, into security tools. This novel approach serves as a foundation for OML 1.0 which uses model fingerprinting to protect the integrity and ownership of AI models. The spirit of OML is to establish a decentralized, open, and transparent platform for AI development, enabling the community to contribute, monetize, and take ownership of AI models. By decentralizing control and ensuring transparency through blockchain technology, OML prevents the concentration of power and provides accountability in AI development that has not been possible before.

2 OLMo 2 Furious

OLMo: Accelerating the Science of Language Models

OLMoE: Open Mixture-of-Experts Language Models

Code Llama: Open Foundation Models for Code

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models

Dolma: An open corpus of three trillion tokens for language model pretraining research

OpenELM: An Efficient Language Model Family with Open Training and Inference Framework

PaLM 2 Technical Report

Llama 2: Open Foundation and Fine-Tuned Chat Models

OmniBench: Towards The Future of Universal Omni-Language Models

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

LLM360: Towards Fully Transparent Open-Source LLMs

OML: Open, Monetizable, and Loyal AI

PolyLM: An Open Source Polyglot Large Language Model

MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series

The Llama 3 Herd of Models

Fully Open Source Moxin-7B Technical Report