Pre-Deployment Information Sharing: A Zoning Taxonomy for Precursory Capabilities

Matteo Pistillo,Charlotte Stix
2024-11-18
Abstract:High-impact and potentially dangerous capabilities can and should be broken down into early warning shots long before reaching red lines. Each of these early warning shots should correspond to a precursory capability. Each precursory capability sits on a spectrum indicating its proximity to a final high-impact capability, corresponding to a red line. To meaningfully detect and track capability progress, we propose a taxonomy of dangerous capability zones (a zoning taxonomy) tied to a staggered information exchange framework that enables relevant bodies to take action accordingly. In the Frontier AI Safety Commitments, signatories commit to sharing more detailed information with trusted actors, including an appointed body, as appropriate (Commitment VII). Building on our zoning taxonomy, this paper makes four recommendations for specifying information sharing as detailed in Commitment VII. (1) Precursory capabilities should be shared as soon as they become known through internal evaluations before deployment. (2) AI Safety Institutes (AISIs) should be the trusted actors appointed to receive and coordinate information on precursory components. (3) AISIs should establish adequate information protection infrastructure and guarantee increased information security as precursory capabilities move through the zones and towards red lines, including, if necessary, by classifying the information on precursory capabilities or marking it as controlled. (4) High-impact capability progress in one geographical region may translate to risk in other regions and necessitates more comprehensive risk assessment internationally. As such, AISIs should exchange information on precursory capabilities with other AISIs, relying on the existing frameworks on international classified exchanges and applying lessons learned from other regulated high-risk sectors.
Computers and Society,Artificial Intelligence
What problem does this paper attempt to address?
The main problem that this paper attempts to solve is the insufficiency and ambiguity of the current pre - deployment information - sharing mechanisms regarding high - impact capabilities. Specifically, the author points out the following key issues: 1. **Unclear information - sharing content**: Existing legal and voluntary frameworks lack detailed regulations on the specific information that developers need to share during the pre - deployment stage. For example, mandatory information - disclosure requirements (such as Executive Order 14110 in the United States and the AI Act in the EU) and voluntary commitments (such as the Voluntary AI Commitment, the Canadian Voluntary Code of Conduct, etc.) all fail to clearly define which "dangerous capabilities" should be shared. 2. **Unclear information - sharing timeline**: Whether it is a mandatory or voluntary information - sharing framework, neither clearly stipulates when information should be shared. This results in information often being disclosed only after the model has been deployed, by which time the best opportunity for risk assessment and mitigation has been missed. 3. **Too - late information - sharing**: Even if some frameworks specify the time point for information - sharing, these time points are usually within a relatively short period (such as within two weeks) after the discovery of high - impact capabilities, which is still too late for effectively dealing with potential risks. 4. **Lack of uniformity and coordination**: There are inconsistencies in the preparedness and the definition of the attention threshold among different AI development companies, leading to differences in internal policies (such as Responsibility Scaling Policies, RSPs) and system card content, including in terms of threat modeling and information - sharing. To solve these problems, the paper proposes a taxonomy based on "dangerous capability zones" and gives four specific suggestions to improve the information - sharing mechanism in the seventh commitment of the Front - line AI Safety Commitment (FAISC): 1. **Link early disclosure to early detection**: Once precursor capabilities are confirmed through internal assessment, this information should be shared immediately. 2. **Designate AI Safety Institutes (AISIs) as information receivers and coordinators**: AISIs are responsible for receiving and coordinating information about precursor components. 3. **Ensure information security in proportion to risk**: AISIs should establish an appropriate information protection infrastructure and increase information - security measures as precursor capabilities approach the red line, and if necessary, mark the information as "controlled" or classified. 4. **Information exchange between international AISIs**: AISIs should exchange information about precursor capabilities, make use of existing international classification - exchange frameworks, and learn from the experiences and lessons of other high - risk regulatory industries. Through these suggestions, the paper aims to promote an earlier, more transparent, and more operable information - sharing mechanism for high - impact capabilities, thereby better managing and mitigating the potential risks brought by AI technology.