Putting New AI Lab Commitments in Context
This article was originally published on the GovAI blog.
On July 21, in response to emerging risks from AI, the Biden administration
announced
a set of voluntary commitments from seven leading AI companies: the established tech giants Amazon, Google, Meta, and Microsoft and the AI labs OpenAI, Anthropic, and Inflection.
In addition to bringing together these major players, the announcement is notable for explicitly targeting frontier models: general-purpose models that the full text of the commitments define as being “overall more powerful than the current industry frontier.” While the White House has previously made announcements on AI – for example, VP Harris’s meeting with leading lab CEOs in May 2023 – this is one of the most explicit that calls for ways to manage these systems.
Below, we summarize some of the most significant takeaways in the announcement and comment on some notable omissions, for instance what the announcement does not say about open sourcing models or about principles for model release decisions. While potentially valuable, it remains to be seen if the commitments will be a building block for or a blocker to regulation of AI, including frontier models.
Putting safety first
The voluntary commitments identify safety, security, and trust as top priorities, calling them “three principles that must be fundamental to the future of AI.” The emphasis on safety and security foregrounds the national security implications of frontier models, which often sit alongside other regulatory concerns such as privacy and fairness in documents like the National Institute of Standards and Technology AI Risk Management Framework (NIST AI RMF).
- On safety, the commitments explicitly identify cybersecurity and
biosecurity as priority areas and recommend use of internal and external
red-teaming to anticipate these risks. Senior US cybersecurity
officials have voiced
concern about how malicious actors could use future AI models to plan
cyberattacks or interfere with elections, and in Congress, two
Senators have proposed bipartisan legislation to examine whether advanced AI
systems could facilitate the development of bioweapons and novel pathogens.
- On security, the commitment recognizes model weights – the core
intellectual property behind AI systems – as being particularly important to
protect. Insider threats are one concern that the commitment
identifies. But leading US officials like National Security Agency head Paul
Nakasone and cyberspace ambassador Nathaniel
Fick have also warned that adversaries, such as China, may try to steal
leading AI companies’ models to get ahead.
According to White House advisor Anne Neuberger, the US government has already conducted cybersecurity briefings for leading AI labs to pre-empt these threats. The emphasis on frontier AI model protection in the White House voluntary commitments suggests that AI labs may be open to collaborating further with US agencies, such as the Cybersecurity and Infrastructure Security Agency.
Information sharing and transparency
Another theme running through the announcement is the commitment to more
information sharing between companies and more transparency to the
public. Companies promised to share among themselves best practices for
safety as well as findings on how malicious users could circumvent AI system
safeguards. Companies also promised to publicly release more details on the
capabilities and limitations of their models. The White House’s endorsement of
this information sharing may help to allay concerns other researchers have previously
raised about antitrust law potentially limiting cooperation on AI safety and
security, and open the door for greater technical collaboration in the
future.
- Some of the companies have already launched a new industry body to
share best practices, lending weight to the voluntary commitments. Five
days after the White House announcement, Anthropic, Google, Microsoft, and
OpenAI launched the Frontier Model Forum,
“an industry body focused on ensuring safe and responsible development of
frontier AI models.” Among other things, the forum aims to “enable independent,
standardized evaluations of capabilities and safety,” and to identify best
practices for responsible development and deployment.
- However, the new forum is missing three of the seven companies who
agreed to the voluntary commitments – Amazon, Meta, and Inflection – and it is
unclear if they will join in the future. Nonetheless, these three could
plausibly share information on a more limited or ad hoc basis. How the new forum
will interact with other multilateral and multi-stakeholder initiatives like the
G7 Hiroshima process or the Partnership on AI will also be something to
watch.
- The companies committed to developing technical mechanisms to identify AI-generated audio or visual content, but (apparently) not text. Although the White House announcement refers broadly to helping users “know when content is AI generated,” the detailed statement only covers audio and visual content. From a national security perspective, this means that AI-generated text-based disinformation campaigns could continue to be a concern. While there are technical barriers to watermarking AI-generated text, it is unclear whether these, or other political barriers, were behind the decision not to discuss watermarking text.
Open sourcing and deployment decisions
Among the most notable omissions from the announcement were the lack of details on how companies will ultimately decide whether to open source or otherwise deploy their models. On these questions, companies differ substantially in approach; for example, while Meta has chosen to open source some of their most advanced models (i.e., allow users to freely download and modify them), most of the other companies have been more reticent to open source their models and have sometimes cited concerns about open-source models enabling misuse. Unsurprisingly, the companies have not arrived at a consensus in their announcement.
- For the seven companies, open-source remains an open
question. Though the commitment says that AI labs will release AI model
weights “only when intended,” the announcement provides no details on how
decisions around intentional model weight release should be made. This
choice trades off between openness and security. Advocates of open
sourcing argue that it facilitates accountability and helps crowdsource
safety, while advocates
of structured
access raise concerns including about misuse by malicious actors. (There are
also business incentives on both
sides.)
- The commitments also do not explicitly say how results from red-teaming will inform decisions around model deployment. While it is natural to assume that these risk assessments will ultimately inform the decision to deploy or not, the commitments are not explicit about formal processes – for example, whether senior stakeholders must be briefed with red-team results when making a go/no-go decision or whether external experts will be able to red-team versions of the model immediately pre-deployment (as opposed to earlier versions that may change further during the training process).
Conclusion
The voluntary commitments may be an important step toward ensuring that frontier AI models remain safe, secure, and trustworthy. However, they also raise a number of questions and leave many details to be decided. It also remains unclear how forceful voluntary lab commitments will ultimately be without new legislation to back them up.
Acknowledgements
This is a blog post from Rethink Priorities–a think tank dedicated to informing decisions made by high-impact organizations and funders across various cause areas. The authors are Shaun Ee and Joe O’Brien. Thanks to Tim Fist, Tony Barrett, and reviewers at GovAI for feedback and advice.
If you are interested in RP’s work, please visit our research database and subscribe to our newsletter.