OpenAI Strengthens Safety Measures with New Preparedness Team

OpenAI Strengthens Safety Measures with New Preparedness Team

OpenAI has introduced a dedicated team focused on safety assessments for its models. The move follows the dismissal and subsequent reappointment of CEO Sam Altman amid concerns raised by Chief Scientist Ilya Sutskever about the company's rapid commercialization without adequate safeguards against potential risks.

The newly formed Preparedness team will play a crucial role in evaluating and pushing the boundaries of OpenAI's foundation models. Reports generated by this team will be shared with both the OpenAI leadership and the expanded board of directors, which gained significant decision-making authority following the November reshuffle.

OpenAI's leadership will assess the reports to determine the viability of moving forward with system developments, while the board now possesses the authority to reverse decisions. This organizational shift is highlighted in the wake of the firing and rehiring incident, with the board set to expand to nine members, including a non-voting observer position held by Microsoft.

In a blog post, OpenAI emphasized the importance of the Preparedness team's technical work in informing decisions related to safe model development and deployment. The company's fiduciary duty is explicitly stated as being to humanity, with a commitment to conducting necessary research to ensure the safety of artificial general intelligence (AGI).

OpenAI's newly introduced Preparedness Framework aims to leverage lessons from deployments to mitigate emerging risks. Under this framework, the Preparedness Team will conduct regular safety drills, ensuring swift responses to any issues that may arise. Independent third parties will also be engaged to conduct audits, enhancing the transparency and reliability of OpenAI's safety measures.

Crucially, OpenAI has adopted a continuous update approach for all its models, evaluating them at each doubling of effective compute during training runs. The safety tests will cover areas such as cybersecurity, persuasion, model autonomy, and potential misuse for generating threats related to chemical, biological, or nuclear scenarios.

Models will be categorized into four safety risk levels—low, medium, high, and critical—based on evaluation results. Those scoring 'medium' or lower will be deemed suitable for deployment, while those scoring 'high' or below can undergo further development. Models classified as 'critical' or 'high' will be subject to additional safety measures, aligning with the risk levels defined in the EU AI Act's classification of AI systems.