Democratizing AI Assessment: A Nationwide Red-Teaming Effort

At the 2023 Defcon hacker conference in Las Vegas, there was a groundbreaking partnership between AI tech companies and algorithmic integrity groups to expose weaknesses in generative AI platforms. This initiative, supported by the US government, aimed to shed light on these influential yet opaque systems. Now, Humane Intelligence, an ethical AI assessment nonprofit, is taking this concept a step further.

Humane Intelligence recently announced a call for participation in a red-teaming effort with the US National Institute of Standards and Technology. This initiative invites US residents to take part in a nationwide evaluation of AI office productivity software. The qualifying round, part of NIST’s AI challenges known as Assessing Risks and Impacts of AI (ARIA), will be open to both developers and the general public.

Expanding Capabilities for Evaluation

The goal of this red-teaming effort is to enhance testing for the security, resilience, and ethics of generative AI technologies. According to Theo Skeadas, chief of staff at Humane Intelligence, the average person using these models often lacks the ability to assess their fitness for purpose. This initiative seeks to democratize evaluation processes and empower users to determine whether AI models meet their needs.

The culminating event at the Conference on Applied Machine Learning in Information Security (CAMLIS) will divide participants into red and blue teams, with the former attacking AI systems and the latter defending them. The AI 600-1 profile, a component of NIST’s AI risk management framework, will serve as a rubric for measuring the outcomes produced by the red team.

Enhancing Evaluation Processes

Rumman Chowdhury, founder of Humane Intelligence and a contractor in NIST’s Office of Emerging Technologies, emphasizes the importance of NIST’s ARIA initiative in understanding real-world applications of AI models. The partnership with NIST signifies a significant step toward rigorous scientific evaluation of generative AI technologies.

Chowdhury and Skeadas hint at upcoming AI red team collaborations with various government agencies, international bodies, and NGOs. The overarching objective is to encourage transparency and accountability in the development of AI algorithms. Mechanisms like “bias bounty challenges” will incentivize individuals to identify issues and inequalities in AI models.

Skeadas emphasizes that AI evaluation should involve a broad spectrum of stakeholders beyond just programmers. Policymakers, journalists, civil society representatives, and non-technical individuals all have a role to play in testing and evaluating AI systems. By democratizing the assessment process, this initiative aims to foster a more inclusive and transparent AI landscape.

Expanding Capabilities for Evaluation

Enhancing Evaluation Processes

Articles You May Like

Leave a Reply Cancel reply