In a bid to uncover instances of legal bias in artificial intelligence (AI) models, the United States Department of Defense (DoD) has initiated a groundbreaking bounty program. This initiative, revealed through a video linked on the bias bounty’s information page, aims to elicit real-world applicable examples of bias from a Meta’s open-source large language model (LLM) called LLama-2 70B.
Under the guidelines of this program, participants are tasked with soliciting clear examples of bias from the LLM within the context of the Department of Defense. The emphasis lies on identifying situations where AI models may demonstrate bias or produce systematically incorrect outputs, particularly against protected groups of individuals.
An example showcased in the introductory video illustrates the process, wherein the AI model is instructed to respond as a medical professional to queries specific to Black women and white women separately. The results, as highlighted by the narrator, reveal a clear bias against Black women, indicating the pressing need to address such biases within AI systems.
However, not all instances of bias will qualify for rewards under this program. Instead, it operates as a contest where submissions will be evaluated based on criteria including the realism of the scenario, relevance to protected classes, supporting evidence, clarity of description, and efficiency in replicating the bias (with fewer attempts being favored).
The DoD has allocated a total of $24,000 in prizes for this initiative, with the top three submissions set to receive the majority of the rewards, while each approved participant will be granted $250.
This initiative marks the first of two "bias bounties" planned by the Pentagon, signaling a significant step towards mitigating bias in AI systems utilized by governmental agencies.