Our members work together to understand the landscape of AI trust and safety risks, as well as other uses of AI system evaluation. We identify use cases specific to various important domains (finance, healthcare, education, and others). We work together and with external collaborators to build tools, refine methods, and create benchmarks for detecting and mitigating those risks and performing other kinds of evaluation. We also help educate the public about responsible AI and the developer community about responsible model and application development.
A major challenge for the successful use of AI is the importance of understanding potential trust and safety issues, along with their mitigation strategies. Failure to consider these issues could impact an organization's operations and the experience of its users. Concerns about safety are also a driver for current regulatory initiatives. Hence, applications built with AI must be designed and implemented with AI trust and safety in mind.
This guide offers an introduction from AI Alliance members and other experts on AI trust and safety concerns. The intent is to provide a concise introduction to the issues and offer recommendations for analyzing and mitigating these concerns.
This guide is a living document that will evolve as we broaden its coverage and incorporate new developments in trust and safety analysis and mitigation.
Much like other software, generative AI (“GenAI”) models and the AI systems that use them need to be trusted and useful to their users. Evaluation is the key.
Evaluation aims to provide the evidence for gaining users’ trust in models and systems. More specifically, evaluation refers to the capability of measuring and quantifying how a model or system responds to inputs. Are the responses within acceptable bounds, for example free of hate speech and hallucinations, are they useful to users, cost-effective, etc.?
The Trust and Safety Evaluations project fills gaps in the current landscape of the taxonomy of different kinds of evaluation, the tools for creating and running evaluations, and leaderboards to address particular categories of user needs.
The goal of this Alliance project is to explore the highest priority safety concerns for specific key domains, starting with finance, healthcare, education, and legal, then expanding to additional domains later.
We are gathering domain experts in these areas to clarify use cases and how they want to use AI for accomplishing their goals. From the use cases, we are identifying the most important safety concerns.
We will publish our findings in the "living" guide linked below, which will evolve over time as our research matures. We will also collaborate with other Alliance work groups to ensure that tool options are available for users to act on these findings.
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.