5. Human-Computer Interaction2 - Post-deployment

Competence trust

We use the term competence trust to refer to users’ trust that AI assistants have the capability to do what they are supposed to do (and that they will not do what they are not expected to, such as exhibiting undesirable behaviour). Users may come to have undue trust in the competencies of AI assistants in part due to marketing strategies and technology press that tend to inflate claims about AI capabilities (Narayanan, 2021; Raji et al., 2022a). Moreover, evidence shows that more autonomous systems (i.e. systems operating independently from human direction) tend to be perceived as more competent (McKee et al., 2021) and that conversational agents tend to produce content that is believable even when nonsensical or untruthful (OpenAI, 2023d). Overtrust in assistants’ competence may be particularly problematic in cases where users rely on their AI assistants for tasks they do not have expertise in (e.g. to manage their finances), so they may lack the skills or understanding to challenge the information or recommendations provided by the AI (Shavit et al., 2023). Inappropriate competence trust in AI assistants also includes cases where users underestimate the AI assistant’s capabilities. For example, users who have engaged with an older version of the technology may underestimate the capabilities that AI assistants may acquire through updates. These include potentially harmful capabilities. For example, through updates that allow them to collect more user data, AI assistants could become increasingly personalisable and able to persuade users (see Chapter 9) or acquire the capacity to plug in to other tools and directly take actions in the world on the user’s behalf (e.g. initiate a payment or synthesise the user’s voice to make a phone call) (see Chapter 4). Without appropriate checks and balances, these developments could potentially circumvent user consent.

Source: MIT AI Risk Repositorymit412

ENTITY

1 - Human

INTENT

2 - Unintentional

TIMING

2 - Post-deployment

Risk ID

mit412

Domain lineage

5. Human-Computer Interaction

92 mapped risks

5.1 > Overreliance and unsafe use

Mitigation strategy

1. Prioritize Transparency and Realistic Expectation Setting Establish a precise *a priori* calibration of user trust by transparently communicating the AI system's functional boundaries, potential failure modes, and level of confidence. This includes providing in-context explanations (chains-of-thought) for outputs and using confidence indicators (e.g., scores, verbal expressions of uncertainty) to build a realistic mental model of the AI's capabilities and limitations. 2. Mandate Human Oversight and Facilitate Verification Implement Human-in-the-Loop (HITL) systems, particularly for high-stakes decisions, ensuring mandatory human checkpoints, clear override capabilities, and robust feedback mechanisms. Furthermore, utilize cognitive forcing functions (intentional friction or challenging questions) to activate user critical thinking and prompt active verification of AI-generated content against external or logical standards. 3. Sustain User Competence and Vigilance Establish continuous professional development programs to maintain and enhance users' critical thinking, domain expertise, and non-AI-reliant skills. This counteracts the risk of skill atrophy ("cognitive debt") and automation complacency, ensuring users are adequately equipped to oversee, challenge, and correct AI outputs over the system's lifecycle.