Limited Causal Reasoning
Causal reasoning makes inferences about the relationships between events or states of the world, mostly by identifying cause-effect relationships
ENTITY
2 - AI
INTENT
2 - Unintentional
TIMING
2 - Post-deployment
Risk ID
mit500
Domain lineage
7. AI System Safety, Failures, & Limitations
7.3 > Lack of capability or robustness
Mitigation strategy
1. Implement supervised fine-tuning frameworks (SFT) that mandate explicit causal structure modeling, such as training the Large Language Model (LLM) to construct and reason over variable-level Directed Acyclic Graphs (DAGs). This methodology instills a more formal, structural understanding of cause-and-effect relationships, which is essential to move beyond Level-1 associative pattern recognition and mitigate logically inconsistent outputs. 2. Augment the LLM's causal reasoning capabilities by integrating them with established, data-driven causal inference algorithms and domain-specific knowledge bases via Retrieval-Augmented Generation (RAG). This strategy leverages the LLM as a proxy for human domain knowledge, providing necessary background context to traditional methods and grounding the LLM's inferences with verifiable statistical evidence from observation data. 3. Prioritize the development and continuous application of robust, counterfactual-rich causal reasoning benchmarks (e.g., CausalProbe-2024) that contain fresh, unseen corpora. This rigorous evaluation methodology is critical for assessing the model's capacity for genuine, higher-level causal reasoning and for identifying and mitigating issues such as prompt dependency and inconsistent responses.
ADDITIONAL EVIDENCE
although GPT-4 can be quite accurate in making inferences of necessary cause, the accuracy for sufficient cause inference is much lower. They conjecture that this is because inferring the sufficient causes of an event requires the LLM to answer a large set of counterfactual questions. Specifically, LLMs need to consider all possible counterfactual scenarios with each event removed or replaced except the outcome and the possible sufficient cause event.