2. Privacy & Security3 - Other

Overhead Attacks

Overhead attacks [146] are also named energy-latency attacks. For example, an adversary can design carefully crafted sponge examples to maximize energy consumption in an AI system. Therefore, overhead attacks could also threaten the platforms integrated with LLMs.

Source: MIT AI Risk Repositorymit48

ENTITY

1 - Human

INTENT

1 - Intentional

TIMING

3 - Other

Risk ID

mit48

Domain lineage

2. Privacy & Security

186 mapped risks

2.2 > AI system security vulnerabilities and attacks

Mitigation strategy

1. **Implement Resource Cut-off Thresholds**. Establish a predetermined maximum limit on the total computational resources (e.g., energy consumed or execution time) permitted for a single model inference. Queries that exceed this established threshold should be immediately terminated or routed to a fallback mechanism to prevent resource exhaustion and mitigate the denial-of-service (DoS) effect inherent to energy-latency attacks. 2. **Apply Request Rate Limiting and Traffic Segmentation**. Institute granular rate limiting on external and internal API requests to the LLM platform to restrict the volume of inputs from any single source within a set time frame. Additionally, leverage network segmentation to isolate the LLM inference environment, limiting an adversary's ability to sustain an overhead attack and contain potential lateral movement. 3. **Deploy Enhanced Observability and Behavioral Analytics**. Utilize specialized logging and monitoring stacks (such as SIEM/EDR) to detect and flag anomalous runtime behaviors, specifically focusing on non-linear increases in latency or energy consumption relative to input size (e.g., sponge example characteristics). This real-time behavioral analytics approach is critical for the early identification of energy-latency attack patterns that aim to subtly increase operational costs or degrade service quality.