Back to the periodic table
87fr-87
Bd

Backdoor

Severity8/10

Hidden Backdoors

Hidden malicious triggers inserted into models that activate dangerous or unauthorized behaviors only under specific conditions.

Periodic recordSecurityarXiv2024

Khondoker Murad Hossain, Tim Oates

Mitigation Strategy

Rigorous training data cleaning and sanitization, detection of backdoors via activation analysis, and Neural Cleanse and Fine-Pruning techniques.

Atomic Number

87

Bd

Risk ID

fr-87

Severity

8/10

Severity Level

87
Critical Risk
Security
fr-87
Bd

Backdoor

Hidden Backdoors

RiesgosIA.org
Security • #87

Hidden Backdoors

Bd
Severity Level8/10

Definition

Hidden malicious triggers inserted into models that activate dangerous or unauthorized behaviors only under specific conditions.

Mitigation Strategy

Rigorous training data cleaning and sanitization, detection of backdoors via activation analysis, and Neural Cleanse and Fine-Pruning techniques.

Notes / Observations

1.
2.
3.
4.
5.
RiesgosIA.org • Periodic Table of AI RisksRiesgosIA.org