AI Security Digest — May 28, 2026
This digest covers advanced LLM security threats, including dynamic inference-time exploits, structural prompt injection, and loader-level defenses against shared object hijacking.
16 articles in this topic.
This topic page curates research-focused writing on Data Poisoning, with an emphasis on practical security implications, reproducible observations, and implementation-aware takeaways. Instead of isolated summaries, the collection is organized to help you connect attack techniques, defensive controls, and evaluation criteria across multiple papers and project write-ups.
Across 16 articles, this cluster highlights how Data Poisoning appears in real workflows and where teams commonly miss risk boundaries. The coverage includes news digest, trend report, research paper, project, paper review and connects this theme with adjacent areas such as LLM Security, Adversarial ML, Agent Security, so you can move from conceptual understanding to deployable engineering decisions.
This page is maintained as a high-signal index for Data Poisoning. Use it to follow newer articles first, then branch into adjacent topics and defensive patterns that repeatedly appear across projects and paper reviews.
This digest covers advanced LLM security threats, including dynamic inference-time exploits, structural prompt injection, and loader-level defenses against shared object hijacking.
The speed of AI exploitation is accelerating, demanding a shift to real-time verification. This digest covers malware poisoning, semantic validation of PE tools, and agentic AI attack vectors.
The dominant theme this week is the collapse of static, post-hoc alignment defenses under the pressure of dynamic, meta-optimizing exploit engines and the subsequent shift toward native, data-free mod
This week, the AI security research community signaled a decisive pivot from static, prompt-response safety paradigms to the volatile, high-stakes realm of agentic autonomy and complex system integrat
The unifying theme of this week's AI security landscape is the critical transition from superficial, syntax-level filtering to deep, state-aware behavioral defenses across both agentic workflows and s
As autonomous agentic systems and multi-modal models increasingly bypass static guardrails, the core paradigm of AI security is shifting from superficial post-hoc input/output filtering to deep, execu
The current AI security landscape is defined by a critical architectural shift: as autonomous agent ecosystems transition from stateless chat interfaces to persistent, multi-tool environments, the tra
The dominant theme in today's landscape is the operational shift toward real-time, inference-stage intervention over destructive weight-modification, manifesting in both AI safety steering and highly
The primary security trajectory this week marks a decisive transition away from localized prompt injection toward systemic, stateful exploitation of autonomous, multi-agent architectures. As artificia
The dominant theme in AI security is the operational crisis emerging from the rapid transition of large language models (LLMs) from passive information-retrieval engines to active, high-privileged age
A comprehensive survey of security vulnerabilities in RAG systems, classifying adversarial attacks by component—data poisoning, retrieval poisoning, and prompt manipulation—and examining emerging defense strategies.
An interactive journey through the fundamentals of Retrieval-Augmented Generation, its security vulnerabilities, and state-of-the-art defense mechanisms.
An analysis of MOEVIL, a novel attack that poisons individual experts in FrankenMoE systems to bypass safety alignment, achieving up to 79% attack success while maintaining benign task performance through DPO-based poisoning and latent vector manipulation.
An analysis of GASLITE, a novel attack that poisons dense embedding-based retrieval systems by crafting adversarial passages that appear in top-k results for targeted queries, achieving up to 100% success with minimal corpus contamination.
A comprehensive analysis of how malicious instructions can be embedded in customized LLMs to create backdoors that activate on specific triggers, without requiring any model fine-tuning.
An analysis of neural phishing attacks that teach LLMs to memorize and leak private information by inserting benign-appearing poison data during pretraining, achieving up to 90% secret extraction rates.