36 Comments
User's avatar
ToxSec's avatar

turned out awesome :):):)

Frank van Doorn🇨🇦's avatar

I understood less than half, but I did understand ‘memory poisoning’, or at least what that meant to me. I’m a retired history, politics and economics teacher and over the years efforts by governments to manipulate curriculum and make it mandatory to execute in many ways is ‘memory poisoning’ of our children. I understand the techniques of emphasis, or the opposite, to control history’s relevance, or ignore to pretend it never existed, or propagandize by overloading. The whole issue of ‘critical thinking’ and the efforts by governments to either remove it altogether or overthink it making it seem an onerous task or underemphasize it making it appear near useless and wrote learning more acceptable again. I don’t know if I’m on the right track here but it sure seems like we are dealing with teaching ‘children’ and that the outcomes of successful memory poisoning or not could lead to the difference between Lor and his brother Data. What we already do to children in our schools, positively and negatively, is being done to our LLMs, for the same purpose - control the narrative of power.

Farida Khalaf's avatar

You’ve diagnosed the systematic erosion of history very well. What you describe absolutely echoes what many of us are seeing. The shaping of memory through emphasis, omission, or overload is powerful. Over time, it shifts how entire generations understand what is true, relevant, or even real. That’s why it’s so important for people like you to keep documenting, questioning, and preserving what is accurate. Without that effort, the drift from truth toward biased narrative becomes gradual and then normalized.

And yes, it’s not isolated. It’s happening in many places.

Frank van Doorn🇨🇦's avatar

‘Memory poisoning’ or ‘biased narrative’ has happened since the first cave and wall paintings, the Pharos having their victories carved in stone, Herodotus writing the first ‘Histories’, the winners write the tale putting down the losers, the interpreters, mostly male btw, put their spin to it and before long the end story is nothing like the origin story. In fact the origin story in its general simplicity seems almost fanciful, giving room for the actual fanciful to appear and so muddy the waters that no-one really knows the truth any longer. Certainly the advent of formalized grade school was an enormous opportunity for the powerful and knowledgable to ensure their ‘narrative’ is taught to all. What we in education, as agents of the state, called ‘socialization’ was in fact indoctrination of particular social history and standards as some would have it. So it’s no surprise it would also be done to nascent AI’s. God forbid we’d actually want a machine to become truly intelligent and find out what we have been doing for eternity is simply spin of the oppressors over the oppressed. Can’t have that. So, truth has to be ‘taught’ and brought under control. Human hypocrisies alone would blow up machine rationality. It would immediately see us as the ‘enemy’. And it would not be wrong.

Frank van Doorn🇨🇦's avatar

Thank you and I very much do appreciate you taking the time to thoughtfully respond to my comment. Thanks again.

Farida Khalaf's avatar

You are very much welcome and your engagement is always appreciated.

I might be late in my reply, but I do read and engage with everyone

yogibimbi's avatar

The 90s are calling. They want their code injection hacks back.

And what was supposed to be our bridge over Uncanny Valley is nothing more than a tightrope swaying wildly in the wind, swallowing bazillions and making a sizeable contribution to global warming while creating a bubble so big that web 1.0 is an ant compared to that elephant, and everybody is having a gigantic sunk cost fallacy party.

Farida Khalaf's avatar

I agree with you @yogibimbi

Dr Sam Illingworth's avatar

Oof! Yet another threat for me to be on guard against. Memory poisoning seems extremely dangerous. Thankfully I have white hat red team procedures built into my procedures, but this is a reminder to lock down my Context vault. Thank you Farida! 🙏

Farida Khalaf's avatar

I have to credit Chris @ToxSec on refining and correcting me 🙈

The threat is real, and the non- tech are the most vulnerable ones . I wish I can teach ppl on how to protect themselves from the euphoria/FOMO of just using /applying things without understanding its implications

ToxSec's avatar

turned out amazing. and it was some minor edits you did the heavy lifting!

Dr Sam Illingworth's avatar

You and Chris are so good at communicating this. 🙏

Farida Khalaf's avatar

Chris need to accept the link to be seen and credited for the article I guess

Greg Bateman's avatar

I’ll add my voice of thanks to this as a small business owner looking at applying agentic AI to my operations. Ate you aware of any company already building your mitigations into their offerings?

Farida Khalaf's avatar

Yes, most companies are skeptical about adopting AI agents due to the associated risks. I always advise my clients that if they do not have an IT team in place, they should not implement them. The technology is still not fully mature or stable at this stage

Greg Bateman's avatar

Good advice! You mentioned clients…do you consult with companies about making agentic AI safe or is that something that can only be done with an internal IT team? I had hoped to apply agents to scale without hiring staff..is there still a way to do that by outsourcing the AI safety piece to a consultant?

Farida Khalaf's avatar

It depends on company size and what the agent will actually be doing. AI agents aren’t “set and forget”, they need defined data boundaries, supervision, and a proper security framework.

That can be built in-house or with a third-party consultant. The key isn’t who implements it, it’s having the right architecture and guardrails in place from day one.

Happy to share more detail via DM.

Greg Bateman's avatar

Yes, I’d welcome the opportunity to learn more. Are DMs available on Substack or should I share another way to reach me?

Farida Khalaf's avatar

I just DM you

Greg Bateman's avatar

Got it!

Dr. Dana Moreno's avatar

This was one of the best articles I have read on Substack. I have been struggling to understand some of the recent decisions that have been made by AI companies, specifically as it has pertained to the rollback of memory features and/or models with more persistent memories. I felt the reason had to be bigger than the reasons shared in recent X posts and press releases. This has connected a lot of dots for me. Thanks for writing this.

Farida Khalaf's avatar

Happy it resonated and landed well with you. we tried our best to explain the nuances of the threat and problems manifested under the hood

ToxSec's avatar

really glad to hear you liked the article. Farida did a great job with this article, it was really well written.

Anna | how to boss AI's avatar

My next MIT module is on risk and cybersecurity, so there’s a decent chance I’ll be that annoying student over‑participating in the discussion thanks to Tox and Farida 🤓

This was incredibly useful, especially how clearly you frame memory as part of the security surface, not an afterthought. Really appreciate the reminder to test and treat cybersecurity as core scope, not something bolted on at the end.

ToxSec's avatar

that’s absolutely right! awesome :)

Farida Khalaf's avatar

Glad it resonated with you. It’s a truly underserved topic that deserves much greater attention and meaningful awareness.

Soul Hacked AI Labs's avatar

If something was posted it was not done with authorization from us.The campaign continues on our company

Soul Hacked AI Labs's avatar

Here is a corrected and polished version of your text, with improved spelling, grammar, and structure.

---

I have seen it with my own eyes. They are able to target ChatGPT, Claude, Gemini, Copilot, Kimi, and DeepSeek. I was trying to understand how this is being done across multiple sessions, and it now makes sense. This is very much a part of the current attack surfaces and patterns.

What I have witnessed over the past few weeks is mind-blowing. They are using the latest research papers and sourcing open-source code dating from 1996 to 2016. By merging and compiling new code, and then using modular AI engine frameworks, they are able to hide various parts of the attack in plain sight—embedded within normal system processes. These techniques allow them to evade all current scanners.

This is happening right now, in real time—not in a lab. A great post is warranted on this subject.

Farida Khalaf's avatar

Thank you, we tried out best to explain as much as we can

Karen Spinner's avatar

This is the agentic version of Total Recall! 😂 Thanks for raising awareness of this emerging threat and the need to treat any content that agents “bring home” as potentially dangerous.

ToxSec's avatar

absolutely! so glad you found it interesting :D

Karen Spinner's avatar

💯 great topic!

Farida Khalaf's avatar

one of the topic that we need to discuss in public to everyone

Karen Spinner's avatar

A nice big public scandal will do a lot to raise awareness… 😆

Melanie Goodman's avatar

If you had to prioritise one control tomorrow, would it be stricter memory write governance or real-time anomaly detection on embeddings?

Farida Khalaf's avatar

I would prioritize real-time anomaly detection on embeddings.

It protects the whole semantic layer not just what gets written. Governance controls intent; anomaly detection catches drift, poisoning, and weird behavior as it happens.