hacker news

Good morning HN! For a while now I have been toying with this idea and now finally have a working prototype.

This project allows you to encode secret messages into ordinary looking text by using arithmetic coding with a probability model derived from an LLM. By encrypting the message and then decompressing the encrypted message using the arithmetic coder, you get output which looks just like randomly sampled output from the LLM. Except, it actually encodes your secret messages in the specific choices of tokens.

By using authenticated encryption, only a user who knows the key can know that a message is present. To others, the messages appear almost indistinguishable from typical LLM output.

This technique allows you to hide your secret messages in a public channel without other users even knowing that a secret conversation is taking place. For example, the prototype is instructed to output text which could plausibly look like a tweet on Twitter. This allows you to post your secret messages in an accessible public space without being discovered. This could be beneficial for situations where users don't want to draw suspicion for using encrypted messaging, such as if they are trying to avoid state-sponsored spying.

On the other hand, a potentially nefarious use of this technology could be the sharing of botnet command-and-control messages. By taking advantage of a technology like this, the messages could be shared in a public space where they could be widely distributed without an effective way to detect or block them.

This project is at an early stage and any feedback or contributions are welcome!

Thanks for reading, Shawn

Show HN: Steganographically encode messages with LLMs and Arithmetic Coding