Now I'm just imagining a video game with characters each having their own fine tune applied on top for their dialog. I'm guessing you could use some relatively small models. In each case you would be feeding all the context to the model (player name, current relevant quests, summary of previous interactions, etc). Though maybe fine tuning/training isn't even needed and a good enough prompt will work (Not sure what all they used for this [0]). I'm excited for the first AAA game that tries this. Anyone that has played a RPG-style game knows that after a few times going into a city (or a couple play-throughs) the dialog feels repetitive. I love the idea of Skyrim but with better dialog. You could either run the models on the user's computer or maybe just run it on the backend so you can block certain generations (wrong/misleading/"unsafe") and just ship updated dialog lists to the client occasionally.
Counterpoint: NPCs repeating their dialogue serves as an implicit indicator that you've exhausted their content and it's time to move on. If they gain the ability to make vapid smalltalk forever then you'll forever be second guessing whether you're wasting your time on them.
(also spare a thought for the poor QA testers who would be given the Sisyphean task of making sure an LLM dialogue system always stays in character and doesn't hallucinate non-existent or outdated content/lore/mechanics)
That is an issue for the mostly transactional NPCs that make up the majority of NPCs in RPGs. But consider the case of the companion NPC.
If I travel with Batu the Hun and meet Zaya the Hun Slayer I want to be able to ask Batu if I should kill Zaya on the spot or should entertain his offer. That kind of stuff is extremely valuable both for the connection between player and companion and to provide an in-world perspective on the events you witness and the actions you take. But it's also extremely time-intensive to script. It's also very low stakes, it is essentially small-talk. And with some careful design you can limit it to short exchanges with AI-provided dialogue choices and have it distinguishable from scripted dialogue that advances the story
I think there certainly are other, better, more natural ways this could be achieved.
For example, if you're instructing an LLM to portray a character, instead of repeating dialogue like a broken record when they run out of relevant things to say... instruct them to act like their character would.
They might wonder out loud if there's anything else you want to know, or literally let you know that you're acting weird and awkward, etc.
Pair w/ a journaling system so that you can review their dialogue without talking to them and asking the same thing 50 times. Etc.
This doesn't seem entirely unsolveable given strict enough system prompts.also spare a thought for the poor QA testers
re: QA, besides a strict prompt, I'd imagine it would be hard for AI responses to go truly off the rails if the player's input is limited to "press A to talk" or pick one of 3 dialog options.
Great point, although fixed player input options might sort of defeat the benefit of using LLM to achieve a more organic dialogue flow?
Maybe there could be a hybrid system - choose from suggested responses, or type your own.
I also have vague thoughts of a dialogue system that rewards true role-playing, as in rewarding the saying things aligned with what your character might feasibly say. (Like a more freeform version of the dialogue/RP/reward system in Disco Elysium)
At that point you're back to a pre-baked script.
The implicit indicator is sort of bad, though. I mean, it is a very gamey, immersion breaking thing. We’re just used to it.
Realistically NPCs should probably respond with increasing urgency if you forget their quest, and eventually give up on you.
You'll also ask yourself whether any NPC tells you anything of relevance. If there is no intention behind the words why would it be interesting to talk to them in the first place.
As I'm imagining it the NPC LLMs would be trained exclusively on the in-game lore as well as given system prompts to shape what they can and cannot say at any given moment.
something like
---
"You are Bob the Farmer. You grow rutabegas in the Kingdom of Foo. You are a cautious, fearful man. Several years ago your home was pillaged by King Foo and your family was taken. [blah blah blah several more paragraphs of biographical information]
Your primary motivation is to get your family back and keep the farm going so that you don't starve.
Last week you saw a mysterious group of figures in the woods who appeared to be headless. This is bothering you, along with the stress of your missing family. You wish a band of strong warriors could investigate, especially if they have a necromancer among them.
You may discuss any of the general world knowledge in background_lore.txt
You know nothing about the following topics: [[insert list of plot points that haven't happened yet or are unknown to him]] and will become confused, fearful, and slightly belligerent when asked about them."
---
You could of course update the system prompts for each character as the events of the game progress.
It would be a lot of work to keep these system prompts updated and relevant, for every character, as game events progress, but I think some of this could be handled by some kind of inheritance system.
Suppose "Bob" lives in "Town A", which is in "Kingdom B." You could probably define a lot of things at the Town/Kingdom level. Like suppose "Kingdom B" is plagued by orcs, but "Town A" is kind of a citadel that is safe against orcs. "Bob"'s system prompt could inherit a lot of concerns and knowledge from "Town A" and "Kingdom B"... the system would not have to be strictly hierarchical either.
This is where emergent behaviors within a game's world building becomes very interesting. Perhaps asking the right questions leads to a quest line not previously discovered or triggers social actions in support of/against the player.
Not every NPC would have something deeper to offer, much like not everyone in our world would have something that would pique my interest (in a general sense -- I'm sure I could learn something from anyone I spoke with), but it would also make me interested in conversations with NPCs at a deeper level than I currently engage with.
Most times I just talk to obviously unimportant NPCs so that I can read about the setting and feel more immersed in the fiction. It also stems from old RPGs like the original Pokemon where sometimes you had to talk to a random NPC in town to learn how to progress past an obstacle.
> If there is no intention behind the words why would it be interesting to talk to them in the first place.
But of course there is a story behind them.
I think there's a really interesting opportunity for a synthesis of the classic NPC dialog menu and a fully freeform LLM character.
Namely, the dialog would still be fixed, where there's a pre-defined flow of the conversation and fixed set of facts to deliver to the player. But the LLM could generate variations on it each time so it's never exactly the same twice. And it could add more character so the NPC gets frustrated if you keep asking it over and over. Or, it tries to dumb it down for you. Or, it gets sick of you and just tells you point blank: Look, you need to know XYZ. That's all I have for you.
Namely, the dialog would still be fixed, where there's a pre-defined flow of the conversation and fixed set of facts to deliver to the player. But the LLM could generate variations on it each time so it's never exactly the same twice. And it could add more character so the NPC gets frustrated if you keep asking it over and over. Or, it tries to dumb it down for you. Or, it gets sick of you and just tells you point blank: Look, you need to know XYZ. That's all I have for you.
Or if it's important pre-scripted text you could put a different colored border around it or include an option like, "What was that thing about the thing that you said" as a permanent option to allow the player to re-trigger the script if needed.
I like this idea a lot. Alternatively, perhaps a dialogue journaling system that records the important bits for you and can be reviewed at any time, instead of badgering the character in-game to repeat things.
No, this is smart, because then the journal could also be the thing that records all of the AI's made up on the spot sayings.
If you indexed that with the unique character descriptor then the LLM can be programed to re-read the relevant data other iterations of itself made and keep the characters somewhat consistent.
Maybe allow it other information like:
"How many in game days since last conversation" after a random number would cause the npc to forget the character "What major events have happened in the world, if any" "What events have happened in this area" "Generic gossip related to this character" "What other NPCs are doing at this time"
All of these combined into summary paragraphs would go a long way to influence what topics would be on the characters mind while also not requiring that every NPC has a perfect knowledge of all events or that the game designers have the script everything out.
Could also cause some interesting emergent gameplay, both naturally by the game world evolving over time as the characters play in it, moreso if it's a MMO.
Could also be fun to do things like advance the game clock 1,000 years and see what happens to the npcs.
What if the world is fully destructible? What if the NPCs can build new houses, expand cities, engage in trade? What if NPCs can die of natural causes, get married, have children that are genetic descendants of their parents? What if NPCs undertook quests and occasionally advanced the plot all on their own?
What if you were just one person among hundreds of thousands in the game world?
I'd love to play something like that.
_Very_ good point. I had not fully considered that, same deal with conversation trees vs free-form entry/response.
It’s a really good point. One thing which comes to mind is the way some games distinguish between UI blocking dialog and background color, which could be a great place to start: imagine walking through a city like Baldur’s Gate only it has actual thousands of people who are saying different things when you walk by, and some of those are based on things your party has done recently with specific details about appearance, gear, and actions which would be too hard to do with traditional dialog approaches (e.g. kids talking about a battle and who they thought was best like real kids talk about sports, a good priest wondering what a paladin was doing spotted talking to a notorious thief, etc.). Something like that could add color and immersion without affecting gameplay or wasting anyone’s time, and you could extend it to things like vendors (“saw you put that axe to good use…” or “were you wearing these boots when you freed those slaves? I bet my brother will want buy them!”) to flesh out the approach before using it for load-bearing purposes.
I really like that idea, the "passive" dialog can use the LLM but main dialog is a little more "on the rails". IIRC Skyrim has "background" dialog change subtly throughout the game as you progress but after 1-2 times hearing it, it feels repetitive. Using LLMs to keep that "fresh" would be interesting.
> a good priest wondering what a paladin was doing spotted talking to a notorious thief, etc.)
Love this and the other examples you gave.
There's a thing called "prefix tuning" which is basically like a prompt but in a latent space: i.e. prompt which consists of vectors (either key and value vectors, or just input embedding vectors - like custom tokens).
Unlike regular prompts you can optimize them exactly the way you'd do fine-tuning, i.e. if you have examples you can tune your latent prompt to match them as close as possible. I guess the benefit is that you can match style more closely and it would be less artificial. Also they can be rather compact.
Another option is to keep a bunch of LoRA adapters which can be dynamically selected. They can also be very compact.
If you play Fortnite right now (until Friday June 7th). You can speak in realtime to Darth Vader, he replies in his voice and in character, he knows who’s playing (the name of the character skin). The technology is here, and used in production of major games. It’ll be a big tide sooner than what people expect.
The idea of using "smaller LLMs" to control the agency of RPG characters has been a pretty common one ever since AI Dungeon back in 2019. The hardest aspect of it would be locking the AI down to a well-defined character dossier so that it is hard to jailbreak them, and also to limit knowledge leakage of things they don't know, etc.
It would also lend itself very well towards interactive fiction, point-and-click adventure games, etc.
I've been thinking really hard about this for a while, though I don't have any game development experience.
Especially if you pair it with a capability like the voice interface of ChatGPT which I find very impressive in terms of intonation and realism.
It would not need to cut humans out of the loop. You would have humans writing the prompts, recording voices, etc. (I assume the synthetic voices used by ChatGPT are based at some level on recordings of humans. Correct me if I'm wrong.)
- [deleted]
Here's a demo of "The Matrix" from 2 years ago: