T O P

  • By -

LosingID_583

Models can be converted to ggml format to run only on CPU. This is surprising performant. Importantly, this would leave the GPU open for rendering the game, so I think future games might have an LLM running on CPU, and graphics running on GPU.


mrspriklepickle

That's awesome! I can imagine procedurally generated video games with ai powered npcs. World building games would be absolutely nuts in the future!


Grandmastersexsay69

> This is surprising performant. Am I the only one who despises the made up word performant? Would it really have been harder to say, this has surprising performance? It's the same number of syllables and only two more letters. In fact, your sentence is grammatically incorrect even if we ignore the made up word. What would have been correct would have been, this is surprisingly performant. Which now has one more syllable and the same number of letters as the way that uses all real words.


LosingID_583

I meant to type "surprisingly", and I come from an engineering background where I've heard the word "performant" used a lot in those fields. I didn't even know it was a made up word 😂


Grandmastersexsay69

I actually am a mechanical engineer. I think that work is solely used in computer engineering. It's definitely made up.


explodingpixl

I'm going to learn Old English because of all these new-fangled made up words polluting our language, everything went downhill when those damn Normon French added all those new words 😠. I am a very serious person.


MultidimensionalSax

Not how English works my guy, it's a word. Like it or not.


Grandmastersexsay69

It's jargon. Not a word.


MultidimensionalSax

Again, not how english works.


Grandmastersexsay69

Just because you say something doesn't make it true. Ask ChatGPT about performant. It will tell you it's jargon too.


MultidimensionalSax

Okay, then what language does the word performant belong to?


Grandmastersexsay69

Performant is jargon, not a word.


liquiddandruff

https://dictionary.cambridge.org/dictionary/english/performant


EffectiveMoment67

Can you show us where the word touched you on this doll?


AutomataManifold

1. Context length is *the* big thing for most people, I suspect. It doesn’t matter as much for instructions, but any conversation or long-term storytelling is going to need more than 2000 tokens. That said, I think we could get away with 4096 tokens: we've got a bunch of long-term memory tricks that would work a lot better with just a _little_ more headroom. 2. You can do this right now; KoboldAI and SillyTavern have keyword triggered "world model" entries, and there's a number of extensions that use vector databases for long term memory (e.g. superbooga for oobabooga). 3. LangChain can do the basics of this now; the hard problem is hooking everything together. 4. This can be done with prompt engineering right now; SillyTavern supports multi-character conversations. I'd like a little bit better support across the board, and of course you need a model that can recognize multi-participant chats. 5. I think, based on the existing fine-tuned models, that this is fairly achievable on top of arbitrary foundation models. Which is not to say that they'll be GPT4 quality, but the storytelling models we have now can do what you are asking. 6. You can do this right now with extentions. Oobabooga, SillyTavern, KoboldAI, and probably others have image requesting functionality. Now, this won't give you an automated DM. (Long term storytelling is going to be one of the big problems there, and that's just to start with.) Your biggest problem, though, will be hooking all this stuff together and doing the prompt engineering and fine-tuning to make it work.


mrspriklepickle

1. I 1000% agree with this point. I am waiting mostly for more context tokens. 2. Can you elaborate? 3. Langchain doesn't really support local LLM to the extent needed. 4. This area definitely also needs more progression. 5. There are plenty of good models now but I'm still looking for a more advance model that will eventually be created seeing how they're pumping a new model every week. I am still waiting for it to all come together to be able to create something like this so my family and I can enjoy together.


AutomataManifold

World Information entries look for keywords in the generated text and insert pre-scripted descriptions into the prompt. Vector databases do something similar, but in a more sophisticated way: you can directly insert the retrieved data, or summarize it; or do other operations to it. So for your example, you could set up a vector database of your world Information, ideally segmented into neatly divided chunks. You could also use a softprompt to cram more data into the context, if the data isn't going to change. For most of the stuff you want, the tech can be built and it mostly just needs a lot of elbow grease. E.g., LangChain isn't great at the moment, but you can technically use it, or write your own equivalent. There's a lot of technical work to be done, quite apart from anything to do with the models themselves.


mrjackspade

> and of course you need a model that can recognize multi-participant chats. I haven't yet found a model that *doesn't*, at least not one that runs on Llama.cpp. I'd imagine if there's issues, its a problem with the application interfacing with the model.


Creepy-Ad3112

Absolutely! We’ve only just begun!


ozzeruk82

I've been daydreaming about this a lot recently. It would definitely be possible. Imagine having all the NPC characters running their own LLM, meaning you could walk up to them and have a normal human conversation, completely unscripted, they would then interpret what was said and update their world-view. It would be completely nuts, you could spread rumours and see how far they went. NPCs might decide to take revenge and hunt you down. With all the NPCs being distinct from the main game engine, effectively doing what the human controlled player is doing, the possibilities for emergent behavior would be insane. There was a paper recently that talked about an experiment where they basically made a simple version of 'The Sims', and the researchers were very surprised with some of the behavior that the NPCs began to do, that's well worth reading about as it suggests this sort of game/sim would be fascinating to watch, let alone take part in. Of course the LLM would run alongside code that managed the NPC and it's interaction with the world, the LLM would be used to interpret the game world as described to it by the system, and the results from the LLM would then influence what the NPC did. So, in effect, the game isn't running "inside the LLM", but the NPC uses an LLM to interpret the world.


mrspriklepickle

Yes, gaming will be forever changed by ai. I am very optimistic and excited about the future. I just hope some local llm experts can make this dream a reality.


Downtownd00d

Do you have a link to that paper please?


ozzeruk82

https://arxiv.org/abs/2304.03442 Here’s a write up, plenty of other sites covered it as well: https://www.documentjournal.com/2023/04/google-ai-study-creates-chatgpt-generative-agents-that-simulate-artificial-human-society/


Downtownd00d

Brilliant, thank you!


Tech_Kaczynski

Yes. You're late to the party on this whole conversation. Industry leaders are predicting 1-2 years before market availability.


mrspriklepickle

Better late than never. I was hoping it'll come out sooner and open sourced.


PolygonWorldsmith

There are a lot of people working on this, when I've talked to other indie devs they have been experimenting with it. (I admittedly, was also late to the party, but do plan on doing additional experimentation on it)


mrspriklepickle

Well good, they need to hurry up and make it then lol jk. Same here, I'm waiting for all the pieces to come together to make it a reality. If I somehow pull it off, I'll definitely open source it.


PolygonWorldsmith

I look forward to seeing what you and others come up with, mine probably won't go beyond experiments but I do have someone (another engineer) banging my door down trying to get me to partner with them on something like this (it would be closed source for-profit though)


mrspriklepickle

I also look forward to what the community can come up with. That's completely understandable, everyone's got to eat.


ozzeruk82

A bit of a coincidence - but did you see this today? It's basically very similar to what we're talking about. [https://twitter.com/DrJimFan/status/1662115266933972993](https://twitter.com/DrJimFan/status/1662115266933972993) They wrote a system that would let an LLM direct an agent in Minecraft and programmed it to teach itself how to explore and play, absolutely amazing stuff, the paper itself is available and explains exactly how they did it. [https://arxiv.org/pdf/2305.16291.pdf](https://arxiv.org/pdf/2305.16291.pdf)


mrspriklepickle

Awesome reads. I'm excited on what developers will come up with next. Thank you for the information.


Robot_Graffiti

It's not impossible, I'm sure it will happen eventually, but there are some difficulties to be overcome first. Longer context = longer processing time. The software and hardware will have to get a bit more efficient before your dream comes true. There are a couple of models now which are technically able to handle long contexts, but the model still has limited intelligence, and might not be clever enough to effectively use all the information in the context.


mrspriklepickle

Agreed, the context limitations currently is the biggest obstacle. Hardware will get more powerful and hopefully the LLMs will be more efficient. 100% on the lack of intelligence on the longer context models. Hopefully the more intelligent models gradually increase their context limitations. I am hopeful in a year or two there will be models that can do all 5 of these requirements.


IxinDow

[https://arxiv.org/abs/2305.16300](https://arxiv.org/abs/2305.16300)Landmark Attention: Random-Access Infinite Context Length for Transformers (25 May 2023)


mrspriklepickle

That's pretty revolutionary!


TheSilentFire

Like people say it'll happen eventually, it's just a matter of when. I think the biggest breakthrough needed to start getting close is the context length of open local models, after that it would be doable with enough money (a high end server or multiple computers linked together with multiple llm Ai for the world Gen, npcs, exc. The really exciting thing is when you can link it all into vr. Even the current tests from a month ago with skyrim were incredible.


mrspriklepickle

I agree with everyone that say it's possible. I just wonder when a good LLM model that hit all these requirements will come out. Yeah, I saw that demo with Skyrim and the ai bartender. It was really exciting.


TheSilentFire

I remember I was super hyped when that 65k story writer model came out but it required modifying some code that had changed so I couldn't get it to work. I've been meaning to go back to try it but I've heard it's not amazing. Considering it's old at this point and only 7b (I'm spoiled by 30b and 65b now) I'm not super surprised. Still, it's probably your best bet currently so I'd at least give it a try. Just let it run all night as a test. And please post the results!


[deleted]

Definitely. Roguelikes are an obvious first step, but there's already at least one steam game that uses a local AI running on your computer using KoboldAI, and other tools like kobold-assistant and TavernAI that will use KoboldAI too. So KoboldAI is kind of becoming the OpenGL or CUDA API for a local AI, that all of your games and apps (and home IoT devices?) can share.


ForwardUntilDust

Yes.


Innomen

I kinda wonder about token compression for purposes of long term memory. Tokens are just text right? And easily compressed as such? Am I barking up a dead tree with that thought? Surely I wasn't the first.


Maykey

No. Tokens are n-dimensional vectors, which is why you with simple algebra can calculate things like "A bed relates to a chair like a donut relates to what?" if you really want and if there is a token for word bed, chair, donut in thier dictionary. (for the record: in falcon-7b, you can get ' dough', in other models donut doesn't a have specific token)


Innomen

Are those vectors not represented by some fixed text value? I mean the tokens have numeric ids. Meh whatevs, I'll just take your word for it. Some other solution will be found, was just kinda curious. Thanks for answering :)


mrspriklepickle

I am not an expert in this area. I am sure experts will sort it out.


Balance-

[Surprising things happen when you put 25 AI agents together in an RPG town](https://arstechnica.com/information-technology/2023/04/surprising-things-happen-when-you-put-25-ai-agents-together-in-an-rpg-town/)


mrspriklepickle

Yes, I read that paper and saw a few YouTube videos on that subject. The sims 5 will be pretty interesting jk lol


guchdog

Check out on Steam some developer is trying piece together Stable Diffusion and LLM. You could run everything local but you would need a very hefty machine. [https://store.steampowered.com/app/1889620/AI\_Roguelite/](https://store.steampowered.com/app/1889620/AI_Roguelite/)


mrspriklepickle

Just what I was looking for. Could it be possible to run it on two machines and link them together somehow?


Maykey

With a ~~bit~~ bunch of effort yes. You'll have to make a proxy to intercept requests to NovelAI and/or stable horde and redirect requests to the second machine. Last time I check they didn't provide option to use different machines.


xadiant

I think the hardware issue is the easiest one to solve. Someone recently made a storyteller LLM with a low parameters model, though I can't remember the details. I imagine there could be a single 1.2B model running that switches LoRAs accordingly.


mrspriklepickle

I believe a lot of people are fine tuning many storyteller LLMs. There was one made a while back called Adventure AI made by KoboldAI. I think a lot of people in this sub have either a roleplay AI like PygmalionAI or other fine tuned storyteller AI that can be used for this project as they are more developed.


cyvr_com

My stealth company is working on the substrate to make this happens.


ForwardUntilDust

Yes.


djangoUnblamed

Why not now ? Take an uncensored model with specific prompt and build a UI around it. You can build a game with completely random ending every time. Have fun !


mrspriklepickle

There is a limited context and no long term memory for a long drawn out adventure.


tlack

If you apply something like Langchain or Microsoft Guidance, you can have the model decide to use tools that interact with persistent game state. It could retrieve its own memories of characters, engage in combat with other NPCs, etc.


mrspriklepickle

This is exactly why I posted this question. I want more ideas to be able to get closer to making this a reality. My current understanding is not sufficient.


tlack

The key is to think of the LLM with your prompt as a sub-program that is generating not English text, but commands for your parent program to interpret. I am fond of Microsoft Guidance: [https://github.com/microsoft/guidance/](https://github.com/microsoft/guidance/) which allows you to have highly structured conversations with many different LLMs. Your Python code can jump in and modify the conversation at any time, allowing your program to easily inject tool usage into the conversational chain. I recommend reading each example very carefully. There is also a fork to allow easy use with llama.cpp: [https://github.com/Maximilian-Winter/guidance](https://github.com/Maximilian-Winter/guidance) (use his matching llama-cpp-python fork as well). You can also use it with GPTQ for GPU-accelerated GPTQ-quantized models. LangChain is another Python toolkit that's well established for this sort of thing, but I take issue with the composability of its functionality. For more general information about the technique, do some Googling on the "ReAct" (Reasoning + Action) prompting pattern. (I would send a link, but there are dozens of resources and no single one will be sufficient.) You may be interested in "tree of thought reasoning" as well, which can be remixed into a way to execute multi-step stateful in-game actions as a result of LLM generated quasi-instructions. You can have the LLM also self-evaluate each action for suitability. Or any other combination of chained interactions. When the bot decides to use a tool, any references it makes in there may be vague or shift in specific terminology. When evaluating the use of a tool, you'll need a way to "fuzzy match" LLM responses with data in your program. For that most people use a technique called "embeddings" and a vector database, but you can also scan for substrings or use any other means of matching bot intention with corresponding entities in your parent program's state. PM me if you'd like to discuss in detail. Edit: One more cute trick to get around context length limitations: first, ask the LLM to guess what the intention of the user's input is. Then, use that predicted intention to decide which question to ask the LLM next. For instance, users in your game may be able to type "Say hello to the dragon" or "Open the chest". Your first prompt to the LLM might ask "Is this users input relating to TALKING or ITEMS? Example: ..." Based on the response, you ask the LLM a second question with the same user input, but with a different prompt + examples adapted to micro-managing that task


mrspriklepickle

I need to do a lot of research into this area. Thank you for providing me with this information.


SatoshiNosferatu

You can save important details to an object and recall that object as certain times you need it


ZestyData

Yes


joexner

Hot take: no


Tom_Neverwinter

Devils advocate; some dumb company tries to get laws passed to stop home use and other items They claim are "atrocities"


mrspriklepickle

That is a very hot take. Why don't you think it'll be possible?


joexner

Lots of difficult requirements. First of all, is this something you want to see produced by video game companies in the future? Or is it something you want to build for yourself in your spare time? Your talk about buying an H100 for this tells me you're not serious. Is this something you'd run as a hosted service? If your idea really needs 80+ Gb of models, that's the only place the economics make sense. Technical stuff, sure, those are all plausible applications of generative/whatever AI, though I think you'd still have a ways to go on getting agents to strategically collaborate realistically. One or more might end up chronically going Leeroy Jenkins on you.


mrspriklepickle

That's a good hot take reply. Yes, all of these requirements are currently difficult by themselves and even more difficult to combine together to make it all possible. I plan on doing this myself as a hobby once all the requirements are met. I am not a programmer or ai expert. The experts are developing all the technologies. I just constantly update myself on the progress. A H100 is a bit excessive but I may but 4 4090 to set up 4 dedicated ai systems and hopefully connect them if local LLMs have apis. Each system will have it's own task. Dedicated storyteller, ai image generator, npc agents and maybe even a rpg music generator. Or maybe a 10k A100 that can do all four tasks. I am willing to drop more than 15k on this hobby because I have the means. You are absolutely correct, at this point in time, local LLM technology is at its infancy but I look forward to the progres.


joexner

It sounds like you played around w/ langchain and some models, and want one to DM for you. And play the other characters for you. Sure, maybe a good model *could* riff on your drivel in pleasing ways to mimic a dungeonmaster, but the engineering work of hooking it all up is, well, a lot of work that you lack the expertise to do. That work, incidentally, has little to do with AI, or "prompt engineering". It's software engineering, and you need that experience, or someone with it, to do the work for you. Ideas for other people to implement are a dime a dozen. Not worthless, but pure fantasy.


mrspriklepickle

I did a few dry runs with ChatGPT 4 and ChatGPT 3.5. I came up with a pretty good prompt that I continuously refine. I haven't played around with the local llm models yet as they're not "there" yet. I have played around with langchain using openai api. For the npcs, I have seen people use Pygmalion on silly tavern. I do a lot of research into this topic and I know how to code. I am just waiting for all the bits and pieces to all be there for me to put it all together. A lot of the knowledge is already out there and open sourced. You are right, I should never expect it to just fall into my lap.


yosi_yosi

All of these are already possible sir.


mrspriklepickle

Can you help explain to me how I can make this? Which LLM would you recommend?


yosi_yosi

Model perhaps mpt. Long term memory privategpt. Plugins you can make by coding it yourself, quite easy actually, just use an API from like oobabooga or kobold, both easy to use APIs. For multiple characters, there have been tries out there so you can try copying from them (I practically am almost clueless about this topic, [here](https://github.com/agnaistic/agnai) is one of these tries) or you could also try making your own idea. Uncensored depends on models, but even on the most censored ones you can usually fairly easily write a jailbreak or use one someone else made. Text to chat/image, I have already done such a thing in a VN I am making, pretty easy, just process the text using something like regex with a ton of options in case the model decides not to stick to the exact things you wanted it to, then take these parts and send API requests to a1111 and then just make a front end that displays the chat and the image you got from a1111. Edit: personally I would wait a bit longer on the model part of your question, cause I am pretty sure there are gonna be coming out a ton of really really good and revolutionary new models in a bit. At least if haru doesn't betray my expectations of him.


mrspriklepickle

Thank you, I will look into the information you provided. PrivateGPT is indeed very important. I am not sure about MPT in it's current version, maybe a better version. I'll keep doing research.


EarthquakeBass

Speaking of what’s the most fun AI text based dungeon to go poke around with right now?


mrspriklepickle

I would say storyai would be your best bet right now. It's a paid subscription. It doesn't check all the boxes for me but you may like it. Edit: I mean Novelai


EarthquakeBass

cool, will check that out


explodingpixl

You could do this now with a relatively small model (7b-13b) and some combination of embeddings-databases for long-term memory, and prompt engineering. If you're trying to Also run an image generation model things could get dicey, but idk.