How to get started? A number of questions ....

Solvena@lemmy.world · 1 year ago

How to get started? A number of questions ....

JackGreenEarth@lemm.ee · 1 year ago

You should download Llama 2 from Meta, as that is the best open source LLM right now. It comes in 7B,13B,and 70B sizes, as well as chat versions of those sizes. You’ll need a good computer to run them, but if you’re already running Stable Diffusion you should be fine.

I think Llama 2 has a python API, so you should be able to use it as prompt for SD, as long as it also has a python API.

Llama 2 actually has a smaller context length than chatGPT (it will remember less of the conversation), but you can use hacks like using a separate prompt to summarise the conversation, then another one to find the relevant parts of it in relation to your actual prompt, and then finally use that selected part of the conversation history in your prompt.

Solvena@lemmy.world · 1 year ago

I have a decent CPU and GPU with 12GB VRam - this should let me run the 7B at least, from what I have seen in the sticky post.

Beside downloading the model, what kind of UI should I start with? Are there good tutorials around, that you are aware of?

ffhein@lemmy.world · 1 year ago

If you’re using llama.cpp it can split the work between GPU and CPU, which allows you to run larger models if you sacrifice a little bit of speed. I also have 12 GB vram and I’m mostly playing around with llama-2-13b-chat. llama.cpp more of a library than a program, but it does come with a simple terminal program to test things out. However many GUI/web programs use llama.cpp so I expect them to be able to do the same.

As for GUI programs I’ve seen gpt4all, kobold and silly tavern, but I never got any of them to run in docker with GPU acceleration.