this post was submitted on 08 Jun 2023
5 points (100.0% liked)

LocalLLaMA

2220 readers
1 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago
MODERATORS
 

Everyone is so thrilled with llama.cpp, but I want to do GPU accelerated text generation and interactive writing. What's the state of the art here? Will KoboldAI now download LLaMA for me?

top 2 comments
sorted by: hot top controversial new old
[โ€“] [email protected] 2 points 1 year ago

there's a bit more setup involved but I would look into https://github.com/oobabooga/text-generation-webui

[โ€“] [email protected] 1 points 1 year ago* (last edited 1 year ago)

Hi, I'm happy to see you are willing to give llama a try! If you want to do GPU-Accelerated processing, it depends on your OS and Hardware what you are able to do. If you have a Nvidia card, you will be able to use cuBLAS, instructions here: https://github.com/ggerganov/llama.cpp#cublas . I don't have experience with other cards, but I'll try to help if issues arise!

Also, for more ease-of-use try text-generation-webui (https://github.com/oobabooga/text-generation-webui). Well, ease-of-use, until you can want to use GPU acceleration, because you'll need to look at https://github.com/oobabooga/text-generation-webui/blob/main/docs/llama.cpp-models.md#gpu-acceleration if you want to do that with LLaMA.

33B and 65B models seem to be the best for storytelling and writing.