elefant @elefant

Recent searches

Search options

Only available when logged in.

I'm slightly shocked, I admit. Yesterday I chatted for a good 40 minutes with a locally running #LLaMa #LLM bot - and the experience was pretty much flawless.

With the GPTQ-for-LLaMa project the GPU requirements for running the 13B model becomes ~9GB VRAM.

The text-generation-webui project is already since before a proper chatbot interface, now updated to support the above project and model.

This is "state of the art" level human language #AI running locally on a pretty much normal modern workstation.

Mar 13, 2023, 08:34 AM··Web

0boosts·1favorite

**Troed Sångberg** @troed · Mar 13, 2023

Mar 13, 2023

Troed Sångberg @troed

Note that this is _not_ the same as the CPU-only llama.cpp project. That one is much too slow for such an experience, and I have some doubts regarding the quality of the inference after having compared both.

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back