ioc.exchange is one of the many independent Mastodon servers you can use to participate in the fediverse.
INDICATORS OF COMPROMISE (IOC) InfoSec Community within the Fediverse. Newbies, experts, gurus - Everyone is Welcome! Instance is supposed to be fast and secure.

Administered by:

Server stats:

1.3K
active users

Troed Sångberg

I'm slightly shocked, I admit. Yesterday I chatted for a good 40 minutes with a locally running bot - and the experience was pretty much flawless.

With the GPTQ-for-LLaMa project the GPU requirements for running the 13B model becomes ~9GB VRAM.

The text-generation-webui project is already since before a proper chatbot interface, now updated to support the above project and model.

This is "state of the art" level human language running locally on a pretty much normal modern workstation.

Note that this is _not_ the same as the CPU-only llama.cpp project. That one is much too slow for such an experience, and I have some doubts regarding the quality of the inference after having compared both.