Astro/Solid - Hacker News

$roosgit 23 minutes ago

I have an RTX 3060 with 12GB VRAM. For simpler questions like "how do I change the modified date of a file in Linux", I use Qwen 14B Q4_K_M. It fits entirely in VRAM. If 14B doesn't answer correctly, I switch to Qwen 32B Q3_K_S, which will be slower because it needs to use the RAM. I haven't tried yet the 30B-A3B which I hear is faster and closer to 32B. BTW, I run these models with llama.cpp.

For image generation, Flux and Qwen Image work with ComfyUI. I also use Nunchaku, which improves speed considerably.

$onion2k 6 hours ago

I've been making a point'n'click game recently, and generating the art using Flux.1 Dev and Flux.Konnect locally on a Mac Mini M1 with 8GB of RAM. It isn't quick (20m+ per image) but once I had the settings dialled in for the style I want it works really well.

[-]

$jjice 4 hours ago

Very neat use! Do you have anything public currently? Curious to see how they look. Or if you can't share at the moment, what's the art style you're going for?

$runjake 2 hours ago

I don't know because I have 36GB memory on Apple Silicon and mostly use models that require around 32GB, but I will say that people underestimate the abilities of ~7b models for many tasks.

Ask HN: What kind of local on-device AI do you find useful?