It all depends on the size of the model you are running, if it cannot fit in GPU memory, then it has to go back and forth with the host (cpu memory or even disk) and the GPU. This is extremely slow. This is why some people are running LLMs on macs, as they can have a large amount of memory shared between the GPU and CPU, making it viable to fit some larger models in memory.
- 1 Post
- 8 Comments
Thanks for sharing, I find it hard to discover new “lemmy spaces”??? On here
mlflexer@lemm.eeto Linux@lemmy.ml•Is there a downside to sticking to iptables over ufw?English111·17 days agoI thought nftables where replacing iptables?
mlflexer@lemm.eeto Mechanical Keyboards@programming.dev•GitHub - jonboh/ulp-dactyl: A Dactyl mounting Ultra Low Profile switches :DEnglish3·17 days agoLooks awesome, how are the ULP switches compared to chocs or mx?
mlflexer@lemm.eeto Free and Open Source Software@beehaw.org•mwmbl - the user-curated search engineEnglish6·20 days agoWhat is the quality like? Is it better than others because of the community ranking? And have you noticed any downsides to it, like speed or having a hard time with niche queries?
Matrix? I think you can setup text channels and also do voice/video/screen sharing in the channels as well if you’re using element, though I havn’t been able to convince my friends to jump ship yet, so don’t know how it compares to discord
mlflexer@lemm.eeto Programming@programming.dev•Ubuntu 25.10 Looks To Make Use Of Rust Coreutils & Other Rust System ComponentsEnglish9·27 days agoThey don’t seem to have 100% pass of the tests, but I might be missing something?
Would love to take the jump, but I think I’ll wait until they pass all tests
Oh, I thought you could get 128gb ram or more, but I can see it does not make sense with the <24gb… sorry for spreading misinformation, I guess, in this case a GPU of the same GB ram would probably be better