Experienced engineer examines comments in a legacy codebase

yacgta@infosec.pub · 1 year ago

Specifically on what LLM to use, I’ve been meaning to try Starcoder, but can’t vouch for how good it is. In general I’ve found Vicuna-13B pretty good at generating code.

As for general recommendations, I’d say the main determinant will be if you can afford the hardware requirements to locally host - I presume you’re familiar with the fact that you’ll (usually) need roughly 2x the number of parameters in VRAM (e.g. 7B parameters means 14GB of VRAM). Techniques like quantization to 8-bits halve the requirement, with the more extreme 4-bit quantization halving them again (at the expense of generation quality).

And if you don’t have enough VRAM, there’s always llama.cpp - I think that list of supported models is outdated, and it supports way more than those.

On the “what software to use for self-hosting” I’ve quite liked FastChat, they even have a way to run an OpenAI API compatible server, which will be useful if your tools expect OpenAI.

Hope this is helpful!

yacgta@infosec.pub · 1 year ago

I have a few of these but I forget where they came from, curious if anyone here knows

yacgta@infosec.pub · 1 year ago

Experienced engineer examines comments in a legacy codebase

yacgta@infosec.pub · 1 year ago

Thank you for sharing!

yacgta@infosec.pub · 1 year ago

I mean, this is Google we’re talking about…