• micheal65536@lemmy.micheal65536.duckdns.org
    link
    fedilink
    English
    arrow-up
    4
    ·
    9 months ago

    There are only a few popular LLM models. A few more if you count variations such as “uncensored” etc. Most of the others tend to not perform well or don’t have much difference from the more popular ones.

    I would think that the difference is likely for two reasons:

    • LLMs require more effort in curating the dataset for training. Whereas a Stable Diffusion model can be trained by grabbing a bunch of pictures of a particular subject or style and throwing them in a directory, an LLM requires careful gathering and reformatting of text. If you want an LLM to write dialog for a particular character, for example, you would need to try to find or write a lot of existing dialog for that character, which is generally harder than just searching for images on the internet.

    • LLMs are already more versatile. For example, most of the popular LLMs will already write dialog for a particular character (or at least attempt to) just by being given a description of the character and possibly a short snippet of sample dialog. Fine-tuning doesn’t give any significant performance improvement in that regard. If you want the LLM to write in a specific style, such as Old English, it is usually sufficient to just instruct it to do so and perhaps prime the conversation with a sentence or two written in that style.