LLaVA-Plus - Large Language and Vision Assistants that Plug and Learn to Use Skills

llava-vl.github.io

LLaVA-Plus - Large Language and Vision Assistants that Plug and Learn to Use Skills

llava-vl.github.io

Even_Adder@lemmy.dbzer0.com to

Free Open-Source Artificial Intelligence@lemmy.worldEnglish · 2 years ago

LLaVA-Plus

llava-vl.github.io

Visual Instruction Tuning

Abstract

LLaVA-Plus is a general-purpose multimodal assistant that expands the capabilities of large multimodal models. It maintains a skill repository of pre-trained vision and vision-language models and can activate relevant tools based on users’ inputs to fulfill real-world tasks. LLaVA-Plus is trained on multimodal instruction-following data to acquire the ability to use tools, covering visual understanding, generation, external knowledge retrieval, and compositions. Empirical results show that LLaVA-Plus outperforms LLaVA in existing capabilities and exhibits new ones. It is distinct in that the image query is directly grounded and actively engaged throughout the entire human-AI interaction sessions, significantly improving tool use performance and enabling new scenarios.

Paper: https://arxiv.org/abs/2311.05437

Code: https://github.com/LLaVA-VL/LLaVA-Plus-Codebase

Demo: https://llavaplus.ngrok.io/

Dataset: https://huggingface.co/datasets/LLaVA-VL/llava-plus-data

Model: https://llava-vl.github.io/llava-plus/

You must log in or # to comment.

Chat

Free Open-Source Artificial Intelligence@lemmy.world

fosai@lemmy.world

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !fosai@lemmy.world

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

GitHub Stars

FOSAI Time Capsule

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

4 users / day
5 users / week
30 users / month
452 users / 6 months
3 local subscribers
3.65K subscribers
284 Posts
769 Comments
Modlog