Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs (공부기록 6일차, 240416)

PaLI-3 architecture에

pre-training recipe : consists of two backbones, ViT-2B and TextEncoder UL2-3B

Pre-training: Chart2Table Mixture

unfrozen ViT에 대해 수행한다.

여러 Chart to Table 데이터 mixture을 사용해 pretraining을 수행한다.

Fine-tuning: Multi-task Loss

two ways of incorporating the rationales available in the extended datatset.

1. Single-Task setup

changing the target task from answer to rationale, answer

2. Multi-Task setup

answer and rationale are treated as independent task

Result

Singletask vs. Multitask

Human dataset에 비해 Augmented dataset이 QA pair이 좀 더 단조로움

Learning with augmented dataset

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models(240414, 공부기록 2일차) (2)	2024.04.13

RFS : River From Scratch