PG video llava

发布于：2024-04-22 ⋅ 阅读:(207) ⋅ 点赞:(0)

git clone https://github.com/mbzuai-oryx/Video-LLaVA.git
conda create --name=pg_video_llava python=3.10
conda activate pg_video_llava

pip install triton>=2.0.0
nvcc -V
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia

#torch #==2.1.0
#torchaudio #==2.1.0
#torchvision #==0.16.0
tqdm==4.65.0
git+https://github.com/openai/CLIP.git
numpy==1.24.3
Pillow==9.5.0
decord==0.6.0
gradio==3.23.0
markdown2==2.4.8
einops==0.6.1
requests==2.30.0
sentencepiece==0.1.99
protobuf==4.23.2
accelerate==0.20.3
tokenizers>=0.13.3
pydantic==1.10.7
git+https://github.com/m-bain/whisperx.git
git+https://github.com/shehanmunasinghe/whisper-at.git@patch-1#subdirectory=package/whisper-at
git+https://github.com/xinyu1205/recognize-anything.git
#transformers@git+https://github.com/huggingface/transformers.git@cae78c46
openai==0.28.0
scenedetect[opencv-headless]

 pip install transformers -U
 pip install transformers[torch]
 pip install -r requirements.txt

transformers[torch] 連帶解決 accelerate 的bug 報錯

Download PG-Video-LLaVA Weights

Setup DEVA as mentioned here

git clone https://github.com/hkchengrex/Tracking-Anything-with-DEVA.git
cd Tracking-Anything-with-DEVA
pip install -e .

Setup Grounded-Segment-Anything as mentioned here

cd ../
git clone https://github.com/hkchengrex/Grounded-Segment-Anything.git
cd Grounded-Segment-Anything
python -m pip install -e segment_anything
python -m pip install -e GroundingDINO

在这里插入图片描述

parser.add_argument("--model-name", type=str,default='weights/llava/llava-v1.5-7b')
parser.add_argument("--projection_path", type=str, default='weights/llava/projection/mm_projector_7b_1.5_336px.bin')
parser.add_argument("--use_asr", action='store_true', default=False, help='Whether to use audio transcripts or not')
parser.add_argument("--conv_mode", type=str, required=False, default='pg-video-llava')
parser.add_argument("--with_grounding", action='store_true',required=False, help='Run with grounding module')

python video_chatgpt/chat.py \
    --model-name  <path_to_LLaVA-7B-1.5_weights> \
    --projection_path <path_to_projector_wights_for_LLaVA-7B-1.5> \
    --use_asr \
    --conv_mode pg-video-llava

PG video llava

Setup DEVA as mentioned here

Setup Grounded-Segment-Anything as mentioned here

网站公告

今日签到

热门文章

最新发布