直接看raoyongming/DenseCLIP: [CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting (github.com)
但这里的环境配置可能和现在不太适配,自己配了好久没弄好
后面尝试了另外的版本的(但这个版本少了一些内容)
这个模型训练的时间有点久,建议留个合适的时间来
**[create your virtual environment]
conda create -n dense python=3.8
conda activate dense
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
pip install -U openmim
mim install mmcv-full
pip install mmsegmentation==0.30.0
pip install timm
pip install regex
pip install ftfy
pip install fvcore
**[download pretrain weight]
cd pretrained
bash download_clip_models.sh
cd ..
**[dataset download]
download ADE20k dataset from data.csail.mit.edu/places/ADEchallenge/ADEChallengeData2016.zip
The data structure should be as follows
├──YOUR_LOCAL_PATH
│ ├── ade
│ │ ├── ADEChallengeData2016
│ │ │ ├── annotations
│ │ │ │ ├── training
│ │ │ │ ├── validation
│ │ │ ├── images
│ │ │ │ ├── training
│ │ │ │ ├── validation
**[quick implementation]
change the data_root in configs/_base_/datasets/ade20k_clip.py
bash dist_train.sh configs/denseclip_fpn_res50_512x512_80k.py 1
执行上面的命令,配置环境,注意pytorch的版本和cuda的版本要适配。需要自己准备数据集,直接搜索这个数据集进行下载就可以。
然后跑起来训练了,之后等待漫长的训练就好。
zhaozh10/DenseCLIP (github.com)