一:项目地址:
IDEA-Research/Grounded-Segment-Anything: Grounded-SAM: Marrying Grounding-DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything (github.com)
二:下载代码
方式一:下载安装包
方式二:git clone(强烈建议采用这种方式)
git clone https://github.com/IDEA-Research/Grounded-Segment-Anything.git
三:创建虚拟环境
conda create -n label python=3,8
然后创建一个文件夹,输入第二步方式二的命令,下载文件
四:打开IDE工具
我们找到合适自己的IDE工具来对代码进行调试和分析,我这边演示的是用pycharm
4.1设置虚拟环境
根据下面的步骤,点击ok
然后记得激活在pycharm中的虚拟环境
4.2安装pytorch
官网地址:PyTorch
选择和自己ucda版本相匹配的pytorch,我的cuda是11.8
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
4.3安装
cd Grounded-Segment-Anything
python -m pip install -e segment_anything
输入下面这个命令之前,记得pytorch一定要先安装,不然会报错
pip install --no-build-isolation -e GroundingDINO
git clone https://github.com/xinyu1205/recognize-anything.git
pip install -r ./recognize-anything/requirements.txt
pip install -e ./recognize-anything/
pip install opencv-python pycocotools matplotlib onnxruntime onnx ipykernel
cd Grounded-Segment-Anything
git submodule init
git submodule update
下面还需要下载四个模型
https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth
https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
https://huggingface.co/spaces/xinyu1205/Tag2Text/resolve/main/ram_swin_large_14m.pth
https://huggingface.co/spaces/xinyu1205/Tag2Text/resolve/main/tag2text_swin_14m.pth
直接把链接输入到浏览器中,会自动下载
下载一个包
pip install litellm
pip install nltk
pip install --upgrade transformers
4.4报错解决
在这个过程中可能会遇到下面这种情况
(label) D:\Desktop\text\Grounded-Segment-Anything>python automatic_label_ram_demo.py
Traceback (most recent call last):
File "automatic_label_ram_demo.py", line 28, in <module>
from ram.models import ram
ModuleNotFoundError: No module named 'ram'
解决方法
import sys
import os
# 获取 'recognize-anything' 目录的路径
recognize_anything_dir = os.path.join(os.path.dirname(__file__), 'recognize-anything')
# 将 'recognize-anything' 目录添加到 Python 解释器的搜索路径中
sys.path.append(recognize_anything_dir)
# 现在可以导入 ram 模块了
from ram.models import ram
这样,Python 解释器就会将 recognize-anything
目录加入到搜索路径中,使得你的程序能够正确地导入 ram
模块。请确保这段代码位于你的 automatic_label_ram_demo.py
文件的顶部。
这个是官方给的代码,我觉得是个坑,命令行参数不能这样子输入,要变成一整行去输入
export CUDA_VISIBLE_DEVICES=0
python automatic_label_ram_demo.py \
--config GroundingDINO/groundingdino/config/GroundingDINO_SwinT_OGC.py \
--ram_checkpoint ram_swin_large_14m.pth \
--grounded_checkpoint groundingdino_swint_ogc.pth \
--sam_checkpoint sam_vit_h_4b8939.pth \
--input_image assets/demo9.jpg \
--output_dir "outputs" \
--box_threshold 0.25 \
--text_threshold 0.2 \
--iou_threshold 0.5 \
--device "cuda"
正确示例
python automatic_label_ram_demo.py --config=GroundingDINO/groundingdino/config/GroundingDINO_SwinT_OGC.py --ram_checkpoint=ram_swi
n_large_14m.pth --grounded_checkpoint=groundingdino_swint_ogc.pth --sam_checkpoint=sam_vit_h_4b8939.pth --input_image="D:\Desktop\text\Grounded-Segment-Anything\bird.jpg" --output_dir="outputs" --box_threshold=0.25 --text_threshold=0.2 --iou_threshold=0.5 --device="cuda"
python automatic_label_ram_demo.py --config GroundingDINO/groundingdino/config/GroundingDINO_SwinT_OGC.py --ram_checkpoint ram_swin_large_14m.pth --grounded_checkpoint groundingdino_swint_ogc.pth --sam_checkpoint sam_vit_h_4b8939.pth --input_image assets/demo9.jpg --output_dir "outputs" --box_threshold 0.25 --text_threshold 0.2 --iou_threshold 0.5 --device "cuda"