Grounded-SAM Demo部署搭建

1 环境部署

2 Grounded-SAM Demo安装

3 运行Demo

3.1 运行Gradio APP

3.2 Gradio APP操作

1 环境部署

由于SAM建议使用CUDA 11.3及以上版本，这里使用CUDA 11.4版本。

另外，由于整个SAM使用的是Pytorch开发，因此需要Python环境，这里使用conda环境，能够更方便地安装各依赖库。Python版本要求3.8及以上，这里使用Python3.8。使用conda安装Python3.8的过程如下。

（1）安装Anaconda

下载Anaconda

wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-5.3.1-Linux-x86_64.sh

安装Anaconda

bash Anaconda3-5.3.1-Linux-x86_64.sh

（2）创建python3.8虚拟环境

conda create -n SAM python=3.8

（3）激活python3.8虚拟环境

source activate SAM

2 Grounded-SAM Demo安装

（1）下载Grounded-SAM

git clone https://github.com/IDEA-Research/Grounded-Segment-Anything.git

（2）安装Grounded-SAM

cd Grounded-Segment-Anything

1> 安装Segment Anything:

python -m pip install -e segment_anything

2> 安装Grounding DINO:

python -m pip install -e GroundingDINO

3> 安装diffusers:

pip install --upgrade diffusers[torch]

4> 安装osx:

git submodule update --init --recursive
cd grounded-sam-osx && bash install.sh

5> 安装Tag2Text

git submodule update --init --recursive
cd Tag2Text && pip install -r requirements.txt

6> 安装其他可选组件

 pip install opencv-python pycocotools matplotlib onnxruntime onnx ipykernel

3 运行Demo

Grounded-SAM提供了以下多种Demo，通过单个或多个不同的大模型组合提供更强大的视觉处理功能。

（1）GroundingDINO：使用文本提示检测所有内容。

GroundingDINO + Segment-Anything：使用文本提示检测和分割所有内容。

（2）GroundingDINO + Segment-Anything + Stable-Diffusion：使用文本提示检测、分割和生成任何内容。

（3）Grounded-SAM + Stable-Diffusion Gradio APP：一个包含文本提示和全自动检测、分割和生成任何内容的Web服务。

（4）Grounded-SAM + Tag2Text：具有优秀图像标记功能的的自动标注系统。

（5）Grounded-SAM + BLIP：自动标注系统。

（6）Whisper + Grounded-SAM：用语音检测和分割所有内容。

（7）Grounded-SAM + Visual ChatGPT：使用ChatBot自动标注和生成所有内容。

（8）Grounded-SAM + OSX：文本到3D全身网格恢复，检测任何人并重建其3D人类网格。

（9）交互式时尚编辑游乐场：点击进行分割和编辑。

（10）交互式人脸编辑游乐场：点击并编辑人脸。

下面以下搭建Gradio APP为例说明Demo的搭建步骤

3.1 运行Gradio APP

（1）安装grdio

pip install gradio

（2）升级transformers

pip install --upgrade transformers

（3）安装openai

pip install openai

（4）模型下载

这里用到两个大模型，一个是SAM的分割一切的大模型sam_vit_h_4b8939.pth，另一个是检测一切的大模型groundingdino_swint_ogc.pth。

cd Grounded-Segment-Anything
wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth
wget https://github.com/IDEA-Research/GroundingDINO/releases/download/v0.1.0-alpha/groundingdino_swint_ogc.pth

（4）运行Gradio APP

python gradio_app.py

3.2 Gradio APP操作

Gradio APP启动后是以Web服务形式提供功能，可以通过浏览器直接访问服务端，默认端口为7589。

Gradio APP提供的6种task_type模式如下：

（1）scribble：通过Segment Anything和鼠标点击交互实现分割（您需要用鼠标点击对象，无需指定提示）。

（2）automask：通过Segment Anything一次性分割整个图像（无需指定提示）。

（3）det：通过Grounding DINO和文本交互实现检测（需要指定文本提示）。

（4）seg：通过结合Grounding DINO和Segment Anything实现文本交互，实现检测+分割（需要指定文本提示）。

（5）inpainting：通过结合Grounding DINO + Segment Anything + Stable Diffusion实现文本交换并替换目标对象（需要指定文本提示和inpaint提示）。

（6）automatic：通过结合BLIP + Grounding DINO + Segment Anything实现非交互式检测+分割（无需指定提示）。

秒客网

Grounded-SAM Demo部署搭建

1 环境部署

2 Grounded-SAM Demo安装

3 运行Demo

3.1 运行Gradio APP

3.2 Gradio APP操作

相关文章