基于Llama 3.2-Vision的医学报告生成

记录运用大模型解决医学报告实例，仅介绍本地调用的情况。

前情提要

已安装 Python
显存不少于8G（8G设备上测试成功，其他环境可以自行测试）。
需要安装Ollama (Ollama 是一个允许在本地运行多模态模型的平台)。

方式1：直接使用Ollama调用

第一步：安装 Ollama

要安装 Ollama，你可以按照以下步骤进行。

下载 Ollama：访问 Ollama 官方网站（https://ollama.com/download）并下载适合操作系统的安装包。

在这里插入图片描述
根据安装提示完成安装。

第二步：安装 Llama 3.2-Vision 模型并使用

直接用命令行使用:

采用以下结构命令行：

ollama run model_name task_filepath

例如：

ollama run llama3.2-vision "describe this image(only including Impression and Findings): D:\Medical-Report-Generation\IUdata\NLMCXR_Frontal\CXR1_1_IM-0001-4001.png"

例子如下：
在这里插入图片描述

常规大模型使用习惯：

通过在终端中运行下面的命令来安装使用 Llama 3.2-Vision 模型

ollama run llama3.2-vision

然后以日常使用大模型的方式输入交流：
在这里插入图片描述

方式2：以Python脚本的形式调用

第一步：安装Python环境

建议基于anaconda创建管理环境。具体操作见其他基础教程。

第二步：安装Ollama环境

使用如下命令行：

pip install ollama

在终端中的示例如下：
在这里插入图片描述

第三步：使用 Ollama 调用模型

该方式中，Ollama作为python中的一个库来使用，即可以按照调库的方式使用。一个示例代码如下：

import ollama

image_path = r"D:\Medical-Report-Generation\CXR1_1_IM-0001-4001.png"  # Replace with your image path

# Use Ollama to analyze the image with Llama 3.2-Vision
response = ollama.chat(
    model="llama3.2-vision", # Replace with deferent models
    messages=[{
      "role": "user",
      "content": "describe this image(only including Impression and Findings):",
      "images": [image_path]
    }],
)

# Extract the model's response about the image
cleaned_text = response['message']['content'].strip()
print(f"Model Response: {cleaned_text}")