Vitis AI 进阶认知（pybind11）

1. 简介

2. 代码分析

2.1 pybind11 介绍

2.2 writefile 魔法命令

2.3 快速编译和加载

2.3.1 语法主体

2.3.2 查看模块位置

2.3.3 中间文件

2.4 编译器链接器标志

3. pybind11 语法

3.1 基础示例

3.1.1 example.cpp

3.1.2 编译

3.1.3 关键字参数

3.1.4 帮助信息

3.2 类的绑定

3.2.1 example.cpp

3.2.2 绑定 lambda 函数

3.2.3 帮助信息

3.3 vart::Runner 类绑定

4. 总结

1. 简介

pybind11 介绍
Jupyter Lab 中的快速编译和加载
使用编译器链接器标志
pybind11 的基本用法
pybind11 的类的绑定用法
vart::Runner 类的 Python 绑定源代码解释

2. 代码分析

2.1 pybind11 介绍

from pynq.lib import pybind11

pybind11 是一个轻量级的、非侵入性的库，用于在 C++ 和 Python 之间创建绑定，即使用 Python 调用 C++ 代码，或者反过来，使用 C++ 调用 Python 代码。它允许开发者将计算密集部分用 C++ 实现，同时保持 Python 的易用性和灵活性。

pybind11 的主要特点包括：

易用性：它提供了一种简洁的方式来定义 Python 绑定。通常只需要少量的代码就可以将 C++ 功能暴露给 Python。
无需额外依赖：pybind11 是一个 header-only 库，这意味着你不需要预先安装或构建库，只需包含相应的头文件即可。
自动类型转换：它支持自动类型转换，可以自动处理 C++ 和 Python 类型之间的转换，如 STL 容器（如 std::vector、std::map 等）、智能指针等。
扩展性：pybind11 支持继承、虚函数、重载函数等高级 C++ 特性。
最小的运行时开销：由于其设计高效，使用 pybind11 创建的绑定具有很低的运行时开销。

在 PYNQ (Python Productivity for Zynq) 或类似环境中，pybind11 可以用来将 FPGA 或其他硬件加速功能与 Python 接口连接，实现高效的数据处理和硬件控制。

2.2 writefile 魔法命令

%%writefile common.h
#define HELLO "Hello world!\r\n"

%%writefile 是一个在 Jupyter Notebook 环境中使用的魔法命令（magic command）。它用于将单元格的内容写入到一个指定的文件中。这通常用于在 Jupyter Notebook 中快速创建脚本或其他文件，而无需离开 Notebook 界面。

2.3 快速编译和加载

2.3.1 语法主体

%%pybind11 myModule;
#include <iostream>
#include "common.h"

void hello() {std::cout << HELLO << std::endl;}

int add(int a, int b) {
    return a + b;
}

int sub(int a, int b) {
    return a - b;
}

在大多数标准的 Jupyter 环境中，没有内置的支持直接编译和链接 C++ 代码的功能，尤其是涉及到如 pybind11 这类库的情况。通常，设计者需要在你的系统中独立编译这样的代码，然后在 Python 中导入生成的模块。

2.3.2 查看模块位置

import myModule
import inspect
import os

module_location = os.path.dirname(inspect.getfile(myModule))
print(module_location)
---
/root/jupyter_notebooks/dong/pybind

!ls -l
---
common.h
myModule.cpython-310-aarch64-linux-gnu.so
pybind.ipynb

2.3.3 中间文件

-rw-r--r-- 1 root root    535 Mar 11 01:34 myModule.cpp
-rw-r--r-- 1 root root    158 Mar 11 01:34 myModule.hpp
-rw-r--r-- 1 root root    179 Mar 11 01:34 temp.c

运行 %%pybind11 单元格时，会生成以上三个中间文件，其中 myModule.cpp 这个文件是最终用于编译和构建的源文件：

#include <pybind11/pybind11.h>
#include <iostream>
#include "common.h"

namespace py = pybind11;

void hello() {
    std::cout << HELLO << std::endl;
}

int add(int a, int b) {
    return a + b;
}

int sub(int a, int b) {
    return a - b;
}

int main() {
    return 0;
}

PYBIND11_MODULE(myModule, m) {
    m.doc() = "Pybind11 module myModule";
    m.def("hello", &hello, "A function with name hello");
    m.def("add", &add, py::arg("a"), py::arg("b"), "A function with name add");
    m.def("sub", &sub, py::arg("a"), py::arg("b"), "A function with name sub");
}

2.4 编译器链接器标志

cflags = "-O2 -fno-inline -std=c++17 -L/usr/lib -Wl,-rpath=/usr/lib -I/usr/include/opencv4/opencv -I/usr/include/opencv4 " + \
         "-I/usr/include/python3.10 -I/usr/include/python3.10 -fPIC -shared "
ldflags = "-lvart-runner -lopencv_videoio -lopencv_imgcodecs " + \
    "-lopencv_highgui -lopencv_imgproc -lopencv_core -lglog " + \
    "-lxir -lunilog -lpthread -L/usr/lib/python3.10/config-3.10-aarch64-linux-gnu -L/usr/lib -lpython3.10 "

%%pybind11 myModule;{cflags};{ldflags}
#include <iostream>
#include "common.h"

cflags

-O2: 优化代码以提高执行速度。
-fno-inline: 禁用内联函数。
-std=c++17: 使用C++17标准。
-L/usr/lib: 指定库文件的搜索路径。
-Wl,-rpath=/usr/lib: 设置运行时库路径。
-I/usr/include/opencv4/opencv: 包含OpenCV头文件的路径。
-I/usr/include/python3.10: 包含Python 3.10头文件的路径。
-fPIC: 生成位置无关代码（Position Independent Code）。
-shared: 生成共享库。

ldflags

-lvart-runner: 链接vart-runner库。
-lopencv_videoio: 链接OpenCV视频输入输出库。
-lopencv_imgcodecs: 链接OpenCV图像编解码库。
-lopencv_highgui: 链接OpenCV高层GUI库。
-lopencv_imgproc: 链接OpenCV图像处理库。
-lopencv_core: 链接OpenCV核心库。
-lglog: 链接Google日志库。
-lxir: 链接XIR库。
-lunilog: 链接UniLog库。
-lpthread: 链接POSIX线程库。
-L/usr/lib/python3.10/config-3.10-aarch64-linux-gnu: 指定Python 3.10配置库的路径。
-L/usr/lib: 指定库文件的搜索路径。
-lpython3.10: 链接Python 3.10库。

3. pybind11 语法

3.1 基础示例

3.1.1 example.cpp

#include <pybind11/pybind11.h>

int add(int a, int b) {
    return a + b;
}

PYBIND11_MODULE(example, m) {
    m.doc() = "pybind11 example by Dong"; // optional module docstring

    m.def("add", &add, "Dong's add function");
}

PYBIND11_MODULE() 宏创建一个函数，当从 Python 中发出 import 语句时将调用该函数。

宏参数一：example，作为模块名称给出，且不应包含在引号中。
宏参数二： m，定义 py::module_ 类型的变量，它是创建绑定的主接口。
方法一：module_::doc()，模块文档字符串。
方法二：module_::def()，生成向 Python 公开 add() 函数的绑定代码。

3.1.2 编译

g++ -O3 -Wall -shared -std=c++11 -fPIC  \
    $(python3 -m pybind11 --includes)   \
    example.cpp                         \
    -o example$(python3-config --extension-suffix)

此命令具体做了以下事情：

-O3 开启编译器的高级优化。
-Wall 开启所有警告。
-shared 生成共享库。
-std=c++11 指定使用 C++11 标准。
-fPIC 生成位置无关代码。
$(python3 -m pybind11 --includes) 插入 Pybind11 提供的必要编译器标志（例如头文件路径）。
example.cpp 是要编译的源文件。
-o example$(python3-config --extension-suffix) 指定输出文件名，扩展名由 python3-config --extension-suffix 提供，这通常是针对特定 Python 版本的动态库扩展名。

3.1.3 关键字参数

namespace py = pybind11;

m.def("add", &add, "A function which adds two numbers",
      py::arg("i"), py::arg("j"));

arg 是几个特殊标记类之一，可用于将元数据传递到 module_::def() 中。通过此修改后的绑定代码，可以使用关键字参数调用该函数，更具可读性，特别是对于采用多个参数的函数：

example.add(i=1, j=2)

3.1.4 帮助信息

help(example)
---
Help on module example:

NAME
    example - pybind11 example by Dong

FUNCTIONS
    add(...) method of builtins.PyCapsule instance
        add(a: int, b: int) -> int
        
        Dong's add function

FILE
    /workspace/dong/pybind_tt/example.cpython-37m-x86_64-linux-gnu.so

3.2 类的绑定

3.2.1 example.cpp

#include <pybind11/pybind11.h>

namespace py = pybind11;

struct Pet {
    Pet(const std::string &name) : name(name) { }
    void setName(const std::string &name_) { name = name_; }
    const std::string &getName() const { return name; }

    std::string name;
};

PYBIND11_MODULE(example, m) {
    m.doc() = "pybind11 example by Dong"; // optional module docstring
    py::class_<Pet>(m, "Pet")
        .def(py::init<const std::string &>())
        .def("setName", &Pet::setName, "Set the name of my pet")
        .def("getName", &Pet::getName, "Get the name of my pet");
}

具体解释：

py::class_<Pet>(m, "Pet")：定义了一个 Pybind11 类，它将 C++ 中的 Pet 类绑定到 Python 中名为 Pet 的一个新类。
py::init<const std::string &>() 为 Python 中的 Pet 类添加一个构造函数，接受 std::string 类型的参数。

编译方法和基础示例相同：

g++ -O3 -Wall -shared -std=c++11 -fPIC $(python3 -m pybind11 --includes) example.cpp -o example$(python3-config --extension-suffix)

3.2.2 绑定 lambda 函数

Lambda 函数是一个小巧、专用的函数，直接在类绑定表达式中内联定义函数，简化了代码结构，使其更加清晰。

py::class_<Pet>(m, "Pet")
    .def(py::init<const std::string &>())
    .def("setName", &Pet::setName)
    .def("getName", &Pet::getName)
    .def("__repr__",
        [](const Pet &a) {
            return "<example.Pet named '" + a.name + "'>";
        }
    );

调用 print(p) 时，将调用 .def("__repr__", ...) 所定义的 lamda 函数。

lamda 函数语法：

1). 定义：[](const Pet &a) {...}

[] 是捕获列表，用于指定在 lambda 函数体中可以使用哪些外部变量。在这个例子中，捕获列表为空，因为不需要捕获任何外部变量。
const Pet &a 是函数的参数列表，表示这个函数接受一个对Pet类型的常量引用作为参数。

2). 函数体： { return "<example.Pet named '" + a.name + "'>"; }

此函数体只包含一个返回语句，它构造并返回一个字符串。
包含固定的文本<example.Pet named >。
name 是 Pet 对象的属性。

3.2.3 帮助信息

Help on module example:

NAME
    example - pybind11 example by Dong

CLASSES
    pybind11_builtins.pybind11_object(builtins.object)
        Pet
    
    class Pet(pybind11_builtins.pybind11_object)
     |  Method resolution order:
     |      Pet
     |      pybind11_builtins.pybind11_object
     |      builtins.object
     |  
     |  Methods defined here:
     |  
     |  __init__(...)
     |      __init__(self: example.Pet, arg0: str) -> None
     |  
     |  getName(...)
     |      getName(self: example.Pet) -> str
     |      
     |      Get the name of my pet
     |  
     |  setName(...)
     |      setName(self: example.Pet, arg0: str) -> None
     |      
     |      Set the name of my pet
     |  
     |  ----------------------------------------------------------------------
     |  Static methods inherited from pybind11_builtins.pybind11_object:
     |  
     |  __new__(*args, **kwargs) from pybind11_builtins.pybind11_type
     |      Create and return a new object.  See help(type) for accurate signature.

FILE
    /workspace/dong/pybind_tt/example.cpython-37m-x86_64-linux-gnu.so

3.3 vart::Runner 类绑定

<Vitis-AI-2.5>/src/Vitis-AI-Runtime/VART/vart/runner/python/runner_py_module.cpp

PYBIND11_MODULE(MODULE_NAME, m) {
  m.doc() = "vart::Runner inferace";  // optional module docstring
  ...
  ...
  py::class_<vart::Runner>(m, "Runner")
  .def_static("create_runner",
              py::overload_cast<const xir::Subgraph*,
              const std::string&>(&vart::Runner::create_runner),
              py::arg("subgraph"),
              py::arg("mode") = "")
  .def("get_input_tensors",  &vart::Runner::get_input_tensors,  py::return_value_policy::reference)
  .def("get_output_tensors", &vart::Runner::get_output_tensors, py::return_value_policy::reference)
  .def("execute_async", []( vart::Runner* self,
                            std::vector<py::buffer> inputs,
                            std::vector<py::buffer> outputs,
                            bool enable_dynamic_array ) {
        // NOTE: it is important to initialize cpu_inputs and
        // cpu_outputs with GIL protection. the_map is the global
        // variable alike.
        auto cpu_inputs  = array_to_tensor_buffer(inputs , self->get_input_tensors() , enable_dynamic_array);
        auto cpu_outputs = array_to_tensor_buffer(outputs, self->get_output_tensors(), enable_dynamic_array);
        auto ret = make_pair(uint32_t(0), int32_t(0));
        if (1) {
          py::gil_scoped_release release;
          ret = self->execute_async(cpu_inputs, cpu_outputs);
        }
        // obtain the GIL again.
        if (ret.first >= 0) {
          for (auto t : cpu_inputs) {
            static_cast<CpuFlatTensorBuffer*>(t)->save_to_map(self, ret.first);
          }
          for (auto t : cpu_outputs) {
            static_cast<CpuFlatTensorBuffer*>(t)->save_to_map(self, ret.first);
          }
        } else {
          destroy(cpu_inputs);
          destroy(cpu_outputs);
        }
        return ret;
      },
      py::arg("inputs"),
      py::arg("outputs"),
      py::arg("enable_dynamic_array") = false)

  .def("wait", [](  vart::Runner* self,
                    std::pair<uint32_t, int> job_id) {
         auto ret = self->wait(job_id.first, -1);
         auto the_map = get_store();
         // copy instead of reference, it is important, do not use
         // reference here, the decontructor will clean up the mess.
         auto v = (*the_map)[self][(int)job_id.first];
         for (auto t : v) {
           delete t;
         }
         return ret;
       })
  .def("__repr__", [](const vart::Runner* self) {
    std::ostringstream str;
    str << "vart::Runner@" << (void*)self;
    
    return str.str();
  });

详细解释：

py::class_<vart::Runner>(m, "Runner")，创建了一个名为 Runner 的类，该类映射到 C++ 的vart::Runner 类。
.def_static("create_runner",...)，绑定了 create_runner 静态方法，该方法可以使用 xir::Subgraph 和一个可选的字符串参数 mode 来创建一个 Runner 实例。
获取输入和输出张量的方法：get_input_tensors、get_output_tensors。
.def("execute_async"...)，异步执行方法，绑定了 execute_async 方法，该方法接受输入和输出缓冲区，并异步执行计算。使用了 py::gil_scoped_release 来释放 GIL（全局解释器锁），以便在执行异步操作时不阻塞 Python 解释器。
.def("wait",...)，绑定了 wait 方法，该方法等待指定的作业完成。
.def("__repr__",...)，绑定了 __repr__ 方法，用于返回对象的字符串表示。

4. 总结

pybind11 是一个轻量级、非侵入性的库，可以让 C++ 和 Python 之间的交互变得简单而高效。它不仅支持自动类型转换和高级 C++ 特性，还具有极低的运行时开销，使得开发者能够在保持 Python 易用性的同时，充分利用 C++ 的性能优势。

本篇文章将深入探讨 pybind11 的核心功能和用法，包括如何通过魔法命令在 Jupyter Notebook 中快速编译和加载 C++ 代码，如何使用 pybind11 进行模块和类的绑定，以及在实际应用中如何利用这些功能实现高效的数据处理和硬件控制。通过这些内容，我们希望能够帮助读者更好地理解和应用 pybind11，从而在其项目中实现更高的性能和灵活性。