-
- 面向 ViT 算法部署场景,项目采用 ggml 推理框架 + Cpp 来实现,支持低比特量化,如 4bit 量化、5bit 量化、8bit 量化。算法部署平台包括通用 CPU、AMD CPU 等。
- 项目细节 ==> 具体参见项目
README.md
-
- (1) 模型转换,将 pytorch 模型转换为 GGUF
# install torch and timm
pip install torch timm
# list available models if needed; note that not all models are supported
python convert-pth-to-ggml.py --list
# convert the weights to gguf : vit tiny with patch size of 16 and an image
# size of 384 pre-trained on ImageNet21k and fine-tuned on ImageNet1k
python convert-pth-to-ggml.py --model_name vit_tiny_patch16_384.augreg_in21k_ft_in1k --ftype 1
# build ggml and vit
mkdir build && cd build
cmake .. && make -j4
# run inference
./bin/vit -t 4 -m ../ggml-model-f16.gguf -i ../assets/tench.jpg
usage: ./bin/vit [options]
options:
-h, --help show this help message and exit
-s SEED, --seed SEED RNG seed (default: -1)
-t N, --threads N number of threads to use during computation (default: 4)
-m FNAME, --model FNAME model path (default: ../ggml-model-f16.bin)
-i FNAME, --inp FNAME input file (default: ../assets/tench.jpg)
-k N, --topk N top k classes to print (default: 5)
-e FLOAT, --epsilon epsilon (default: 0.000001)
usage: ./bin/quantize /path/to/ggml-model-f32.gguf /path/to/ggml-model-quantized.gguf type
type = 2 - q4_0
type = 3 - q4_1
type = 6 - q5_0
type = 7 - q5_1
type = 8 - q8_0
python vaihingen_test.py -c config/vaihingen/dcswin.py -o fig_results/vaihingen/dcswin --rgb -t 'd4'
-
- https://download.csdn.net/download/weixin_42405819/89100807