计算目标检测和语义分割的PR

需求描述

  1. 实际工作中,相比于mAP项目更加关心的是特定阈值下的precision和recall结果;
  2. 由于本次的GT中除了目标框之外还存在多边形标注,为此,计算IoU的方式从框与框之间变成了mask之间
    本文的代码适用于MMDetection下的预测结果和COCO格式之间来计算PR结果,具体的实现过程如下:
  • 获取预测结果并保存到json文件中;
  • 解析预测结果和GT;
  • 根据image_id获取每张图的预测结果和GT;
  • 基于mask计算预测结果和GT之间的iou矩阵;
  • 根据iou矩阵得到对应的tp、fp和num_gt;
  • 迭代所有的图像得到所有的tp、fp和num_gt累加,根据公式计算precision和recall;

具体实现

获取预测结果

在MMDetection框架下,通常使用如下的命令来评估模型的结果:

bash tools/dist_test.sh configs/aaaa/gaotie_cascade_rcnn_r50_fpn_1x.py work_dirs/gaotie_cascade_rcnn_r50_fpn_1x/epoch_20.pth 8 --eval bbox

此时能获取到类似下图的mAP结果。
mAP)
而我们需要在某个过程把预测结果保存下,用于后续得到PR结果,具体可以在mmdet/datasets/coco.py的438行位置添加如下代码:

 try:
     import shutil
     cocoDt = cocoGt.loadRes(result_files[metric])
     shutil.copyfile(result_files[metric], "results.bbox.json")

这样我们就可以得到results.bbox.json文件,里面包含的是模型的预测结果,如下图所示。
在这里插入图片描述)

获取GT结果

由于标注时有两个格式:矩形框和多边形,因此在构建GT的coco格式文件时,对于矩形框会将其四个顶点作为多边形传入到segmentations字段,对于多边形会计算出外接矩形传入到bbox字段。
在这里插入图片描述)
为此,获取GT信息的脚本实现如下:

def construct_gt_results(gt_json_path):

    results = dict()
    bbox_results = dict()
    cocoGt = COCO(annotation_file=gt_json_path)
    # cat_ids = cocoGt.getCatIds()
    img_ids = cocoGt.getImgIds()
    for id in img_ids:
        anno_ids = cocoGt.getAnnIds(imgIds=[id])
        
        annotations = cocoGt.loadAnns(ids=anno_ids)
        for info in annotations:
            img_id = info["image_id"]
            if img_id not in results:
                results[img_id] = list()
                bbox_results[img_id] = list()
            bbox = info["bbox"]
            x1, y1, x2, y2 = bbox[0], bbox[1], bbox[0] + bbox[2], bbox[1] + bbox[3]
            # results[img_id].append([x1, y1, x2, y2])
            # mask = _poly2mask(info["segmentation"], img_h=1544, img_w=2064)
            results[img_id].append(info["segmentation"])
            bbox_results[img_id].append([x1, y1, x2, y2])
    return results, img_ids, cocoGt, bbox_results

输入GT的json文件路径,返回所有图像的分割结果,image_id,COCO对象和目标框结果(用于后续的可视化结果)。

获取预测结果

模型预测出来的结果都是目标框的形式,与上面一样,将目标框的四个顶点作为多边形的分割结果。具体解析脚本如下:

def construct_det_results(det_json_path):

    results = dict()
    bbox_results = dict()
    scores  = dict()
    with open(det_json_path) as f:
        json_data = json.load(f)
    for info in json_data:
        img_id = info["image_id"]
        if img_id not in results:
            results[img_id] = list()
            scores[img_id] = list()
            bbox_results[img_id] = list()
        bbox = info["bbox"]
        x1, y1, x2, y2 = bbox[0], bbox[1], bbox[0] + bbox[2], bbox[1] + bbox[3]
        segm = [[x1, y1, x2, y1, x2, y2, x1, y2]]
        # mask = _poly2mask(segm, img_h=1544, img_w=2064)
        score = info["score"]
        # results[img_id].append([x1, y1, x2, y2, score])
        results[img_id].append(segm)
        bbox_results[img_id].append([x1, y1, x2, y2])
        scores[img_id].append(score)
    return results, scores, bbox_results

输入的是预测结果的json文件路径,输出是所有图像分割结果、得分和目标框结果。

根据image_id计算单个图像的TP、FP结果

本步骤的具体内容如下:

  1. 根据置信度阈值对预测框进行筛选;
  2. 将所有的多边形转换为mask,用于后续计算IoU;
  3. 得到tp和fp;
  4. 可视化fp和fn结果;

将多边形转换为mask

    if img_id in det_results:
        # for dt in det_results[img_id]:
        for idx, score in enumerate(det_scores[img_id]):
            # score = dt[-1]
            if score > conf_thrs:
                mask = _poly2mask(det_results[img_id][idx], img_h=1544, img_w=2064)
                det_bboxes.append(mask)
                det_thrs_scores.append(score)
                plot_det_bboxes.append(det_tmp_bboxes[img_id][idx])
    if img_id in gt_results:     
        for segm in gt_results[img_id]:
            mask = _poly2mask(segm, img_h=1544, img_w=2064)   
            gt_bboxes.append(mask)
        plot_gt_bboxes = gt_tmp_bboxes[img_id]

通过_poly2mask函数可以将多边形转换为mask,_poly2mask函数的实现如下。

def _poly2mask(mask_ann, img_h, img_w):
    """Private function to convert masks represented with polygon to
    bitmaps.

    Args:
        mask_ann (list | dict): Polygon mask annotation input.
        img_h (int): The height of output mask.
        img_w (int): The width of output mask.

    Returns:
        numpy.ndarray: The decode bitmap mask of shape (img_h, img_w).
    """

    if isinstance(mask_ann, list):
        # polygon -- a single object might consist of multiple parts
        # we merge all parts into one mask rle code
        rles = maskUtils.frPyObjects(mask_ann, img_h, img_w)
        rle = maskUtils.merge(rles)
    elif isinstance(mask_ann['counts'], list):
        # uncompressed RLE
        rle = maskUtils.frPyObjects(mask_ann, img_h, img_w)
    else:
        # rle
        rle = mask_ann
    mask = maskUtils.decode(rle)
    return mask

计算单张图像的TP和FP

本文中使用tpfp_default函数实现该功能,具体实现如下:

def tpfp_default(det_bboxes,
                 gt_bboxes,
                 gt_bboxes_ignore=None,
                 det_thrs_scores=None,
                 iou_thr=0.5,
                 area_ranges=None):
    """Check if detected bboxes are true positive or false positive.

    Args:
        det_bbox (ndarray): Detected bboxes of this image, of shape (m, 5).
        gt_bboxes (ndarray): GT bboxes of this image, of shape (n, 4).
        gt_bboxes_ignore (ndarray): Ignored gt bboxes of this image,
            of shape (k, 4). Default: None
        iou_thr (float): IoU threshold to be considered as matched.
            Default: 0.5.
        area_ranges (list[tuple] | None): Range of bbox areas to be evaluated,
            in the format [(min1, max1), (min2, max2), ...]. Default: None.

    Returns:
        tuple[np.ndarray]: (tp, fp) whose elements are 0 and 1. The shape of
            each array is (num_scales, m).
    """
    # an indicator of ignored gts
    gt_ignore_inds = np.concatenate(
        (np.zeros(gt_bboxes.shape[0], dtype=np.bool),
         np.ones(gt_bboxes_ignore.shape[0], dtype=np.bool)))
    # stack gt_bboxes and gt_bboxes_ignore for convenience
    # gt_bboxes = np.vstack((gt_bboxes, gt_bboxes_ignore))

    num_dets = det_bboxes.shape[0]
    num_gts = gt_bboxes.shape[0]
    if area_ranges is None:
        area_ranges = [(None, None)]
    num_scales = len(area_ranges)
    # tp and fp are of shape (num_scales, num_gts), each row is tp or fp of
    # a certain scale
    tp = np.zeros((num_scales, num_dets), dtype=np.float32)
    fp = np.zeros((num_scales, num_dets), dtype=np.float32)

    # if there is no gt bboxes in this image, then all det bboxes
    # within area range are false positives
    if gt_bboxes.shape[0] == 0:
        if area_ranges == [(None, None)]:
            fp[...] = 1
        else:
            det_areas = (det_bboxes[:, 2] - det_bboxes[:, 0] + 1) * (
                det_bboxes[:, 3] - det_bboxes[:, 1] + 1)
            for i, (min_area, max_area) in enumerate(area_ranges):
                fp[i, (det_areas >= min_area) & (det_areas < max_area)] = 1
        return tp, fp

    # ious = bbox_overlaps(det_bboxes, gt_bboxes)
    # ious = mask_overlaps(det_bboxes, gt_bboxes)
    ious = mask_wraper(det_bboxes, gt_bboxes)
    # for each det, the max iou with all gts
    ious_max = ious.max(axis=1)
    # for each det, which gt overlaps most with it
    ious_argmax = ious.argmax(axis=1)
    # sort all dets in descending order by scores
    # sort_inds = np.argsort(-det_bboxes[:, -1])
    sort_inds = np.argsort(-det_thrs_scores)
    for k, (min_area, max_area) in enumerate(area_ranges):
        gt_covered = np.zeros(num_gts, dtype=bool)
        # if no area range is specified, gt_area_ignore is all False
        if min_area is None:
            gt_area_ignore = np.zeros_like(gt_ignore_inds, dtype=bool)
        else:
            gt_areas = (gt_bboxes[:, 2] - gt_bboxes[:, 0] + 1) * (
                gt_bboxes[:, 3] - gt_bboxes[:, 1] + 1)
            gt_area_ignore = (gt_areas < min_area) | (gt_areas >= max_area)
        for i in sort_inds:
            if ious_max[i] >= iou_thr:
                matched_gt = ious_argmax[i]     # 得到对应的GT索引
                if not (gt_ignore_inds[matched_gt]
                        or gt_area_ignore[matched_gt]):
                    if not gt_covered[matched_gt]:
                        gt_covered[matched_gt] = True   # GT占位
                        tp[k, i] = 1            
                    else:
                        fp[k, i] = 1
                # otherwise ignore this detected bbox, tp = 0, fp = 0
            elif min_area is None:
                fp[k, i] = 1
            else:
                bbox = det_bboxes[i, :4]
                area = (bbox[2] - bbox[0] + 1) * (bbox[3] - bbox[1] + 1)
                if area >= min_area and area < max_area:
                    fp[k, i] = 1
    return tp, fp

过程是先获取预测框和GT框之间的IoU矩阵,然后按照置信度排序,将每个预测框分配给GT框得到tp和fp结果。

计算mask的IoU

IoU的定义都是一样的,计算公式如下:
在这里插入图片描述
基于mask计算IoU的实验也非常简单,代码如下:

def mask_overlaps(bboxes1, bboxes2, mode='iou'):

    assert mode in ['iou', 'iof']

    bboxes1 = bboxes1.astype(np.bool_)
    bboxes2 = bboxes2.astype(np.bool_)
    
    intersection = np.logical_and(bboxes1, bboxes2)
    union = np.logical_or(bboxes1, bboxes2)

    intersection_area = np.sum(intersection)
    union_area = np.sum(union)

    iou = intersection_area / union_area
    return iou

而计算预测框和GT之间的IoU矩阵实现如下:

def mask_wraper(bboxes1, bboxes2, mode='iou'):
    rows = bboxes1.shape[0]     # gt
    cols = bboxes2.shape[0]     # det
    ious = np.zeros((rows, cols), dtype=np.float32)
    if rows * cols == 0:
        return ious
    for i in range(rows):
        for j in range(cols):
            iou = mask_overlaps(bboxes1[i], bboxes2[j])
            ious[i, j] = iou
    return ious

至此,通过上述过程就能获取到单张图像的tp和fp结果。

可视化FP和FN结果

此外,我们需要分析模型的badcase,因此,可以将FP和FN的结果可视化出来,我这里是直接将存在问题的图像所有预测框和GT框都画出来了。

    if VIS and (fp > 0 or tp < gt):
        img_data, path = draw_bbox(img_id=img_id, cocoGt=cocoGt, det_bboxes=plot_det_bboxes, gt_bboxes=plot_gt_bboxes)
        if fp > 0:
            save_dir = os.path.join(VIS_ROOT, "tmp/FP/")
            os.makedirs(save_dir, exist_ok=True)
            cv2.imwrite(os.path.join(save_dir, os.path.basename(path)+".jpg"), img_data, [int(cv2.IMWRITE_JPEG_QUALITY), 30])
        if tp < gt:
            save_dir = os.path.join(VIS_ROOT, "tmp/FN/")
            os.makedirs(save_dir, exist_ok=True)
            cv2.imwrite(os.path.join(save_dir, os.path.basename(path)+".jpg"), img_data,
                        [int(cv2.IMWRITE_JPEG_QUALITY), 30])

画框的实现如下:

def draw_bbox(img_id, cocoGt, det_bboxes, gt_bboxes):
    path = cocoGt.loadImgs(ids=[img_id])[0]["file_name"]
    img_path = os.path.join(IMG_ROOT, path)
    img_data = cv2.imread(img_path)
    for box in det_bboxes:
        # color_mask = (0, 0, 255)
        # color_mask = np.array([0, 0, 255], dtype=np.int8)
        # bbox_mask = box.astype(np.bool)
        cv2.rectangle(img_data, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), (0, 0, 255), 3)
        # img_data[bbox_mask] = img_data[bbox_mask] * 0.5 + color_mask * 0.5
    for box in gt_bboxes:
        # color_mask = np.array([0, 255, 0], dtype=np.int8)
        # bbox_mask = box.astype(np.bool)

        # img_data[bbox_mask] = img_data[bbox_mask] * 0.5 + color_mask * 0.5
        cv2.rectangle(img_data, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), (0, 255, 0), 3)
    
    return img_data, path

至此,我们实现了单张图像的所有业务逻辑。

多线程计算所有图像结果

通过multiprocessing启动一个进程池来加速结果计算。

def eval_multiprocessing(img_ids):
    from multiprocessing import Pool
    pool = Pool(processes=16)

    results = pool.map(eval_pr, img_ids)
    # 关闭进程池,表示不再接受新的任务
    pool.close()

    # 等待所有任务完成
    pool.join()
    return np.sum(np.array(results), axis=0)

计算PR结果

返回所有图像的TP和FP结果之后,就可以计算precision和recall值了。

gt, tp, fp = eval_multiprocessing(img_ids)
eps = np.finfo(np.float32).eps
recalls = tp / np.maximum(gt, eps)
precisions = tp / np.maximum((tp + fp), eps)

print("conf_thrs:{:.3f} iou_thrs:{:.3f}, gt:{:d}, TP={:d}, FP={:d}, P={:.3f}, R={:.3f}".format(conf_thrs, iou_thrs, gt, tp, fp, precisions, recalls))

最后,也附上整个实现代码,方便后续复现或者参考。

from multiprocessing import Pool
import os
import numpy as np
import json
from pycocotools.coco import COCO
import cv2
from pycocotools import mask as maskUtils

def bbox_overlaps(bboxes1, bboxes2, mode='iou'):
    """Calculate the ious between each bbox of bboxes1 and bboxes2.

    Args:
        bboxes1(ndarray): shape (n, 4)
        bboxes2(ndarray): shape (k, 4)
        mode(str): iou (intersection over union) or iof (intersection
            over foreground)

    Returns:
        ious(ndarray): shape (n, k)
    """

    assert mode in ['iou', 'iof']

    bboxes1 = bboxes1.astype(np.float32)
    bboxes2 = bboxes2.astype(np.float32)
    rows = bboxes1.shape[0]
    cols = bboxes2.shape[0]
    ious = np.zeros((rows, cols), dtype=np.float32)
    if rows * cols == 0:
        return ious
    exchange = False
    if bboxes1.shape[0] > bboxes2.shape[0]:
        bboxes1, bboxes2 = bboxes2, bboxes1
        ious = np.zeros((cols, rows), dtype=np.float32)
        exchange = True
    area1 = (bboxes1[:, 2] - bboxes1[:, 0] + 1) * (bboxes1[:, 3] - bboxes1[:, 1] + 1)
    area2 = (bboxes2[:, 2] - bboxes2[:, 0] + 1) * (bboxes2[:, 3] - bboxes2[:, 1] + 1)
    for i in range(bboxes1.shape[0]):
        x_start = np.maximum(bboxes1[i, 0], bboxes2[:, 0])
        y_start = np.maximum(bboxes1[i, 1], bboxes2[:, 1])
        x_end = np.minimum(bboxes1[i, 2], bboxes2[:, 2])
        y_end = np.minimum(bboxes1[i, 3], bboxes2[:, 3])
        overlap = np.maximum(x_end - x_start + 1, 0) * np.maximum(y_end - y_start + 1, 0)
        if mode == 'iou':
            union = area1[i] + area2 - overlap
        else:
            union = area1[i] if not exchange else area2
        ious[i, :] = overlap / union
    if exchange:
        ious = ious.T
    return ious

def mask_wraper(bboxes1, bboxes2, mode='iou'):
    rows = bboxes1.shape[0]     # gt
    cols = bboxes2.shape[0]     # det
    ious = np.zeros((rows, cols), dtype=np.float32)
    if rows * cols == 0:
        return ious
    for i in range(rows):
        for j in range(cols):
            iou = mask_overlaps(bboxes1[i], bboxes2[j])
            ious[i, j] = iou
    return ious

def mask_overlaps(bboxes1, bboxes2, mode='iou'):

    assert mode in ['iou', 'iof']

    bboxes1 = bboxes1.astype(np.bool_)
    bboxes2 = bboxes2.astype(np.bool_)
    
    intersection = np.logical_and(bboxes1, bboxes2)
    union = np.logical_or(bboxes1, bboxes2)

    intersection_area = np.sum(intersection)
    union_area = np.sum(union)

    iou = intersection_area / union_area
    return iou


def tpfp_default(det_bboxes,
                 gt_bboxes,
                 gt_bboxes_ignore=None,
                 det_thrs_scores=None,
                 iou_thr=0.5,
                 area_ranges=None):
    """Check if detected bboxes are true positive or false positive.

    Args:
        det_bbox (ndarray): Detected bboxes of this image, of shape (m, 5).
        gt_bboxes (ndarray): GT bboxes of this image, of shape (n, 4).
        gt_bboxes_ignore (ndarray): Ignored gt bboxes of this image,
            of shape (k, 4). Default: None
        iou_thr (float): IoU threshold to be considered as matched.
            Default: 0.5.
        area_ranges (list[tuple] | None): Range of bbox areas to be evaluated,
            in the format [(min1, max1), (min2, max2), ...]. Default: None.

    Returns:
        tuple[np.ndarray]: (tp, fp) whose elements are 0 and 1. The shape of
            each array is (num_scales, m).
    """
    # an indicator of ignored gts
    gt_ignore_inds = np.concatenate(
        (np.zeros(gt_bboxes.shape[0], dtype=np.bool),
         np.ones(gt_bboxes_ignore.shape[0], dtype=np.bool)))
    # stack gt_bboxes and gt_bboxes_ignore for convenience
    # gt_bboxes = np.vstack((gt_bboxes, gt_bboxes_ignore))

    num_dets = det_bboxes.shape[0]
    num_gts = gt_bboxes.shape[0]
    if area_ranges is None:
        area_ranges = [(None, None)]
    num_scales = len(area_ranges)
    # tp and fp are of shape (num_scales, num_gts), each row is tp or fp of
    # a certain scale
    tp = np.zeros((num_scales, num_dets), dtype=np.float32)
    fp = np.zeros((num_scales, num_dets), dtype=np.float32)

    # if there is no gt bboxes in this image, then all det bboxes
    # within area range are false positives
    if gt_bboxes.shape[0] == 0:
        if area_ranges == [(None, None)]:
            fp[...] = 1
        else:
            det_areas = (det_bboxes[:, 2] - det_bboxes[:, 0] + 1) * (
                det_bboxes[:, 3] - det_bboxes[:, 1] + 1)
            for i, (min_area, max_area) in enumerate(area_ranges):
                fp[i, (det_areas >= min_area) & (det_areas < max_area)] = 1
        return tp, fp

    # ious = bbox_overlaps(det_bboxes, gt_bboxes)
    # ious = mask_overlaps(det_bboxes, gt_bboxes)
    ious = mask_wraper(det_bboxes, gt_bboxes)
    # for each det, the max iou with all gts
    ious_max = ious.max(axis=1)
    # for each det, which gt overlaps most with it
    ious_argmax = ious.argmax(axis=1)
    # sort all dets in descending order by scores
    # sort_inds = np.argsort(-det_bboxes[:, -1])
    sort_inds = np.argsort(-det_thrs_scores)
    for k, (min_area, max_area) in enumerate(area_ranges):
        gt_covered = np.zeros(num_gts, dtype=bool)
        # if no area range is specified, gt_area_ignore is all False
        if min_area is None:
            gt_area_ignore = np.zeros_like(gt_ignore_inds, dtype=bool)
        else:
            gt_areas = (gt_bboxes[:, 2] - gt_bboxes[:, 0] + 1) * (
                gt_bboxes[:, 3] - gt_bboxes[:, 1] + 1)
            gt_area_ignore = (gt_areas < min_area) | (gt_areas >= max_area)
        for i in sort_inds:
            if ious_max[i] >= iou_thr:
                matched_gt = ious_argmax[i]     # 得到对应的GT索引
                if not (gt_ignore_inds[matched_gt]
                        or gt_area_ignore[matched_gt]):
                    if not gt_covered[matched_gt]:
                        gt_covered[matched_gt] = True   # GT占位
                        tp[k, i] = 1            
                    else:
                        fp[k, i] = 1
                # otherwise ignore this detected bbox, tp = 0, fp = 0
            elif min_area is None:
                fp[k, i] = 1
            else:
                bbox = det_bboxes[i, :4]
                area = (bbox[2] - bbox[0] + 1) * (bbox[3] - bbox[1] + 1)
                if area >= min_area and area < max_area:
                    fp[k, i] = 1
    return tp, fp


def _poly2mask(mask_ann, img_h, img_w):
    """Private function to convert masks represented with polygon to
    bitmaps.

    Args:
        mask_ann (list | dict): Polygon mask annotation input.
        img_h (int): The height of output mask.
        img_w (int): The width of output mask.

    Returns:
        numpy.ndarray: The decode bitmap mask of shape (img_h, img_w).
    """

    if isinstance(mask_ann, list):
        # polygon -- a single object might consist of multiple parts
        # we merge all parts into one mask rle code
        rles = maskUtils.frPyObjects(mask_ann, img_h, img_w)
        rle = maskUtils.merge(rles)
    elif isinstance(mask_ann['counts'], list):
        # uncompressed RLE
        rle = maskUtils.frPyObjects(mask_ann, img_h, img_w)
    else:
        # rle
        rle = mask_ann
    mask = maskUtils.decode(rle)
    return mask



def construct_det_results(det_json_path):

    results = dict()
    bbox_results = dict()
    scores  = dict()
    with open(det_json_path) as f:
        json_data = json.load(f)
    for info in json_data:
        img_id = info["image_id"]
        if img_id not in results:
            results[img_id] = list()
            scores[img_id] = list()
            bbox_results[img_id] = list()
        bbox = info["bbox"]
        x1, y1, x2, y2 = bbox[0], bbox[1], bbox[0] + bbox[2], bbox[1] + bbox[3]
        segm = [[x1, y1, x2, y1, x2, y2, x1, y2]]
        # mask = _poly2mask(segm, img_h=1544, img_w=2064)
        score = info["score"]
        # results[img_id].append([x1, y1, x2, y2, score])
        results[img_id].append(segm)
        bbox_results[img_id].append([x1, y1, x2, y2])
        scores[img_id].append(score)
    return results, scores, bbox_results
    
    
def construct_gt_results(gt_json_path):

    results = dict()
    bbox_results = dict()
    cocoGt = COCO(annotation_file=gt_json_path)
    # cat_ids = cocoGt.getCatIds()
    img_ids = cocoGt.getImgIds()
    for id in img_ids:
        anno_ids = cocoGt.getAnnIds(imgIds=[id])
        
        annotations = cocoGt.loadAnns(ids=anno_ids)
        for info in annotations:
            img_id = info["image_id"]
            if img_id not in results:
                results[img_id] = list()
                bbox_results[img_id] = list()
            bbox = info["bbox"]
            x1, y1, x2, y2 = bbox[0], bbox[1], bbox[0] + bbox[2], bbox[1] + bbox[3]
            # results[img_id].append([x1, y1, x2, y2])
            # mask = _poly2mask(info["segmentation"], img_h=1544, img_w=2064)
            results[img_id].append(info["segmentation"])
            bbox_results[img_id].append([x1, y1, x2, y2])
    return results, img_ids, cocoGt, bbox_results



def draw_bbox(img_id, cocoGt, det_bboxes, gt_bboxes):
    path = cocoGt.loadImgs(ids=[img_id])[0]["file_name"]
    img_path = os.path.join(IMG_ROOT, path)
    img_data = cv2.imread(img_path)
    for box in det_bboxes:
        # color_mask = (0, 0, 255)
        # color_mask = np.array([0, 0, 255], dtype=np.int8)
        # bbox_mask = box.astype(np.bool)
        cv2.rectangle(img_data, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), (0, 0, 255), 3)
        # img_data[bbox_mask] = img_data[bbox_mask] * 0.5 + color_mask * 0.5
    for box in gt_bboxes:
        # color_mask = np.array([0, 255, 0], dtype=np.int8)
        # bbox_mask = box.astype(np.bool)

        # img_data[bbox_mask] = img_data[bbox_mask] * 0.5 + color_mask * 0.5
        cv2.rectangle(img_data, (int(box[0]), int(box[1])), (int(box[2]), int(box[3])), (0, 255, 0), 3)
    
    return img_data, path

def eval_pr(img_id):
    tp, fp, gt = 0, 0, 0
    gt_bboxes, gt_ignore = [], []
    det_bboxes = list()
    gt_bboxes = list()
    det_thrs_scores = list()
    
    plot_det_bboxes = list()
    plot_gt_bboxes  = list()
    
    if img_id in det_results:
        # for dt in det_results[img_id]:
        for idx, score in enumerate(det_scores[img_id]):
            # score = dt[-1]
            if score > conf_thrs:
                mask = _poly2mask(det_results[img_id][idx], img_h=1544, img_w=2064)
                det_bboxes.append(mask)
                det_thrs_scores.append(score)
                plot_det_bboxes.append(det_tmp_bboxes[img_id][idx])
    if img_id in gt_results:     
        for segm in gt_results[img_id]:
            mask = _poly2mask(segm, img_h=1544, img_w=2064)   
            gt_bboxes.append(mask)
        plot_gt_bboxes = gt_tmp_bboxes[img_id]
            
    det_bboxes = np.array(det_bboxes)
    gt_bboxes = np.array(gt_bboxes)
    det_thrs_scores = np.array(det_thrs_scores)
    gt_ignore = np.array(gt_ignore).reshape(-1, 4)
    
    if len(gt_bboxes) > 0:
        if len(det_bboxes) == 0:
            tp, fp = 0, 0 
        else:
            tp, fp = tpfp_default(det_bboxes, gt_bboxes, gt_ignore, det_thrs_scores, iou_thrs)
            tp, fp = np.sum(tp == 1), np.sum(fp == 1)
        gt = len(gt_bboxes)
        
    else:
        fp = len(det_bboxes)
        
        
    if VIS and (fp > 0 or tp < gt):
        img_data, path = draw_bbox(img_id=img_id, cocoGt=cocoGt, det_bboxes=plot_det_bboxes, gt_bboxes=plot_gt_bboxes)
        if fp > 0:
            save_dir = os.path.join(VIS_ROOT, "tmp/FP/")
            os.makedirs(save_dir, exist_ok=True)
            cv2.imwrite(os.path.join(save_dir, os.path.basename(path)+".jpg"), img_data, [int(cv2.IMWRITE_JPEG_QUALITY), 30])
        if tp < gt:
            save_dir = os.path.join(VIS_ROOT, "tmp/FN/")
            os.makedirs(save_dir, exist_ok=True)
            cv2.imwrite(os.path.join(save_dir, os.path.basename(path)+".jpg"), img_data,
                        [int(cv2.IMWRITE_JPEG_QUALITY), 30])
    return gt, tp, fp

    
def eval_multiprocessing(img_ids):
    from multiprocessing import Pool
    pool = Pool(processes=16)

    results = pool.map(eval_pr, img_ids)
    # 关闭进程池,表示不再接受新的任务
    pool.close()

    # 等待所有任务完成
    pool.join()
    return np.sum(np.array(results), axis=0)



if __name__ == '__main__':
    VIS = 1
    IMG_ROOT = "gaotie_data"
    VIS_ROOT = 'badcase-vis-test-2/'

    conf_thrs = 0.5
    iou_thrs  = 0.001
    det_json_path = "results.bbox.json"
    gt_json_path  = "datasets/gaotie_test_data/annotations/test5_seg_removed.json"
    det_results, det_scores, det_tmp_bboxes = construct_det_results(det_json_path)
    gt_results, img_ids, cocoGt, gt_tmp_bboxes  = construct_gt_results(gt_json_path)

    gt, tp, fp = eval_multiprocessing(img_ids)
    eps = np.finfo(np.float32).eps
    recalls = tp / np.maximum(gt, eps)
    precisions = tp / np.maximum((tp + fp), eps)
    
    print("conf_thrs:{:.3f} iou_thrs:{:.3f}, gt:{:d}, TP={:d}, FP={:d}, P={:.3f}, R={:.3f}".format(conf_thrs, iou_thrs, gt, tp, fp, precisions, recalls))
    

总结

本文针对目标检测任务中GT存在多边形情况下给出了如下计算数据集的PR结果,基于mask来计算IoU,与语义分割计算IoU的思路一致,最后也给出了所有的实现代码作为参考。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/232502.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

kafka常见问题处理

1. 如何防⽌消息丢失 在生产者层面&#xff0c;我们有个ack参数确认机制 设置成-1&#xff0c;也就是副本全部同步了leader才发送ack&#xff0c;这样确保leader和副本挂掉只剩一个还能 保证消息不丢失 消费者&#xff1a; 把⾃动提交改成⼿动提交 2. 如何防⽌重复消费 在…

【数据结构】平衡树引入

数据结构-平衡树 前置知识 二叉树二叉树的中序遍历 问题 维护一个数据结构&#xff0c;支持插入元素、删除元素、查询元素的排名、查询排名对应的元素、查询元素的前驱、查询元素的后继等。 BST&#xff08;二叉搜索树&#xff09; 作为一个基本无效&#xff08;很容易卡掉…

【IC验证】perl脚本——分析前/后仿用例回归情况

目录 1 脚本名称 2 脚本使用说明 3 nocare_list文件示例 4 脚本执行方法 5 postsim_result.log文件示例 6 脚本代码 1 脚本名称 post_analysis 2 脚本使用说明 help&#xff1a;打印脚本说明信息 命令&#xff1a;post_analysis help 前/后仿结束后&#xff0c;首先填…

VoxPoser:使用语言模型进行机器人操作的可组合 3D 值图

语言是一种压缩媒介&#xff0c;人们通过它来提炼和传达他们对世界的知识和经验。大型语言模型&#xff08;LLMs&#xff09;已成为一种有前景的方法&#xff0c;通过将世界投影到语言空间中来捕捉这种抽象。虽然这些模型被认为在文本形式中内化了可概括的知识&#xff0c;但如…

C++STL详解+代码分析+典例讲解

vector 的介绍&#xff1a; 1、vector是表示可变大小数组的序列容器。 2、vector就像数组一样&#xff0c;也采用的连续空间来存储元素&#xff0c;这也意味着可以采用下标对vector的元素进行访问。 3、vector与普通数组不同的是&#xff0c;vector的大小是可以动态改变的。 4、…

基于K-means与CNN的遥感影像分类方法

基于K-means与CNN的遥感影像分类 一、引言 1.研究背景 航天遥感技术是一种通过卫星对地观测获取遥感图像信息数据的技术&#xff0c;这些图像数据在各领域都发挥着不可或缺的作用。遥感图像分类主要是根据地面物体电磁波辐射在遥感图像上的特征&#xff0c;判断识别地面物体的属…

10 大 Mac 数据恢复软件深度评测

对于任何依赖计算机获取重要文件&#xff08;无论是个人照片还是重要商业文档&#xff09;的人来说&#xff0c;数据丢失可能是一场噩梦。值得庆幸的是&#xff0c;有多种专门为 Mac 用户提供的数据恢复工具&#xff0c;可以帮助检索丢失或意外删除的文件。在本文中&#xff0c…

基于Python+Selenium+Unittest+PO设计模式

一、什么是PO设计模式&#xff08;Page Object Model&#xff09; 1、Page Object是一种设计模式&#xff0c;它主要体现在对界面交互细节的封装上&#xff0c;使测试用例更专注于业务的操作&#xff0c;从而提高测试用例的可维护性。 2、一般PO设计模式有三层 第一层&#x…

【基于NLP的微博情感分析:从数据爬取到情感洞察】

基于NLP的微博情感分析&#xff1a;从数据爬取到情感洞察 背景数据集技术选型功能实现创新点 今天我将分享一个基于NLP的微博情感分析项目&#xff0c;通过Python技术、NLP模型和Flask框架&#xff0c;对微博数据进行清洗、分词、可视化&#xff0c;并利用NLP和贝叶斯进行情感分…

基于Lucene的全文检索系统的实现与应用

文章目录 一、概念二、引入案例1、数据库搜索2、数据分类3、非结构化数据查询方法1&#xff09; 顺序扫描法(Serial Scanning)2&#xff09;全文检索(Full-text Search) 4、如何实现全文检索 三、Lucene实现全文检索的流程1、索引和搜索流程图2、创建索引1&#xff09;获取原始…

Moco框架的搭建使用

一、前言   之前一直听mock&#xff0c;也大致了解mock的作用&#xff0c;但没有具体去了解过如何用工具或框架实现mock&#xff0c;以及也没有考虑过落实mock&#xff0c;因为在实际的工作中&#xff0c;很少会考虑用mock。最近在学java&#xff0c;刚好了解到moco框架是用于…

语言模型GPT与HuggingFace应用

受到计算机视觉领域采用ImageNet对模型进行一次预训练&#xff0c;使得模型可以通过海量图像充分学习如何提取特征&#xff0c;然后再根据任务目标进行模型微调的范式影响&#xff0c;自然语言处理领域基于预训练语言模型的方法也逐渐成为主流。以ELMo为代表的动态词向量模型开…

创建dockerSwarm nfs挂载

创建dockerSwarm nfs挂载 nfs高可用部署(lsyncd两主机双向同步) nfs高可用部署(lsyncd三主机三向同步) 1. 通过 Volume 1.1 创建 Docker Volume 每个 swarm 节点均创建相同名称的 Docker Volume&#xff08;名称为 nfs120&#xff09; docker volume create --driver local …

Jupyter notebook修改背景主题

打开Anaconda Prompt&#xff0c;输入以下内容 1. pip install --upgrade jupyterthemes 下载对应背景主题包 出现Successfully installed jupyterthemes-0.20.0 lesscpy-0.15.1时&#xff0c;说明已经下载安装完成 2. jt -l 查看背景主题列表 3. jt -t 主题名称&#xff08;…

【docker 】centOS 安装docker

官网 docker官网 github源码 卸载旧版本 sudo yum remove docker \docker-client \docker-client-latest \docker-common \docker-latest \docker-latest-logrotate \docker-logrotate \docker-engine 安装软件包 yum install -y yum-utils \device-mapper-persistent-data…

Spring IOC—基于XML配置Bean的更多内容和细节(通俗易懂)

目录 一、前言 二、Bean配置信息重用 1.简介 : 2.实例 : 三、关于Bean的创建顺序 1.简介 : 2.实例 : 四、关于Bean的单例和多例 1.简介 : 2.实例 : 五、关于Bean的生命周期 1.简介 : 2.实例 : 六、Bean配置后置处理器 1.简介 : 2.实例 : 七、通过.properties文…

AcWing 93. 递归实现组合型枚举

Every day a AcWing 题目来源&#xff1a;93. 递归实现组合型枚举 解法1&#xff1a;回溯算法 标准的回溯算法模板题。 如果把 n、m 和数组 nums 都设置成全局变量的话&#xff0c;backtracking 回溯函数可以只用一个参数 level。 注意传参时 nums 不能用引用&#xff0c;…

Hive SQL间隔连续问题

问题引入 下面是某游戏公司记录的用户每日登录数据, 计算每个用户最大的连续登录天数&#xff0c;定义连续登录时可以间隔一天。举例&#xff1a;如果一个用户在 1,3,5,6,9 登录了游戏&#xff0c;则视为连续 6 天登录。 id dt1001 2021-12-121002 2021-12-12…

SQL语句---删除索引

介绍 使用sql语句删除索引。由于索引会占用一定的磁盘空间&#xff0c;因此&#xff0c;为了避免影响数据库性能&#xff0c;应该及时删除不再使用的索引。 命令 drop index 索引名 on 表名;例子 删除a表中的singleidx索引&#xff1a; drop index singleidx on a;下面是执…

GoldWave注册机 最新中文汉化破解版-安装使用教程

GoldWave是一个功能强大的数字音乐编辑器&#xff0c;是一个集声音编辑、播放、录制和转换的音频工具。它还可以对音频内容进行转换格式等处理。它体积小巧&#xff0c;功能却无比强大&#xff0c;支持许多格式的音频文件&#xff0c;包括WAV、OGG、VOC、 IFF、AIFF、 AIFC、AU…