前言
BOP是6D位姿估计基准,汇总整理了多个数据集,还举行挑战赛,相关报告被CVPR2024接受和认可。
它提供3D物体模型和RGB-D图像,其中标注信息包括6D位姿、2D边界框和2D蒙版等。
包含数据集:LM 、LM-O 、T-LESS 、ITODD 、HB 、HOPE 、YCB-V 、RU-APC 、IC-BIN 、IC-MI 、TUD-L、TYO-L
数据集汇总地址:https://bop.felk.cvut.cz/datasets/
BOP | 细分数据集 | 标注内容 | 特点 |
数据集 | LM | 15 件无纹理的家居用品,具有不同的颜色、形状和大小 | 该实例具有明显的杂波,但只有轻微的遮挡 |
LM-O | 15 件无纹理的家居用品,具有不同的颜色、形状和大小 | 在LM基础上引入了不同遮挡级别的干扰 | |
T-LESS | 30 个与工业相关的物体 | 没有明显的纹理或可辨别的颜色 | |
ITODD | 28 个真实的工业环境中物体 | 使用高质量的 Gray-D 传感器拍摄 | |
HB | 33 个物体,包括17 个玩具物体、8 个家居物体和 8 个行业相关物体 | 在 13 个场景中捕获,复杂程度各不相同 | |
HOPE | 28 个玩具杂货物品,10 个家庭/办公环境中的 50 个场景中捕获 | 杂乱场景,有不同程度的遮挡 | |
YCB-V | 21个日常生活物品,具有有不同形状、大小、纹理、重量和刚度 | 在92个视频中捕获 | |
RU-APC | 14 个纹理物品,杂乱仓库货架场景 | 杂乱场景 | |
IC-BIN | 2个物体,来自IC-MI的箱子拾取场景 | 受到严重遮挡 | |
IC-MI | 2个无纹理和4个有纹理的家居用品 | 具有杂乱和轻微遮挡 | |
TUD-L | 3个移动物体,8种照明变化场景 | 8种照明条件下的移动物体 | |
TYO-L | 21 个物体,每个物体在桌面设置中以多种姿势捕捉 | 有4个不同的 桌布和五种不同的照明条件 |
一、数据集下载
BOP提供了整理汇总后的数据集
下载地址:https://huggingface.co/datasets/bop-benchmark/datasets/tree/main
比如下载YCB-V数据集,点进去能看到,然后点击下载即可
或者下载LM(Linemod)数据集:
二、数据格式
默认采用 BOP-webdataset 格式,数据集具有以下结构:
DATASET_NAME
├─ camera[_TYPE].json # 相机参数(仅用于模拟传感器)
├─ dataset_info.json # 特定于数据集的信息
├─ test_targets_bop19.json # 用于评估的测试目标列表BOP挑战赛 2019/2020/2022等。
├─ models[_MODELTYPE][_eval] # 3D物体模型
│ ├─ models_info.json
│ ├─ obj_OBJ_ID.ply
├─ train|val|test[_TYPE] # 对应训练集、验证集、测试集
│ ├─ SCENE_ID|OBJ_ID
│ │ ├─ scene_camera.json # 相机参数(真实数据的相机)
│ │ ├─ scene_gt.json
│ │ ├─ scene_gt_info.json
│ │ ├─ depth # 深度图
│ │ ├─ mask # 物体完整mask掩码图
│ │ ├─ mask_visib # 物体实际可见部分的mask掩码图
│ │ ├─ rgb|gray # 彩色图/灰度图
- 其中,相应图像具有相同的 ID,例如 rgb/000000.png 和 depth/000000.png 是颜色和深度图像 相同的RGB-D帧。
- 掩码的命名约定是IMID_GTID.png, 其中 IMID 是影像 ID,GTID 是真值注释的索引 (存储在scene_gt.json中)。
详细介绍参考官方文档:https://github.com/thodan/bop_toolkit/blob/master/docs/bop_datasets_format.md
2.1 相机参数 scene_camera.json
通常每个物体有一组图像数据,数据集有多个物体,形成多组数据。
每组图像都有文件scene_camera.json,表示真实相机的参数,其中包含每个图像的以下信息:
- cam_K - 3x3 固有相机矩阵 K。
- depth_scale - 将深度图像乘以此系数,得到以毫米为单位的深度。
- cam_R_w2c(可选)- 3x3 旋转矩阵R_w2c。
- cam_t_w2c(可选)- 3x1 平移向量t_w2c。
- view_level(可选)- 视点细分级别。
注意,每个图像的矩阵 K 可能不同。camera.json仅表示用于在渲染训练图像时模拟使用的传感器。
2.2 真实姿势信息 scene_gt_info.json
文件scene_gt_info.json中提供了以下有关地面真实姿势的信息:
- bbox_obj - 对象轮廓的 2D 边界框,由 (x,y ,width,height),其中(x, y)是边界框的左上角。
- bbox_visib - 对象剪影可见部分的 2D 边界框。
- px_count_all - 对象侧面像中的像素数。
- px_count_valid - 对象侧面图像中具有有效 深度测量(即深度图像中的非零值)。
- px_count_visib - 对象可见部分的像素数 剪影。
- visib_fract - 对象轮廓的可见部分 (= px_count_visib/px_count _all)。
2.3 3D物体模型
3D 对象模型以PLY格式提供,包括顶点法线。
- 大多数模型还包括顶点颜色或顶点纹理与保存为单独图像的纹理相协调。
- 顶点法线是使用 MeshLab 作为面的角度加权和计算的 入射到顶点的法线
- 每个包含对象模型的文件夹都包含文件models_info.json,其中包括每个对象模型的 3D 边界框和直径。直径为计算为任意一对模型顶点之间的最大距离。
2.4 坐标系
所有坐标系(模型、相机、世界)都是右手坐标系。
- 在模型坐标系中,Z 轴指向上方(当对象 站立“自然直立”),原点与中心重合 对象模型的 3D 边界框。
- 相机坐标系与 OpenCV 中一样,相机沿 Z 轴查看。
单位信息:
- 3D 物体模型:1 毫米
- 平移矢量: 1 mm
2.5 数据格式示例
以YCB-V数据集为例,看下其中一个物体的示例数据:
depth文件夹存放深度图
mask文件夹中,存放物体完整mask掩码图
mask_visib文件夹中,存放物体实际可见部分的mask掩码图
rgb文件夹存放彩色图片
scene_camera.json文件
{
"1": {"cam_K": [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0], "cam_R_w2c": [0.775038, 0.630563, -0.0413049, 0.1427, -0.238322, -0.960645, -0.615591, 0.738643, -0.27469], "cam_t_w2c": [22.278120142899976, 67.27103635299997, 833.583980809], "depth_scale": 0.1},
"36": {"cam_K": [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0], "cam_R_w2c": [0.780182, 0.625096, -0.0238987, 0.151021, -0.225288, -0.962516, -0.607049, 0.747329, -0.270168], "cam_t_w2c": [5.508214978099937, 64.68344464100001, 825.7533207070001], "depth_scale": 0.1},
"47": {"cam_K": [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0], "cam_R_w2c": [0.767769, 0.640207, -0.0257968, 0.152583, -0.221792, -0.963082, -0.622293, 0.735488, -0.26797], "cam_t_w2c": [27.66960968699998, 63.071926080000004, 832.8279904959999], "depth_scale": 0.1},
"83": {"cam_K": [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0], "cam_R_w2c": [0.747494, 0.663726, -0.0268341, 0.151009, -0.209129, -0.966158, -0.646876, 0.718145, -0.256551], "cam_t_w2c": [47.37378315260004, 70.23191834300002, 839.83011703], "depth_scale": 0.1},
"112": {"cam_K": [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0], "cam_R_w2c": [0.732966, 0.679258, -0.0369964, 0.154704, -0.219403, -0.963291, -0.66244, 0.700336, -0.265899], "cam_t_w2c": [57.30521621640001, 60.03261259300005, 845.084446656], "depth_scale": 0.1},
"1024": {"cam_K": [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0], "cam_R_w2c": [0.221497, 0.974633, -0.0320828, 0.293663, -0.0980389, -0.950868, -0.929893, 0.201193, -0.307929], "cam_t_w2c": [84.5553717548, 78.09197834190002, 966.855761809], "depth_scale": 0.1},
"1027": {"cam_K": [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0], "cam_R_w2c": [0.208285, 0.977538, -0.0322094, 0.291752, -0.0935287, -0.95191, -0.933541, 0.188871, -0.304679], "cam_t_w2c": [95.29885425239999, 83.36950151319999, 964.165435848], "depth_scale": 0.1},
"1059": {"cam_K": [1066.778, 0.0, 312.9869, 0.0, 1067.487, 241.3109, 0.0, 0.0, 1.0], "cam_R_w2c": [0.184825, 0.982726, -0.00947274, 0.310088, -0.0674605, -0.948311, -0.932569, 0.172334, -0.3172], "cam_t_w2c": [100.85003064122, 67.49651895950004, 969.0728422079999], "depth_scale": 0.1},
...
}
scene_gt.json文件
{
"1": [{"cam_R_m2c": [0.6155426282490462, -0.7872002747152219, -0.03771988906432555, -0.19894986552077276, -0.10889926681888931, -0.9739401059834828, 0.7625789371251613, 0.6070060649389416, -0.22364550906920733], "cam_t_m2c": [-31.677025422232273, -17.368816807616497, 865.056765056294], "obj_id": 1}, {"cam_R_m2c": [-0.888800154374123, 0.45524946687574985, -0.05275501966907288, 0.15161264602174085, 0.18344755196788856, -0.9712669209190921, -0.4324911950904046, -0.8712604167821704, -0.2320697374100931], "cam_t_m2c": [19.826959191155, 56.52491050704691, 810.7227026810766], "obj_id": 6}, {"cam_R_m2c": [0.800270328363595, -0.5984833728683278, -0.03721742358490441, -0.16673638817023093, -0.16247691893389601, -0.97252257766676, 0.5759917033378716, 0.7844862673094515, -0.22981459219802255], "cam_t_m2c": [-23.310093543898354, -112.62940453072957, 848.1679482820954], "obj_id": 14}, {"cam_R_m2c": [-0.09489741249029623, 0.3561386300696222, -0.9296019480411393, -0.34937416104044705, 0.8625018988232865, 0.36609704814610067, 0.932165006412512, 0.3595205652274382, 0.04257586846391211], "cam_t_m2c": [52.33599371521044, -13.031296641365195, 861.5899289137005], "obj_id": 19}, {"cam_R_m2c": [-0.9984789779904625, -0.046702003100451896, 0.029310392655164948, 0.04118009201782687, -0.27814158992816, 0.9596564448294642, -0.036666017780062427, 0.9594042011059428, 0.27964165456635065], "cam_t_m2c": [-24.98442834314123, 91.09670661036117, 685.170819763342], "obj_id": 20}],
"36": [{"cam_R_m2c": [0.609160304011732, -0.7927779371553533, -0.020665842847375732, -0.18625494062652728, -0.11768910284587072, -0.9754267744096374, 0.7708650760967027, 0.5980404962914244, -0.2193500045932242], "cam_t_m2c": [-47.54392055014297, -20.624979740453274, 857.3694396811588], "obj_id": 1}, {"cam_R_m2c": [-0.885654032823929, 0.46297766711836735, -0.03563597970836668, 0.13642613789028796, 0.18608279820967355, -0.9730156842215326, -0.4438532397201184, -0.8666167711437901, -0.22796686931874305], "cam_t_m2c": [3.517734961917874, 53.76740226064, 803.2989808314162], "obj_id": 6}, {"cam_R_m2c": [0.7956314653073248, -0.6054474964433687, -0.020109911212931725, -0.15218610956201728, -0.1676389990666953, -0.9740308438419983, 0.5863531148783824, 0.7780298269242845, -0.2255194250661368], "cam_t_m2c": [-37.57128763014816, -115.82934553753682, 841.044974335238], "obj_id": 14}, {"cam_R_m2c": [-0.10098129392773866, 0.3390290986563458, -0.9353408405023759, -0.34637500340565003, 0.8693469251931312, 0.35250354851560134, 0.9326447803199308, 0.35957456267807025, 0.029642972835050057], "cam_t_m2c": [36.431813052732934, -15.061671846474432, 854.889665423068], "obj_id": 19}, {"cam_R_m2c": [-0.9984623750612651, -0.05417273393821578, 0.011792611042814375, 0.02623639801568292, -0.274310799977981, 0.9612824039468447, -0.04884079999762013, 0.9601139585387595, 0.27530992305310614], "cam_t_m2c": [-40.27985947489197, 87.0849679017978, 677.0522075201466], "obj_id": 20}],
"47": [{"cam_R_m2c": [0.6245636521890725, -0.7806766111406835, -0.02154245471722178, -0.18248158481557564, -0.11905916022552619, -0.975974147637646, 0.7593548693744658, 0.6134888758861301, -0.21681937469597998], "cam_t_m2c": [-24.395090683468215, -21.783960857519098, 865.0794295130081], "obj_id": 1}, {"cam_R_m2c": [-0.894520458378818, 0.4455202651911757, -0.03668433387925216, 0.13243670002639793, 0.18573890794853912, -0.9736329537290823, -0.42695875163671215, -0.8757921280786526, -0.22515068766577487], "cam_t_m2c": [25.357235475837868, 52.66179328072182, 809.873745451658], "obj_id": 6}, {"cam_R_m2c": [0.807320112848335, -0.5897361063376075, -0.02111051938004192, -0.1481874231217827, -0.16797481282866994, -0.9745899708885118, 0.5712038666214954, 0.7899336536550774, -0.22300135488429054], "cam_t_m2c": [-14.412173387618193, -116.98979989756661, 848.76946639409], "obj_id": 14}, {"cam_R_m2c": [-0.08176790305726558, 0.34291594675714004, -0.9358005665368483, -0.34465464710178356, 0.8712861049416898, 0.34938989394733905, 0.9351606006571631, 0.3510967987887798, 0.046944407423005166], "cam_t_m2c": [59.49748765951539, -15.940989444141788, 860.9665492937945], "obj_id": 19}, {"cam_R_m2c": [-0.9993032187000758, -0.03468586763469483, 0.013799769394337228, 0.02273746658654025, -0.27237133848523803, 0.9619233928797483, -0.02960648331060263, 0.9615662504172701, 0.2729703664836574], "cam_t_m2c": [-20.983454465364908, 85.55142809136696, 684.4252367092936], "obj_id": 20}],
"83": [{"cam_R_m2c": [0.6483560298162269, -0.7610510488761957, -0.02086610662784509, -0.1692011189126235, -0.11731610680604301, -0.9785743648106761, 0.7422968557005021, 0.6379953191679695, -0.2048329708565904], "cam_t_m2c": [-3.182798020397842, -14.00835493497242, 873.8116262427217], "obj_id": 1}, {"cam_R_m2c": [-0.9078584178495492, 0.4177062056136942, -0.03626756862269121, 0.12116889278699025, 0.17857338188688224, -0.9764368511592453, -0.4013865966547271, -0.8908605421313989, -0.21273198641370872], "cam_t_m2c": [44.345559041277, 60.13133053644625, 816.2833963297118], "obj_id": 6}, {"cam_R_m2c": [0.8251367861465224, -0.5645561164478825, -0.020629748503466508, -0.13583743253946046, -0.16282504159094316, -0.9772597366958712, 0.5483580610786056, 0.8091749254401457, -0.2110406512057419], "cam_t_m2c": [7.00537338156821, -109.3266555886105, 858.3053675688028], "obj_id": 14}, {"cam_R_m2c": [-0.051182081267789885, 0.34687788183738805, -0.9365126806882695, -0.33423610323086433, 0.8777158286503173, 0.3433663873103977, 0.9410979073728976, 0.3305899343507701, 0.07101581339481473], "cam_t_m2c": [80.50375242502996, -7.627733393267646, 867.1132222035583], "obj_id": 19}, {"cam_R_m2c": [-0.9998814983012471, -0.003861774590082771, 0.014912517026945876, 0.015403543280134506, -0.26132102760791526, 0.9651287621047637, 0.0001699907030837675, 0.9652438747296362, 0.2613491515087478], "cam_t_m2c": [-5.972908641135359, 91.22401864073694, 691.9137142738838], "obj_id": 20}],
"112": [{"cam_R_m2c": [0.6647649468317126, -0.7464489096042396, -0.03001657167948718, -0.17934682759164805, -0.1204579130201553, -0.9763836821566041, 0.725204605642379, 0.6544489331842311, -0.21394953128846064], "cam_t_m2c": [7.423095042063199, -24.208037318315366, 878.8066953601489], "obj_id": 1}, {"cam_R_m2c": [-0.9164409834258465, 0.3975651734102653, -0.045588733607644205, 0.12901204313969786, 0.1856895576876478, -0.9741022379218828, -0.3788031962736472, -0.8985881514318583, -0.22146426682502685], "cam_t_m2c": [53.92454489616582, 50.30051928831908, 820.916944720947], "obj_id": 6}, {"cam_R_m2c": [0.8371121143302147, -0.5462122309108142, -0.029919488718101708, -0.14479700119590075, -0.1685049959179436, -0.9750081519139011, 0.5275190636003836, 0.8205228433109507, -0.22014774919225136], "cam_t_m2c": [16.82039765317768, -119.40906570753043, 862.1248702849944], "obj_id": 14}, {"cam_R_m2c": [-0.03066579499454619, 0.35842471964468575, -0.9330546466130516, -0.3433961351884973, 0.8728955476024446, 0.34660120723264787, 0.9386892054263638, 0.3310359762596514, 0.09631358656249628], "cam_t_m2c": [90.9578870625636, -18.115127756738985, 870.214102795854], "obj_id": 19}, {"cam_R_m2c": [-0.9995255713303868, 0.017530980458125914, 0.02532545822207572, 0.019633115389425172, -0.27092669739461855, 0.9623997900400193, 0.023733079584714534, 0.9624400538470402, 0.27045397974048285], "cam_t_m2c": [0.8525459893978651, 82.84611660913978, 698.0728419733201], "obj_id": 20}],
"1024": [{"cam_R_m2c": [0.9704843061877538, -0.24109489453234348, 0.005747985968890892, -0.05702875853463301, -0.2525872788160113, -0.9658917605444589, 0.23432352592357092, 0.9370555030237454, -0.2588812841574994], "cam_t_m2c": [71.41915012233497, -8.855300323114646, 1000.9610991395713], "obj_id": 1}, {"cam_R_m2c": [-0.9818049404650917, -0.1895412287960604, -0.011535627356135513, -0.03769862638683262, 0.25409208731971417, -0.9664445035770324, 0.18611318695445572, -0.9484256537221011, -0.2566141917897591], "cam_t_m2c": [67.01964196225809, 72.7948380863339, 934.7868376282179], "obj_id": 6}, {"cam_R_m2c": [0.9997595271389749, 0.021799496359123954, 0.0022441839567555817, 0.007919269824689405, -0.26388702002464454, -0.9645210969783664, -0.020434334502123813, 0.9643072544399508, -0.2639963679079688], "cam_t_m2c": [84.42283358552737, -100.81966035400322, 972.6189230163418], "obj_id": 14}, {"cam_R_m2c": [0.5378109085736478, 0.3445428353846606, -0.7694472931938168, -0.3433293601926506, 0.9230761784574881, 0.17336187404666403, 0.76998924176551, 0.17093743837230568, 0.6147327127097234], "cam_t_m2c": [134.73122615884228, 12.143600548964699, 949.5835934469427], "obj_id": 19}, {"cam_R_m2c": [-0.8145733310772104, 0.5795937271002144, 0.023271274982038334, -0.16022702734378425, -0.2633815493028152, 0.9512920548835528, 0.5574926601599337, 0.77116864864905, 0.3074100481839816], "cam_t_m2c": [-48.587286695603, 95.27932302573576, 863.400450652429], "obj_id": 20}],
"1027": [{"cam_R_m2c": [0.9736380541620182, -0.2280117975833827, 0.006327552801320991, -0.05238012971444529, -0.25049721582618745, -0.9666989890232974, 0.22200334685406545, 0.9408839944912755, -0.25583655962012575], "cam_t_m2c": [82.84236377136222, -3.202382029485854, 998.3668973579089], "obj_id": 1}, {"cam_R_m2c": [-0.9791725627289203, -0.20273762671433873, -0.010923135997289196, -0.0410220439845687, 0.2502411897729927, -0.9673135255248978, 0.19884576291409187, -0.9467190984057884, -0.2533455910721295], "cam_t_m2c": [77.26279404150836, 78.21354304065359, 931.9933078713171], "obj_id": 6}, {"cam_R_m2c": [0.999374696732068, 0.03525470569420298, 0.0027528023614022453, 0.011854531524012881, -0.2606557192210831, -0.9653590276533489, -0.03331690907664211, 0.9647882098792482, -0.26091052240056956], "cam_t_m2c": [95.85264694795609, -95.20576749627322, 970.1547235570508], "obj_id": 14}, {"cam_R_m2c": [0.5490441191287648, 0.34297485437741737, -0.7621805354210485, -0.33866431656913015, 0.9249997374251607, 0.17228159729595274, 0.7641048830135074, 0.1635321897826169, 0.624019860272118], "cam_t_m2c": [145.4036728839161, 17.882345049387965, 946.1122610800769], "obj_id": 19}, {"cam_R_m2c": [-0.8066888252132398, 0.5905149883916639, 0.023371657751691022, -0.16166990776914283, -0.25854647899038274, 0.95237368538674, 0.5684344898274531, 0.764490516199126, 0.30403461998260806], "cam_t_m2c": [-39.34184312110956, 100.00247891645904, 862.0280513435945], "obj_id": 20}],
"1059": [{"cam_R_m2c": [0.9782295367682443, -0.20533659652444655, 0.030068581891794056, -0.026945973299812093, -0.269340018094528, -0.9626676876287587, 0.20576934213763715, 0.9409000441338533, -0.2690093747814608], "cam_t_m2c": [91.91297764509498, -19.065713859707213, 1001.8789157884851], "obj_id": 1}, {"cam_R_m2c": [-0.9742128751147976, -0.22526372153623223, 0.012885668716460606, -0.07208852214553628, 0.25663206687277745, -0.9638163225739518, 0.21380719264399567, -0.9398914255637064, -0.26625313092092107], "cam_t_m2c": [82.90533755190265, 63.01687304404585, 936.714193434043], "obj_id": 6}, {"cam_R_m2c": [0.9979408481092544, 0.05846768501180202, 0.026374504656193298, 0.041352748038722534, -0.27214957636651804, -0.9613658434349226, -0.049031741700048634, 0.9604769635513392, -0.2740072162843031], "cam_t_m2c": [107.1130420578321, -110.27386886274778, 972.2166924287133], "obj_id": 14}, {"cam_R_m2c": [0.5709597738530855, 0.31874303674173754, -0.7565764595490821, -0.3325428987822525, 0.9323597738062869, 0.1418421956290973, 0.750612474435284, 0.17060742606646714, 0.6383363540398707], "cam_t_m2c": [152.9804531357707, 4.513159355754437, 948.9277642986692], "obj_id": 19}, {"cam_R_m2c": [-0.7923237136450533, 0.6101010484593035, 0.000812568704820232, -0.1925048617747842, -0.2512647153904161, 0.9485814388073351, 0.5789352987188576, 0.7514269629780099, 0.3165303330408628], "cam_t_m2c": [-35.40333944213483, 82.3207125125342, 868.9133541232341], "obj_id": 20}],
...
}
scene_gt_info.json文件
{
"1": [{"bbox_obj": [206, 126, 132, 194], "bbox_visib": [206, 128, 132, 192], "px_count_all": 23236, "px_count_valid": 21318, "px_count_visib": 21729, "visib_fract": 0.9351437424685832}, {"bbox_obj": [282, 280, 112, 73], "bbox_visib": [282, 280, 112, 73], "px_count_all": 7335, "px_count_valid": 4794, "px_count_visib": 7316, "visib_fract": 0.9974096796182685}, {"bbox_obj": [209, 45, 137, 113], "bbox_visib": [209, 45, 137, 113], "px_count_all": 12243, "px_count_valid": 10658, "px_count_visib": 12206, "visib_fract": 0.9969778649023932}, {"bbox_obj": [329, 116, 90, 213], "bbox_visib": [329, 116, 90, 213], "px_count_all": 7586, "px_count_valid": 5142, "px_count_visib": 7448, "visib_fract": 0.9818085947798576}, {"bbox_obj": [92, 332, 349, 127], "bbox_visib": [92, 332, 349, 127], "px_count_all": 25062, "px_count_valid": 16519, "px_count_visib": 25044, "visib_fract": 0.999281781182667}],
"36": [{"bbox_obj": [186, 122, 132, 194], "bbox_visib": [186, 124, 132, 192], "px_count_all": 23617, "px_count_valid": 21257, "px_count_visib": 22245, "visib_fract": 0.9419062539695982}, {"bbox_obj": [260, 277, 113, 73], "bbox_visib": [260, 277, 113, 73], "px_count_all": 7411, "px_count_valid": 4406, "px_count_visib": 7411, "visib_fract": 1.0}, {"bbox_obj": [191, 40, 137, 112], "bbox_visib": [191, 40, 137, 112], "px_count_all": 12409, "px_count_valid": 11153, "px_count_visib": 12382, "visib_fract": 0.9978241598839552}, {"bbox_obj": [311, 112, 87, 215], "bbox_visib": [311, 112, 87, 215], "px_count_all": 7724, "px_count_valid": 5485, "px_count_visib": 7681, "visib_fract": 0.994432936302434}, {"bbox_obj": [63, 326, 355, 127], "bbox_visib": [63, 326, 355, 127], "px_count_all": 25528, "px_count_valid": 18324, "px_count_visib": 25413, "visib_fract": 0.9954951425885302}],
"47": [{"bbox_obj": [216, 122, 130, 192], "bbox_visib": [216, 124, 130, 190], "px_count_all": 23159, "px_count_valid": 21000, "px_count_visib": 21673, "visib_fract": 0.9358348806079709}, {"bbox_obj": [289, 275, 112, 73], "bbox_visib": [289, 275, 112, 73], "px_count_all": 7260, "px_count_valid": 4759, "px_count_visib": 7260, "visib_fract": 1.0}, {"bbox_obj": [222, 40, 135, 111], "bbox_visib": [222, 40, 135, 111], "px_count_all": 12125, "px_count_valid": 10929, "px_count_visib": 12099, "visib_fract": 0.9978556701030927}, {"bbox_obj": [340, 111, 86, 215], "bbox_visib": [340, 111, 86, 215], "px_count_all": 7553, "px_count_valid": 5084, "px_count_visib": 7461, "visib_fract": 0.9878194095061565}, {"bbox_obj": [97, 322, 350, 124], "bbox_visib": [97, 322, 350, 124], "px_count_all": 24848, "px_count_valid": 15756, "px_count_visib": 24838, "visib_fract": 0.9995975531229878}],
"83": [{"bbox_obj": [244, 133, 127, 189], "bbox_visib": [244, 135, 127, 187], "px_count_all": 22665, "px_count_valid": 19893, "px_count_visib": 21319, "visib_fract": 0.9406132803882639}, {"bbox_obj": [315, 285, 111, 71], "bbox_visib": [315, 285, 111, 71], "px_count_all": 7128, "px_count_valid": 4491, "px_count_visib": 7124, "visib_fract": 0.999438832772166}, {"bbox_obj": [250, 52, 133, 110], "bbox_visib": [250, 52, 133, 110], "px_count_all": 11838, "px_count_valid": 9909, "px_count_visib": 11814, "visib_fract": 0.9979726305119108}, {"bbox_obj": [366, 123, 84, 213], "bbox_visib": [366, 123, 84, 213], "px_count_all": 7510, "px_count_valid": 5251, "px_count_visib": 7389, "visib_fract": 0.9838881491344873}, {"bbox_obj": [123, 330, 346, 120], "bbox_visib": [123, 330, 346, 120], "px_count_all": 24201, "px_count_valid": 15320, "px_count_visib": 24158, "visib_fract": 0.9982232139167803}],
"112": [{"bbox_obj": [257, 121, 127, 188], "bbox_visib": [257, 122, 127, 187], "px_count_all": 22397, "px_count_valid": 20236, "px_count_visib": 21050, "visib_fract": 0.939858016698665}, {"bbox_obj": [327, 272, 111, 71], "bbox_visib": [327, 272, 111, 71], "px_count_all": 7043, "px_count_valid": 4349, "px_count_visib": 7035, "visib_fract": 0.9988641204032372}, {"bbox_obj": [262, 41, 133, 109], "bbox_visib": [262, 41, 133, 109], "px_count_all": 11731, "px_count_valid": 10683, "px_count_visib": 11647, "visib_fract": 0.9928394851248827}, {"bbox_obj": [378, 110, 85, 212], "bbox_visib": [378, 110, 85, 212], "px_count_all": 7558, "px_count_valid": 5128, "px_count_visib": 7393, "visib_fract": 0.9781688277322043}, {"bbox_obj": [134, 317, 344, 118], "bbox_visib": [134, 317, 344, 118], "px_count_all": 23666, "px_count_valid": 14994, "px_count_visib": 23584, "visib_fract": 0.9965351136651737}],
"1024": [{"bbox_obj": [334, 149, 112, 170], "bbox_visib": [334, 154, 112, 165], "px_count_all": 17627, "px_count_valid": 15328, "px_count_visib": 15641, "visib_fract": 0.8873319339649401}, {"bbox_obj": [341, 292, 97, 66], "bbox_visib": [341, 292, 97, 66], "px_count_all": 5742, "px_count_valid": 3714, "px_count_visib": 5705, "visib_fract": 0.9935562521769419}, {"bbox_obj": [342, 80, 127, 101], "bbox_visib": [342, 80, 127, 101], "px_count_all": 9552, "px_count_valid": 8781, "px_count_visib": 9541, "visib_fract": 0.9988484087102177}, {"bbox_obj": [428, 154, 97, 198], "bbox_visib": [428, 154, 97, 198], "px_count_all": 9981, "px_count_valid": 7073, "px_count_visib": 9943, "visib_fract": 0.9961927662558862}, {"bbox_obj": [94, 301, 253, 118], "bbox_visib": [94, 301, 253, 118], "px_count_all": 13689, "px_count_valid": 9113, "px_count_visib": 13622, "visib_fract": 0.9951055592081233}],
"1027": [{"bbox_obj": [346, 155, 113, 170], "bbox_visib": [346, 159, 113, 166], "px_count_all": 17731, "px_count_valid": 15453, "px_count_visib": 15699, "visib_fract": 0.885398454683887}, {"bbox_obj": [352, 298, 98, 66], "bbox_visib": [352, 298, 98, 66], "px_count_all": 5778, "px_count_valid": 3499, "px_count_visib": 5778, "visib_fract": 1.0}, {"bbox_obj": [355, 86, 127, 101], "bbox_visib": [355, 86, 127, 101], "px_count_all": 9650, "px_count_valid": 8657, "px_count_visib": 9639, "visib_fract": 0.998860103626943}, {"bbox_obj": [441, 161, 97, 198], "bbox_visib": [441, 161, 97, 198], "px_count_all": 10080, "px_count_valid": 6927, "px_count_visib": 9978, "visib_fract": 0.9898809523809524}, {"bbox_obj": [105, 306, 253, 120], "bbox_visib": [105, 306, 253, 120], "px_count_all": 13745, "px_count_valid": 9595, "px_count_visib": 13491, "visib_fract": 0.9815205529283376}],
"1059": [{"bbox_obj": [354, 138, 116, 170], "bbox_visib": [354, 140, 116, 168], "px_count_all": 17615, "px_count_valid": 14947, "px_count_visib": 15552, "visib_fract": 0.8828839057621345}, {"bbox_obj": [359, 281, 97, 65], "bbox_visib": [359, 281, 97, 65], "px_count_all": 5683, "px_count_valid": 3645, "px_count_visib": 5676, "visib_fract": 0.9987682562027098}, {"bbox_obj": [368, 69, 127, 102], "bbox_visib": [368, 69, 127, 102], "px_count_all": 9599, "px_count_valid": 8503, "px_count_visib": 9585, "visib_fract": 0.9985415147411189}, {"bbox_obj": [450, 146, 95, 196], "bbox_visib": [450, 146, 95, 196], "px_count_all": 10051, "px_count_valid": 5521, "px_count_visib": 9989, "visib_fract": 0.9938314595562631}, {"bbox_obj": [112, 284, 249, 121], "bbox_visib": [112, 284, 249, 121], "px_count_all": 13366, "px_count_valid": 8972, "px_count_visib": 13219, "visib_fract": 0.9890019452341763}],
...
}
三、BOP Challenge 2023数据集(CVPR2024)
这些数据集包括超过 2M 张图像,显示超过 50K 个不同的对象。
这些图像最初是使用 BlenderProc 为 MegaPose 合成的。这些对象来自 Google Scanned Objects 和 ShapeNetCore 数据集,其 3D 模型可从各自的网站下载。
3.1 MegaPose-GSO数据集
- 3D物体模型可以从Google扫描物体下载。为了使模型与 GT 姿势兼容,需要将对象居中并重新缩放它们,使其适合单位球体,以及将归一化模型缩放 0.1。有关伪代码,参阅此注释。
- 从 BOP 中使用的obj_id映射到原始对象标识符
- 从映像键映射到存储该键的分片索引
- 数据集采用BOP-webdataset格式,分为1040个分片,每个分片包含~1000张图片以及对象注解和相机参数。
使用以下 URL 模板下载分片( is from to )。<SHARD-ID>
000000至
001039
https://huggingface.co/datasets/bop-benchmark/datasets/resolve/main/MegaPose-GSO/shard-<SHARD-ID>.tar
比如:
https://bop.felk.cvut.cz/media/data/bop_datasets/bop23_datasets/megapose-gso/train_pbr_web/shard-000000.tar
https://bop.felk.cvut.cz/media/data/bop_datasets/bop23_datasets/megapose-gso/train_pbr_web/shard-000001.tar
......
https://bop.felk.cvut.cz/media/data/bop_datasets/bop23_datasets/megapose-gso/train_pbr_web/shard-001039.tar
3.2 MegaPose-ShapeNetCore 数据集
- 可以从 ShapeNet 下载 3D 对象模型(将模型缩放 0.1 以与 GT 姿势兼容)
- 从 BOP 中使用的obj_id映射到原始对象标识符
- 从映像键映射到存储该键的分片索引
- 数据集采用BOP-webdataset格式,分为1040个分片,每个分片包含~1000张图片以及对象注解和相机参数。
使用以下 URL 模板下载分片( is from to )。<SHARD-ID>
000000至
001039
https://huggingface.co/datasets/bop-benchmark/datasets/resolve/main/MegaPose-ShapeNetCore/shard-<SHARD-ID>.tar
四、BOP挑战赛 6D位姿估计
BOP Challenge 2023 报告已被 CVPR 2024接收和认可,下面6D位姿估计的排行榜。
主要在LM-O, T-LESS, TUD-L, IC-BIN, ITODD, HB, YCB-V数据集进行训练和测试的。
对应可见物体的测试,排行榜:https://bop.felk.cvut.cz/leaderboards/
对应不可见物体的测试,排行榜:https://bop.felk.cvut.cz/leaderboards/pose-estimation-unseen-bop23/core-datasets/
推荐一下Top2的方法:
Top1——FoundationPose CVPR2024 ,https://github.com/NVlabs/FoundationPose
Top2——SAM-6D CVPR2024 ,https://github.com/JiehongLin/SAM-6D/
分享完成~
本文先介绍到这里,后面会分享“6D位姿估计”的其它数据集、算法、代码、具体应用示例。