【读论文】【泛读】S-NERF: NEURAL RADIANCE FIELDS FOR STREET VIEWS

文章目录

    • 0. Abstract
    • 1. Introduction
    • 2. Related work
    • 3. Methods-NERF FOR STREET VIEWS
      • 3.1 CAMERA POSE PROCESSING
      • 3.2 REPRESENTATION OF STREET SCENES
      • 3.3 DEPTH SUPERVISION
      • 3.4 Loss function
    • 4. EXPERIMENTS
    • 5. Conclusion
    • Reference

0. Abstract

Problem introduction:

However, we conjugate that this paradigm does not fit the nature of the street views that are collected by many self-driving cars from the large-scale unbounded scenes. Also,the onboard cameras perceive scenes without much overlapping.

Solutions:

  • Consider novel view synthesis of both the large-scale background scenes and the foreground moving vehicles jointly.

  • Improve the scene parameterization function and the camera poses.

  • We also use the the noisy and sparse LiDAR points to boost the training and learn a robust geometry and reprojection-based confidence to address the depth outliers.

  • Extend our S-NeRF for reconstructing moving vehicles

Effect:

Reduce 7 40% of the mean-squared error in the street-view synthesis and a 45% PSNR gain for the moving vehicles rendering

1. Introduction

Overcome the shortcomings of the work of predecessors:

  • MipNeRF-360 (Barron et al., 2022) is designed for training in unbounded scenes. But it still needs enough intersected camera rays

  • BlockNeRF (Tancik et al., 2022) proposes a block-combination strategy with refined poses, appearances, and exposure on the MipNeRF (Barron et al., 2021) base model for processing large-scale outdoor scenes. But it needs a special platform to collect the data, and is hard to utilize the existing dataset.

  • Urban-NeRF (Rematas et al., 2022) takes accurate dense LiDAR depth as supervision for the reconstruction of urban scenes. But the accurate dense LiDAR is too expensive. S-Nerf just needs the noisy sparse LiDAR signals.

2. Related work

There are lots of papers in the field of Large-scale NeRF and Depth supervised NeRF

在这里插入图片描述

This paper is included in the field of both of them.

3. Methods-NERF FOR STREET VIEWS

3.1 CAMERA POSE PROCESSING

SfM used in previous NeRFs fails. Therefore, we proposed two different methods to reconstruct the camera poses for the static background and the foreground moving vehicles.

  1. Background scenes

    For the static background, we use the camera parameters achieved by sensor-fusion SLAM and IMU of the self-driving cars (Caesar et al., 2019; Sun et al., 2020) and further reduce the inconsistency between multi-cameras with a learning-based pose refinement network.

  2. Moving vehicles
    在这里插入图片描述

    We now transform the coordinate system by setting the target object’s center as the coordinate system’s origin.
    P ^ i = ( P i P b − 1 ) − 1 = P b P i − 1 , P − 1 = [ R T − R T T 0 T 1 ] . \hat{P}_i=(P_iP_b^{-1})^{-1}=P_bP_i^{-1},\quad P^{-1}=\begin{bmatrix}R^T&-R^TT\\\mathbf{0}^T&1\end{bmatrix}. P^i=(PiPb1)1=PbPi1,P1=[RT0TRTT1].

    And, P = [ R T 0 T 1 ] P=\begin{bmatrix}R&T\\\mathbf{0}^T&1\end{bmatrix} P=[R0TT1]represents the old position of the camera or the target object.

    After the transformation, only the camera is moving which is favorable in training NeRFs.

    How to convert the parameter matrix.

3.2 REPRESENTATION OF STREET SCENES

  1. Background scenes

    constrain the whole scene into a bounded range

    image-20231121223229096

    This part is from mipnerf-360

  2. Moving Vehicles

    Compute the dense depth maps for the moving cars as an extra supervision

    • We follow GeoSim (Chen et al., 2021b) to reconstruct coarse mesh from multi-view images and the sparse LiDAR points.

    • After that, a differentiable neural renderer (Liu et al., 2019) is used to render the corresponding depth map with the camera parameter (Section 3.2).

    • The backgrounds are masked during the training by an instance segmentation network (Wang et al., 2020).

    There are three references

3.3 DEPTH SUPERVISION

image-20231123205037308

To provide credible depth supervisions from defect LiDAR depths, we first propagate the sparse depths and then construct a confidence map to address the depth outliers(异常值).

  1. LiDAR depth completion

    Use NLSPN (Park et al., 2020) to propagate the depth information from LiDAR points to surrounding pixels.

  2. Reprojection confidence

    Measure the accuracy of the depths and locate the outliers.

    The warping operation can be represented as:

    X t = ψ ( ψ − 1 ( X s , P s ) , P t ) \mathbf{X}_t=\psi(\psi^{-1}(\mathbf{X}_s,P_s),P_t) Xt=ψ(ψ1(Xs,Ps),Pt)

    And we use three features to measure the similarity:

    C r g b = 1 − ∣ I s − I ^ s ∣ , C s s i m = S S I M ( I s , I ^ s ) ) , C v g g = 1 − ∥ F s − F ^ s ∥ . \mathcal{C}_{\mathrm{rgb}}=1-|\mathcal{I}_{s}-\hat{\mathcal{I}}_{s}|,\quad\mathcal{C}_{\mathrm{ssim}}=\mathrm{SSIM}(\mathcal{I}_{s},\hat{\mathcal{I}}_{s})),\quad\mathcal{C}_{\mathrm{vgg}}=1-\|\mathcal{F}_{s}-\hat{\mathcal{F}}_{s}\|. Crgb=1IsI^s,Cssim=SSIM(Is,I^s)),Cvgg=1FsF^s∥.

  3. Geometry confidence

    Measure the geometry consistency of the depths and flows across different views.

    The depth:

    C d e p t h = γ ( ∣ d t − d ^ t ) ∣ / d s ) , γ ( x ) = { 0 , if x ≥ τ , 1 − x / τ , otherwise . \left.\mathcal{C}_{depth}=\gamma(|d_t-\hat{d}_t)|/d_s),\quad\gamma(x)=\left\{\begin{array}{cc}0,&\text{if}x\geq\tau,\\1-x/\tau,&\text{otherwise}.\end{array}\right.\right. Cdepth=γ(dtd^t)∣/ds),γ(x)={0,1x/τ,ifxτ,otherwise.

    The flow:

    C f l o w = γ ( ∥ Δ x , y − f s → t ( x s , y s ) ∥ ∥ Δ x , y ∥ ) , Δ x , y = ( x t − x s , y t − y s ) . \mathcal{C}_{flow}=\gamma(\frac{\|\Delta_{x,y}-f_{s\rightarrow t}(x_{s},y_{s})\|}{\|\Delta_{x,y}\|}),\quad\Delta_{x,y}=(x_{t}-x_{s},y_{t}-y_{s}). Cflow=γ(Δx,yΔx,yfst(xs,ys)),Δx,y=(xtxs,ytys).

  4. Learnable confidence combination

    The final confidence map can be learned as C ^ = ∑ i ω i C i , \hat{\mathcal{C}}=\sum_{i}{\omega_{i}\mathcal{C}_{i}}, C^=iωiCi, where ∑ i w i = 1 \sum_iw_i=1 iwi=1 and i ∈ { r g b , s s i m , v g g , d e p t h , f l o w } i \in \{rgb,ssim,vgg,depth,flow\} i{rgb,ssim,vgg,depth,flow}

3.4 Loss function

A RGB loss, a depth loss, and an edge-aware smoothness constraint to penalize large variances in depth.
L c o l o r = ∑ r ∈ R ∥ I ( r ) − I ^ ( r ) ∥ 2 2 L d e p t h = ∑ C ^ ⋅ ∣ D − D ^ ∣ L s m o o t h = ∑ ∣ ∂ x D ^ ∣ exp ⁡ − ∣ ∂ x I ∣ + ∣ ∂ y D ^ ∣ exp ⁡ − ∣ ∂ y I ∣ L t o t a l = L c o l o r + λ 1 L d e p t h + λ 2 L s m o o t h \begin{aligned} &\mathcal{L}_{\mathrm{color}}=\sum_{\mathbf{r}\in\mathcal{R}}\|I(\mathbf{r})-\hat{I}(\mathbf{r})\|_{2}^{2} \\ &\mathcal{L}_{depth}=\sum\hat{\mathcal{C}}\cdot|\mathcal{D}-\hat{\mathcal{D}}| \\ &\mathcal{L}_{smooth}=\sum|\partial_{x}\hat{\mathcal{D}}|\exp^{-|\partial_{x}I|}+|\partial_{y}\hat{\mathcal{D}}|\exp^{-|\partial_{y}I|} \\ &\mathcal{L}_{\mathrm{total}}=\mathcal{L}_{\mathrm{color}}+\lambda_{1}\mathcal{L}_{depth}+\lambda_{2}\mathcal{L}_{smooth} \end{aligned} Lcolor=rRI(r)I^(r)22Ldepth=C^DD^Lsmooth=xD^expxI+yD^expyILtotal=Lcolor+λ1Ldepth+λ2Lsmooth

4. EXPERIMENTS

  1. Dataset: nuScenes and Waymo
  • For the foreground vehicles, we extract car crops from nuScenes and Waymo video sequences.
  • For the large-scale background scenes, we use scenes with 90∼180 images.
  • In each scene, the ego vehicle moves around 10∼40 meters, and the whole scene span more than 200m.
  1. Foreground Vehicles

    Different experiments in static vehicles and moving vehicles, compared with Origin-NeRF and GeoSim(the latest non-NeRF car reconstruction method).

    image-20231128233015946

    There are large room for improvement in PSNR

  2. Background scenes

    Compared with the state-of-the-art methods Mip-NeRF (Barron et al., 2021), Urban-NeRF (Rematas et al., 2022), and Mip-NeRF 360.
    在这里插入图片描述
    Also, show a 360-degree panorama to emphasize some details:
    在这里插入图片描述

  3. BACKGROUND AND FOREGROUND FUSION
    Depth-guided placement and inpainting (e.g. GeoSim Chen et al. (2021b)) and joint NeRF rendering (e.g. GIRAFFE Niemeyer & Geiger (2021)) heavily rely on accurate depth maps and 3D geometry information.
    A new method was used without an introduction?

  4. Ablation study
    在这里插入图片描述

    Split into RGB, depth confidence, and smooth loss.

5. Conclusion

In the future, we will use the block merging as proposed in Block-NeRF (Tancik et al., 2022) to learn a larger city-level neural representation

Reference

[1] S-NeRF (ziyang-xie.github.io)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:/a/199083.html

如若内容造成侵权/违法违规/事实不符,请联系我们进行投诉反馈qq邮箱809451989@qq.com,一经查实,立即删除!

相关文章

井盖倾斜怎么办?智能井盖传感器监测方法

井盖倾斜是一个紧迫的问题,如果不及时处理可能会导致道路安全性下降,进而增加车辆和行人发生意外的风险。为应对这一问题现已开发出智能井盖传感器,它可以持续监测井盖的状态,一旦发现倾斜等异常情况会立即发出警报。 在智慧城市的…

前端---JavaScript篇

1. 介绍 JavaScript 是 前端开发人员必须学习的 3 门语言中的一门: HTML 定义了网页的内容CSS 描述了网页的布局JavaScript 控制了网页的行为 接下来开始详解JavaScript。 2.引入方法 js有两种导入方式,一种是内部脚本:直接在html页面中…

csv文件EXCEL默认打开乱码问题

这里讨论的问题是,当用记事本打开带有中文字符的csv正常时,用excel打开却是乱码。 简单概括就是:编码问题,windows的 excel打开csv文本文件时,默认使用的是系统内的ANSI,在中文环境下就是GB2312。如果写文件…

NX二次开发UF_MTX3_vec_multiply_t 函数介绍

文章作者:里海 来源网站:https://blog.csdn.net/WangPaiFeiXingYuan UF_MTX3_vec_multiply_t Defined in: uf_mtx.h void UF_MTX3_vec_multiply_t(const double vec [ 3 ] , const double mtx [ 9 ] , double vec_product [ 3 ] ) overview 概述 Ret…

C#,数值计算——插值和外推,径向基函数插值(RBF_inversemultiquadric)的计算方法与源程序

1 文本格式 using System; namespace Legalsoft.Truffer { public class RBF_inversemultiquadric : RBF_fn { private double r02 { get; set; } public RBF_inversemultiquadric(double scale 1.0) { this.r02 Globals.SQR(scale); …

nginx 配置跨域(小皮面板)

本地开发的时候,前端请求后端,后端不能用域名请求,只能用端口模式,在小皮面板的话就是如下配置: 我的测试项目部署: 前端:http://localhost:8082 后端:http://localhost:8081 前端…

【小黑嵌入式系统第十课】μC/OS-III概况——实时操作系统的特点、基本概念(内核任务中断)、与硬件的关系实现

文章目录 一、为什么要学习μC/OS-III二、嵌入式操作系统的发展历史三、实时操作系统的特点四、基本概念1. 前后台系统2. 操作系统3. 实时操作系统(RTOS)4. 内核5. 任务6. 任务优先级7. 任务切换8. 调度9. 非抢占式(合作式)内核10…

C陷阱与缺陷——第2章语法陷阱

1. 理解函数声明 硬件将调用首地址为0位置的子例程 (*(void(*)())0)(); 任何C变量的声明都由两部分组成:类型以及一组类似表达式的声明符,声明符从表面看与表达式有些类似,对它求值应该返回一个声明中给定类型的结果。 假定变量fp是一个函…

UData+StarRocks在京东物流的实践 | 京东物流技术团队

1 背景 数据服务与数据分析场景是数据团队在数据应用上两个大的方向,行业内大家有可能会遇到下面的问题: 1.1 数据服务 烟囱式开发模式:每来一个需求开发一个数据服务,数据服务无法复用,难以平台化,技术…

IWDG和WWDG HAL库+cubeMX

一.IWDG 1.原理 启用IWDG后,LSI时钟会自动开启 2.IWDG溢出时间计算 3.IWDG配置步骤 4.HAL库相关函数介绍 HAL_IWDG_Init //使能IWDG,设置预分频系数和重装载值等 HAL_IWDG_Refresh //把重装载寄存器的值重载到计数器中,喂狗typedef str…

上海数字孪生技术推进制造业升级,工业物联网可视化应用加速

上海数字孪生技术推进制造业升级,工业物联网可视化应用加速。数字孪生技术,是从数字模型、数字样机的相关技术发展而来,而对于生产系统的数字孪生又和虚拟制造这一相关技术。数字孪生不是全新技术,它具有建模仿真、虚拟制造、数字…

使用.NET8中的.http文件和终结点资源管理器

本文将以.NET8的模板增加的.http文件为引,介绍 Visual Studio 2022 中的 .http 文件编辑器,这是一个用于测试 ASP.NET Core 项目的强大工具。 文章目录 1. 背景2. HTTP 文件介绍2.1 简介2.2 .http 文件语法3. 在 Visual Studio 中使用3.1 终结点资源管理…

【RESTful API】RESTful接口设计练习

参考: BV1Ps4y1J7Ve ---------------------------------------------------------------------------------------------------------- 一、RESTful框架 常见的有SpringMVC,jersey,play 二、API测试工具 Postman,Insomnia 三、RESTful接口设计练习 3.1 项目准备 构…

使用OSS搭建私有云内网yum仓库的方法

使用OSS搭建私有云内网yum仓库的方法 文字&图片内容已脱敏 #、前几天接到一个搭建内网yum源的任务。刚接到这个任务的时候还是比较头疼的,因为内部有很多VPC。VPC与VPC之间是不互通的,又不能打高速通道,也不可能每个VPC下边都建一个yum…

安全防控 | AIRIOT智能安防管理解决方案

现代社会对安全和便捷性的需求越来越高,特别是在大型商业园区、住宅社区和办公大楼等场所。传统的安防系统往往存在一些痛点: 通行效率问题:传统门禁系统通常导致人员排队等待,降低了通行效率。车辆通行管理不当会导致交通拥堵和停车问题。 …

【古月居《ros入门21讲》学习笔记】15_ROS中的坐标系管理系统

目录 说明: 1. 机器人中的坐标变换 tf功能包能干什么? tf坐标变换如何实现 2. 小海龟跟随实验 安装 ros-melodic-turtle-tf 实验命令 运行效果 说明: 1. 本系列学习笔记基于B站:古月居《ROS入门21讲》课程,且使…

11月28日作业

C环境下实现输入字符串&#xff0c;并判断大小写字母、数字、空格及其他字符个数 #include <iostream>using namespace std;int main() {string str;cout << "请输入一个字符串:" ;getline(cin,str);int num 0,ch 0,CH 0,spa 0,indo 0;for(int i0;…

C语言 移位操作符

<< 左移操作符>> 右移操作符 注&#xff1a;移位操作符的操作数只能是整数。 移位操作符移动的是二进制位。 整数的二进制表示有3种&#xff1a; 原码反码补码 正的整数的原码、反码、补码相同。 负的整数的原码、反码、补码是要计算的。 由负整数原码计算出反…

Linux - 动静态库(上篇)

Linux 当中的 内存管理模块 不管是操作系统对于进程之间的管理&#xff0c;还是 对于文件的访问和修改等等的操作&#xff0c;都是要把数据加载到内存当中的&#xff0c;所以&#xff0c;所有的工作都离不开 内存管理模块。 内存的本质其实是对数据的一种临时存储&#xff0c…

吃火锅(Python)

题目描述 吃火锅 以上图片来自微信朋友圈&#xff1a;这种天气你有什么破事打电话给我基本没用。但是如果你说“吃火锅”&#xff0c;那就厉害了&#xff0c;我们的故事就开始了。 本题要求你实现一个程序&#xff0c;自动检查你朋友给你发来的信息里有没有 chi1 huo3 guo1。…