具有不规则结果的常规 PyTorch 张量函数

一、说明

深度学习从业者应注意的常用 PyTorch 张量函数的例外情况。你是不是也和上面的人一样呢？如果是，那么本文可能会帮助您在使用 PyTorch 构建深度学习模型时发现一些常见错误。

我在下面提到了 5 个最常用的 PyTorch 函数及其小示例以及它们无法按预期工作的一种可能情况。所以，坚持住。让我们开始！

二、torch.Tensor.numpy()

直观上看，该函数用于将张量转换为 numpy 多维数组。[参考这里]

当定义 Tensor 时将 require_grad 设置为 True，然后尝试转换为 numpy 数组，PyTorch 会抛出运行时错误，如下图 1 所示。

图 1：将叶张量转换为 Numpy 数组时出错

这是因为 PyTorch 只能将张量转换为 Numpy 数组，而这不会成为 PyTorch 中任何动态计算的一部分。必须使用detach将张量与动态图分离，然后转换为 numpy 数组。

三、torch.Tensor.new_tensor()

该方法返回一个新的张量作为data张量数据。[参考这里]

当使用 .new_tensor() 方法从现有张量复制新张量时，将 require_grad 设置为“True”，仅复制张量数据，而不复制与源张量相关的梯度。

请参考下面这个例子：

# Create tensors.
x = torch.tensor(1., requires_grad=True )
w = torch.tensor(2., requires_grad=True)
b = torch.tensor(3., requires_grad=True)
print("x:",x)
print("w:",w)
print("b:",b)

y = w*x + b 
print(y)

y.backward() #calculates the gradient of output w.r.t inputs

#gradients of outputs w.r.t inputs
print('dy/dx :',x.grad)  
print('dy/dw :',w.grad)
print('dy/db :',b.grad)

a = x.new_tensor(x,requires_grad=True) #creating a copy of a tensor x

print(a.grad)  #no grad of tensor x is copied to tensor a

图 2：在不复制梯度的情况下形成新的张量

在打印与张量 a 相关的梯度值时，不会打印None 。这是因为，PyTorch 只关联动态计算图中涉及的Leaf 张量的梯度（例如图 2 中计算张量 y 时使用的张量 x、w 和 b）。new_tensor 方法仅将张量数据复制到新张量（例如张量 a ）

四、torch.Tensor.view()

此方法返回具有相同数据但形状不同的新张量。[参考这里]

t1 = torch.randn(2,3)
print(t1)
print("is t1 contiguous??",t1.is_contiguous())
t1_ = t1.transpose(0,1)
print(t1_)
print("is t1_ contiguous??",t1_.is_contiguous())
print(t1_.view(-1))

tensor([[-1.5140,  0.7263, -0.7764],
        [ 0.6795,  0.7041,  1.1364]])
is t1 contiguous?? True
tensor([[-1.5140,  0.6795],
        [ 0.7263,  0.7041],
        [-0.7764,  1.1364]])
is t1_ contiguous?? False
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-45-ac8f2ddf9c04> in <module>
      5 print(t1_)
      6 print("is t1_ contiguous??",t1_.is_contiguous())
----> 7 print(t1_.view(-1))

RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.

此方法不适用于非连续张量。有关张量连续性的更多信息可以在此处找到。

图 3：非连续张量上的 .view() 错误

从图 3 中可以看出，很明显.view()无法重塑非连续张量 t1_，因此可以使用 .view() 。在这里重塑以达到相同的目的。

五、torch.Tensor.is_leaf

根据 PyTorch 文档 V1.5.0 ，所有属性 require_grad 设置为False 的张量都被视为叶张量。[请参阅此处]

t2 = torch.randn(2,2,requires_grad=True)+2
print(t2)

#As shown above, the tensor t2 is a random tensor with requires_grad set to True 
#but is a result of an operation of adding 2 also. 
#So, in this case it will not be a Leaf Tensor. Let's check!!

print(t2.is_leaf) #t2 is not a leaf Tensor as it is created by an operation with requires_grad = True

仅当属性 require_grad 设置为 True 的张量由用户创建且不是操作结果时，才将其视为叶张量。

tensor([[4.4777, 2.4102], [2.5153, 1.6102]], grad_fn=<AddBackward0>) False

图 4：创建叶张量时出错

从图 4 中可以看出，张量 t2 不是叶张量，因为它是通过将两个添加到新定义的张量而创建的，这违反了定义叶张量的准则。

六、torch.Tensor.backward()

该函数负责计算当前张量相对于图叶子（leaftensors）的梯度【参考这里】

此方法仅需要 1 维张量或标量来计算叶向量/图叶的梯度

t1 = torch.randn(2,2,requires_grad=True)
t2 = torch.randn(2,2,requires_grad=True)

print(t1,t2)

#Above are 2 randomly defined tensors of dimensions 2x2 with requires_grad = True. 
#Let's involve these variables in a linear computation and calculate the gradient 
#of output y w.r.t these variables. 

y = t1*t2
print(y)
print(y.shape)

y.backward()

tensor([[ 1.0389, -1.0465], [-1.0421, 0.7246]], requires_grad=True) tensor([[0.1542, 1.4928], [0.0989, 0.2785]], requires_grad=True) tensor([[ 0.1602, -1.5622], [-0.1031, 0.2018]], grad_fn=<MulBackward0>) torch.Size([2, 2])

图 5：非一维张量的梯度计算误差

RuntimeError                              Traceback (most recent call last)
<ipython-input-58-3a823b03f50a> in <module>
     12 print(y.shape)
     13 
---> 14 y.backward()

~/anaconda3/envs/snowflakes/lib/python3.7/site-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
    193                 products. Defaults to ``False``.
    194         """
--> 195         torch.autograd.backward(self, gradient, retain_graph, create_graph)
    196 
    197     def register_hook(self, hook):

~/anaconda3/envs/snowflakes/lib/python3.7/site-packages/torch/autograd/__init__.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
     91         grad_tensors = list(grad_tensors)
     92 
---> 93     grad_tensors = _make_grads(tensors, grad_tensors)
     94     if retain_graph is None:
     95         retain_graph = create_graph

~/anaconda3/envs/snowflakes/lib/python3.7/site-packages/torch/autograd/__init__.py in _make_grads(outputs, grads)
     32             if out.requires_grad:
     33                 if out.numel() != 1:
---> 34                     raise RuntimeError("grad can be implicitly created only for scalar outputs")
     35                 new_grads.append(torch.ones_like(out, memory_format=torch.preserve_format))
     36             else:

RuntimeError: grad can be implicitly created only for scalar outputs

从图5中，我们可以观察到张量y的形状是2x2，因此无法计算y相对于t1,t2的梯度并导致运行时错误。因此，为了计算梯度，我们可以对张量 y 中的所有元素求和，然后根据上下文和要求计算梯度 wrt 总和或更新张量 y 中每个元素的梯度。有关 PyTorch 论坛的更多讨论，请参阅此处。

请参阅此处的完整代码：

Pytorch Tensor Functions - Notebook by Souptik (souptikmajumder) | Jovian

GitHub - souptikmajumder/torch.tensor-function-exceptions-pytorch: 5 exceptions to the most regularly used pytorch.tensor functions

有关 Torch.tensor 的官方 PyTorch 文档可以在此处找到。