PyTorch Tensors

Tensor is a multi-dimensional matrix of single data type.

Table of contents
  1. Comparison with NumPy
  2. torch.Tensor
    1. Tensor Attributes
      1. Shape
      2. Data Type
      3. Device
      4. Requires Gradient
    2. Data Types
  3. Initializing Tensors with Other Data
    1. Copy Creation (torch.tensor)
    2. Avoid Copy with NumPy
    3. Avoid Copy in with Existing Data
    4. Detach from Autograd
  4. Initializing Tensors with Shape
    1. Shape from Existing Tensor
  5. Moving Tensors to Devices
    1. Upon Creation
    2. Move Later
  6. Numpy-Like Indexing
  7. Single-Element Tensors
  8. Concatenating Tensors
    1. Concatenate Along Existing Dimension
    2. Concatenate Along New Dimension
  9. Two Ways to Matrix Multiply Tensors
  10. Two Ways to Element-Wise Multiply Tensors
  11. Convert Single-Element Tensor to Python Scalar
  12. In-Place Operations
  13. Tensor to NumPy Array
  14. Require Gradient

Comparison with NumPy

  • Similar to ndarray in NumPy
  • Can be run on GPU/TPU
  • Optimized for automatic differentiation
  • Shares memory with NumPy arrays when they are on the CPU

torch.Tensor

import torch

Tensor Attributes

Given a tensor tensor = torch.rand(3, 4):

Shape

tensor.shape returns torch.Size:

torch.Size([3, 4])

Data Type

tensor.dtype returns data type, e.g. torch.float32.

Device

tensor.device returns device, e.g. cpu, cuda, mps.

Requires Gradient

tensor.requires_grad returns True if operations are being tracked for autograd.

Data Types

Refer to the list here.

The recommended way to create a tensor is to specify a dtype with a factory function. For example:

torch.zeros([2, 3], dtype=torch.float32)

Default data type is torch.float32.


Initializing Tensors with Other Data

Copy Creation (torch.tensor)

torch.tensor always copies data.

You can copy data from Python lists, NumPy arrays, or other tensors.

data = [[1, 2], [3, 4]]
torch.tensor(data)
np_arr = np.array(data)
torch.tensor(np_arr)
torch.tensor(tensor)

Avoid Copy with NumPy

You can have a tensor share memory with a NumPy array:

torch.from_numpy(np_arr)

The changes you make to the tensor will affect the NumPy array.

You cannot resize this tensor.

Avoid Copy in with Existing Data

torch.as_tensor(data, dtype=..., device=...)

torch.as_tensor tries to avoid copying data and tries to preserve autograd history.

If dtype and device are not changed, it will return the same tensor. However, if they are changed, it will have to copy the data as if doing: data.to(device, dtype).

If data is a NumPy array, it will perform the same operation as torch.from_numpy(data) above.

Detach from Autograd

You can avoid copying data while detaching from autograd:

tensor.detach()

This returns a new tensor that does not require gradients. However, it shares the same data as the original tensor so any modifications will affect the original tensor.


Initializing Tensors with Shape

Give the following shape:

shape = (2, 3,)
  • torch.rand(shape)

    tensor([[0.6324, 0.2434, 0.1177],
            [0.6004, 0.8779, 0.2302]])
    
  • torch.ones(shape)

    tensor([[1., 1., 1.],
            [1., 1., 1.]])
    
  • torch.zeros(shape)

    tensor([[0., 0., 0.],
            [0., 0., 0.]])
    

Shape from Existing Tensor

ones = torch.ones_like(data)
rand = torch.rand_like(data, dtype=torch.float)
tensor([[1, 1],
        [1, 1]])
tensor([[0.6234, 0.9853],
        [0.4259, 0.0843]])

torch.ones_like and torch.rand_like inherits the properties of the input tensor (e.g. dtype, device).

ones_like creates the same shape and fills it with ones. rand_like does the same but fills it with random numbers.

dtype is specified to override the data type of the original input tensor.


Moving Tensors to Devices

By default, tensors are created on the CPU, but we can use other devices too:

  • cuda
  • cpu
  • mps

You can move tensors to your specified device in the following ways:

Upon Creation

tensor = torch.ones(3, 4, device="cuda")
# or
tensor = torch.ones(3, 4, device=torch.device("cuda"))

Move Later

tensor.to("cuda")
# or
tensor.to(torch.device("cuda"))

Numpy-Like Indexing

Given some tensor, you can access and modify its elements using NumPy-like indexing:

  • First row: tensor[0]
  • First column: tensor[:, 0]
  • Last column: tensor[..., -1]
  • Fill second column with zeros: tensor[:, 1] = 0

Single-Element Tensors

Single-element tensors regardless of their shape can be converted to Python scalars using .item():

torch.tensor([[1]]).item()  # 1
torch.tensor(1).item()  # 1

Concatenating Tensors

Concatenate Along Existing Dimension

To concatenate two tensors along an existing dimension:

torch.cat([tensor, tensor, tensor], dim=1)

Concatenate Along New Dimension

To concatenate two tensors along a new dimension:

torch.stack([tensor, tensor, tensor],dim=0)

Inserts an additional dimension at the specified index.


Two Ways to Matrix Multiply Tensors

  • tensor @ tensor.T
  • tensor.matmul(tensor.T)
    • You can also specify the output tensor

      y = torch.rand_like(tensor)
      torch.matmul(tensor, tensor.T, out=y)
      

Two Ways to Element-Wise Multiply Tensors

  • tensor * tensor
  • tensor.mul(tensor)
    • You can specify the output tensor same ways as above

Convert Single-Element Tensor to Python Scalar

For example, if you obtained a single-element tensor via some operation:

agg = tensor.sum()

You can convert it to a Python scalar using .item():

agg_item = agg.item()

In-Place Operations

In-place operations are post-fixed with an underscore _.

For example:

  • x.copy_(y): copies y to x
  • x.t_(): transposes x
  • x.add_(5): adds 5 to x element-wise

Tensor to NumPy Array

You can convert a PyTorch tensor to a NumPy array and vice versa:

t = torch.ones(5)
n = t.numpy()
n = np.ones(5)
t = torch.from_numpy(n)

In-place changes to either will affect the other.


Require Gradient

To record operations on a tensor for autograd, we can either

  • Set requires_grad=True upon creation

    bias = torch.zeros(3, requires_grad=True)
    
  • Use x.requires_grad_() later

    x.requires_grad_()  # True by default
    

References: