# chapter 3 - It Starts with a Tensor

Dec 30, 2022
• "floating point numbers are the way a network deals with information"
• "In the context of deep learning, tensors refer to the generalization of vectors and matrices to an arbitrary number of dimensions,"
• unlike numpy arrays, torch tensors have the ability to run on GPUs, distribute operations to multiple computers, and track the graph of computations that created them
• notes on using tensors in this ipython notebook
• squeezing and unsqueezing
• named tensors
• which will not be used in the book, due to their experimental nature
• Tensor element types
• the tensor documentation lists the availble types
• float32 default
• float16 can be useful, as it is a default data type on modern GPUs and often the extra precision of 32 bits does not buy you useful training results
• tensors used as indexes into other tensors are expected to be int64
• predicates on tensors produce bool result tensors
• notebook with examples

## Types of operations

• The types of operations
• The torch docs are thorough, but here's an overview:
• Creation operations
• ones, zeros, rand, from_numpy, range, linspace, full (fill with a given value)
• Pointwise operations
• return a new tensor by applying a function to each element independently
• abs, cos, add, bitwise ops, pow
• Reduction operations
• aggregate values by iterating through tensors
• mean, std, norm, all, quantile
• Comparison operations
• functions for evaluating numerical predicates over tensors
• equal, max
• Spectral operations
• functions for transofrming in and operating in the frequency domain
• All the stuff from DSP I don't understand!
• hamming_window, stft (short-time fourier transform)
• Other operations
• clone, cross, flatten, histogram, renorm, trace (sum of elements of diagonal)
• BLAS and LAPACK operations
• fast matrix math

## Storage

• values in tensors are allocaetd in contiguous chunks of memory managed by torch.Storage
• storage is 1d, and a tensor is a view of storage capable of idnexing into it with an offset and strides
• as such, multiple tensors can index the same storage
• tensors are defined by three metadata
• size is a tuple indicating how many elements across each dimension the tensor represents
• offset is the index in the storage corresponding to the first element in the tensor
• stride is the number of elements in the storage that need to be skipped over to obtain the next element along each dimension
• in a 2d tensor, accessing element (i,j) is:offset + (stride[0]*i) + (stride[1]*j)
• operations that don't reallocate storage are cheap
• transposition or extracting a subtensor, for example
• You can use clone() to allocate a tensor with new storage
• There is a shorthand for transpose - t
• You can transpose in multiple dimensions by specifying which dimensions should be transposed
• A tensor whose values are laid out in storage from the rightmost dimension onwards is called contiguous
• What is the "rightmost dimension"? I don't understand this at all
• contiguous tensors have improved efficiency because we can iterate them efficiently
• improved cache locality
• Some functions only work on contiguous tensors
• is_contiguous will tell us if our tensor is or is not
• a transpose of a contiguous tensor will not be contiguous
• we can use the contiguous method to obtain a contiguous tensor from a non-contiguous one
• It will reallocate
• The final example in the section helps me understand a bit better what contiguous means, so I'll copy it in here from the notebook referenced above

python In [23]: p = torch.tensor() ...: p_t = p.t() ...: p_t, p_t.storage(), p_t.stride(), p_t.is_contiguous()

Out[23]: (tensor([[4., 5., 2.], [1., 3., 1.]]), 4.0 1.0 5.0 3.0 2.0 1.0 [torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6], (1, 2), False)

In [24]: p_t_c = p_t.contiguous() ...: p_t_c, p_t_c.storage(), p_t_c.stride()

Out[24]: (tensor([[4., 5., 2.], [1., 3., 1.]]), 4.0 5.0 2.0 1.0 3.0 1.0 [torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 6], (3, 1)) 

## Moving tensors to the GPU

• You can specify the device as a keyword argument to the tensor
• devices
• cuda is the device for nvidia GPUs
• MPS is the device for M1 macs (Metal Pixel Shader)
• this issue tracks coverage for torch operaions on the MPS device
• the gaps are still significant
• cpu is the device for the CPU
• To copy a CPU tensor to the GPU, use to with the device kwarg
• check out the notebook

## Numpy compatibility

• call numpy() on a tensor to get a numpy array out
• will share storage with the tensor
• call torch.from_numpy(array) to get a tensor out
• will also share storage
• note that numpy defaults to float64, while float32 or float16 are best for us, so you might want to change the dtype

## serializing tensors

• calling torch.save(tensor, filename) will save the tensor to a pickle at filename
• can also pass a file handle as the second arg
• torch.load(filename | file_handle) will return the tensor
• Both of those methods save it in a pickle format, which is not great for interop
• we can use h5py to save tensors as HDF5
• book doesn't mention parquet, but torch seems to have libraries for it

notebook

↑ up