XIII. Indexing, Selection & Advanced Operations (索引、选择与高级操作)#

1. `torch.nonzero()` / `torch.argwhere()`#

Returns the coordinates of all non-zero (or True) elements. Used for Sparse Operations (稀疏操作).

1
x = torch.tensor([[0, 1, 0], [2, 0, 3]])
2
idx = torch.nonzero(x)           # tensor([[0,1],[1,0],[1,2]])
3
idx2 = torch.argwhere(x > 0)     # PyTorch 1.9+

2. `torch.index_select()`#

Selects elements along a dimension by index tensor. Similar to NumPy fancy indexing.

1
x = torch.rand(5, 4)
2
idx = torch.tensor([0, 2, 4])
3
out = torch.index_select(x, dim=0, index=idx)  # shape [3, 4]

Note: Index must be a 1D LongTensor; more efficient than boolean masking for this case.

3. `torch.masked_select()`#

Selects elements by boolean mask. Returns a flattened 1D Tensor.

1
x = torch.randn(3, 3)
2
mask = x > 0
3
pos_vals = torch.masked_select(x, mask)  # all positive values, 1D

Note: Result is always 1D. Use masked_fill or scatter_ to reconstruct shape.

4. `torch.sort()` / `torch.argsort()`#

Sorts a Tensor along a dimension, returning sorted values and original indices.

1
x = torch.tensor([3., 1., 4., 1., 5., 9.])
2
vals, idx = torch.sort(x, descending=True)  # [9,5,4,3,1,1]
3
order = torch.argsort(x)

Note: Key step in NMS (Non-Maximum Suppression, 非极大抑制): sort boxes by confidence descending.

5. `torch.cumsum()` / `torch.cumprod()`#

Cumulative sum (累积和) or cumulative product (累积积) along a dimension.

1
x = torch.tensor([1., 2., 3., 4.])
2
print(torch.cumsum(x, dim=0))   # tensor([1., 3., 6., 10.])
3
print(torch.cumprod(x, dim=0))  # tensor([1., 2., 6., 24.])

Note: cumsum is an efficient alternative for generating causal attention masks (lower triangular).

6. `torch.flip()`#

Flips a Tensor along specified dimensions — mirror flip augmentation or reverse operation.

1
x = torch.tensor([[1, 2, 3], [4, 5, 6]])
2
h = torch.flip(x, dims=[1])  # horizontal: [[3,2,1],[6,5,4]]
3
v = torch.flip(x, dims=[0])  # vertical: [[4,5,6],[1,2,3]]

7. torch.bucketize()`#

Assigns continuous values to discrete buckets (离散化) by given boundaries. Analogous to NumPy's digitize.

1
boundaries = torch.tensor([0.0, 0.5, 1.0])
2
x = torch.tensor([-0.1, 0.3, 0.7, 1.5])
3
bins = torch.bucketize(x, boundaries)  # tensor([0, 1, 2, 3])

Note: Useful for Feature Engineering (特征工程) — binning continuous features and custom quantile normalization.

💡 One-line Takeaway
gather picks values by index; scatter_ puts values by index; masked_fill overwrites by condition.

XIII. Indexing, Selection & Advanced Operations (索引、选择与高级操作)#

1. torch.nonzero() / torch.argwhere()#

2. torch.index_select()#

3. torch.masked_select()#

4. torch.sort() / torch.argsort()#

5. torch.cumsum() / torch.cumprod()#

6. torch.flip()#