574 words
3 minutes
NumPy Set Operations

VIII. NumPy Set Operations (集合运算)
NumPy provides set-like operations (集合运算) on 1-D arrays. These treat the array as a set of values and support finding unique elements, intersections (交集), unions (并集), and differences (差集) — all returned as sorted arrays.
1. unique() — Find Unique Values (求唯一值)
Core idea: Return sorted unique elements. Optionally return counts or indices.
import numpy as np
a = np.array([3, 1, 2, 1, 3, 3, 2])
np.unique(a) # [1 2 3]
# Also return counts (每个值出现的次数)vals, counts = np.unique(a, return_counts=True)# vals: [1 2 3]# counts: [2 2 3]
# Also return first-occurrence indices (首次出现的索引)vals, idx = np.unique(a, return_index=True)# idx: [1 2 0] (positions in original array)2. intersect1d() — Intersection (交集)
Core idea: Elements that appear in both arrays.
a = np.array([1, 2, 3, 4, 5])b = np.array([3, 4, 5, 6, 7])
np.intersect1d(a, b) # [3 4 5]
# Also return indices in each arraycommon, ia, ib = np.intersect1d(a, b, return_indices=True)# ia: [2 3 4] (positions in a)# ib: [0 1 2] (positions in b)3. union1d() — Union (并集)
Core idea: All elements from either array, deduplicated and sorted.
a = np.array([1, 2, 3])b = np.array([2, 3, 4, 5])
np.union1d(a, b) # [1 2 3 4 5]4. setdiff1d() — Difference (差集)
Core idea: Elements in a that are not in b (order matters: a − b).
a = np.array([1, 2, 3, 4, 5])b = np.array([3, 4])
np.setdiff1d(a, b) # [1 2 5] — in a but not in bnp.setdiff1d(b, a) # [] — b minus a (empty here)5. in1d() — Membership Test (成员检测)
Core idea: Returns a boolean array — True where elements of a appear in b.
a = np.array([1, 2, 3, 4, 5])b = np.array([2, 4])
mask = np.in1d(a, b) # [False True False True False]a[mask] # [2 4] — filter using the mask
# Modern equivalent (NumPy 1.24+)np.isin(a, b) # same result, more readableNote:
np.isin() is preferred over in1d() in modern NumPy — it supports multi-dimensional arrays and has clearer naming.6. setxor1d() — Symmetric Difference (对称差集)
Core idea: Elements in either array but not in both (XOR logic).
a = np.array([1, 2, 3, 4])b = np.array([3, 4, 5, 6])
np.setxor1d(a, b) # [1 2 5 6] — not in common7. Visual Summary
a = [1, 2, 3, 4, 5]b = [3, 4, 5, 6, 7]
intersect1d: [3, 4, 5] ← overlapunion1d: [1, 2, 3, 4, 5, 6, 7] ← allsetdiff1d(a,b): [1, 2] ← only in asetxor1d: [1, 2, 6, 7] ← not shared8. Quick Comparison Table
| Function (函数) | Result | Analogy |
|---|---|---|
unique(a) | Deduplicated a | Remove duplicates |
intersect1d(a, b) | a ∩ b | Both have it |
union1d(a, b) | a ∪ b | Either has it |
setdiff1d(a, b) | a − b | Only a has it |
setxor1d(a, b) | a △ b | Only one has it |
in1d(a, b) | Boolean mask | Is a[i] in b? |
💡 One-line Takeaway
All set functions operate on 1-D sorted arrays — use
All set functions operate on 1-D sorted arrays — use
unique() first to deduplicate, then apply intersect1d / union1d / setdiff1d for standard set logic. NumPy Set Operations
https://lxy-alexander.github.io/blog/posts/numpy/api/08numpy-set-operations/