574 words
3 minutes
NumPy Set Operations

VIII. NumPy Set Operations (集合运算)#

NumPy provides set-like operations (集合运算) on 1-D arrays. These treat the array as a set of values and support finding unique elements, intersections (交集), unions (并集), and differences (差集) — all returned as sorted arrays.

1. unique() — Find Unique Values (求唯一值)#

Core idea: Return sorted unique elements. Optionally return counts or indices.

import numpy as np
a = np.array([3, 1, 2, 1, 3, 3, 2])
np.unique(a) # [1 2 3]
# Also return counts (每个值出现的次数)
vals, counts = np.unique(a, return_counts=True)
# vals: [1 2 3]
# counts: [2 2 3]
# Also return first-occurrence indices (首次出现的索引)
vals, idx = np.unique(a, return_index=True)
# idx: [1 2 0] (positions in original array)

2. intersect1d() — Intersection (交集)#

Core idea: Elements that appear in both arrays.

a = np.array([1, 2, 3, 4, 5])
b = np.array([3, 4, 5, 6, 7])
np.intersect1d(a, b) # [3 4 5]
# Also return indices in each array
common, ia, ib = np.intersect1d(a, b, return_indices=True)
# ia: [2 3 4] (positions in a)
# ib: [0 1 2] (positions in b)

3. union1d() — Union (并集)#

Core idea: All elements from either array, deduplicated and sorted.

a = np.array([1, 2, 3])
b = np.array([2, 3, 4, 5])
np.union1d(a, b) # [1 2 3 4 5]

4. setdiff1d() — Difference (差集)#

Core idea: Elements in a that are not in b (order matters: a − b).

a = np.array([1, 2, 3, 4, 5])
b = np.array([3, 4])
np.setdiff1d(a, b) # [1 2 5] — in a but not in b
np.setdiff1d(b, a) # [] — b minus a (empty here)

5. in1d() — Membership Test (成员检测)#

Core idea: Returns a boolean array — True where elements of a appear in b.

a = np.array([1, 2, 3, 4, 5])
b = np.array([2, 4])
mask = np.in1d(a, b) # [False True False True False]
a[mask] # [2 4] — filter using the mask
# Modern equivalent (NumPy 1.24+)
np.isin(a, b) # same result, more readable
Note: np.isin() is preferred over in1d() in modern NumPy — it supports multi-dimensional arrays and has clearer naming.

6. setxor1d() — Symmetric Difference (对称差集)#

Core idea: Elements in either array but not in both (XOR logic).

a = np.array([1, 2, 3, 4])
b = np.array([3, 4, 5, 6])
np.setxor1d(a, b) # [1 2 5 6] — not in common

7. Visual Summary#

a = [1, 2, 3, 4, 5]
b = [3, 4, 5, 6, 7]
intersect1d: [3, 4, 5] ← overlap
union1d: [1, 2, 3, 4, 5, 6, 7] ← all
setdiff1d(a,b): [1, 2] ← only in a
setxor1d: [1, 2, 6, 7] ← not shared

8. Quick Comparison Table#

Function (函数)ResultAnalogy
unique(a)Deduplicated aRemove duplicates
intersect1d(a, b)a ∩ bBoth have it
union1d(a, b)a ∪ bEither has it
setdiff1d(a, b)a − bOnly a has it
setxor1d(a, b)a △ bOnly one has it
in1d(a, b)Boolean maskIs a[i] in b?
💡 One-line Takeaway
All set functions operate on 1-D sorted arrays — use unique() first to deduplicate, then apply intersect1d / union1d / setdiff1d for standard set logic.
NumPy Set Operations
https://lxy-alexander.github.io/blog/posts/numpy/api/08numpy-set-operations/
Author
Alexander Lee
Published at
2026-03-12
License
CC BY-NC-SA 4.0