3994 words
20 minutes
Python collections

I. Python collections Module — Complete Learning Manual
Python's collections module provides specialized container datatypes (特殊容器数据类型) that extend the built-in
dict, list, and tuple. The seven main classes are: defaultdict, Counter, OrderedDict, deque, namedtuple, ChainMap, and UserDict / UserList / UserString. Each solves a specific pain-point of the standard built-ins with minimal overhead. 1. defaultdict — Default Value Dict (默认值字典)
A defaultdict (默认值字典) behaves exactly like a regular
dict, except that accessing a missing key (缺失键) automatically creates it by calling the default_factory (默认工厂函数) — eliminating KeyError and verbose setdefault() boilerplate. 1) Constructor (构造函数)
collections.defaultdict(default_factory=None, **kwargs)default_factory is any zero-argument callable: int, list, set, dict, or a custom lambda.
2) defaultdict(int) — Frequency counter (频率计数)
from collections import defaultdict
text = "apple banana apple cherry banana apple"
freq = defaultdict(int) # missing key → 0
for word in text.split(): freq[word] += 1 # no KeyError on first access
print(dict(freq))# → {'apple': 3, 'banana': 2, 'cherry': 1}
# Compare with plain dict (verbose):freq2 = {}for word in text.split(): freq2[word] = freq2.get(word, 0) + 1 # needs .get()3) defaultdict(list) — Grouping (分组)
from collections import defaultdict
students = [ ("Alice", "Math"), ("Bob", "Science"), ("Alice", "Science"), ("Carol", "Math"), ("Bob", "Math"),]
by_name = defaultdict(list)
for name, subject in students: by_name[name].append(subject) # missing key → [] automatically
print(dict(by_name))# → {'Alice': ['Math', 'Science'], 'Bob': ['Science', 'Math'], 'Carol': ['Math']}4) defaultdict(set) — Unique grouping (去重分组)
from collections import defaultdict
edges = [(1, 2), (1, 3), (2, 3), (1, 2)] # duplicate edge (1,2)
graph = defaultdict(set)
for u, v in edges: graph[u].add(v) graph[v].add(u)
print(dict(graph))# → {1: {2, 3}, 2: {1, 3}, 3: {1, 2}} (no duplicates)5) defaultdict(dict) — Nested dict (嵌套字典)
from collections import defaultdict
# 2-level nested defaultdictmatrix = defaultdict(lambda: defaultdict(int))
matrix["row1"]["col1"] += 10matrix["row1"]["col2"] += 20matrix["row2"]["col1"] += 30
for row, cols in matrix.items(): print(f"{row}: {dict(cols)}")# → row1: {'col1': 10, 'col2': 20}# → row2: {'col1': 30}6) Custom default_factory (自定义工厂函数)
from collections import defaultdict
# Factory that returns a specific default valuedd = defaultdict(lambda: "N/A")dd["name"] = "Alice"
print(dd["name"]) # → Aliceprint(dd["age"]) # → N/A (key created with "N/A")print(dd["city"]) # → N/A
# Factory with counterid_counter = [0]def next_id(): id_counter[0] += 1 return id_counter[0]
registry = defaultdict(next_id)print(registry["alice"]) # → 1print(registry["bob"]) # → 2print(registry["alice"]) # → 1 (already exists)7) default_factory attribute — Inspect and change
from collections import defaultdict
dd = defaultdict(list)print(dd.default_factory) # → <class 'list'>
dd.default_factory = set # change factory at runtimedd["new_key"].add(42)print(dict(dd)) # → {'new_key': {42}}
dd.default_factory = None # disable factory → KeyError on missing keystry: _ = dd["missing"]except KeyError as e: print(f"KeyError: {e}") # → KeyError: 'missing'8) __missing__ — How defaultdict works internally
from collections import defaultdict
class MyDefaultDict(dict): """Manual implementation of defaultdict logic."""
def __init__(self, factory): super().__init__() self.factory = factory
def __missing__(self, key): # Called automatically when key is not found value = self.factory() self[key] = value return value
d = MyDefaultDict(list)d["x"].append(1)d["x"].append(2)d["y"].append(3)print(dict(d)) # → {'x': [1, 2], 'y': [3]}Note:
__missing__ is only triggered by d[key] access, NOT by d.get(key). get() always returns None (or the provided default) without creating the key.9) Inherits all dict methods
from collections import defaultdict
dd = defaultdict(int, a=1, b=2)
# All standard dict methods workprint(dd.keys()) # → dict_keys(['a', 'b'])print(dd.values()) # → dict_values([1, 2])print(dd.items()) # → dict_items([('a', 1), ('b', 2)])print(dd.get("x", 99)) # → 99 (no key created)print("a" in dd) # → Truedd.update({"c": 3})print(dd.pop("a")) # → 1print(dict(dd)) # → {'b': 2, 'c': 3}2. Counter — Multiset / Frequency Map (计数器)
A Counter (计数器) is a
dict subclass designed for counting hashable objects (统计可哈希对象). Missing keys return 0 instead of raising KeyError. It supports arithmetic operations (算术运算) between counters. 1) Constructor — Three ways to create
from collections import Counter
# From an iterablec1 = Counter("abracadabra")print(c1) # → Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
# From a dictc2 = Counter({"cats": 4, "dogs": 8})print(c2) # → Counter({'dogs': 8, 'cats': 4})
# From keyword argumentsc3 = Counter(red=3, blue=1, green=5)print(c3) # → Counter({'green': 5, 'red': 3, 'blue': 1})2) Missing key → 0 (缺失键返回0)
from collections import Counter
c = Counter("hello")print(c["l"]) # → 2 (exists)print(c["z"]) # → 0 (missing — no KeyError!)print("z" in c) # → False (not stored, just returns 0)3) most_common(n) — Top N elements (最高频N个元素)
from collections import Counter
words = "the quick brown fox jumps over the lazy dog the fox".split()c = Counter(words)
print(c.most_common(3))# → [('the', 3), ('fox', 2), ('quick', 1)]
print(c.most_common()) # all elements, sorted by frequencyprint(c.most_common()[:-4:-1])# least common 3 (tail trick)# → [('dog', 1), ('lazy', 1), ('over', 1)]4) elements() — Expand back to iterable (展开为可迭代)
from collections import Counter
c = Counter(a=3, b=1, c=2)
print(list(c.elements()))# → ['a', 'a', 'a', 'b', 'c', 'c'] (ordered by insertion)
# Reconstruct a sorted listprint(sorted(c.elements()))# → ['a', 'a', 'a', 'b', 'c', 'c']
# Elements with count ≤ 0 are excludedc["x"] = -1print(list(c.elements())) # 'x' not included5) subtract() / update() — In-place operations (就地运算)
from collections import Counter
inventory = Counter(apples=10, oranges=5, bananas=8)
# subtract: reduces counts (allows negatives)sold = Counter(apples=3, oranges=5, bananas=10)inventory.subtract(sold)print(inventory)# → Counter({'apples': 7, 'bananas': -2, 'oranges': 0})
# update: adds counts (merges)restocked = Counter(apples=5, bananas=15)inventory.update(restocked)print(inventory)# → Counter({'bananas': 13, 'apples': 12, 'oranges': 0})6) Arithmetic operators (算术运算符)
from collections import Counter
a = Counter(x=4, y=2, z=0)b = Counter(x=1, y=3, w=5)
print(a + b) # add counts# → Counter({'x': 5, 'w': 5, 'y': 5})
print(a - b) # subtract, keep only positives# → Counter({'x': 3})
print(a & b) # intersection: min of each count# → Counter({'x': 1, 'y': 2})
print(a | b) # union: max of each count# → Counter({'w': 5, 'x': 4, 'y': 3})
# Unary operatorsprint(+a) # remove zero and negative countsprint(-a) # negate — flip sign, keep negatives as positives7) Total count and filtering (总计数与过滤)
from collections import Counter
c = Counter(a=5, b=3, c=0, d=-2)
# Total of all positive counts (Python 3.10+)print(c.total()) # → 8 (5+3+0 = 8, negatives excluded)
# Keep only positive countspositive = +cprint(positive) # → Counter({'a': 5, 'b': 3})
# Keep only negative counts (useful for "owed" quantities)negative = -cprint(negative) # → Counter({'d': 2})8) Practical: anagram check, top-K, word frequency
from collections import Counter
# ── Anagram check (变位词检测) ──def is_anagram(s1: str, s2: str) -> bool: return Counter(s1.lower()) == Counter(s2.lower())
print(is_anagram("listen", "silent")) # → Trueprint(is_anagram("hello", "world")) # → False
# ── Character frequency difference ──def missing_chars(have: str, need: str) -> Counter: deficit = Counter(need) - Counter(have) return deficit
print(missing_chars("aab", "aaabbc"))# → Counter({'a': 1, 'b': 1, 'c': 1})
# ── Top-K frequent words ──import re
text = """To be or not to be that is the question whether tis nobler in the mind to suffer"""
words = re.findall(r'\w+', text.lower())top5 = Counter(words).most_common(5)print(top5)# → [('to', 3), ('be', 2), ('the', 2), ('or', 1), ('not', 1)]9) Inherits all dict methods
from collections import Counter
c = Counter("mississippi")
print(c.keys()) # → dict_keys(['m', 'i', 's', 'p'])print(c.values()) # → dict_values([1, 4, 4, 2])print(c.items()) # → dict_items([('m', 1), ('i', 4), ('s', 4), ('p', 2)])print(c.get("i")) # → 4print(c.get("z")) # → None (get() returns None, not 0)
# del sets count to 0 conceptually, but removes the keydel c["m"]print("m" in c) # → Falseprint(c["m"]) # → 0 (missing key returns 0)3. OrderedDict — Ordered Dictionary (有序字典)
Since Python 3.7, plain
dict preserves insertion order. OrderedDict (有序字典) still offers unique advantages: order-sensitive equality, move_to_end(), and popitem(last=True/False) for implementing LRU Cache (LRU缓存) and similar structures. 1) Basic usage and order-sensitive equality
from collections import OrderedDict
od = OrderedDict()od["banana"] = 3od["apple"] = 5od["cherry"] = 1
print(od)# → OrderedDict([('banana', 3), ('apple', 5), ('cherry', 1)])
# Order-sensitive equality (顺序敏感的相等判断)od1 = OrderedDict([("a", 1), ("b", 2)])od2 = OrderedDict([("b", 2), ("a", 1)])d1 = {"a": 1, "b": 2}
print(od1 == od2) # → False (same keys/values, different order)print(od1 == d1) # → True (OrderedDict == dict ignores order)2) move_to_end(key, last=True) — Reposition a key
from collections import OrderedDict
od = OrderedDict.fromkeys("ABCDE")
od.move_to_end("B") # move B to end (last=True default)print(list(od)) # → ['A', 'C', 'D', 'E', 'B']
od.move_to_end("E", last=False) # move E to frontprint(list(od)) # → ['E', 'A', 'C', 'D', 'B']3) popitem(last=True) — LIFO / FIFO removal
from collections import OrderedDict
od = OrderedDict.fromkeys("ABCDE")
print(od.popitem(last=True)) # → ('E', None) LIFO (like a stack)print(od.popitem(last=False)) # → ('A', None) FIFO (like a queue)print(list(od)) # → ['B', 'C', 'D']4) LRU Cache implementation (LRU缓存实现)
from collections import OrderedDict
class LRUCache: """ Least Recently Used Cache (最近最少使用缓存) using OrderedDict for O(1) get and put. """
def __init__(self, capacity: int): self.capacity = capacity self.cache = OrderedDict()
def get(self, key: int) -> int: if key not in self.cache: return -1 self.cache.move_to_end(key) # mark as recently used return self.cache[key]
def put(self, key: int, value: int) -> None: if key in self.cache: self.cache.move_to_end(key) self.cache[key] = value if len(self.cache) > self.capacity: self.cache.popitem(last=False) # evict least recently used
cache = LRUCache(3)cache.put(1, 10)cache.put(2, 20)cache.put(3, 30)print(cache.get(1)) # → 10 (1 moved to end)cache.put(4, 40) # evicts key 2 (least recently used)print(cache.get(2)) # → -1 (evicted)print(cache.get(3)) # → 30print(cache.get(4)) # → 405) __reversed__() — Reverse iteration
from collections import OrderedDict
od = OrderedDict([("a", 1), ("b", 2), ("c", 3)])
for key in reversed(od): print(key, od[key])# → c 3# → b 2# → a 14. deque — Double-Ended Queue (双端队列)
A deque (双端队列) supports O(1) append and pop from both ends. Unlike a
list (where insert(0, x) is O(n)), deque is the correct data structure for queues (队列), stacks (栈), and sliding windows (滑动窗口). 1) Constructor
from collections import deque
d1 = deque() # emptyd2 = deque([1, 2, 3, 4, 5]) # from iterabled3 = deque("abcde") # from stringd4 = deque(range(10), maxlen=5) # bounded deque (固定长度)
print(d2) # → deque([1, 2, 3, 4, 5])print(d4) # → deque([5, 6, 7, 8, 9], maxlen=5) (first 5 discarded)2) append() / appendleft() — Add to ends (两端添加)
from collections import deque
d = deque([3, 4, 5])
d.append(6) # right end → deque([3, 4, 5, 6])d.appendleft(2) # left end → deque([2, 3, 4, 5, 6])d.appendleft(1) # → deque([1, 2, 3, 4, 5, 6])
print(d) # → deque([1, 2, 3, 4, 5, 6])3) pop() / popleft() — Remove from ends (两端弹出)
from collections import deque
d = deque([1, 2, 3, 4, 5])
print(d.pop()) # → 5 (right) → deque([1, 2, 3, 4])print(d.popleft()) # → 1 (left) → deque([2, 3, 4])print(d) # → deque([2, 3, 4])4) extend() / extendleft() — Batch add (批量添加)
from collections import deque
d = deque([3, 4])
d.extend([5, 6, 7]) # right: → deque([3, 4, 5, 6, 7])d.extendleft([2, 1, 0]) # left, each prepended individually # 2 → [2,3..], 1 → [1,2,3..], 0 → [0,1,2,3..]print(d) # → deque([0, 1, 2, 3, 4, 5, 6, 7])Note:
extendleft([a, b, c]) results in [c, b, a, ...] because each element is prepended one by one — the iterable is effectively reversed.5) rotate(n) — Circular rotation (循环旋转)
from collections import deque
d = deque([1, 2, 3, 4, 5])
d.rotate(2) # rotate RIGHT by 2print(d) # → deque([4, 5, 1, 2, 3])
d.rotate(-2) # rotate LEFT by 2 (undo)print(d) # → deque([1, 2, 3, 4, 5])
# Circular buffer simulation (循环缓冲区)ring = deque(range(5))for _ in range(8): print(ring[0], end=" ") ring.rotate(-1)# → 0 1 2 3 4 0 1 26) maxlen — Bounded / sliding window (有界滑动窗口)
from collections import deque
# Keep only the last 3 elementswindow = deque(maxlen=3)
for i in range(7): window.append(i) print(f"added {i}: {list(window)}")# → added 0: [0]# → added 1: [0, 1]# → added 2: [0, 1, 2]# → added 3: [1, 2, 3] ← 0 dropped automatically# → added 4: [2, 3, 4]# → added 5: [3, 4, 5]# → added 6: [4, 5, 6]
# Moving average (滑动平均)def moving_average(data, window_size): w = deque(maxlen=window_size) result = [] for val in data: w.append(val) result.append(sum(w) / len(w)) return result
print(moving_average([1, 2, 3, 4, 5, 6], 3))# → [1.0, 1.5, 2.0, 3.0, 4.0, 5.0]7) insert() / remove() / count() / index()
from collections import deque
d = deque([1, 2, 3, 2, 4])
d.insert(2, 99) # insert 99 at position 2print(d) # → deque([1, 2, 99, 3, 2, 4])
d.remove(99) # remove first occurrenceprint(d) # → deque([1, 2, 3, 2, 4])
print(d.count(2)) # → 2 (occurrences of 2)print(d.index(3)) # → 2 (first index of 3)8) reverse() / copy() / clear()
from collections import deque
d = deque([1, 2, 3, 4, 5])
d.reverse()print(d) # → deque([5, 4, 3, 2, 1])
d2 = d.copy() # shallow copyd2.append(0)print(d) # → deque([5, 4, 3, 2, 1]) (original unchanged)
d.clear()print(d) # → deque([])print(len(d)) # → 09) Performance comparison vs list (与list性能对比)
import timeitfrom collections import deque
# prepend 100_000 itemslist_time = timeit.timeit(lambda: [0] * 100_000, number=100)deque_time = timeit.timeit(lambda: deque([0] * 100_000), number=100)
# insert at frontn = 10_000t_list = timeit.timeit(lambda: [None] + list(range(n)), number=1000)t_deque = timeit.timeit(lambda: deque([None]) + deque(range(n)), number=1000)
print(f"list front-insert: {t_list:.4f}s")print(f"deque front-insert: {t_deque:.4f}s")# deque is orders of magnitude faster for front operations5. namedtuple — Immutable Record (具名元组)
namedtuple creates a tuple subclass (元组子类) whose fields can be accessed by name as well as by index. It is immutable (不可变), memory-efficient, and self-documenting. 1) Factory function namedtuple(typename, field_names)
from collections import namedtuple
# Three equivalent ways to define field names:Point = namedtuple("Point", ["x", "y"])Point = namedtuple("Point", "x y")Point = namedtuple("Point", "x, y")
p = Point(3, 4)print(p) # → Point(x=3, y=4)print(p.x, p.y) # → 3 4 (by name)print(p[0], p[1]) # → 3 4 (by index)print(p == (3, 4)) # → True (is a tuple subclass)2) _make() — Create from iterable (从可迭代对象创建)
from collections import namedtuple
Employee = namedtuple("Employee", "name age department salary")
data = ["Alice", 30, "Engineering", 95000]emp = Employee._make(data)print(emp)# → Employee(name='Alice', age=30, department='Engineering', salary=95000)
# From CSV rowimport csv, iocsv_data = "Bob,25,Marketing,60000"for row in csv.reader(io.StringIO(csv_data)): e = Employee._make(row) print(f"{e.name} in {e.department}")# → Bob in Marketing3) _asdict() — Convert to OrderedDict (转换为有序字典)
from collections import namedtuple
Point3D = namedtuple("Point3D", "x y z")p = Point3D(1, 2, 3)
d = p._asdict()print(d) # → {'x': 1, 'y': 2, 'z': 3}print(type(d)) # → <class 'dict'>
# Serialize to JSONimport jsonprint(json.dumps(p._asdict())) # → {"x": 1, "y": 2, "z": 3}4) _replace() — Create modified copy (创建修改副本)
from collections import namedtuple
# namedtuple is IMMUTABLE — _replace() returns a new instancePerson = namedtuple("Person", "name age city")alice = Person("Alice", 30, "NYC")
# "update" one fieldolder_alice = alice._replace(age=31)print(alice) # → Person(name='Alice', age=30, city='NYC') (unchanged)print(older_alice) # → Person(name='Alice', age=31, city='NYC')5) _fields / _field_defaults — Introspection (内省)
from collections import namedtuple
Config = namedtuple("Config", "host port timeout", defaults=["localhost", 8080, 30])
print(Config._fields) # → ('host', 'port', 'timeout')print(Config._field_defaults) # → {'host': 'localhost', 'port': 8080, 'timeout': 30}
c1 = Config() # all defaultsc2 = Config("example.com") # override host onlyprint(c1) # → Config(host='localhost', port=8080, timeout=30)print(c2) # → Config(host='example.com', port=8080, timeout=30)6) rename=True — Auto-rename invalid field names
from collections import namedtuple
# 'class' and '2bad' are invalid Python identifiersT = namedtuple("T", ["class", "2bad", "ok"], rename=True)print(T._fields) # → ('_0', '_1', 'ok') (invalid names → _index)
t = T(1, 2, 3)print(t._0, t._1, t.ok) # → 1 2 37) Subclassing namedtuple — Adding methods
from collections import namedtupleimport math
class Vector(namedtuple("Vector", "x y")): """Extend namedtuple with custom methods."""
def magnitude(self) -> float: return math.sqrt(self.x**2 + self.y**2)
def dot(self, other: "Vector") -> float: return self.x * other.x + self.y * other.y
def __add__(self, other): return Vector(self.x + other.x, self.y + other.y)
v1 = Vector(3, 4)v2 = Vector(1, 2)
print(v1.magnitude()) # → 5.0print(v1.dot(v2)) # → 11.0print(v1 + v2) # → Vector(x=4, y=6)6. ChainMap — Multi-scope Lookup (多层级查找映射)
A ChainMap (链式映射) groups multiple dicts into a single, updateable view. Lookups search the dicts from first to last, returning the first match. Writes always go to the first map. Perfect for modeling variable scopes (变量作用域) like Python's own LEGB rule.
1) Basic lookup (基本查找)
from collections import ChainMap
defaults = {"color": "red", "user": "guest", "timeout": 30}env_vars = {"color": "blue", "debug": True}cli_args = {"timeout": 10}
# Priority: cli_args > env_vars > defaultsconfig = ChainMap(cli_args, env_vars, defaults)
print(config["color"]) # → blue (from env_vars, overrides defaults)print(config["user"]) # → guest (only in defaults)print(config["timeout"]) # → 10 (from cli_args, highest priority)print(config["debug"]) # → True (from env_vars)2) Writes go to first map only (写入仅影响第一个映射)
from collections import ChainMap
base = {"x": 1, "y": 2}overlay = {}
cm = ChainMap(overlay, base)
cm["x"] = 99 # written to overlay (first map)cm["z"] = 0 # new key also goes to overlay
print(overlay) # → {'x': 99, 'z': 0}print(base) # → {'x': 1, 'y': 2} (unchanged!)print(cm["x"]) # → 99 (overlay shadows base)print(cm["y"]) # → 2 (from base)3) new_child(m=None) — Push a new scope (推入新作用域)
from collections import ChainMap
# Simulate nested scopes (模拟嵌套作用域)global_scope = ChainMap({"x": 1, "y": 2})local_scope = global_scope.new_child({"x": 10, "z": 3})
print(local_scope["x"]) # → 10 (local shadows global)print(local_scope["y"]) # → 2 (falls through to global)print(local_scope["z"]) # → 3 (local only)
# Pop the local scope (返回父作用域)parent_scope = local_scope.parentsprint(parent_scope["x"]) # → 1 (original global value)4) maps attribute — Access underlying dicts (访问底层字典列表)
from collections import ChainMap
cm = ChainMap({"a": 1}, {"b": 2}, {"c": 3})
print(cm.maps)# → [{'a': 1}, {'b': 2}, {'c': 3}]
# Modify underlying dicts directlycm.maps[1]["b"] = 99print(cm["b"]) # → 995) Practical: CLI argument + environment + defaults
from collections import ChainMapimport os
def build_config(cli_args: dict) -> ChainMap: """Three-tier configuration (三层配置): CLI > ENV > defaults.""" defaults = { "host": "localhost", "port": 8080, "debug": False, "workers": 4, } env_config = { k.lower().replace("app_", ""): v for k, v in os.environ.items() if k.startswith("APP_") } return ChainMap(cli_args, env_config, defaults)
config = build_config({"port": 9090, "debug": True})print(config["host"]) # → localhost (from defaults)print(config["port"]) # → 9090 (from cli_args)print(config["debug"]) # → True (from cli_args)7. UserDict / UserList / UserString — Custom Containers (自定义容器基类)
UserDict, UserList, and UserString are wrapper classes designed for safe subclassing. Subclassing built-in
dict / list directly can miss overrides because C-level methods call each other without going through Python. UserDict etc. route ALL operations through Python methods. 1) UserDict — Custom dict with validation (带验证的自定义字典)
from collections import UserDict
class TypedDict(UserDict): """A dict that only accepts string keys and int values."""
def __setitem__(self, key, value): if not isinstance(key, str): raise TypeError(f"Key must be str, got {type(key).__name__}") if not isinstance(value, int): raise TypeError(f"Value must be int, got {type(value).__name__}") super().__setitem__(key, value) # delegate to UserDict
td = TypedDict()td["score"] = 100td["count"] = 42print(td) # → {'score': 100, 'count': 42}
try: td[123] = 10 # invalid keyexcept TypeError as e: print(f"Error: {e}") # → Error: Key must be str, got int
try: td["x"] = "hello" # invalid valueexcept TypeError as e: print(f"Error: {e}") # → Error: Value must be int, got str2) UserList — Custom list with constraints (带约束的自定义列表)
from collections import UserList
class BoundedList(UserList): """A list that enforces a maximum length (最大长度限制)."""
def __init__(self, maxlen: int, iterable=()): self.maxlen = maxlen super().__init__() for item in iterable: self.append(item)
def append(self, item): if len(self.data) >= self.maxlen: raise OverflowError(f"List is full (max {self.maxlen})") self.data.append(item)
def insert(self, index, item): if len(self.data) >= self.maxlen: raise OverflowError(f"List is full (max {self.maxlen})") self.data.insert(index, item)
bl = BoundedList(3, [1, 2, 3])print(bl) # → [1, 2, 3]
try: bl.append(4)except OverflowError as e: print(f"Error: {e}") # → Error: List is full (max 3)3) UserString — Custom string with transforms (带转换的自定义字符串)
from collections import UserString
class SlugString(UserString): """Auto-converts string to URL-safe slug (URL友好字符串)."""
def __init__(self, seq=""): import re slug = re.sub(r'[^a-z0-9]+', '-', str(seq).lower()).strip('-') super().__init__(slug)
def __add__(self, other): return SlugString(self.data + "-" + str(other))
s = SlugString("Hello World! This is a Test.")print(s) # → hello-world-this-is-a-test
s2 = s + "extra"print(s2) # → hello-world-this-is-a-test-extraprint(len(s)) # → 28 (all str methods work)print(s.upper()) # → HELLO-WORLD-THIS-IS-A-TEST8. Comparison Table (对比总结)
| Class | Based on | Missing key | Ordered | Mutable | Best use case |
|---|---|---|---|---|---|
defaultdict | dict | auto-creates | insertion | ✅ | Grouping, counting |
Counter | dict | returns 0 | insertion | ✅ | Frequency, multiset ops |
OrderedDict | dict | KeyError | insertion | ✅ | LRU cache, order-sensitive eq |
deque | list-like | IndexError | yes | ✅ | Queue, stack, sliding window |
namedtuple | tuple | AttributeError | yes | ❌ | Immutable records, CSV rows |
ChainMap | dict view | KeyError | first-wins | ✅ (first) | Config layers, scopes |
UserDict | dict | KeyError | insertion | ✅ | Safe dict subclassing |
💡 One-line Takeaway
Use
Use
defaultdict to eliminate KeyError boilerplate, Counter for frequency analysis and multiset arithmetic, deque when you need O(1) operations on both ends, namedtuple for self-documenting immutable records, OrderedDict for LRU caches and order-sensitive comparisons, and ChainMap for multi-tier configuration or scope simulation. Python collections
https://lxy-alexander.github.io/blog/posts/python/python-collections/