Python 计数器：如何使用 collections.Counter？

王林

发布时间：2023-05-08 13:34:07

1418人浏览过

来源于亿速云

转载

一. 介绍

一个计数器工具提供快速和方便的计数，counter是一个dict的子类，用于计数可哈希对象。它是一个集合，元素像字典键(key)一样存储，它们的计数存储为值。计数可以是任何整数值，包括0和负数，counter类有点像其他语言中的bags或multisets。简单说，就是可以统计计数，来几个例子看看就清楚了。
举例：

#计算top10的单词
from collections import Counter
import re
text = 'remove an existing key one level down remove an existing key one level down'
words = re.findall(r'\w+', text)
Counter(words).most_common(10)
[('remove', 2),('an', 2),('existing', 2),('key', 2),('one', 2)('level', 2),('down', 2)] 


#计算列表中单词的个数
cnt = Counter()
for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
    cnt[word] += 1
cnt
Counter({'red': 2, 'blue': 3, 'green': 1})


#上述这样计算有点嘛，下面的方法更简单，直接计算就行
L = ['red', 'blue', 'red', 'green', 'blue', 'blue'] 
Counter(L)
Counter({'red': 2, 'blue': 3, 'green': 1}

元素从一个iterable 被计数或从其他的mapping (or counter)初始化：

from collections import Counter

#字符串计数
Counter('gallahad') 
Counter({'g': 1, 'a': 3, 'l': 2, 'h': 1, 'd': 1})

#字典计数
Counter({'red': 4, 'blue': 2})  
Counter({'red': 4, 'blue': 2})

#计数
Counter(cats=4, dogs=8)
Counter({'cats': 4, 'dogs': 8})

Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
Counter({'red': 2, 'blue': 3, 'green': 1})

二. 基本操作

1. 统计“可迭代序列”中每个元素的出现的次数

1.1 对列表/字符串作用

下面是两种使用方法，一种是直接使用，一种是实例化以后使用，如果要频繁调用的话，显然后一种更简洁，因为可以方便地调用Counter内的各种方法，对于其他可迭代序列也是一样的套路。

#首先引入该方法
from collections import Counter
#对列表作用
list_01 = [1,9,9,5,0,8,0,9]  #GNZ48-陈珂生日
print(Counter(list_01))  #Counter({9: 3, 0: 2, 1: 1, 5: 1, 8: 1})
 
#对字符串作用
temp = Counter('abcdeabcdabcaba')
print(temp)  #Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2, 'e': 1})
#以上其实是两种使用方法，一种是直接用，一种是实例化以后使用,如果要频繁调用的话，显然后一种更简洁

1.2 输出结果

立即学习“Python免费学习笔记（深入）”；

#查看类型
print( type(temp) ) #<class 'collections.Counter'>
 
#转换为字典后输出
print( dict(temp) ) #{'b': 4, 'a': 5, 'c': 3, 'd': 2, 'e': 1}
 
for num,count in enumerate(dict(temp).items()):
    print(count)
"""
('e', 1)
('c', 3)
('a', 5)
('b', 4)
('d', 2)
"""

1.3 用自带的items()方法输出

显然这个方法比转换为字典后再输出的方法更为方便：

print(temp.items()) #dict_items([('e', 1), ('c', 3), ('b', 4), ('d', 2), ('a', 5)])
 
for item in temp.items():
    print(item)
"""
('a', 5)
('c', 3)
('d', 2)
('e', 1)
('b', 4)
"""

2. most_common()统计出现次数最多的元素

利用most_common()方法，返回一个列表，其中包含n个最常见的元素及出现次数，按常见程度由高到低排序。如果 n 被省略或为None，most_common() 将返回计数器中的所有元素，计数值相等的元素按首次出现的顺序排序，经常用来计算top词频的词语：

#求序列中出现次数最多的元素
 
from collections import Counter
 
list_01 = [1,9,9,5,0,8,0,9]
temp = Counter(list_01)
 
#统计出现次数最多的一个元素
print(temp.most_common(1))   #[(9, 3)]  元素“9”出现3次。
print(temp.most_common(2)) #[(9, 3), (0, 2)]  统计出现次数最多个两个元素
 
#没有指定个数，就列出全部
print(temp.most_common())  #[(9, 3), (0, 2), (1, 1), (5, 1), (8, 1)]

Counter('abracadabra').most_common(3)
[('a', 5), ('b', 2), ('r', 2)]

Counter('abracadabra').most_common(5)
[('a', 5), ('b', 2), ('r', 2), ('c', 1), ('d', 1)]

3. elements() 和 sort()方法

描述：返回一个迭代器，其中每个元素将重复出现计数值所指定次。元素会按首次出现的顺序返回。如果一个元素的计数值小于1，elements() 将会忽略它。
举例：

c = Counter(a=4, b=2, c=0, d=-2)
list(c.elements())
['a', 'a', 'a', 'a', 'b', 'b']

sorted(c.elements())
['a', 'a', 'a', 'a', 'b', 'b']

c = Counter(a=4, b=2, c=0, d=5)
list(c.elements())
['a', 'a', 'a', 'a', 'b', 'b', 'd', 'd', 'd', 'd', 'd']

from collections import Counter
 
c = Counter('ABCABCCC')
print(c.elements()) #<itertools.chain object at 0x0000027D94126860>
 
#尝试转换为list
print(list(c.elements())) #['A', 'A', 'C', 'C', 'C', 'C', 'B', 'B']
 
#或者这种方式
print(sorted(c.elements()))  #['A', 'A', 'B', 'B', 'C', 'C', 'C', 'C']
 
#这里与sorted的作用是： list all unique elements，列出所有唯一元素
#例如
print( sorted(c) ) #['A', 'B', 'C']

官方文档例子：

# Knuth's example for prime factors of 1836:  2**2 * 3**3 * 17**1
prime_factors = Counter({2: 2, 3: 3, 17: 1})
product = 1
for factor in prime_factors.elements():  # loop over factors
    product *= factor  # and multiply them
print(product)  #1836
#1836 = 2*2*3*3*3*17

4. subtract()减操作：输出不会忽略掉结果为零或者小于零的计数

从迭代对象或映射对象减去元素，输入和输出都可以是0或者负数。

c = Counter(a=4, b=2, c=0, d=-2)
d = Counter(a=1, b=2, c=3, d=4)
c.subtract(d)
c
Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})

#减去一个abcd
str0 = Counter('aabbccdde')
str0
Counter({'a': 2, 'b': 2, 'c': 2, 'd': 2, 'e': 1})

str0.subtract('abcd')
str0
Counter({'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 1}

subtract_test01 = Counter("AAB")
subtract_test01.subtract("BCC")
print(subtract_test01)  #Counter({'A': 2, 'B': 0, 'C': -2})

这里的计数可以减到零一下，可以包含零和负数：

subtract_test02 = Counter("which")
subtract_test02.subtract("witch")  #从另一个迭代序列中减去元素
subtract_test02.subtract(Counter("watch"))  #^……
 
#查看结果
print( subtract_test02["h"] )  # 0 ,whirch 中两个，减去witch中一个，减去watch中一个，剩0个
print( subtract_test02["w"] )  #-1

5. 字典方法

通常字典方法都可用于Counter对象，除了有两个方法工作方式与字典并不相同。

fromkeys(iterable)：这个类方法没有在Counter中实现。

百度AI搜
百度全新AI搜索引擎

下载
update([iterable-or-mapping])：从迭代对象计数元素或者从另一个映射对象 (或计数器) 添加，元素个数是加上。另外迭代对象应该是序列元素，而不是一个 (key, value) 对。

sum(c.values())                 # total of all counts
c.clear()                       # reset all counts
list(c)                         # list unique elements
set(c)                          # convert to a set
dict(c)                         # convert to a regular dictionary
c.items()                       # convert to a list of (elem, cnt) pairs
Counter(dict(list_of_pairs))    # convert from a list of (elem, cnt) pairs
c.most_common(n)                   # n least common elements
+c                              # remove zero and negative counts

6. 数学操作

这个功能非常强大，提供了几个数学操作，可以结合 Counter 对象，以生产 multisets (计数器中大于0的元素）。加和减，结合计数器，通过加上或者减去元素的相应计数。交集和并集返回相应计数的最小或最大值。每种操作都可以接受带符号的计数，但是输出会忽略掉结果为零或者小于零的计数。

c = Counter(a=3, b=1)
d = Counter(a=1, b=2)
c + d                       # add two counters together:  c[x] + d[x]
Counter({'a': 4, 'b': 3})
c - d                       # subtract (keeping only positive counts)
Counter({'a': 2})
c & d                       # intersection:  min(c[x], d[x]) 
Counter({'a': 1, 'b': 1})
c | d                       # union:  max(c[x], d[x])
Counter({'a': 3, 'b': 2})

print(Counter('AAB') + Counter('BCC'))
#Counter({'B': 2, 'C': 2, 'A': 2})
print(Counter("AAB")-Counter("BCC"))
#Counter({'A': 2})

与”和“或”操作:

print(Counter('AAB') & Counter('BBCC'))
#Counter({'B': 1})
 
print(Counter('AAB') | Counter('BBCC'))
#Counter({'A': 2, 'C': 2, 'B': 2})

单目加和减（一元操作符）意思是从空计数器加或者减去，相当于给计数值乘以正值或负值，同样输出会忽略掉结果为零或者小于零的计数:

c = Counter(a=2, b=-4)
+c
Counter({'a': 2})
-c
Counter({'b': 4})

写一个计算文本相似的算法，加权相似：

def str_sim(str_0,str_1,topn):
    topn = int(topn)
    collect0 = Counter(dict(Counter(str_0).most_common(topn)))
    collect1 = Counter(dict(Counter(str_1).most_common(topn)))       
    jiao = collect0 & collect1
    bing = collect0 | collect1       
    sim = float(sum(jiao.values()))/float(sum(bing.values()))        
    return(sim)         

str_0 = '定位手机定位汽车定位GPS定位人定位位置查询'         
str_1 = '导航定位手机定位汽车定位GPS定位人定位位置查询'         

str_sim(str_0,str_1,5)    
0.75

7. 计算元素总数、Keys() 和 Values()

from collections import Counter
 
c = Counter('ABCABCCC')
print(sum(c.values()))  # 8  total of all counts
 
print(c.keys())  #dict_keys(['A', 'B', 'C'])
print(c.values())  #dict_values([2, 2, 4])

8. 查询单元素结果

from collections import Counter
c = Counter('ABBCC')
#查询具体某个元素的个数
print(c["A"])  #1

9. 添加

for elem in 'ADD':  # update counts from an iterabl
    c[elem] += 1
print(c.most_common())  #[('C', 2), ('D', 2), ('A', 2), ('B', 2)]
#可以看出“A”增加了一个，新增了两个“D”

10. 删除（del）

del c["D"]
print(c.most_common())  #[('C', 2), ('A', 2), ('B', 2)]
del c["C"]
print(c.most_common())  #[('A', 2), ('B', 2)]

11. 更新 update()

d = Counter("CCDD")
c.update(d)
print(c.most_common())  #[('B', 2), ('A', 2), ('C', 2), ('D', 2)]

12. 清除 clear()

c.clear()
print(c)  #Counter()

三. 总结

Counter是一个dict子类，主要是用来对你访问的对象的频率进行计数。

常用方法：

elements()：返回一个迭代器，每个元素重复计算的个数，如果一个元素的计数小于1,就会被忽略。
most_common([n])：返回一个列表，提供n个访问频率最高的元素和计数
subtract([iterable-or-mapping])：从迭代对象中减去元素，输入输出可以是0或者负数，不同于减号 - 的作用
update([iterable-or-mapping])：从迭代对象计数元素或者从另一个映射对象 (或计数器) 添加。

举例：

# 统计字符出现的次数
>>> import collections
>>> collections.Counter('hello world')
Counter({'l': 3, 'o': 2, 'h': 1, 'e': 1, ' ': 1, 'w': 1, 'r': 1, 'd': 1})
# 统计单词数
>>> collections.Counter('hello world hello world hello nihao'.split())
Counter({'hello': 3, 'world': 2, 'nihao': 1})

常用的方法：

>>> c = collections.Counter('hello world hello world hello nihao'.split())
>>> c
Counter({'hello': 3, 'world': 2, 'nihao': 1})
# 获取指定对象的访问次数，也可以使用get()方法
>>> c['hello']
3
>>> c = collections.Counter('hello world hello world hello nihao'.split())
# 查看元素
>>> list(c.elements())
['hello', 'hello', 'hello', 'world', 'world', 'nihao']
# 追加对象，或者使用c.update(d)
>>> c = collections.Counter('hello world hello world hello nihao'.split())
>>> d = collections.Counter('hello world'.split())
>>> c
Counter({'hello': 3, 'world': 2, 'nihao': 1})
>>> d
Counter({'hello': 1, 'world': 1})
>>> c + d
Counter({'hello': 4, 'world': 3, 'nihao': 1})
# 减少对象，或者使用c.subtract(d)
>>> c - d
Counter({'hello': 2, 'world': 1, 'nihao': 1})
# 清除
>>> c.clear()
>>> c
Counter()

Python Flask怎么做身份验证_Flask-JWT-Extended插件实现双Token校验与黑名单控制

如何使用 Python 生成累加拼接的字符串序列

如何将列表按指定列数格式化输出并自动编号

Python 文件写入失败的常见原因及正确处理方式

Python DataFrame 去重：基于时间戳保留每篇文章的最新操作记录

python速学教程(入门到精通)

python怎么学习？python怎么入门？python在哪学？python怎么学才快？不用担心，这里为大家提供了python速学教程(入门到精通)，有需要的小伙伴保存下载就能学习啦！

下载

相关专题

TypeScript类型系统进阶与大型前端项目实践

本专题围绕 TypeScript 在大型前端项目中的应用展开，深入讲解类型系统设计与工程化开发方法。内容包括泛型与高级类型、类型推断机制、声明文件编写、模块化结构设计以及代码规范管理。通过真实项目案例分析，帮助开发者构建类型安全、结构清晰、易维护的前端工程体系，提高团队协作效率与代码质量。

2026.03.13

Python异步编程与Asyncio高并发应用实践

本专题围绕 Python 异步编程模型展开，深入讲解 Asyncio 框架的核心原理与应用实践。内容包括事件循环机制、协程任务调度、异步 IO 处理以及并发任务管理策略。通过构建高并发网络请求与异步数据处理案例，帮助开发者掌握 Python 在高并发场景中的高效开发方法，并提升系统资源利用率与整体运行性能。

2026.03.12

C# ASP.NET Core微服务架构与API网关实践

本专题围绕 C# 在现代后端架构中的微服务实践展开，系统讲解基于 ASP.NET Core 构建可扩展服务体系的核心方法。内容涵盖服务拆分策略、RESTful API 设计、服务间通信、API 网关统一入口管理以及服务治理机制。通过真实项目案例，帮助开发者掌握构建高可用微服务系统的关键技术，提高系统的可扩展性与维护效率。

276

2026.03.11

Go高并发任务调度与Goroutine池化实践

本专题围绕 Go 语言在高并发任务处理场景中的实践展开，系统讲解 Goroutine 调度模型、Channel 通信机制以及并发控制策略。内容包括任务队列设计、Goroutine 池化管理、资源限制控制以及并发任务的性能优化方法。通过实际案例演示，帮助开发者构建稳定高效的 Go 并发任务处理系统，提高系统在高负载环境下的处理能力与稳定性。

2026.03.10

Kotlin Android模块化架构与组件化开发实践

本专题围绕 Kotlin 在 Android 应用开发中的架构实践展开，重点讲解模块化设计与组件化开发的实现思路。内容包括项目模块拆分策略、公共组件封装、依赖管理优化、路由通信机制以及大型项目的工程化管理方法。通过真实项目案例分析，帮助开发者构建结构清晰、易扩展且维护成本低的 Android 应用架构体系，提升团队协作效率与项目迭代速度。

2026.03.09

JavaScript浏览器渲染机制与前端性能优化实践

本专题围绕 JavaScript 在浏览器中的执行与渲染机制展开，系统讲解 DOM 构建、CSSOM 解析、重排与重绘原理，以及关键渲染路径优化方法。内容涵盖事件循环机制、异步任务调度、资源加载优化、代码拆分与懒加载等性能优化策略。通过真实前端项目案例，帮助开发者理解浏览器底层工作原理，并掌握提升网页加载速度与交互体验的实用技巧。

105

2026.03.06

Rust内存安全机制与所有权模型深度实践

本专题围绕 Rust 语言核心特性展开，深入讲解所有权机制、借用规则、生命周期管理以及智能指针等关键概念。通过系统级开发案例，分析内存安全保障原理与零成本抽象优势，并结合并发场景讲解 Send 与 Sync 特性实现机制。帮助开发者真正理解 Rust 的设计哲学，掌握在高性能与安全性并重场景中的工程实践能力。

230

2026.03.05

PHP高性能API设计与Laravel服务架构实践

本专题围绕 PHP 在现代 Web 后端开发中的高性能实践展开，重点讲解基于 Laravel 框架构建可扩展 API 服务的核心方法。内容涵盖路由与中间件机制、服务容器与依赖注入、接口版本管理、缓存策略设计以及队列异步处理方案。同时结合高并发场景，深入分析性能瓶颈定位与优化思路，帮助开发者构建稳定、高效、易维护的 PHP 后端服务体系。

619

2026.03.04