💻 Pytest中的benchmark基准测试

pytest bench

基于 Pytest 8.3.4版本

官方源码地址

raywong fork 地址

pytest bench

源码路径：pytest/bench/

用于存放基准测试（benchmark）相关的代码文件。
基准测试用于测量和评估代码的性能表现，比如执行速度、内存使用等指标。
可以在不同版本之间进行对比，以评估性能变化。

该目录中共有7个文件：

bench.py
bench_argcomplete.py
empty.py
manyparam.py
skip.py
unit_test.py
xunit.py

下面一一进行介绍。

`bench.py`

source code file

主要的基准测试框架文件
包含基准测试的核心功能和基类

使用cProfile来分析pytest运行测试用例的性能，以及pstats生成性能统计数据。

cProfile

Python 的内置性能分析器

用于收集程序运行时的详细信息，包括：

函数调用次

每个函数的执行时间

函数调用关系

pstats

用于处理 cProfile 生成的性能统计数据

主要功能：

读取性能分析结果

排序统计数据（如按累计时间、调用次数等）

格式化输出结果

过滤和分析数据

输出字段：

ncalls：函数调用次数

tottime：函数执行时间（不包括子函数）

percall：每个调用的平均时间

cumtime：函数执行时间（包括子函数）

filename:lineno(function)：函数所在的文件名、行号和函数名

使用：

# 默认运行的是：empty.py
$ python3 bench.py

# 自定义，如：
$ python3 bench.py bench_argcomplete.py

`bench_argcomplete.py`

source code file

命令行参数自动补全功能的基准测试
测试参数补全的性能和正确性

对比两个文件补全器的性能：

argcomplete 库的 FilesCompleter
pytest 的 FastFilesCompleter （pytest的优化版本，比原始版本性能更好）

对每个补全器：
1、导入并实例化补全器（setup阶段）
2、执行路径不全操作1000次（run阶段）
3、计算并输出总执行时间

timeit

python标准库，用于测量小段代码的执行时间

常用参数：

stmt：要测量的代码字符串或可调用对象

setup：设置代码字符串或可调用对象

timer：计时器函数，默认使用 time.perf_counter()

number：执行次数，默认1000000次

globals：全局变量字典

locals：局部变量字典

argcomplete

命令行参数自动补全库

主要用于：

命令行参数的自动补全

文件路径的自动补全

Demo：对比两个文件补全器的性能

# 运行bench_argcomplete.py
$ python3 bench_argcomplete.py

argcomplete - FilesCompleter:
8.29129308499978
pytest - FastFilesCompleter:
0.060678875001030974

Demo：

# bench.py：只打印前10条数据
$ python3 bench.py bench_argcomplete.py
Mon Jan 27 09:24:41 2025    prof

         712535 function calls (702969 primitive calls) in 0.758 seconds

   Ordered by: cumulative time
   List reduced from 3982 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       63    0.001    0.000    0.828    0.013 __init__.py:1(<module>)
    702/1    0.020    0.000    0.759    0.759 {built-in method builtins.exec}
        1    0.000    0.000    0.759    0.759 __init__.py:139(main)
        1    0.000    0.000    0.759    0.759 __init__.py:318(_prepareconfig)
   107/70    0.000    0.000    0.732    0.010 _manager.py:111(_hookexec)
   107/70    0.000    0.000    0.732    0.010 _callers.py:53(_multicall)
      2/1    0.000    0.000    0.730    0.730 _hooks.py:498(__call__)
        1    0.000    0.000    0.730    0.730 __init__.py:1136(pytest_cmdline_parse)
        1    0.000    0.000    0.730    0.730 __init__.py:1486(parse)
        1    0.000    0.000    0.730    0.730 __init__.py:1358(_preparse)


<pstats.Stats object at 0x10a18dd30>

`empty.py`

source code file

空测试用例
用于建立基准测试的基线性能
最基本的测试场景，测试pytest处理大量测试函数的能力，即发现和加载测试函数的时间

使用 exec() 动态生成代码每次循环创建一个新的测试函数函数名格式为 test_func_0, test_func_1, …, test_func_999
每个函数都是空函数（只有 pass 语句）

和pytest交互

在 empty.py 中没有直接和 pytest 的交互代码，但它遵循了 pytest 的命名约定，即：pytest 的测试发现机制

默认查找名称匹配 test_*.py 或 *_test.py 的文件
在这些文件中，查找符合以下规则的函数：
- 以 test_ 开头的函数
- 以 Test 开头的类中以 test_ 开头的方法

于是：pytest 会通过其发现机制自动找到并执行这些测试函数，在 bench.py 中可以用 pytest.cmdline.main(["empty.py"]) 来运行这些测试

Demo：

# bench.py：只打印前10条数据
$ python3 bench.py empty.py

Mon Jan 27 09:42:51 2025    prof

         712318 function calls (702752 primitive calls) in 1.375 seconds

   Ordered by: cumulative time
   List reduced from 3982 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       63    0.002    0.000    1.556    0.025 __init__.py:1(<module>)
    702/1    0.028    0.000    1.376    1.376 {built-in method builtins.exec}
        1    0.000    0.000    1.376    1.376 __init__.py:139(main)
        1    0.000    0.000    1.376    1.376 __init__.py:318(_prepareconfig)
   107/70    0.000    0.000    1.330    0.019 _manager.py:111(_hookexec)
   107/70    0.000    0.000    1.330    0.019 _callers.py:53(_multicall)
      2/1    0.000    0.000    1.328    1.328 _hooks.py:498(__call__)
        1    0.000    0.000    1.328    1.328 __init__.py:1136(pytest_cmdline_parse)
        1    0.000    0.000    1.328    1.328 __init__.py:1486(parse)
        1    0.000    0.000    1.328    1.328 __init__.py:1358(_preparse)

<pstats.Stats object at 0x101afaf30>

`manyparam.py`

source code file

多参数场景的基准测试
测试处理大量参数时的性能表现
与 empty.py 对比，测试参数化带来的额外开销

Demo：

# bench.py：只打印前10条数据
$ python3 bench.py manyparam.py

Mon Jan 27 09:54:32 2025    prof

         712398 function calls (702832 primitive calls) in 1.347 seconds

   Ordered by: cumulative time
   List reduced from 3982 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       63    0.002    0.000    1.396    0.022 __init__.py:1(<module>)
    702/1    0.029    0.000    1.349    1.349 {built-in method builtins.exec}
        1    0.000    0.000    1.348    1.348 __init__.py:139(main)
        1    0.000    0.000    1.348    1.348 __init__.py:318(_prepareconfig)
   107/70    0.000    0.000    1.301    0.019 _manager.py:111(_hookexec)
   107/70    0.001    0.000    1.301    0.019 _callers.py:53(_multicall)
      2/1    0.000    0.000    1.298    1.298 _hooks.py:498(__call__)
        1    0.000    0.000    1.298    1.298 __init__.py:1136(pytest_cmdline_parse)
        1    0.000    0.000    1.298    1.298 __init__.py:1486(parse)
        1    0.000    0.000    1.298    1.298 __init__.py:1358(_preparse)

<pstats.Stats object at 0x10fc3a570>

`skip.py`

source code file

包含被跳过的测试用例
用于测试跳过特定测试的功能

评估skip机制的效率。

Demo：

# bench.py：只打印前10条数据
$ python3 bench.py skip.py

Mon Jan 27 09:58:38 2025    prof

         712529 function calls (702963 primitive calls) in 1.401 seconds

   Ordered by: cumulative time
   List reduced from 3982 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       63    0.002    0.000    1.572    0.025 __init__.py:1(<module>)
    702/1    0.030    0.000    1.403    1.403 {built-in method builtins.exec}
        1    0.000    0.000    1.403    1.403 __init__.py:139(main)
        1    0.000    0.000    1.401    1.401 __init__.py:318(_prepareconfig)
   107/70    0.000    0.000    1.329    0.019 _manager.py:111(_hookexec)
   107/70    0.000    0.000    1.329    0.019 _callers.py:53(_multicall)
      2/1    0.000    0.000    1.326    1.326 _hooks.py:498(__call__)
        1    0.000    0.000    1.326    1.326 __init__.py:1136(pytest_cmdline_parse)
        1    0.000    0.000    1.326    1.326 __init__.py:1486(parse)
        1    0.000    0.000    1.326    1.326 __init__.py:1358(_preparse)

<pstats.Stats object at 0x106579f10>

`unit_test.py`

source code file

测试 pytest 处理大量 unittest 风格测试类的性能基准。（类继承和类方法的开销）

Demo：

# bench.py：只打印前10条数据
$ python3 bench.py unit_test.py

Mon Jan 27 10:03:07 2025    prof

         712256 function calls (702690 primitive calls) in 1.350 seconds

   Ordered by: cumulative time
   List reduced from 3982 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       63    0.002    0.000    1.483    0.024 __init__.py:1(<module>)
    702/1    0.028    0.000    1.352    1.352 {built-in method builtins.exec}
        1    0.000    0.000    1.352    1.352 __init__.py:139(main)
        1    0.000    0.000    1.351    1.351 __init__.py:318(_prepareconfig)
   107/70    0.000    0.000    1.305    0.019 _manager.py:111(_hookexec)
   107/70    0.000    0.000    1.305    0.019 _callers.py:53(_multicall)
      2/1    0.000    0.000    1.302    1.302 _hooks.py:498(__call__)
        1    0.000    0.000    1.302    1.302 __init__.py:1136(pytest_cmdline_parse)
        1    0.000    0.000    1.302    1.302 __init__.py:1486(parse)
        1    0.000    0.000    1.302    1.302 __init__.py:1358(_preparse)


<pstats.Stats object at 0x10397dd30>

`xunit.py`

source code file

用于测试 pytest 处理 xUnit 风格测试类的性能基准。

unittest VS xUnit

# xunit.py中的类（pytest原生风格）
class Test0:                          # 不需要继承任何基类
    def setup_class(cls): pass        # pytest的setup_class方法


# unit_test.py中的类（unittest风格）
class Test0(TestCase):                # 继承 unittest.TestCase
    def setUpClass(cls): pass         # unittest的setUpClass方法

Demo：

# bench.py：只打印前10条数据
$ python3 bench.py xunit.py

Mon Jan 27 10:10:54 2025    prof

         712452 function calls (702886 primitive calls) in 1.366 seconds

   Ordered by: cumulative time
   List reduced from 3982 to 10 due to restriction <10>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       63    0.002    0.000    1.511    0.024 __init__.py:1(<module>)
    702/1    0.032    0.000    1.368    1.368 {built-in method builtins.exec}
        1    0.000    0.000    1.367    1.367 __init__.py:139(main)
        1    0.000    0.000    1.367    1.367 __init__.py:318(_prepareconfig)
   107/70    0.000    0.000    1.320    0.019 _manager.py:111(_hookexec)
   107/70    0.001    0.000    1.320    0.019 _callers.py:53(_multicall)
      2/1    0.000    0.000    1.318    1.318 _hooks.py:498(__call__)
        1    0.000    0.000    1.318    1.318 __init__.py:1136(pytest_cmdline_parse)
        1    0.000    0.000    1.318    1.318 __init__.py:1486(parse)
        1    0.000    0.000    1.318    1.318 __init__.py:1358(_preparse)


<pstats.Stats object at 0x10fbcef30>

以上。