0

0

Why %util number from iostat is meaningless for MySQL capaci_MySQL

php中文网

php中文网

发布时间:2016-06-01 13:07:34

|

1181人浏览过

|

来源于php中文网

原创

earlier this month i wrote aboutvmstat iowait cpu numbersand some of the comments i got were advertising the use of util% as reported by the iostat tool instead. i find this number even more useless for mysql performance tuning and capacity planning.

Now let me start by saying this is a really tricky and deceptive number. Many DBAs who report instances of their systems having a very busy IO subsystem said the util% in vmstat was above 99% and therefore they believe this number is a good indicator of an overloaded IO subsystem.

Indeed – when your IO subsystem is busy, up to its full capacity, the utilization should be very close to 100%. However, it is perfectly possible for the IO subsystem and MySQL with it to have plenty more capacity than when utilization is showing 100% – as I will show in an example.

Before that though lets see what the iostat manual page has to say on this topic – fromthis main pagewe can read:

Memories.ai
Memories.ai

专注于视频解析的AI视觉记忆模型

下载

%util

Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits.

Which says right here that the number is useless for pretty much any production database server that is likely to be running RAID, Flash/SSD, SAN or cloud storage (such as EBS) capable of handling multiple requests in parallel.

Let’s look at the following illustration. I will run sysbench on a system with a rather slow storage data size larger than buffer pool and uniform distribution to put pressure on the IO subsystem. I will use a read-only benchmark here as it keeps things more simple…

sysbench –num-threads=1 –max-requests=0 –max-time=6000000 –report-interval=10 –test=oltp –oltp-read-only=on –db-driver=mysql –oltp-table-size=100000000 –oltp-dist-type=uniform –init-rng=on –mysql-user=root –mysql-password= run

I’m seeing some 9 transactions per second, while disk utilization from iostat is at nearly 96%:

[ 80s] threads: 1, tps: 9.30, reads/s: 130.20, writes/s: 0.00 response time: 171.82ms (95%)

[ 90s] threads: 1, tps: 9.20, reads/s: 128.80, writes/s: 0.00 response time: 157.72ms (95%)

[ 100s] threads: 1, tps: 9.00, reads/s: 126.00, writes/s: 0.00 response time: 215.38ms (95%)

[ 110s] threads: 1, tps: 9.30, reads/s: 130.20, writes/s: 0.00 response time: 141.39ms (95%)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utildm-0 0.00 0.00 127.90 0.70 4070.40 28.00 31.87 1.01 7.83 7.52 96.68

This makes a lot of sense – with read single thread read workload the drive should be only used getting data needed by the query, which will not be 100% as there is some extra time needed to process the query on the MySQL side as well as passing the result set back to sysbench.

So 96% utilization; 9 transactions per second, this is a close to full-system capacity with less than 5% of device time to spare, right?

Let’s run a benchmark with more concurrency – 4 threads at the time; we’ll see…

[ 110s] threads: 4, tps: 21.10, reads/s: 295.40, writes/s: 0.00 response time: 312.09ms (95%)

[ 120s] threads: 4, tps: 22.00, reads/s: 308.00, writes/s: 0.00 response time: 297.05ms (95%)

[ 130s] threads: 4, tps: 22.40, reads/s: 313.60, writes/s: 0.00 response time: 335.34ms (95%)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utildm-0 0.00 0.00 295.40 0.90 9372.80 35.20 31.75 4.06 13.69 3.38 100.01

So we’re seeing 100% utilization now, but what is interesting – we’re able to reclaim much more than less than 5% which was left if we look at utilization – throughput of the system increased about 2.5x

Finally let’s do the test with 64 threads – this is more concurrency than exists at storage level which is conventional hard drives in RAID on this system…

[ 70s] threads: 64, tps: 42.90, reads/s: 600.60, writes/s: 0.00 response time: 2516.43ms (95%)

[ 80s] threads: 64, tps: 42.40, reads/s: 593.60, writes/s: 0.00 response time: 2613.15ms (95%)

[ 90s] threads: 64, tps: 44.80, reads/s: 627.20, writes/s: 0.00 response time: 2404.49ms (95%)

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %utildm-0 0.00 0.00 601.20 0.80 19065.60 33.60 31.73 65.98 108.72 1.66 100.00

In this case we’re getting 4.5x of throughput compared to single thread and 100% utilization. We’re also getting almost double throughput of the run with 4 thread where 100% utilization was reported. This makes sense – there are 4 drives which can work in parallel and with many outstanding requests they are able to optimize their seeks better hence giving a bit more than 4x.

So what have we so ? The system which was 96% capacity but which could have driven still to provide 4.5x throughput – so it had plenty of extra IO capacity. More powerful storage might have significantly more ability to run requests in parallel so it is quite possible to see 10x or more room after utilization% starts to be reported close to 100%

So if utilization% is not very helpful what can we use to understand our database IO capacity better ? First lets look at the performance reported from those sysbench runs. If we look at 95% response time you can see 1 thread and 4 threads had relatively close 95% time growing just from 150ms to 250-300ms. This is the number I really like to look at- if system is able to respond to the queries with response time not significantly higher than it has with concurrency of 1 it is not overloaded. I like using 3x as multiplier – ie when 95% spikes to be more than 3x of the single concurrency the system might be getting to the overload.

With 64 threads the 95% response time is 15-20x of the one we see with single thread so it is surely overloaded.

Do we have anything reported by iostat which we can use in a similar way? It turns out we do! Check out the “await” column which tells us how much the requester had to wait for the IO request to be serviced. With single concurrency it is 7.8ms which is what this drives can do for random IO and is as good as it gets. With 4 threads it is 13.7ms – less than double of best possible, so also good enough… with concurrency of 64 it is however 108ms which is over 10x of what this device could produce with no waiting and which is surely sign of overload.

A couple words of caution. First, do not look at svctm which is not designed with parallel processing in mind. You can see in our case it actually gets better with high concurrency while really database had to wait a lot longer for requests submitted. Second, iostat mixes together reads and writes in single statistics which specifically for databases and especially on RAID with BBU might be like mixing apples and oranges together – writes should go to writeback cache and be acknowledged essentially instantly while reads only complete when actual data can be delivered. The toolpt-diskstatsfromPercona Tookitbreaks them apart and so can be much more for storage tuning for database workload.

Final note – I used a read-only workload on purpose – when it comes to writes things can be even more complicated – MySQL buffer pool can be more efficient with more intensive writes plus group commit might be able to commit a lot of transactions with single disk write. Still, the same base logic will apply.

Summary:The take away is simple – util% only shows if a device has at least one operation to deal with or is completely busy, which does not reflect actual utilization for a majority of modern IO subsystems. So you may have a lot of storage IO capacity left even when utilization is close to 100%.

本站声明:本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

热门AI工具

更多
DeepSeek
DeepSeek

幻方量化公司旗下的开源大模型平台

豆包大模型
豆包大模型

字节跳动自主研发的一系列大型语言模型

通义千问
通义千问

阿里巴巴推出的全能AI助手

腾讯元宝
腾讯元宝

腾讯混元平台推出的AI助手

文心一言
文心一言

文心一言是百度开发的AI聊天机器人,通过对话可以生成各种形式的内容。

讯飞写作
讯飞写作

基于讯飞星火大模型的AI写作工具,可以快速生成新闻稿件、品宣文案、工作总结、心得体会等各种文文稿

即梦AI
即梦AI

一站式AI创作平台,免费AI图片和视频生成。

ChatGPT
ChatGPT

最最强大的AI聊天机器人程序,ChatGPT不单是聊天机器人,还能进行撰写邮件、视频脚本、文案、翻译、代码等任务。

相关专题

更多
Python 序列化
Python 序列化

本专题整合了python序列化、反序列化相关内容,阅读专题下面的文章了解更多详细内容。

0

2026.02.02

AO3官网入口与中文阅读设置 AO3网页版使用与访问
AO3官网入口与中文阅读设置 AO3网页版使用与访问

本专题围绕 Archive of Our Own(AO3)官网入口展开,系统整理 AO3 最新可用官网地址、网页版访问方式、正确打开链接的方法,并详细讲解 AO3 中文界面设置、阅读语言切换及基础使用流程,帮助用户稳定访问 AO3 官网,高效完成中文阅读与作品浏览。

91

2026.02.02

主流快递单号查询入口 实时物流进度一站式追踪专题
主流快递单号查询入口 实时物流进度一站式追踪专题

本专题聚合极兔快递、京东快递、中通快递、圆通快递、韵达快递等主流物流平台的单号查询与运单追踪内容,重点解决单号查询、手机号查物流、官网入口直达、包裹进度实时追踪等高频问题,帮助用户快速获取最新物流状态,提升查件效率与使用体验。

27

2026.02.02

Golang WebAssembly(WASM)开发入门
Golang WebAssembly(WASM)开发入门

本专题系统讲解 Golang 在 WebAssembly(WASM)开发中的实践方法,涵盖 WASM 基础原理、Go 编译到 WASM 的流程、与 JavaScript 的交互方式、性能与体积优化,以及典型应用场景(如前端计算、跨平台模块)。帮助开发者掌握 Go 在新一代 Web 技术栈中的应用能力。

11

2026.02.02

PHP Swoole 高性能服务开发
PHP Swoole 高性能服务开发

本专题聚焦 PHP Swoole 扩展在高性能服务端开发中的应用,系统讲解协程模型、异步IO、TCP/HTTP/WebSocket服务器、进程与任务管理、常驻内存架构设计。通过实战案例,帮助开发者掌握 使用 PHP 构建高并发、低延迟服务端应用的工程化能力。

5

2026.02.02

Java JNI 与本地代码交互实战
Java JNI 与本地代码交互实战

本专题系统讲解 Java 通过 JNI 调用 C/C++ 本地代码的核心机制,涵盖 JNI 基本原理、数据类型映射、内存管理、异常处理、性能优化策略以及典型应用场景(如高性能计算、底层库封装)。通过实战示例,帮助开发者掌握 Java 与本地代码混合开发的完整流程。

5

2026.02.02

go语言 注释编码
go语言 注释编码

本专题整合了go语言注释、注释规范等等内容,阅读专题下面的文章了解更多详细内容。

62

2026.01.31

go语言 math包
go语言 math包

本专题整合了go语言math包相关内容,阅读专题下面的文章了解更多详细内容。

55

2026.01.31

go语言输入函数
go语言输入函数

本专题整合了go语言输入相关教程内容,阅读专题下面的文章了解更多详细内容。

27

2026.01.31

热门下载

更多
网站特效
/
网站源码
/
网站素材
/
前端模板

精品课程

更多
相关推荐
/
热门推荐
/
最新课程
关于我们 免责申明 举报中心 意见反馈 讲师合作 广告合作 最新更新
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送

Copyright 2014-2026 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号