CREATE RESOURCE GROUP rg1 RU_per_SEC=1;
CREATE GLOBAL BINDING FOR
SELECT COUNT(*) FROM t1 t JOIN t1 t2 USING (uid)
USING
SELECT /*+ resource_group(rg1) */ COUNT(*) FROM t1 t JOIN t1 t2 USING (uid);
EXPLAIN ANALYZE FORMAT = 'verbose' SELECT COUNT(*) FROM t t1 JOIN t1 t2 USING (uid);
为了实现上述场景,TiDB 实现了若干 SQL 语法,接下来看看如何具体操作。
增:
-- 创建 `rg1` 资源组,限额是每秒 `500 RU`,优先级为 `HIGH`,允许这个资源组的超额 (`BURSTABLE`) 占用资源。
CREATE RESOURCE GROUP IF NOT EXISTS rg1
RU_PER_SEC = 100
PRIORITY = HIGH
BURSTABLE;
-- 创建 `rg2` 资源组,限额是每秒 `500 RU`,其他选项为默认值。
CREATE RESOURCE GROUP rg2 RU_PER_SEC = 200;
删:
-- 删除资源组
DROP RESOURCE GROUP rg2;
改:
-- 修改资源组资源配置
ALTER RESOURCE GROUP rg2 ru_per_sec = 2000;
查:
-- 通过 I_S 表查看
SELECT * FROM INFORMATION_SCHEMA.RESOURCE_GROUPS;
ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER or RESOURCE_GROUP_ADMIN privilege(s) for this operation
将某个用户绑定到资源组
-- CREATE
CREATE USER u3 RESOURCE GROUP rg3;
-- ALTER
ALTER USER u2 RESOURCE GROUP rg2;
mysql> SELECT User, JSON_EXTRACT(User_attributes, "$.resource_group") AS RG FROM mysql.user ;
+------+-----------+
| User | RG |
+------+-----------+
| root | NULL |
| u1 | "default" |
| u2 | "rg2" |
| u3 | "" |
+------+-----------+
4 rows in set (0.00 sec)
SELECT * FROM INFORMATION_SCHEMA.USER_ATTRIBUTES;
只是目前效果不如意,期待下个版本会得到改进。相关帖子参见:resource control 相关,INFORMATION_SCHEMA.USER_ATTRIBUTES 表取值问题[4]
将当前会话绑定到资源组
SET RESOURCE GROUP rg1;
查看当前会话所使用的资源组
SELECT CURRENT_RESOURCE_GROUP();
重置当前会话绑定资源组
SET RESOURCE GROUP `default`;
SET RESOURCE GROUP ``;
SELECT /*+ RESOURCE_GROUP(rg1) */ * FROM t1 limit 10;
如何查看某条语句消耗的 RU 呢?可以通过实际执行计划来获取,举例如下:
EXPLAIN ANALYZE SELECT /*+ RESOURCE_GROUP(rg1) */ * FROM t1 limit 10;
mysql -h 192.168.195.128 -P 4000 -c -u u1 -e 'select sleep(1000)'
mysql -h 192.168.195.128 -P 4000 -c -u u2 -e 'SET RESOURCE GROUP `rg1`; select sleep(1000)'
mysql -h 192.168.195.128 -P 4000 -c -u u3 -e 'SELECT /*+ RESOURCE_GROUP(rg1) */ * sleep(1000)'
mysql> SELECT USER, RESOURCE_GROUP, INFO from INFORMATION_SCHEMA.PROCESSLIST ORDER BY USER;
+------+----------------+-------------------------------------------------------------------------------------+
| USER | RESOURCE_GROUP | INFO |
+------+----------------+-------------------------------------------------------------------------------------+
| root | default | SELECT USER, RESOURCE_GROUP, INFO from INFORMATION_SCHEMA.PROCESSLIST ORDER BY USER |
| u1 | default | select sleep(1000) |
| u2 | rg1 | select sleep(1000) |
| u3 | rg1 | SELECT /*+ RESOURCE_GROUP(rg1) */ sleep(1000) |
+------+----------------+-------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)
mysql> SELECT USER, RESOURCE_GROUP, INFO from INFORMATION_SCHEMA.PROCESSLIST ORDER BY USER;
+------+----------------+-------------------------------------------------------------------------------------+
| USER | RESOURCE_GROUP | INFO |
+------+----------------+-------------------------------------------------------------------------------------+
| u2 | rg2 | SELECT USER, RESOURCE_GROUP, INFO from INFORMATION_SCHEMA.PROCESSLIST ORDER BY USER |
| u2 | rg1 | select sleep(1000) |
+------+----------------+-------------------------------------------------------------------------------------+
2 rows in set (0.00 sec)
○ 对于表 USER_ATTRIBUTES ,普通用户可以查看所有用户的属性,如果【问题二】的功能实现,那么普通用户就可以查看到所有用户绑定的资源组。
○ 对于表 RESOURCE_GROUPS ,普通用户可以查看所有资源组。那么,关于 Bad SQL 问题,其实就有另外一种处理方式,开发者可以在 SQL 里写入 Hint,使得 Bad SQL 可以“越权”调用 default 资源组,轻则打破平衡,影响其他业务性能,重则打穿资源规划,再现 “一条 SQL 炮轰整个 TiDB 集群” 的威力。
TiDB:通过配置全局变量 tidb_enable_resource_control 控制是否打开资源组流控。
TiKV:通过配置参数 resource-control.enabled 控制是否使用基于资源组配额的请求调度。
-- tidb
SHOW VARIABLES LIKE "tidb_enable_resource_control";
-- tikv
SHOW CONFIG WHERE NAME LIKE "resource-control.enabled";
-- 默认 TPC-C 模型预测,等同于下一条命令
CALIBRATE RESOURCE;
-- 根据类似 TPC-C 的负载模型预测
CALIBRATE RESOURCE WORKLOAD TPCC;
-- 根据类似 sysbench oltp_write_only 的负载模型预测
CALIBRATE RESOURCE WORKLOAD OLTP_WRITE_ONLY;
-- 根据类似 sysbench oltp_read_write 的负载模型预测
CALIBRATE RESOURCE WORKLOAD OLTP_READ_WRITE;
-- 根据类似 sysbench oltp_read_only 的负载模型预测
CALIBRATE RESOURCE WORKLOAD OLTP_READ_ONLY;
在 TiDB Dashboard v7.1.0 面板上,我们可以看到新增了【资源管控】菜单,如图,
...[INFO] [session.go:3878] ... [sql="calibrate resource workload tpcc"]
...[INFO] [session.go:3878] ... [sql="calibrate resource workload oltp_read_write"]
...[INFO] [session.go:3878] ... [sql="calibrate resource workload oltp_read_write"]
...[INFO] [session.go:3878] ... [sql="calibrate resource workload oltp_read_only"]
...[INFO] [session.go:3878] ... [sql="calibrate resource workload oltp_write_only"]
○ 日常 RU 余量监控
○ 异常 RU 突增监控
[2023/06/29 11:27:50.069 +09:00] [INFO] [session.go:3878] [GENERAL_LOG] [conn=7398943596793037429] [user=u2@192.168.195.128] [schemaVersion=53] [txnStartTS=0] [forUpdateTS=0] [isReadConsistency=false] [currentDB=] [isPessimistic=false] [sessionTxnMode=PESSIMISTIC] [sql="select @@version_comment limit 1"]
[2023/06/29 11:27:58.973 +09:00] [INFO] [session.go:3878] [GENERAL_LOG] [conn=7398943596793037429] [user=u2@192.168.195.128] [schemaVersion=53] [txnStartTS=0] [forUpdateTS=0] [isReadConsistency=false] [currentDB=] [isPessimistic=false] [sessionTxnMode=PESSIMISTIC] [sql="select current_resource_group()"]
[2023/06/29 11:28:09.557 +09:00] [INFO] [session.go:3878] [GENERAL_LOG] [conn=7398943596793037429] [user=u2@192.168.195.128] [schemaVersion=53] [txnStartTS=0] [forUpdateTS=0] [isReadConsistency=false] [currentDB=] [isPessimistic=false] [sessionTxnMode=PESSIMISTIC] [sql="set resource group rg1"]
[2023/06/29 11:28:19.532 +09:00] [INFO] [session.go:3878] [GENERAL_LOG] [conn=7398943596793037429] [user=u2@192.168.195.128] [schemaVersion=53] [txnStartTS=0] [forUpdateTS=0] [isReadConsistency=false] [currentDB=] [isPessimistic=false] [sessionTxnMode=PESSIMISTIC] [sql="select * from test.t1 limit 1"]
[2023/06/29 11:28:19.534 +09:00] [INFO] [controller.go:287] ["[resource group controller] create resource group cost controller"] [name=rg1]
如果真的出现【问题三】中描述的,存在 Bad SQL “越权” 运行的情况,可以从日志中查找线索。
-- tiflash replica
ALTER TABLE t SET TIFLASH REPLICA 1;
-- read from tiflash
EXPLAIN ANALYZE FORMAT = 'verbose'
SELECT /*+ read_from_storage(tiflash[t]) */ COUNT(*)
FROM t;
-- read from tikv
EXPLAIN ANALYZE FORMAT = 'verbose'
SELECT /*+ read_from_storage(tikv[t]) */ COUNT(*)
FROM t;
... [INFO] [controller.go:287] ["[resource group controller] create resource group cost controller"] [name=rg1]
EXPLAIN ANALYZE FORMAT = 'verbose' SELECT /*+ RESOURCE_GROUP(rg2), read_from_storage(tikv[t]) */ COUNT(*) FROM t;
EXPLAIN ANALYZE FORMAT = 'verbose' SELECT /*+ RESOURCE_GROUP(rg3), read_from_storage(tiflash[t]) */ COUNT(*) FROM t;
○TiDB, MySQL 中的资源组相关命令,“增 ( CREATE RESOURCE GROUP ) 删 ( DROP RESOURCE GROUP ) 改 ( ALTER RESOURCE GROUP ) 查 ( SELECT * FROM INFORMATION_SCHEMA.RESOURCE_GROUPS )” 语法相近,但命令后跟接参数不同, I_S.RESOURCE_GROUPS 表结构亦不同。
○TiDB, MySQL 均支持 Hint,可以实现语句级资源组调用。
INSERT /*+ RESOURCE_GROUP(rg1) */ into t1 values (1);
[Service]
AmbientCapabilities=CAP_SYS_NICE
2. TiDB 中资源组设定的 RU 是定值,而在 MySQL 中可以指定 vCPU 为范围值,这个范围值对应所有可用的 CPU。
mysql> select version()\G
*************************** 1. row ***************************
version(): 8.0.28
1 row in set (0.00 sec)
mysql> ALTER RESOURCE GROUP rg1 VCPU = 0-1;
Query OK, 0 rows affected (0.01 sec)
3. TiDB 中的一些 SQL 语法或函数 (Functions) 是特有的,与 MySQL 不兼容,如 CURRENT_RESOURCE_GROUP() 。
前文提到了 MySQL 需要借助 Nice 来控制线程优先级,其实熟悉 Linux 系统的朋友都知道 Nice 成名已久,而 cgroup 这位后来者近年逐渐走入人们的视野,尤其是虚拟化、云化技术(如 Docker, Kubernetes)成熟后,cgroup 技术更是不可或缺。比如,从 RHEL 7 之后,可以直接为某个服务在 systemd 文件中设置 CPUAccounting 和 CPUShares 来控制进程对 CPU 的占用率。从 RHEL 8 开始引入了 cgroup v2,以完善功能,简化配置,CPU 控制参数也做了调整,变为 cpu.weight 。
AskTUG: https://asktug.com/
[2]【社区智慧合集】TiDB 相关 SQL 脚本大全: https://asktug.com/t/topic/999618
[3]resource control,default resource group 文档勘误: https://asktug.com/t/topic/1008372/6?u=shawnyan
[4]resource control 相关,INFORMATION_SCHEMA.USER_ATTRIBUTES 表取值问题: https://asktug.com/t/topic/1008437
[5]resource control, 如何查看session级别变量: https://asktug.com/t/topic/1008357
[6]resource control, I_S 权限控制问题: https://asktug.com/t/topic/1008596
[7]官方文档: https://docs.pingcap.com/zh/tidb/stable/tidb-resource-control#%E7%9B%B8%E5%85%B3%E5%8F%82%E6%95%B0
[8]预估集群容量: https://docs.pingcap.com/zh/tidb/stable/tidb-resource-control#%E9%A2%84%E4%BC%B0%E9%9B%86%E7%BE%A4%E5%AE%B9%E9%87%8F
[9]calibrate_resource: https://github.com/pingcap/tidb/blob/v7.1.0/executor/calibrate_resource.go#L295
[10]【资源管控】: https://docs.pingcap.com/zh/tidb/stable/dashboard-resource-manager
[11]“根据实际负载估算容量”: https://docs.pingcap.com/zh/tidb/stable/sql-statement-calibrate-resource#%E6%A0%B9%E6%8D%AE%E5%AE%9E%E9%99%85%E8%B4%9F%E8%BD%BD%E4%BC%B0%E7%AE%97%E5%AE%B9%E9%87%8F
[12]设计文档: https://github.com/pingcap/tidb/blob/master/docs/design/2022-11-25-global-resource-control.md#distributed-token-buckets
[13]resource control, Grafana 面板默认配置: https://asktug.com/t/topic/1008464
[14]resource control, Grafana 文档内容不完整: https://asktug.com/t/topic/1008693
[15]resource_manager: add metrics for avaiable RU #6523: https://github.com/tikv/pd/pull/6523
[16]资源组 (Resource Groups): https://dev.mysql.com/doc/refman/8.0/en/resource-groups.html
[17]WL#9467: Resource Groups: https://dev.mysql.com/worklog/task/?id=9467
[18]使用案例: https://asktug.com/t/topic/37127/2?u=shawnyan
[19]#17706: https://github.com/pingcap/tidb/pull/17706
💡 点击文末【阅读原文】,立即下载试用 TiDB!