DBA在日常工作中常常会面临以下两种常见情况:
- 业务人员会提出问题:“表被锁了,导致业务受阻,请帮忙解决。”
- 业务人员还会反馈:“某个程序通常几秒内就能执行完毕,但现在却运行了好几分钟,不清楚是什么原因?”
针对第二种情况,当我们排除了执行计划突然变化或数据量异常增加的可能性后,很大概率也是由锁造成的。
锁查询思路:区分锁类别
根本思路还是排查锁,只是区分排查当前的锁还是排查历史的锁。
下文以实验场景展示上述两个的分析过程。我使用的环境为:
observer (OceanBase 4.3.0.1)
REVISION: 101000062024032200-b59432e535c48e8b8828190c803b6c7736413ff9
BUILD_BRANCH: HEAD
BUILD_TIME: Mar 22 2024 00:52:21
BUILD_FLAGS: RelWithDebInfo
BUILD_INFO:
实验场景演示排查过程
第一个场景:排查当前正在锁的表。
第一步,我们开启一个事务,然后更新一行数据,但是不提交。
第二步,我们模仿程序,通过java代码调用sql,更新同一条数据。
可以看到,现在程序现在卡死的状态。这时我们就可以去库里排查,代码如下。
MySQL [oceanbase]> select * from __all_virtual_processlist where TENANT='htap' and user_client_ip<>'127.0.0.1'; +------------+------+--------+--------------------+------+---------+--------+------+-------+------+--------------+----------+----------+---------------------+---------------+----------------+-----------+----------+-----------+------------+----------+-------------+------------+-----------+------------+--------+--------+-------------+-----------+---------+-----------+---------------------+-------+-------------------+-----------------------+--------+--------+----------+----------+-----------+------------------+------------+--------------+ | id | user | tenant | host | db | command | sql_id | time | state | info | svr_ip | svr_port | sql_port | proxy_sessid | master_sessid | user_client_ip | user_host | trans_id | thread_id | ssl_cipher | trace_id | trans_state | total_time | retry_cnt | retry_info | action | module | client_info | sql_trace | plan_id | tenant_id | effective_tenant_id | level | sample_percentage | record_policy | lb_vid | lb_vip | lb_vport | in_bytes | out_bytes | user_client_port | proxy_user | service_name | +------------+------+--------+--------------------+------+---------+--------+------+-------+------+--------------+----------+----------+---------------------+---------------+----------------+-----------+----------+-----------+------------+----------+-------------+------------+-----------+------------+--------+--------+-------------+-----------+---------+-----------+---------------------+-------+-------------------+-----------------------+--------+--------+----------+----------+-----------+------------------+------------+--------------+ | 3221643380 | SYS | htap | 134.90.8.212:39832 | SYS | Sleep | | 24 | SLEEP | NULL | 134.90.8.212 | 2882 | 2881 | 9681060055801331737 | NULL | 134.90.8.212 | % | 3917058 | 0 | NULL | NULL | ACTIVE | 24 | 0 | 0 | | | | 1 | 3036 | 1002 | 1002 | 1 | 10 | SAMPLE_AND_SLOW_QUERY | NULL | NULL | NULL | 663 | 13527 | 59424 | NULL | NULL | | 3221704604 | SYS | htap | 134.90.8.212:43442 | SYS | Sleep | | 5 | SLEEP | NULL | 134.90.8.212 | 2882 | 2881 | 9681060055801331738 | NULL | 134.84.16.9 | % | 0 | 0 | NULL | NULL | | 5 | 6 | -6005 | | | | 1 | 3037 | 1002 | 1002 | 1 | 10 | SAMPLE_AND_SLOW_QUERY | NULL | NULL | NULL | 1078 | 11179 | 42302 | NULL | NULL | | 3222051655 | SYS | htap | 134.90.8.212:38844 | SYS | Sleep | | 969 | SLEEP | NULL | 134.90.8.216 | 2882 | 2881 | 9681060055801331737 | NULL | 134.90.8.212 | % | 0 | 0 | NULL | NULL | | 969 | 0 | 0 | | | | 1 | 322 | 1002 | 1002 | 1 | 10 | SAMPLE_AND_SLOW_QUERY | NULL | NULL | NULL | 638 | 8569 | 59424 | NULL | NULL | +------------+------+--------+--------------------+------+---------+--------+------+-------+------+--------------+----------+----------+---------------------+---------------+----------------+-----------+----------+-----------+------------+----------+-------------+------------+-----------+------------+--------+--------+-------------+-----------+---------+-----------+---------------------+-------+-------------------+-----------------------+--------+--------+----------+----------+-----------+------------------+------------+--------------+ 3 rows in set (0.02 sec) |
通过__all_virtual_processlist,我们可以看到持有锁的会话,以及被锁的会话,可以看到第二个session的retry_info有6005获取锁失败的报错,即被阻塞的会话。第一个active是持有锁,阻塞别人的会话,只要第一个会话回滚或提交,释放锁,第二个会话就可以获取到锁。
由于此次测试环境会话较少,能够清晰地看到我们需要的信息,在生产环境中这需要耗费点时间。因此,下面介绍其他方式供参考。
(1)业务反馈有积压,程序卡住,但不确定卡在哪些表上了。
针对这个问题,可以通过以下命令sql查询持有锁的所有会话及表名,并与业务确认卡顿程序涉及的表以定位锁,如果涉及杀会话,需要确认杀掉持有锁的会话会不会影响业务。
MySQL [oceanbase]> select a.tenant_id,a.svr_ip,a.ls_id,a.table_id,c.table_id table_actual_id,c.table_name,a.tablet_id,a.session_id,a.ctx_create_time -> from __all_virtual_trans_lock_stat a -> left join __all_virtual_tablet_to_ls b on b.tablet_id=a.tablet_id -> left join __all_virtual_table c on b.table_id=c.table_id; +-----------+--------------+-------+----------+-----------------+------------+-----------+------------+----------------------------+ | tenant_id | svr_ip | ls_id | table_id | table_actual_id | table_name | tablet_id | session_id | ctx_create_time | +-----------+--------------+-------+----------+-----------------+------------+-----------+------------+----------------------------+ | 1002 | 169.78.3.112 | 1001 | 0 | 500008 | ZRY2 | 200010 | 3221643380 | 2024-04-12 10:32:52.880125 | +-----------+--------------+-------+----------+-----------------+------------+-----------+------------+----------------------------+ 1 row in set (0.09 sec) |
(2)业务反馈涉及某个表的程序卡住
针对这个问题,我们只需要查询该表是否有锁,查询语句如下,该sql是根据表名查询持有锁的会话。
MySQL [oceanbase]> select * from (select a.tenant_id,a.svr_ip,a.ls_id,a.table_id,c.table_id table_actual_id,c.table_name,a.tablet_id,a.session_id,a.ctx_create_time -> from __all_virtual_trans_lock_stat a -> left join __all_virtual_tablet_to_ls b on b.tablet_id=a.tablet_id -> left join __all_virtual_table c on b.table_id=c.table_id) t where t.table_name like '%ZRY2%'; +-----------+--------------+-------+----------+-----------------+------------+-----------+------------+----------------------------+ | tenant_id | svr_ip | ls_id | table_id | table_actual_id | table_name | tablet_id | session_id | ctx_create_time | +-----------+--------------+-------+----------+-----------------+------------+-----------+------------+----------------------------+ | 1002 | 169.78.3.112 | 1001 | 0 | 500008 | ZRY2 | 200010 | 3221643380 | 2024-04-12 10:32:52.880125 | +-----------+--------------+-------+----------+-----------------+------------+-----------+------------+----------------------------+ 1 row in set (0.05 sec) |
(3)程序冲突导致卡顿
有时多个程序逻辑可能相互冲突,导致某个业务的程序被卡住,但是该业务逻辑可以手动调整,此时可能想终止该会话优先运行其他流程。这时,如下sql可以查询到被锁,或者说等待持有锁的会话。
MySQL [oceanbase]> select a.tenant_id,a.svr_ip,c.table_id table_actual_id,c.table_name,a.tablet_id,a.session_id,a.block_session_id,a.trans_id,a.holder_trans_id,a.lock_mode,a.`type` -> from __all_virtual_lock_wait_stat a -> left join __all_virtual_tablet_to_ls b on b.tablet_id=a.tablet_id -> left join __all_virtual_table c on b.table_id=c.table_id; +-----------+--------------+-----------------+------------+-----------+------------+------------------+----------+-----------------+-----------+------+ | tenant_id | svr_ip | table_actual_id | table_name | tablet_id | session_id | block_session_id | trans_id | holder_trans_id | lock_mode | type | +-----------+--------------+-----------------+------------+-----------+------------+------------------+----------+-----------------+-----------+------+ | 1002 | 169.78.3.112 | 500008 | ZRY2 | 200010 | 3221704604 | 3221704604 | 3917301 | 3917058 | X | 1 | +-----------+--------------+-----------------+------------+-----------+------------+------------------+----------+-----------------+-----------+------+ 1 row in set (0.02 sec) |
第二个场景:排查历史被锁的表。
该场景中的问题是DBA经常遇到的,一个程序一反常态运行变慢通常是锁冲突导致的,建议业务人员修改业务逻辑,可是业务人员无法确定是哪些业务冲突,DBA就需要找到被锁住的业务模型,便于业务定位和修改业务逻辑。
第一步,提交update。
可以看到java程序获取到锁,执行结束了,此处需要说明的事,该java程序是我从社区官网的客户案例中复制出来并改了sql,getString报错不影响此次实验的目的。
可以确认数据,java程序修改成功了。
首先,在生产等环境sqlaudit刷新较快,我们需要保留相应的aqlaudit便于分析,在场景复现时关闭audit,保留信息。
MySQL [oceanbase]> alter system set enable_sql_audit='False';
Query OK, 0 rows affected (0.07 sec)
然后,DBA通过业务日志或业务人员提供的sql语句查询sql_audit信息。
MySQL [oceanbase]> select * from gv$ob_sql_audit where query_sql='update zry2 set b=\'javacs\' where a=985 and c=164;' \G *************************** 1. row *************************** SVR_IP: 169.78.3.112 SVR_PORT: 2882 REQUEST_ID: 12119440 SQL_EXEC_ID: 52294148 TRACE_ID: YB42865A08D4-0006157C0336EBEA-0-0 SID: 3221704604 CLIENT_IP: 169.78.3.112 CLIENT_PORT: 43442 TENANT_ID: 1002 TENANT_NAME: htap EFFECTIVE_TENANT_ID: 1002 USER_ID: 200003 USER_NAME: SYS USER_GROUP: 0 USER_CLIENT_IP: 169.78.16.6 DB_ID: 201006 DB_NAME: SYS SQL_ID: 3E393CA2244E2C49F354EA1DE7874597 QUERY_SQL: update zry2 set b='javacs' where a=985 and c=164; PLAN_ID: 3037 AFFECTED_ROWS: 1 RETURN_ROWS: 0 PARTITION_CNT: 1 RET_CODE: 0 QC_ID: 0 DFO_ID: 0 SQC_ID: 0 WORKER_ID: 0 EVENT: P1TEXT: P1: 0 P2TEXT: P2: 0 P3TEXT: P3: 0 LEVEL: 0 WAIT_CLASS_ID: 100 WAIT_CLASS#: 0 WAIT_CLASS: OTHER STATE: MAX_WAIT TIME ZERO WAIT_TIME_MICRO: 0 TOTAL_WAIT_TIME_MICRO: 0 TOTAL_WAITS: 0 RPC_COUNT: 2 PLAN_TYPE: 1 IS_INNER_SQL: 0 IS_EXECUTOR_RPC: 0 IS_HIT_PLAN: 1 REQUEST_TIME: 1712889179734177 ELAPSED_TIME: 94830642 NET_TIME: 0 NET_WAIT_TIME: 4 QUEUE_TIME: 2467 DECODE_TIME: 3 GET_PLAN_TIME: 1698 EXECUTE_TIME: 160381 APPLICATION_WAIT_TIME: 0 CONCURRENCY_WAIT_TIME: 0 USER_IO_WAIT_TIME: 0 SCHEDULE_TIME: 0 ROW_CACHE_HIT: 0 BLOOM_FILTER_CACHE_HIT: 0 BLOCK_CACHE_HIT: 243 DISK_READS: 0 RETRY_CNT: 30 TABLE_SCAN: 1 CONSISTENCY_LEVEL: 3 MEMSTORE_READ_ROW_COUNT: 2 SSSTORE_READ_ROW_COUNT: 3 DATA_BLOCK_READ_CNT: 0 DATA_BLOCK_CACHE_HIT: 212 INDEX_BLOCK_READ_CNT: 0 INDEX_BLOCK_CACHE_HIT: 31 BLOCKSCAN_BLOCK_CNT: 422 BLOCKSCAN_ROW_CNT: 999999 PUSHDOWN_STORAGE_FILTER_ROW_CNT: 0 REQUEST_MEMORY_USED: 1614992 EXPECTED_WORKER_COUNT: 0 USED_WORKER_COUNT: 0 SCHED_INFO: NULL FUSE_ROW_CACHE_HIT: 0 PS_CLIENT_STMT_ID: -1 PS_INNER_STMT_ID: -1 TX_ID: 3917415 SNAPSHOT_VERSION: 1712889274559945492 REQUEST_TYPE: 2 IS_BATCHED_MULTI_STMT: 0 OB_TRACE_INFO: NULL PLAN_HASH: 5849233527757913110 LOCK_FOR_READ_TIME: 0 PARAMS_VALUE: RULE_NAME: PARTITION_HIT: 1 TX_INTERNAL_ROUTING: 0 TX_STATE_VERSION: 0 FLT_TRACE_ID: PL_TRACE_ID: NULL PLSQL_EXEC_TIME: 0 TOTAL_MEMSTORE_READ_ROW_COUNT: 0 TOTAL_SSSTORE_READ_ROW_COUNT: 0 1 row in set (0.82 sec) |
可以看到 ELAPSED_TIME: 94830642,总响应时间超过90s,但其实EXECUTE_TIME: 160381,执行时间才160ms,其他的时间消耗也比较低,RETRY_CNT: 30,重试了30次,基本可以判定是锁冲突导致,我们通过TRACE_ID: YB42865A08D4-0006157C0336EBEA-0-0 在SVR_IP: 169.78.3.112节点的日志中搜索。
xxxxxxxxxroot:/home/admin/oceanbase/log # grep YB42865A08D4-0006157C0336EBEA-0-0 observer.log|more [2024-04-12 10:32:59.739289] WDIAG [STORAGE.TRANS] mvcc_write (ob_mvcc_row.cpp:1023) [9181][T1002_L0_G0][T1002][YB42865A08D4-0006157C0336EBEA-0-0] [lt=21][e rrcode=-6005] mvcc write conflict(ret=-6005, ctx={ObIMvccCtx={alloc_type=0 ctx_descriptor=0 min_table_version=1712482298355992 max_table_version=17124822983 55992 trans_version={val:4611686018427387903, v:0} commit_version={val:0, v:0} lock_wait_start_ts=0 replay_compact_version={val:0, v:0}} end_code=0 tx_statu s=0 is_readonly=false ref=1 trans_id={txid:3917083} ls_id=1001 row_callback[alloc:0, free:0, unsubmit:0] redo[fill:0,sync_succ:0, sync_fail:0] main_list_len =1 pending_log_size=0 callback_list:{cnt=1 need_merge=0 stat:[tx_end=0, rollback_to=0, fast_commit=0, remove_memtable=0] detail:[(log_epoch,length,logged,sy nced,appended,removed,unlog_removed,branch_removed)|0:(1,1,0,0,1,0,0,0)|]}}, node=this=0x7fe662a11a10 trans_version={val:4611686018427387903, v:0} scn={val: 4611686018427387903, v:0} tx_id={txid:3917083} prev=(nil) next=(nil) modify_count=4294967295 acc_checksum=0 version=1712887388091454 type=0 flag=0 snapshot_ barrier=0 snapshot_barrier_flag=0 mtd={dml_flag:2, buf_len:49} seq_no={branch:0, seq:3}, res={can_insert:false, need_insert:false, is_new_locked:false, is_m vcc_undo:false, lock_state:{is_locked:true, trans_version:{val:0, v:0}, lock_trans_id:{txid:3917058}, lock_data_sequence:{branch:0, seq:15984}, lock_dml_fla g:2, is_delayed_cleanout:false, mvcc_row:0x7fe662a040a8, trans_scn:{val:4611686018427387903, v:0}}, is_checked:false, tx_node:NULL}, *this={this=0x7fe662a04 0a8 latch_=unlocked flag=11 first_dml=UPDATE last_dml=UPDATE update_since_compact=2 list_head=0x7fe662a11978 latest_compact_node=(nil) max_trans_version={va l:1712887883522203616, v:0} max_trans_id=3912943 max_elr_trans_version={val:1712887883522203616, v:0} max_elr_trans_id=3912943 latest_compact_ts=0 last_comp act_cnt=0 total_trans_node_cnt=3 max_modify_scn={val:1712887883522203616, v:0} min_modify_scn={val:1712887883518676835, v:0}}) [2024-04-12 10:32:59.739372] INFO [STORAGE.TRANS] on_wlock_retry (ob_memtable_context.cpp:300) [9181][T1002_L0_G0][T1002][YB42865A08D4-0006157C0336EBEA-0-0 ] [lt=78] mvcc_write conflict(key={"BIGINT UNSIGNED":33}, tx_id={txid:3917083}, conflict_tx_id={txid:3917058}, this={ObIMvccCtx={alloc_type=0 ctx_descriptor =0 min_table_version=1712482298355992 max_table_version=1712482298355992 trans_version={val:4611686018427387903, v:0} commit_version={val:0, v:0} lock_wait_ start_ts=0 replay_compact_version={val:0, v:0}} end_code=0 tx_status=0 is_readonly=false ref=1 trans_id={txid:3917083} ls_id=1001 row_callback[alloc:0, free :0, unsubmit:0] redo[fill:0,sync_succ:0, sync_fail:0] main_list_len=1 pending_log_size=0 callback_list:{cnt=1 need_merge=0 stat:[tx_end=0, rollback_to=0, fa st_commit=0, remove_memtable=0] detail:[(log_epoch,length,logged,synced,appended,removed,unlog_removed,branch_removed)|0:(1,1,0,0,1,0,0,0)|]}}) [2024-04-12 10:32:59.739404] WDIAG [STORAGE] update_row (ob_tablet.cpp:3796) [9181][T1002_L0_G0][T1002][YB42865A08D4-0006157C0336EBEA-0-0] [lt=24][errcode=- 6005] failed to set memtable, (ret=-6005) |
在日志中能够明显看到6005获取锁失败的信息,以及持有锁的lock_trans_id:{txid:3917058},根据这个txid就可以找到当时锁住该业务的业务模型。当然,在OceanBase3.2.4版本中可以从日志中获取持有锁的trans_hash的值,通过该值也可以获取该事务的业务模型。
MySQL [oceanbase]> select * from gv$ob_sql_audit where tx_id=3917058 \G *************************** 1. row *************************** SVR_IP: 169.78.3.112 SVR_PORT: 2882 REQUEST_ID: 12116788 SQL_EXEC_ID: 52281688 TRACE_ID: YB42865A08D4-0006157C0336EBE3-0-0 SID: 3221643380 CLIENT_IP: 169.78.3.112 CLIENT_PORT: 39832 TENANT_ID: 1002 TENANT_NAME: htap EFFECTIVE_TENANT_ID: 1002 USER_ID: 200003 USER_NAME: SYS USER_GROUP: 0 USER_CLIENT_IP: 169.78.3.112 DB_ID: 201006 DB_NAME: SYS SQL_ID: 8D589AFA4DFAEEED85FFF5AA78E5FF6A QUERY_SQL: begin PLAN_ID: 0 AFFECTED_ROWS: 0 RETURN_ROWS: 0 PARTITION_CNT: 0 RET_CODE: 0 QC_ID: 0 DFO_ID: 0 SQC_ID: 0 WORKER_ID: 0 EVENT: P1TEXT: P1: 0 P2TEXT: P2: 0 P3TEXT: P3: 0 LEVEL: 0 WAIT_CLASS_ID: 100 WAIT_CLASS#: 0 WAIT_CLASS: OTHER STATE: MAX_WAIT TIME ZERO WAIT_TIME_MICRO: 0 TOTAL_WAIT_TIME_MICRO: 0 TOTAL_WAITS: 0 RPC_COUNT: 0 PLAN_TYPE: 0 IS_INNER_SQL: 0 IS_EXECUTOR_RPC: 0 IS_HIT_PLAN: 0 REQUEST_TIME: 1712889172875953 ELAPSED_TIME: 200 NET_TIME: 0 NET_WAIT_TIME: 8 QUEUE_TIME: 37 DECODE_TIME: 1 GET_PLAN_TIME: 66 EXECUTE_TIME: 53 APPLICATION_WAIT_TIME: 0 CONCURRENCY_WAIT_TIME: 0 USER_IO_WAIT_TIME: 0 SCHEDULE_TIME: 0 ROW_CACHE_HIT: 0 BLOOM_FILTER_CACHE_HIT: 0 BLOCK_CACHE_HIT: 0 DISK_READS: 0 RETRY_CNT: 0 TABLE_SCAN: 0 CONSISTENCY_LEVEL: -1 MEMSTORE_READ_ROW_COUNT: 0 SSSTORE_READ_ROW_COUNT: 0 DATA_BLOCK_READ_CNT: 0 DATA_BLOCK_CACHE_HIT: 0 INDEX_BLOCK_READ_CNT: 0 INDEX_BLOCK_CACHE_HIT: 0 BLOCKSCAN_BLOCK_CNT: 0 BLOCKSCAN_ROW_CNT: 0 PUSHDOWN_STORAGE_FILTER_ROW_CNT: 0 REQUEST_MEMORY_USED: 65536 EXPECTED_WORKER_COUNT: 0 USED_WORKER_COUNT: 0 SCHED_INFO: NULL FUSE_ROW_CACHE_HIT: 0 PS_CLIENT_STMT_ID: -1 PS_INNER_STMT_ID: -1 TX_ID: 3917058 SNAPSHOT_VERSION: 0 REQUEST_TYPE: 2 IS_BATCHED_MULTI_STMT: 0 OB_TRACE_INFO: NULL PLAN_HASH: 0 LOCK_FOR_READ_TIME: 0 PARAMS_VALUE: RULE_NAME: PARTITION_HIT: 1 TX_INTERNAL_ROUTING: 1 TX_STATE_VERSION: 1 FLT_TRACE_ID: 000615dd-16b0-59f2-f7a7-1215e6062770 PL_TRACE_ID: NULL PLSQL_EXEC_TIME: 0 TOTAL_MEMSTORE_READ_ROW_COUNT: 0 TOTAL_SSSTORE_READ_ROW_COUNT: 0 *************************** 2. row *************************** SVR_IP: 169.78.3.112 SVR_PORT: 2882 REQUEST_ID: 12116789 SQL_EXEC_ID: 52281689 TRACE_ID: YB42865A08D4-0006157C0336EBE4-0-0 SID: 3221643380 CLIENT_IP: 169.78.3.112 CLIENT_PORT: 39832 TENANT_ID: 1002 TENANT_NAME: htap EFFECTIVE_TENANT_ID: 1002 USER_ID: 200003 USER_NAME: SYS USER_GROUP: 0 USER_CLIENT_IP: 169.78.3.112 DB_ID: 201006 DB_NAME: SYS SQL_ID: 328A68949A9F6E066CEA786B79D83BE9 QUERY_SQL: update zry2 set b='cs' where a=985 and c=164 PLAN_ID: 3036 AFFECTED_ROWS: 1 RETURN_ROWS: 0 PARTITION_CNT: 1 RET_CODE: 0 QC_ID: 0 DFO_ID: 0 SQC_ID: 0 WORKER_ID: 0 EVENT: P1TEXT: P1: 0 P2TEXT: P2: 0 P3TEXT: P3: 0 LEVEL: 0 WAIT_CLASS_ID: 100 WAIT_CLASS#: 0 WAIT_CLASS: OTHER STATE: MAX_WAIT TIME ZERO WAIT_TIME_MICRO: 0 TOTAL_WAIT_TIME_MICRO: 0 TOTAL_WAITS: 0 RPC_COUNT: 0 PLAN_TYPE: 1 IS_INNER_SQL: 0 IS_EXECUTOR_RPC: 0 IS_HIT_PLAN: 1 REQUEST_TIME: 1712889172876387 ELAPSED_TIME: 4429 NET_TIME: 0 NET_WAIT_TIME: 3 QUEUE_TIME: 145 DECODE_TIME: 0 GET_PLAN_TIME: 71 EXECUTE_TIME: 4196 APPLICATION_WAIT_TIME: 0 CONCURRENCY_WAIT_TIME: 0 USER_IO_WAIT_TIME: 0 SCHEDULE_TIME: 0 ROW_CACHE_HIT: 0 BLOOM_FILTER_CACHE_HIT: 0 BLOCK_CACHE_HIT: 243 DISK_READS: 0 RETRY_CNT: 0 TABLE_SCAN: 1 CONSISTENCY_LEVEL: 3 MEMSTORE_READ_ROW_COUNT: 2 SSSTORE_READ_ROW_COUNT: 3 DATA_BLOCK_READ_CNT: 0 DATA_BLOCK_CACHE_HIT: 212 INDEX_BLOCK_READ_CNT: 0 INDEX_BLOCK_CACHE_HIT: 31 BLOCKSCAN_BLOCK_CNT: 422 BLOCKSCAN_ROW_CNT: 999999 PUSHDOWN_STORAGE_FILTER_ROW_CNT: 0 REQUEST_MEMORY_USED: 1369040 EXPECTED_WORKER_COUNT: 0 USED_WORKER_COUNT: 0 SCHED_INFO: NULL FUSE_ROW_CACHE_HIT: 0 PS_CLIENT_STMT_ID: -1 PS_INNER_STMT_ID: -1 TX_ID: 3917058 SNAPSHOT_VERSION: 1712889172484169270 REQUEST_TYPE: 2 IS_BATCHED_MULTI_STMT: 0 OB_TRACE_INFO: NULL PLAN_HASH: 5849233527757913110 LOCK_FOR_READ_TIME: 0 PARAMS_VALUE: RULE_NAME: PARTITION_HIT: 1 TX_INTERNAL_ROUTING: 1 TX_STATE_VERSION: 2 FLT_TRACE_ID: 000615dd-16b0-59f2-f7a7-1215e6062770 PL_TRACE_ID: NULL PLSQL_EXEC_TIME: 0 TOTAL_MEMSTORE_READ_ROW_COUNT: 0 TOTAL_SSSTORE_READ_ROW_COUNT: 0 *************************** 3. row *************************** SVR_IP: 169.78.3.112 SVR_PORT: 2882 REQUEST_ID: 12119439 SQL_EXEC_ID: 52294147 TRACE_ID: YB42865A08D4-0006157C0336EBEB-0-0 SID: 3221643380 CLIENT_IP: 169.78.3.112 CLIENT_PORT: 39832 TENANT_ID: 1002 TENANT_NAME: htap EFFECTIVE_TENANT_ID: 1002 USER_ID: 200003 USER_NAME: SYS USER_GROUP: 0 USER_CLIENT_IP: 169.78.3.112 DB_ID: 201006 DB_NAME: SYS SQL_ID: FFFCA4D67EA0A788813031B8BBC3B329 QUERY_SQL: commit PLAN_ID: 0 AFFECTED_ROWS: 0 RETURN_ROWS: 0 PARTITION_CNT: 0 RET_CODE: 0 QC_ID: 0 DFO_ID: 0 SQC_ID: 0 WORKER_ID: 0 EVENT: P1TEXT: P1: 0 P2TEXT: P2: 0 P3TEXT: P3: 0 LEVEL: 0 WAIT_CLASS_ID: 100 WAIT_CLASS#: 0 WAIT_CLASS: OTHER STATE: MAX_WAIT TIME ZERO WAIT_TIME_MICRO: 0 TOTAL_WAIT_TIME_MICRO: 0 TOTAL_WAITS: 0 RPC_COUNT: 4 PLAN_TYPE: 0 IS_INNER_SQL: 0 IS_EXECUTOR_RPC: 0 IS_HIT_PLAN: 0 REQUEST_TIME: 1712889274559727 ELAPSED_TIME: 546 NET_TIME: 0 NET_WAIT_TIME: 5 QUEUE_TIME: 75 DECODE_TIME: 1 GET_PLAN_TIME: 94 EXECUTE_TIME: 344 APPLICATION_WAIT_TIME: 0 CONCURRENCY_WAIT_TIME: 0 USER_IO_WAIT_TIME: 0 SCHEDULE_TIME: 0 ROW_CACHE_HIT: 0 BLOOM_FILTER_CACHE_HIT: 0 BLOCK_CACHE_HIT: 0 DISK_READS: 0 RETRY_CNT: 0 TABLE_SCAN: 0 CONSISTENCY_LEVEL: -1 MEMSTORE_READ_ROW_COUNT: 0 SSSTORE_READ_ROW_COUNT: 0 DATA_BLOCK_READ_CNT: 0 DATA_BLOCK_CACHE_HIT: 0 INDEX_BLOCK_READ_CNT: 0 INDEX_BLOCK_CACHE_HIT: 0 BLOCKSCAN_BLOCK_CNT: 0 BLOCKSCAN_ROW_CNT: 0 PUSHDOWN_STORAGE_FILTER_ROW_CNT: 0 REQUEST_MEMORY_USED: 65536 EXPECTED_WORKER_COUNT: 0 USED_WORKER_COUNT: 0 SCHED_INFO: NULL FUSE_ROW_CACHE_HIT: 0 PS_CLIENT_STMT_ID: -1 PS_INNER_STMT_ID: -1 TX_ID: 3917058 SNAPSHOT_VERSION: 0 REQUEST_TYPE: 2 IS_BATCHED_MULTI_STMT: 0 OB_TRACE_INFO: NULL PLAN_HASH: 0 LOCK_FOR_READ_TIME: 0 PARAMS_VALUE: RULE_NAME: PARTITION_HIT: 1 TX_INTERNAL_ROUTING: 0 TX_STATE_VERSION: 2 FLT_TRACE_ID: 000615dd-16b0-59f2-f7a7-1215e6062770 PL_TRACE_ID: NULL PLSQL_EXEC_TIME: 0 TOTAL_MEMSTORE_READ_ROW_COUNT: 0 TOTAL_SSSTORE_READ_ROW_COUNT: 0 3 rows in set (0.58 sec) |
上述实验的模型比较简单,就是开启事务执行update然后commit,生产中的业务模型可能会涉及成百上千条sql,可以根据表名和时间缩小查询的范围,为业务人员提供关键的信息和证据。
总结
因为OceanBase 4.x版本采用了日志流,所以在确认表锁的时候要多关联一张表,但和旧版本的排查思路都是一样的。希望这个分析可以帮助大家学习与理解。