错误信息:
The last packet successfully received from the server was 3,576,246 milliseconds ago. The last packet sent successfully to the server was 3,576,247 milliseconds ago.
参考解决方案 https://github.com/alibaba/druid/issues/5549
如果修改socket-timeout不生效,可以尝试修改 socketTimeout。
思考
出现这个问题的地方是一个根据id更新数据的语句。按照常规理解更新速度很快。不应该出现更新超时的情况。
软件环境
- mysql 8.0
- mysql 驱动 8.0.26
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>8.0.26</version>
</dependency>
- 数据库连接池 druid 1.2.21
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>druid-spring-boot-3-starter</artifactId>
<version>1.2.21</version>
</dependency>
- 数据库操作orm框架 mybatis-plus
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-spring-boot3-starter</artifactId>
<version>3.5.5</version>
</dependency>
业务背景
- 多线程操作用户数据。
- 先查询用户是否存在,如果不存在,则创建一个用户。
- 查询用户其他逻辑,对用户回家进行更新。
问题表现
在更新用户数据时,出现锁等待时间过长的问题。
解决思路
- 查询数据库现在执行的语句
select * from information_schema.PROCESSLIST;
通过查询中的info和status 可以看到查询语句以及查询执行状态。发现update 语句的状态是 running。没有出现lock字样。通过客户端工具尝试查询表数据发现速度很快,执行查询结果更新语句就会很慢。
- 查询事务信息
SHOW ENGINE INNODB STATUS;
返回字段status详细内容:
=====================================
2024-09-20 07:45:19 140448521721408 INNODB MONITOR OUTPUT
=====================================
Per second averages calculated from the last 13 seconds
-----------------
BACKGROUND THREAD
-----------------
srv_master_thread loops: 44195 srv_active, 0 srv_shutdown, 55397 srv_idle
srv_master_thread log flush and writes: 0
----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 125312
OS WAIT ARRAY INFO: signal count 112952
RW-shared spins 0, rounds 0, OS waits 0
RW-excl spins 0, rounds 0, OS waits 0
RW-sx spins 0, rounds 0, OS waits 0
Spin rounds per wait: 0.00 RW-shared, 0.00 RW-excl, 0.00 RW-sx
------------
TRANSACTIONS
------------
Trx id counter 398850
Purge done for trx's n:o < 398642 undo n:o < 0 state: running but idle
History list length 20
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 421923654702672, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421923654707520, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421923654699440, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 421923654698632, not started
0 lock struct(s), heap size 1128, 0 row lock(s)
---TRANSACTION 398849, ACTIVE 63 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 2 lock struct(s), heap size 1128, 1 row lock(s)
MySQL thread id 2850, OS thread handle 140448519607872, query id 2416543 172.17.0.1 root updating
UPDATE user_info SET resource_flag=0 WHERE (user_id = '123456')
------- TRX HAS BEEN WAITING 63 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 7 page no 192 n bits 704 index user_info_pk of table `test`.`user_info` trx id 398849 lock_mode X waiting
Record lock, heap no 543 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
0: len 15; hex 77656978696e5f3433393930373237; asc weixin_43990727;;
1: len 4; hex 80077a4e; asc zN;;
------------------
---TRANSACTION 398640, ACTIVE 2595 sec
6 lock struct(s), heap size 1128, 4 row lock(s), undo log entries 5
MySQL thread id 2839, OS thread handle 140448056182336, query id 2414289 172.17.0.1 root
Trx read view will not see trx with id >= 398641, sees < 398641
--------
FILE I/O
--------
I/O thread 0 state: waiting for completed aio requests (insert buffer thread)
I/O thread 1 state: waiting for completed aio requests (read thread)
I/O thread 2 state: waiting for completed aio requests (read thread)
I/O thread 3 state: waiting for completed aio requests (read thread)
I/O thread 4 state: waiting for completed aio requests (read thread)
I/O thread 5 state: waiting for completed aio requests (write thread)
I/O thread 6 state: waiting for completed aio requests (write thread)
I/O thread 7 state: waiting for completed aio requests (write thread)
I/O thread 8 state: waiting for completed aio requests (write thread)
Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] ,
ibuf aio reads:
Pending flushes (fsync) log: 0; buffer pool: 0
9610812 OS file reads, 1442861 OS file writes, 958074 OS fsyncs
0.00 reads/s, 0 avg bytes/read, 0.44 writes/s, 0.00 fsyncs/s
-------------------------------------
INSERT BUFFER AND ADAPTIVE HASH INDEX
-------------------------------------
Ibuf: size 1, free list len 2078, seg size 2080, 1170 merges
merged operations:
insert 1176, delete mark 65, delete 5
discarded operations:
insert 0, delete mark 0, delete 0
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 4 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 7 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
Hash table size 34679, node heap has 0 buffer(s)
Hash table size 34679, node heap has 1 buffer(s)
0.00 hash searches/s, 0.00 non-hash searches/s
---
LOG
---
Log sequence number 4225651809
Log buffer assigned up to 4225651809
Log buffer completed up to 4225651809
Log written up to 4225651809
Log flushed up to 4225651809
Added dirty pages up to 4225651809
Pages flushed up to 4225651809
Last checkpoint at 4225651809
Log minimum file id is 1285
Log maximum file id is 1290
1095775 log i/o's done, 0.00 log i/o's/second
----------------------
BUFFER POOL AND MEMORY
----------------------
Total large memory allocated 0
Dictionary memory allocated 596889
Buffer pool size 8192
Free buffers 1024
Database pages 7154
Old database pages 2620
Modified db pages 0
Pending reads 0
Pending writes: LRU 0, flush list 0, single page 0
Pages made young 7477079, not young 85684361
1.18 youngs/s, 0.00 non-youngs/s
Pages read 9610535, created 12010, written 260831
0.00 reads/s, 0.00 creates/s, 0.44 writes/s
Buffer pool hit rate 1000 / 1000, young-making rate 25 / 1000 not 0 / 1000
Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s
LRU len: 7154, unzip_LRU len: 0
I/O sum[15]:cur[0], unzip sum[0]:cur[0]
--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Process ID=1, Main thread ID=140448106538560 , state=sleeping
Number of rows inserted 264792, updated 317559, deleted 1, read 758081039
1.54 inserts/s, 0.00 updates/s, 0.00 deletes/s, 1.54 reads/s
Number of system rows inserted 57, updated 338, deleted 54, read 6826
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s
----------------------------
END OF INNODB MONITOR OUTPUT
============================
从上面可以看出有两个事务存在,一个已经存在了63秒,一个已经存在了2595秒。其中第一个事务是更新语句,另一个没有语句。
从上面结果看可以是第二个事务长时间未提交导致第一个事务无法获取到锁。排查代码最终找到问题所在。
- select * from information_schema.INNODB_TRX;
通过这条语句能够更清晰的看到事务执行状态。
避免死锁的方法
- 避免长时间持有一个事务不放。