postgreSQL解决网络抖动问题
一、问题描述
org.mybatis.spring.MyBatisSystemException,org.mybatis.spring.MyBatisSystemException: nested exception is org.apache.ibatis.exceptions.PersistenceException:
### Error querying database. Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: org.postgresql.util.PSQLException: Connection attempt timed out.
### The error may exist in URL [jar:file:/opt/app/xxT.jar!/com/test/mapper/testMapper.xml]
### The error occurred while executing a query
### Cause: org.springframework.jdbc.CannotGetJdbcConnectionException: Could not get JDBC Connection; nested exception is java.sql.SQLException: org.postgresql.util.PSQLException: Connection attempt timed out.!
二、解决过过程
1、一开始认为是dbcp连接池的连接数不够用,所以测试了连接数为0和断网的场景,但异常没有复现
(1)dbcp配置:
dataSource.driverClassName=org.postgresql.Driver
dataSource.url=jdbc:postgresql://ip:port/rimu?loginTimeout=5
dataSource.username=appuser
dataSource.password=RI123text
datasource.maxActive=10(注:最大连接数)
datasource.initialSize=0
datasource.maxIdle=7
datasource.minIdle=0
datasource.maxWait=1000(注:-1为无限等待)
(2)本地模拟断网:
org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (连线被拒,请检查主机名称和埠号,并确定 postmaster 可以接受 TCP/IP 连线。)
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot create PoolableConnectionFactory (连线被拒,请检查主机名称和埠号,并确定 postmaster 可以接受 TCP/IP 连线。)
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1549)
at org.apache.commons.dbcp.BasicDataSource.createDataSource(BasicDataSource.java:1388)
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doBegin(DataSourceTransactionManager.java:203)
... 38 more
Caused by: org.postgresql.util.PSQLException: 连线被拒,请检查主机名称和埠号,并确定 postmaster 可以接受 TCP/IP 连线。
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:222)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:64)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:143)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:29)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:21)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:38)
at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:412)
at org.postgresql.Driver.connect(Driver.java:280)
at org.apache.commons.dbcp.DriverConnectionFactory.createConnection(DriverConnectionFactory.java:38)
at org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
at org.apache.commons.dbcp.BasicDataSource.validateConnectionFactory(BasicDataSource.java:1556)
at org.apache.commons.dbcp.BasicDataSource.createPoolableConnectionFactory(BasicDataSource.java:1545)
... 41 more
Caused by: java.net.ConnectException: Connection timed out: connect
at java.net.DualStackPlainSocketImpl.connect0(Native Method)
at java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at org.postgresql.core.PGStream.<init>(PGStream.java:61)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:110)
... 53 more
(3)本地模拟连接数为0:
org.springframework.transaction.CannotCreateTransactionException: Could not open JDBC Connection for transaction; nested exception is org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
Caused by: org.apache.commons.dbcp.SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:114)
at org.apache.commons.dbcp.BasicDataSource.getConnection(BasicDataSource.java:1044)
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doBegin(DataSourceTransactionManager.java:203)
... 38 more
Caused by: java.util.NoSuchElementException: Timeout waiting for idle object
at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1134)
at org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
... 40 more
2、针对loginTimeout=5和maxWait=1000做了测试,但是还是看不到loginTimeout的效果
(1)、连接数=0 maxWait=-1 loginTimeout=5 结果:一直等待
(2)、连接数=0 maxWait=10000 loginTimeout=5000 结果:等10s左右报错
(3)、断网,不管maxWait、loginTimeout为多少,都立即报错
3、针对minIdle与timeBetweenEvictionRunsMillis=30000搭配使用才有效做了测试
(1)、timeBetweenEvictionRunsMillis=30000
maxActive initialSize minIdle maxIdle 结果 说明
40 30 30 30 42
40 0 30 30 42 最小空闲有效
40 0 20 30 32 最小空闲有效
40 30 0 30 42 初始连接有效
40 0 0 1 13
说明最小空闲起作用:平常使用是12个,最小空闲为30,结果42=应用的12+空闲的30
(2)、无
maxActive initialSize minIdle maxIdle 结果 说明
40 30 30 30 42
40 0 30 30 13 最小空闲失效
40 30 0 30 42 初始连接有效
40 0 0 1 13
4、判断是jdbc创建连接时网络抖动造成,而和从dbcp中获取连接数无关,于是查postgreSQL-jdbc的源码
(1)、找到了报错的位置:postgresql-9.3-1104-jdbc4.jar 中的driver.java
(注:eclipse安装插件可直接反编译源码:Help-->Eclipse Marketplace-->Decompiler)
(2)代码如图:
所以在原有配置值的基础上加上:dataSource.url=jdbc:postgresql://ip:port/rimu?loginTimeout=5
4、虽然没有复现出生产异常,但是找到了异常产生的原因
三、总结
1、loginTimeout = int: Specify how long to wait for establishment of a database connection. The timeout is specified in seconds. 默认为0,单位s
socketTimeout = int : 如果pg-server超过这个时间不给app-server返回socket,则关闭这个connection
注: loginTimeout是jdbc建立连接超时
maxWait是从dbcp中获取连接超时
2、原因:mapper从dbcp池中获取connection,connection由jdbc创建放到dbcp池中。此报错是jdbc创建connection时超时(网络抖动)。
3、影响:执行的sql会失败,即使dbcp的maxWait为-1,表示获取不到connection时无限等待,但由于jdbc会抛出PSQLException异常,执行sql的方法还是会失败,所以对业务会产生影响。
4、解决:把jdbc的loginTimeout调大些:
dataSource.url=jdbc:postgresql://ip:port/rimu?loginTimeout=5
或者把dbcp的最小空闲调大(默认为0),不会正好在执行sql的时候去创建connection:
minIdle="10"
timeBetweenEvictionRunsMillis = "30000"
或在dbcp上添加检测机制:
testOnBorrow = "true" --->检测此时的connection是否正常,若不正常则放此connection,使用其他的
connection
validationQuery = "SELECT 1" --->检测时使用的sql语句
注:minIdle要与timeBetweenEvictionRunsMillis配合使用才有用,单独使用minIdle不会起作用。
timeBetweenEvictionRunsMillis:每30秒运行一次空闲连接回收器,配置timeBetweenEvictionRunsMillis = "30000"后,每30秒运行一次空闲连接回收器(独立线程)。并每次检查3个连接,如果连接空闲时间超过30分钟就销毁。销毁连接后,连接数量就少了,如果小于minIdle数量,就新建连接,维护数量不少于minIdle,过行了新老更替。
5、postgresql-jdbc官方文档:https://jdbc.postgresql.org/documentation/head/connect.html#connection-parameters