近期将一个老项目向ARM版的CentOS7移植时,遇到了SpringBoot启动顺利,但访问页面卡住的问题。由于是aarch64架构,因此使用了openjdk,这个项目之前在x86_64环境下一直是用Oracle的ServerJRE,没有遇到问题。此次启动正常,但启动完成后,访问部分页面正常,部分页面会卡住,卡住的时间不固定,有时长有时短,毫无规律可言。而且当卡住的页面正常后,再刷新不会再次卡住。
第一想法肯定是查日志,在首次访问卡顿页面时,Spring框架有一条这样的WARN:
2019-12-09 17:40:32.995 WARN 15161 --- [https-jsse-nio-443-exec-8] o.a.c.util.SessionIdGeneratorBase : Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [178,241] milliseconds.
为Session创建SecureRandom实例耗时将近3分钟,这就是页面卡住的原因,同时也解释了为什么只有部分页面卡住,因为不是所有页面都使用了Session,同时也解析了为什么卡住的页面可访问后再刷新就正常了,因为创建SecureRandom instance只进行一次。2019-12-09 17:40:32.995 WARN 15161 --- [https-jsse-nio-443-exec-8] o.a.c.util.SessionIdGeneratorBase : Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [178,241] milliseconds.
翻回来看看原因,项目中使用了Tomcat Embed作为内嵌WEB服务器,而Tomcat在生成session ID时会使用org.apache.catalina.util.SessionIdGeneratorBase来产生安全随机类SecureRandom实例。为了算法保密性较强,需要用到伪随机数生成器,Tomcat用到的是SHA1PRNG算法,为了得到随机种子,在Linux中,一般从/dev/random或/dev/urandom中产生,两者原理都是利用系统的环境噪声产生一定数量的随机比特,区别在于系统环境噪声不够时,random会阻塞,而urandom会牺牲安全性避免阻塞。
从卡顿现象上看,一定是用了/dev/random导致的,看一下$JAVA_HOME/jre/lib/security/java.security文件,找到下面的内容:
#
# Sun Provider SecureRandom seed source.
#
# Select the primary source of seed data for the "SHA1PRNG" and
# "NativePRNG" SecureRandom implementations in the "Sun" provider.
# (Other SecureRandom implementations might also use this property.)
#
# On Unix-like systems (for example, Solaris/Linux/MacOS), the
# "NativePRNG" and "SHA1PRNG" implementations obtains seed data from
# special device files such as file:/dev/random.
#
# On Windows systems, specifying the URLs "file:/dev/random" or
# "file:/dev/urandom" will enable the native Microsoft CryptoAPI seeding
# mechanism for SHA1PRNG.
#
# By default, an attempt is made to use the entropy gathering device
# specified by the "securerandom.source" Security property. If an
# exception occurs while accessing the specified URL:
#
# SHA1PRNG:
# the traditional system/thread activity algorithm will be used.
#
# NativePRNG:
# a default value of /dev/random will be used. If neither
# are available, the implementation will be disabled.
# "file" is the only currently supported protocol type.
#
# The entropy gathering device can also be specified with the System
# property "java.security.egd". For example:
#
# % java -Djava.security.egd=file:/dev/random MainClass
#
# Specifying this System property will override the
# "securerandom.source" Security property.
#
# In addition, if "file:/dev/random" or "file:/dev/urandom" is
# specified, the "NativePRNG" implementation will be more preferred than
# SHA1PRNG in the Sun provider.
#
securerandom.source=file:/dev/random
果然用的是/dev/random,按照上面的注释部分,解决方案也不复杂,可以添加启动参数或者修改java.security:
解决方法1:
启动参数添加-Djava.security.egd=file:/dev/urandom,如:
java -Djava.security.egd=file:/dev/urandom -jar xxxxx.jar
解决方法2:
修改$JAVA_HOME/jre/lib/security/java.security,找到securerandom.source并修改:
securerandom.source=file:/dev/urandom
再重启站点,卡顿现象消失。