I have configured tomcat with the following configurations:
我已使用以下配置配置tomcat:
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
maxThreads="500"
maxConnections="20000"
acceptCount="150"
etc... />
same numbers for AJP
connector, maxThreads=500
and acceptCount="150"
.
AJP连接器的数字相同,maxThreads = 500,acceptCount =“150”。
It works fine most of the time, but on peak times, when I have much more requests than usual, it takes too long to respond. Sometimes above 15 seconds and in rare cases timeOut. It may look okay, as maxThreads=500
and I have several thousand requests, however, on Server Status I see:
它在大多数情况下工作正常,但在高峰时段,当我有比平常更多的请求时,响应需要很长时间。有时超过15秒,极少数情况下是timeOut。它可能看起来没问题,因为maxThreads = 500并且我有几千个请求,但是,在服务器状态上我看到:
Max threads: 500 Current thread count: 17 Current thread busy: 1 Keep alive sockets count: 1
最大线程数:500当前线程数:17当前线程忙:1保持活动套接字计数:1
The max number of currentThreadCount
I have seen so far was 27. If there are so many connections, shouldn't tomcat create more threads (up to 500) to respond faster?
到目前为止我看到的currentThreadCount的最大数量是27.如果有这么多的连接,tomcat不应该创建更多的线程(最多500个)以更快地响应?
So, what am I doing wrong? What am I missing? I have 2 core CPU (max usage during peak hours ~10%) and 2GB of RAM (max usage 60%).
那么,我做错了什么?我错过了什么?我有2个核心CPU(高峰时段最大使用量~10%)和2GB RAM(最大使用率60%)。
Short info about web app: normally, each user makes at least 2 requests per session: static JSON response and 1 database query. In peak time I have 15-20k active users, but I don't know how many requests per second do I get. However, slow responses start from 5k active users.
关于Web应用程序的简短信息:通常,每个用户每个会话至少发出2个请求:静态JSON响应和1个数据库查询。在高峰时间,我有15-20k活跃用户,但我不知道每秒有多少请求。但是,响应缓慢从5k活跃用户开始。
I also increased max-active connections on app properties, with no change on performance, my current application.properties:
我还增加了应用程序属性上的最大活动连接数,但性能没有变化,我当前的application.properties:
spring.jpa.hibernate.ddl-auto=update
spring.datasource.driverClassName=com.mysql.jdbc.Driver
spring.datasource.url=jdbc:mysql://localhost:3306/database_name
spring.datasource.username=$username$
spring.datasource.password=$password$
spring.datasource.tomcat.max-active=200
spring.datasource.tomcat.max-wait=10000
spring.datasource.tomcat.max-idle=50
spring.datasource.tomcat.min-idle=10
spring.datasource.tomcat.initial-size=10
UPDATE I changed default JDBC
connection pool to Hikari
with the following configurations and enabled jta, however, didn't feel any difference on peak times:
更新我使用以下配置将默认JDBC连接池更改为Hikari并启用了jta,但是,在高峰时间没有任何差异:
spring.jta.enabled=true
spring.datasource.hikari.maximum-pool-size=125
spring.datasource.hikari.minimum-idle=5
I am adding database query below. Results of the query later added into another object and returned as ResponseBody.
我在下面添加数据库查询。稍后将查询结果添加到另一个对象中并作为ResponseBody返回。
@Query("select new ObjectClass(s.id, s.a, s.b, s.c") from TableName s " +
"where s.x > :param order by id desc")
List<ObjectClass> getObjects(@Param("param") long param);
CPU usage doesn't grow, RAM is almost half-free, if I am having too many requests, shouldn't I have overloaded on the server? Instead, I just get slow response time. Therefore, I think I have a configuration problem which I want to resolve.
CPU使用率没有增长,RAM几乎是一半免费,如果我有太多的请求,我不应该在服务器上超载?相反,我的反应时间很慢。因此,我认为我有一个配置问题,我想解决。
-Xms512M -Xmx1024M
The app that hangs on peak time:
在高峰时间挂起的应用:
Active sessions: 3243 Session count: 475330 Max active sessions: 4685 Rejected session creations: 0 Expired sessions: 472105 Longest session alive time: 7457 s Average session alive time: 9 s Processing time: 3177 ms JSPs loaded: 0 JSPs reloaded: 0
活动会话:3243会话数:475330最大活动会话数:4685拒绝会话创建数:0过期会话数:472105最长会话活动时间:7457秒平均会话活动时间:9秒处理时间:3177毫秒加载JSP:0重新加载JSP:0
Stack trace:
"Attach Listener" #502 daemon prio=9 os_prio=0 tid=0x00007fde58007800 nid=0x3ff waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Abandoned connection cleanup thread" #69 daemon prio=5 os_prio=0 tid=0x00007fde6c03e800 nid=0xa44 in Object.wait() [0x00007fde471ba000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000000c259e618> (a java.lang.ref.ReferenceQueue$Lock)
at com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-exec-25" #68 daemon prio=5 os_prio=0 tid=0x00007fde40016000 nid=0x741 waiting on condition [0x00007fde35fe0000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c1cc6758> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-exec-11" #54 daemon prio=5 os_prio=0 tid=0x00007fde38041800 nid=0x733 waiting on condition [0x00007fde36fee000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c1cc6758> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-AsyncTimeout" #52 daemon prio=5 os_prio=0 tid=0x00007fde884e8800 nid=0x732 waiting on condition [0x00007fde370ef000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.coyote.AbstractProtocol$AsyncTimeout.run(AbstractProtocol.java:1211)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-Acceptor-0" #51 daemon prio=5 os_prio=0 tid=0x00007fde884e6800 nid=0x731 runnable [0x00007fde371f0000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
- locked <0x00000000c019d7e8> (a java.lang.Object)
at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:455)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-ClientPoller-1" #50 daemon prio=5 os_prio=0 tid=0x00007fde884e4800 nid=0x730 runnable [0x00007fde372f1000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000c1da2fa0> (a sun.nio.ch.Util$3)
- locked <0x00000000c1da2f90> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c1d5b1e0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:787)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-ClientPoller-0" #49 daemon prio=5 os_prio=0 tid=0x00007fde884d6000 nid=0x72f runnable [0x00007fde373f2000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000c1d510d8> (a sun.nio.ch.Util$3)
- locked <0x00000000c1d510c8> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c1ce78c0> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:787)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-exec-10" #48 daemon prio=5 os_prio=0 tid=0x00007fde884c7000 nid=0x72e waiting on condition [0x00007fde374f3000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c1cc6758> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-exec-2" #40 daemon prio=5 os_prio=0 tid=0x00007fde884b7000 nid=0x726 waiting on condition [0x00007fde37cfb000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c1cc6758> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
"ajp-nio-8009-exec-1" #39 daemon prio=5 os_prio=0 tid=0x00007fde884b5000 nid=0x725 waiting on condition [0x00007fde37dfc000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c1cc6758> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:467)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:85)
at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:31)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
"http-nio-127.0.0.1-8080-AsyncTimeout" #38 daemon prio=5 os_prio=0 tid=0x00007fde884b3000 nid=0x724 waiting on condition [0x00007fde37efd000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.coyote.AbstractProtocol$AsyncTimeout.run(AbstractProtocol.java:1211)
at java.lang.Thread.run(Thread.java:748)
"http-nio-127.0.0.1-8080-Acceptor-0" #37 daemon prio=5 os_prio=0 tid=0x00007fde884b1800 nid=0x723 runnable [0x00007fde37ffe000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250)
- locked <0x00000000c01a03b8> (a java.lang.Object)
at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:455)
at java.lang.Thread.run(Thread.java:748)
"http-nio-127.0.0.1-8080-exec-1" #25 daemon prio=5 os_prio=0 tid=0x00007fde88324000 nid=0x717 waiting on condition [0x00007fde46db8000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c1d9c4e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:103)
at org.apache.tomcat.util.threads.TaskQueue.take(TaskQueue.java:31)
at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
"ContainerBackgroundProcessor[StandardEngine[Catalina]]" #24 daemon prio=5 os_prio=0 tid=0x00007fde88323000 nid=0x716 waiting on condition [0x00007fde476bb000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1355)
at java.lang.Thread.run(Thread.java:748)
"Abandoned connection cleanup thread" #22 daemon prio=5 os_prio=0 tid=0x00007fde4ca72800 nid=0x6f5 in Object.wait() [0x00007fde45c22000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000000c102c4b0> (a java.lang.ref.ReferenceQueue$Lock)
at com.mysql.jdbc.AbandonedConnectionCleanupThread.run(AbandonedConnectionCleanupThread.java:64)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"Tomcat JDBC Pool Cleaner[1595428806:1507838479700]" #21 daemon prio=5 os_prio=0 tid=0x00007fde4ca5b800 nid=0x6f4 in Object.wait() [0x00007fde470b9000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Timer.java:552)
- locked <0x00000000c0f6fe80> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:505)
"NioBlockingSelector.BlockPoller-2" #13 daemon prio=5 os_prio=0 tid=0x00007fde8847e000 nid=0x66f runnable [0x00007fde478bd000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000c019bd40> (a sun.nio.ch.Util$3)
- locked <0x00000000c019bd30> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c019bbf8> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:339)
"NioBlockingSelector.BlockPoller-1" #12 daemon prio=5 os_prio=0 tid=0x00007fde8846f800 nid=0x66e runnable [0x00007fde479be000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000c019ec10> (a sun.nio.ch.Util$3)
- locked <0x00000000c019ec00> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000c019ead8> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:339)
"GC Daemon" #11 daemon prio=2 os_prio=0 tid=0x00007fde883f9000 nid=0x66b in Object.wait() [0x00007fde741c6000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000000c02f16d8> (a sun.misc.GC$LatencyLock)
at sun.misc.GC$Daemon.run(GC.java:117)
- locked <0x00000000c02f16d8> (a sun.misc.GC$LatencyLock)
"AsyncFileHandlerWriter-1510467688" #10 daemon prio=5 os_prio=0 tid=0x00007fde88168800 nid=0x63e waiting on condition [0x00007fde7475c000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c02f16e8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)
at java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522)
at java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684)
at org.apache.juli.AsyncFileHandler$LoggerThread.run(AsyncFileHandler.java:160)
"Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007fde880af000 nid=0x62e runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007fde880ac000 nid=0x62d waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007fde880a9000 nid=0x62c waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007fde880a7000 nid=0x62b runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007fde88080000 nid=0x625 in Object.wait() [0x00007fde74f33000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000000c02f7408> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)
"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007fde8807b800 nid=0x622 in Object.wait() [0x00007fde75034000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000000c02f7490> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
"main" #1 prio=5 os_prio=0 tid=0x00007fde8800a800 nid=0x589 runnable [0x00007fde8f6af000]
java.lang.Thread.State: RUNNABLE
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at org.apache.catalina.core.StandardServer.await(StandardServer.java:466)
at org.apache.catalina.startup.Catalina.await(Catalina.java:744)
at org.apache.catalina.startup.Catalina.start(Catalina.java:690)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:355)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:495)
"VM Thread" os_prio=0 tid=0x00007fde88073800 nid=0x5fd runnable
"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007fde8801f800 nid=0x597 runnable
"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007fde88021000 nid=0x598 runnable
"VM Periodic Task Thread" os_prio=0 tid=0x00007fde880bd800 nid=0x62f waiting on condition
JNI global references: 317
Update: While I haven't resolved my problem, @Per Huss's answer pushed me towards the right direction on analysing each thread separately and find the problem. I have to award my bounty now, therefore, I will award it to him. However, I thank everyone who commented here, as all comments helped me to learn something new.
更新:虽然我还没有解决我的问题,但@Per Huss的回答让我朝着正确的方向分别分析每个线程并找到问题。我现在必须奖励我的赏金,因此,我将奖励给他。但是,我感谢在此发表评论的所有人,因为所有评论都帮助我学习了一些新东西。
Update 2: It looks like the problem is within apache. On peak times even static pages have slow response time, even the ones from other apps. Including tomcat manager. So, I changed prefork
to mpm_worker
and currently testing different configurations. I will update this thread with the results, soon.
更新2:看起来问题是在apache中。在高峰时段,甚至静态页面的响应时间也很慢,甚至是其他应用程序的响应时间。包括tomcat经理。所以,我将prefork更改为mpm_worker,目前正在测试不同的配置。我很快就会用结果更新这个帖子。
6 个解决方案
#1
3
I'm afraid the question does not currently have enough information for anything other than guesses. The low CPU usage indicates that your java process is waiting for something, which could be anything from obtaining a connection to the database, waiting for the result of a query, or anything else. I would start by looking at what is causing the wait, before trying to fix it. One way to do that is to run
我担心这个问题目前没有足够的信息用于猜测以外的任何事情。低CPU使用率表明您的java进程正在等待某些事情,这可能是从获取连接到数据库,等待查询结果或其他任何事情。在尝试修复之前,我将首先查看导致等待的原因。一种方法是运行
jstack <pid>
(where <pid>
would be your process' pid) during peak. It will list a stack trace for each thread. You may be able to spot the problem from that, or you can paste it into the question and perhaps the community can help you out. Good luck with your tuning!
(高峰期间,
#2
2
You can allow as many threads as you want, but if the number of queries increases, then the response time of the RDBMS will deteriorate, which is probably your root cause.
您可以根据需要允许任意数量的线程,但如果查询数量增加,则RDBMS的响应时间将会恶化,这可能是您的根本原因。
You need to determine where he bottleneck is. Create a dummy page and issue requests to it like a maniac from several computers. If the dummy page responds in time, then your problem is loosely related if at all to connection number and much more to your database. It is highly probable that this is the case.
你需要确定他的瓶颈在哪里。创建一个虚拟页面并向其发出请求,就像来自多台计算机的疯子一样。如果虚拟页面及时响应,那么您的问题与连接数量和数据库的更多相关性是松散的。这种情况极有可能发生。
Take a look at your database, make sure your schema is in normal form. Also, if you search frequently by some columns, make sure you create the correct indexes. Take a look at your queries and observe whether they are unnecessarily slow. If so, optimize them. Cache some data which does not change too frequently and reuse it.
查看您的数据库,确保您的架构处于正常状态。此外,如果您经常搜索某些列,请确保创建正确的索引。查看您的查询并观察它们是否不必要地慢。如果是这样,请优化它们。缓存一些不会过于频繁更改并重复使用的数据。
#3
1
When using spring the default is Tomcat. How ever you can use Netty or Undertow or Jetty for better performance. Please also remember that despite having a 2 core CPU you don't really have 500 threads.
使用spring时,默认值为Tomcat。您如何使用Netty或Undertow或Jetty获得更好的性能。还要记住,尽管拥有2核CPU,但你并没有500线程。
How ever the above answer to actually simulate how your application reacts to traffic is probably the best way to go. If using relation database remember that writes can be even ten times slower than reads (you can see some interesting statistics on that in Cassandra documentation). If using hibernate you may want to look for n+1 problem too. Best way to do that: write an integration test, log sql sent to database. If your test sends 51 instead of one query there you have it.
实际上模拟应用程序如何响应流量的上述答案可能是最好的方法。如果使用关系数据库,请记住写入甚至比读取慢十倍(您可以在Cassandra文档中看到一些有趣的统计信息)。如果使用休眠,你可能也想查找n + 1问题。最好的方法是:编写集成测试,将log sql发送到数据库。如果您的测试发送51而不是一个查询,那么您就拥有它。
#4
1
If there are so many connections, shouldn't tomcat create more threads (up to 500) to respond faster?
如果有这么多连接,tomcat是否应该创建更多线程(最多500个)来更快地响应?
=> As per Tomcat8 docs, if more simultaneous requests are received than can be handled by the currently available request processing threads, additional threads will be created up to the configured maximum (the value of the maxThreads attribute). If still more simultaneous requests are received, they are stacked up inside the server socket created by the Connector, up to the configured maximum (the value of the acceptCount attribute).
=>根据Tomcat8文档,如果收到的并发请求多于当前可用请求处理线程可以处理的请求,则将创建其他线程,直到达到配置的最大值(maxThreads属性的值)。如果收到更多并发请求,它们将堆叠在连接器创建的服务器套接字内,直到达到配置的最大值(acceptCount属性的值)。
So your tomcat must be creating the threads as required. Also, Tomcat 8 by default works in NIO mode meaning one thread can serve multiple requests. You can confirm that behavior by starting monitoring tool like "jvisualvm" during your load test.
所以你的tomcat必须根据需要创建线程。此外,Tomcat 8默认在NIO模式下工作,这意味着一个线程可以为多个请求提供服务。您可以通过在负载测试期间启动监视工具(如“jvisualvm”)来确认该行为。
Live threads: This shows the current number of live/active threads including both daemon and non-daemon threads(Currently running).
实时线程:显示当前活动/活动线程数,包括守护程序和非守护程序线程(当前正在运行)。
Live Peak: This gives the peak count of live threads since the Java virtual machine started or peak was reset.
实时峰值:这表示自Java虚拟机启动或峰值重置以来活动线程的峰值计数。
Daemon Threads: This gives the current number of live daemon threads.
守护程序线程:这提供了当前活动守护程序线程的数量。
Total Threads: This gives the total number of threads created and also started since the Java virtual machine started.
总线程数:这给出了自Java虚拟机启动以来创建和启动的线程总数。
So, what am I doing wrong? What am I missing? I have 2 core CPU (max usage during peak hours ~10%) and 2GB of RAM (max usage 60%).CPU usage doesn't grow, RAM is almost half-free if I am having too many requests, shouldn't I have overloaded on the server? Instead, I just get slow response time.
那么,我做错了什么?我错过了什么?我有2个核心CPU(高峰时段的最大使用量~10%)和2GB的RAM(最大使用率60%).CPU使用率没有增长,如果我有太多请求,RAM几乎是半免费的,不应该我在服务器上超载了?相反,我的反应时间很慢。
=>IMO, threads are blocking while fetching data from DB. It could be due to the poor performance of query during load times. I would suggest enabling "hibernate.show_sql" capture the SQL. Check the execution plan of the SQL, ensure that indexes are being applied. You can also check the performance of query during load time, by executing it on SQL client.
=> IMO,线程在从DB获取数据时阻塞。这可能是由于加载时间内查询性能不佳所致。我建议启用“hibernate.show_sql”捕获SQL。检查SQL的执行计划,确保正在应用索引。您还可以通过在SQL客户端上执行查询来加载查询期间的性能。
#5
1
I finally solved my problem. In fact, it was apache that didn't allow enough connections. First of all, I changed prefork
to mpm worker
. Later, I increased the number of MaxRequestWorkers
.
我终于解决了我的问题。事实上,它是apache,不允许足够的连接。首先,我将prefork改为mpm worker。后来,我增加了MaxRequestWorkers的数量。
<IfModule mpm_worker_module>
StartServers 2
MinSpareThreads 50
MaxSpareThreads 125
ThreadLimit 64
ThreadsPerChild 25
ServerLimit 5000
MaxRequestWorkers 5000
MaxConnectionsPerChild 4500
Earlier, I was getting slow response time already with 3000 active users. With the new configuration even 17000 active users didn't increase response time and it was working like in normal times. As expected, CPU usage and RAM increased on peak time and then went back to normal.
早些时候,我的响应时间已经很慢,有3000名活跃用户。使用新配置,即使是17000个活跃用户也没有增加响应时间,并且它在正常时间工作。正如预期的那样,CPU使用率和RAM在高峰时间增加,然后恢复正常。
#6
1
In these kind of bugs , First of all we should identify where is problem . Here giving a plan of action for debugging these type of issues :
在这些类型的错误中,首先我们应该确定问题所在。这里给出了调试这些类型问题的行动计划:
For example in your case
例如在你的情况下
Requests come from user to tomcat , then it will give it to your application .
请求从用户到tomcat,然后它将它提供给您的应用程序。
First of all , check where is issue i.e. there can be issues in following places:
首先,检查问题所在,即以下地方可能存在问题:
- Your any application api or all api started taking time, but tomcat threads are free
- Your tomcat threads are not free and processing of these each thread is taking time, so latency occurs
- You database starts taking time
- As you are querying the database, there may be case that more data is being loaded into your app and some java gc issues started occurs
您的任何应用程序api或所有api都开始花时间,但tomcat线程是免费的
您的tomcat线程不是免费的,并且处理这些每个线程都需要时间,因此会发生延迟
数据库开始花时间
在查询数据库时,可能会有更多数据被加载到您的应用程序中并且发生了一些java gc问题
So, in first case , please check your application logs and if logs are not there please put the logs and check , if any of your application is taking time (Logs Never Lies )
因此,在第一种情况下,请检查您的应用程序日志,如果日志不存在,请将日志和检查,如果您的任何应用程序花费时间(日志从不说谎)
In the second case, check your tomcat logs that what is condition there .
在第二种情况下,检查您的tomcat日志,那里有什么条件。
In third case, please check your database logs , that queries is taking more time or not .
在第三种情况下,请检查您的数据库日志,查询是否需要更多时间。
In the fourth case, you can monitor your java health monitoring , there are many tools in market like jfr, jcisualvm etc ..
在第四种情况下,您可以监控您的java健康监控,市场上有很多工具,如jfr,jcisualvm等。
Also, your question has not enough explanation, please answer the following
此外,您的问题没有足够的解释,请回答以下问题
- What is sample structure of your application?
- What you do to bring back your application not normal state, for example, restarting solves your issue or not? I am asking this because if you need to restart it, then there may be deadlock so you might need to take jstack and analyse it
- How much XMX is given to the application?
- Are your database server and application server on the same machine? Because there may be some io problem in peak time on some machine, so we need to check both
您的应用程序的示例结构是什么?
你做了什么来恢复你的应用程序不正常状态,例如,重新启动解决你的问题?我问这个是因为如果你需要重新启动它,那么可能会出现死锁,所以你可能需要采取jstack并对其进行分析
XMX给应用程序多少钱?
您的数据库服务器和应用程序服务器在同一台计算机上因为某些机器在高峰时间可能存在一些问题,所以我们需要检查两者
Please identify first where is problem, then we can proceed further, how we can identify and solve the issue .
请首先确定问题所在,然后我们可以继续进行,如何识别和解决问题。
Thanks
#1
3
I'm afraid the question does not currently have enough information for anything other than guesses. The low CPU usage indicates that your java process is waiting for something, which could be anything from obtaining a connection to the database, waiting for the result of a query, or anything else. I would start by looking at what is causing the wait, before trying to fix it. One way to do that is to run
我担心这个问题目前没有足够的信息用于猜测以外的任何事情。低CPU使用率表明您的java进程正在等待某些事情,这可能是从获取连接到数据库,等待查询结果或其他任何事情。在尝试修复之前,我将首先查看导致等待的原因。一种方法是运行
jstack <pid>
(where <pid>
would be your process' pid) during peak. It will list a stack trace for each thread. You may be able to spot the problem from that, or you can paste it into the question and perhaps the community can help you out. Good luck with your tuning!
(高峰期间,
#2
2
You can allow as many threads as you want, but if the number of queries increases, then the response time of the RDBMS will deteriorate, which is probably your root cause.
您可以根据需要允许任意数量的线程,但如果查询数量增加,则RDBMS的响应时间将会恶化,这可能是您的根本原因。
You need to determine where he bottleneck is. Create a dummy page and issue requests to it like a maniac from several computers. If the dummy page responds in time, then your problem is loosely related if at all to connection number and much more to your database. It is highly probable that this is the case.
你需要确定他的瓶颈在哪里。创建一个虚拟页面并向其发出请求,就像来自多台计算机的疯子一样。如果虚拟页面及时响应,那么您的问题与连接数量和数据库的更多相关性是松散的。这种情况极有可能发生。
Take a look at your database, make sure your schema is in normal form. Also, if you search frequently by some columns, make sure you create the correct indexes. Take a look at your queries and observe whether they are unnecessarily slow. If so, optimize them. Cache some data which does not change too frequently and reuse it.
查看您的数据库,确保您的架构处于正常状态。此外,如果您经常搜索某些列,请确保创建正确的索引。查看您的查询并观察它们是否不必要地慢。如果是这样,请优化它们。缓存一些不会过于频繁更改并重复使用的数据。
#3
1
When using spring the default is Tomcat. How ever you can use Netty or Undertow or Jetty for better performance. Please also remember that despite having a 2 core CPU you don't really have 500 threads.
使用spring时,默认值为Tomcat。您如何使用Netty或Undertow或Jetty获得更好的性能。还要记住,尽管拥有2核CPU,但你并没有500线程。
How ever the above answer to actually simulate how your application reacts to traffic is probably the best way to go. If using relation database remember that writes can be even ten times slower than reads (you can see some interesting statistics on that in Cassandra documentation). If using hibernate you may want to look for n+1 problem too. Best way to do that: write an integration test, log sql sent to database. If your test sends 51 instead of one query there you have it.
实际上模拟应用程序如何响应流量的上述答案可能是最好的方法。如果使用关系数据库,请记住写入甚至比读取慢十倍(您可以在Cassandra文档中看到一些有趣的统计信息)。如果使用休眠,你可能也想查找n + 1问题。最好的方法是:编写集成测试,将log sql发送到数据库。如果您的测试发送51而不是一个查询,那么您就拥有它。
#4
1
If there are so many connections, shouldn't tomcat create more threads (up to 500) to respond faster?
如果有这么多连接,tomcat是否应该创建更多线程(最多500个)来更快地响应?
=> As per Tomcat8 docs, if more simultaneous requests are received than can be handled by the currently available request processing threads, additional threads will be created up to the configured maximum (the value of the maxThreads attribute). If still more simultaneous requests are received, they are stacked up inside the server socket created by the Connector, up to the configured maximum (the value of the acceptCount attribute).
=>根据Tomcat8文档,如果收到的并发请求多于当前可用请求处理线程可以处理的请求,则将创建其他线程,直到达到配置的最大值(maxThreads属性的值)。如果收到更多并发请求,它们将堆叠在连接器创建的服务器套接字内,直到达到配置的最大值(acceptCount属性的值)。
So your tomcat must be creating the threads as required. Also, Tomcat 8 by default works in NIO mode meaning one thread can serve multiple requests. You can confirm that behavior by starting monitoring tool like "jvisualvm" during your load test.
所以你的tomcat必须根据需要创建线程。此外,Tomcat 8默认在NIO模式下工作,这意味着一个线程可以为多个请求提供服务。您可以通过在负载测试期间启动监视工具(如“jvisualvm”)来确认该行为。
Live threads: This shows the current number of live/active threads including both daemon and non-daemon threads(Currently running).
实时线程:显示当前活动/活动线程数,包括守护程序和非守护程序线程(当前正在运行)。
Live Peak: This gives the peak count of live threads since the Java virtual machine started or peak was reset.
实时峰值:这表示自Java虚拟机启动或峰值重置以来活动线程的峰值计数。
Daemon Threads: This gives the current number of live daemon threads.
守护程序线程:这提供了当前活动守护程序线程的数量。
Total Threads: This gives the total number of threads created and also started since the Java virtual machine started.
总线程数:这给出了自Java虚拟机启动以来创建和启动的线程总数。
So, what am I doing wrong? What am I missing? I have 2 core CPU (max usage during peak hours ~10%) and 2GB of RAM (max usage 60%).CPU usage doesn't grow, RAM is almost half-free if I am having too many requests, shouldn't I have overloaded on the server? Instead, I just get slow response time.
那么,我做错了什么?我错过了什么?我有2个核心CPU(高峰时段的最大使用量~10%)和2GB的RAM(最大使用率60%).CPU使用率没有增长,如果我有太多请求,RAM几乎是半免费的,不应该我在服务器上超载了?相反,我的反应时间很慢。
=>IMO, threads are blocking while fetching data from DB. It could be due to the poor performance of query during load times. I would suggest enabling "hibernate.show_sql" capture the SQL. Check the execution plan of the SQL, ensure that indexes are being applied. You can also check the performance of query during load time, by executing it on SQL client.
=> IMO,线程在从DB获取数据时阻塞。这可能是由于加载时间内查询性能不佳所致。我建议启用“hibernate.show_sql”捕获SQL。检查SQL的执行计划,确保正在应用索引。您还可以通过在SQL客户端上执行查询来加载查询期间的性能。
#5
1
I finally solved my problem. In fact, it was apache that didn't allow enough connections. First of all, I changed prefork
to mpm worker
. Later, I increased the number of MaxRequestWorkers
.
我终于解决了我的问题。事实上,它是apache,不允许足够的连接。首先,我将prefork改为mpm worker。后来,我增加了MaxRequestWorkers的数量。
<IfModule mpm_worker_module>
StartServers 2
MinSpareThreads 50
MaxSpareThreads 125
ThreadLimit 64
ThreadsPerChild 25
ServerLimit 5000
MaxRequestWorkers 5000
MaxConnectionsPerChild 4500
Earlier, I was getting slow response time already with 3000 active users. With the new configuration even 17000 active users didn't increase response time and it was working like in normal times. As expected, CPU usage and RAM increased on peak time and then went back to normal.
早些时候,我的响应时间已经很慢,有3000名活跃用户。使用新配置,即使是17000个活跃用户也没有增加响应时间,并且它在正常时间工作。正如预期的那样,CPU使用率和RAM在高峰时间增加,然后恢复正常。
#6
1
In these kind of bugs , First of all we should identify where is problem . Here giving a plan of action for debugging these type of issues :
在这些类型的错误中,首先我们应该确定问题所在。这里给出了调试这些类型问题的行动计划:
For example in your case
例如在你的情况下
Requests come from user to tomcat , then it will give it to your application .
请求从用户到tomcat,然后它将它提供给您的应用程序。
First of all , check where is issue i.e. there can be issues in following places:
首先,检查问题所在,即以下地方可能存在问题:
- Your any application api or all api started taking time, but tomcat threads are free
- Your tomcat threads are not free and processing of these each thread is taking time, so latency occurs
- You database starts taking time
- As you are querying the database, there may be case that more data is being loaded into your app and some java gc issues started occurs
您的任何应用程序api或所有api都开始花时间,但tomcat线程是免费的
您的tomcat线程不是免费的,并且处理这些每个线程都需要时间,因此会发生延迟
数据库开始花时间
在查询数据库时,可能会有更多数据被加载到您的应用程序中并且发生了一些java gc问题
So, in first case , please check your application logs and if logs are not there please put the logs and check , if any of your application is taking time (Logs Never Lies )
因此,在第一种情况下,请检查您的应用程序日志,如果日志不存在,请将日志和检查,如果您的任何应用程序花费时间(日志从不说谎)
In the second case, check your tomcat logs that what is condition there .
在第二种情况下,检查您的tomcat日志,那里有什么条件。
In third case, please check your database logs , that queries is taking more time or not .
在第三种情况下,请检查您的数据库日志,查询是否需要更多时间。
In the fourth case, you can monitor your java health monitoring , there are many tools in market like jfr, jcisualvm etc ..
在第四种情况下,您可以监控您的java健康监控,市场上有很多工具,如jfr,jcisualvm等。
Also, your question has not enough explanation, please answer the following
此外,您的问题没有足够的解释,请回答以下问题
- What is sample structure of your application?
- What you do to bring back your application not normal state, for example, restarting solves your issue or not? I am asking this because if you need to restart it, then there may be deadlock so you might need to take jstack and analyse it
- How much XMX is given to the application?
- Are your database server and application server on the same machine? Because there may be some io problem in peak time on some machine, so we need to check both
您的应用程序的示例结构是什么?
你做了什么来恢复你的应用程序不正常状态,例如,重新启动解决你的问题?我问这个是因为如果你需要重新启动它,那么可能会出现死锁,所以你可能需要采取jstack并对其进行分析
XMX给应用程序多少钱?
您的数据库服务器和应用程序服务器在同一台计算机上因为某些机器在高峰时间可能存在一些问题,所以我们需要检查两者
Please identify first where is problem, then we can proceed further, how we can identify and solve the issue .
请首先确定问题所在,然后我们可以继续进行,如何识别和解决问题。
Thanks