Arthas
Arthas 使用手册
Arthas 是一款线上监控诊断产品,通过全局视角实时查看应用 load、内存、gc、线程的状态信息,并能在不修改应用代码的情况下,对业务问题进行诊断,包括查看方法调用的出入参、异常,监测方法执行耗时,类加载信息等,大大提升线上问题排查效率。
本章因包含使用示例、命令解析、结果解析篇幅较长,可针对性按目录查看。
下载与安装
- 下载 Arthas
可以通过以下命令下载 Arthas:
$ curl -O https://arthas.aliyun.com/arthas-boot.jar- 启动 Arthas
使用以下命令启动 Arthas:
$ java -jar arthas-boot.jar若提示无法找到 jps 或无法列出 JVM 进程,可以指定 pid
$ java -jar arthas-boot.jar 10- 选择目标进程
Arthas 启动后,会列出当前运行的 Java 进程,输入对应的编号,回车进入监控进程。
[INFO] JAVA_HOME: /Library/Java/JavaVirtualMachines/graalvm-community-openjdk-21/Contents/Home
[INFO] arthas-boot version: 4.0.4
[INFO] Found existing java process, please choose one and input the serial number of the process, eg : 1. Then hit ENTER.
* [1]: 11688 team.aikero.blade.examples.file.BladeFileApplicationKt常用命令
查看程序状态
使用 dashboard 命令可以查看当前 JVM 的内存、线程、GC 等信息:
$ dashboard
Runtime
ID NAME GROUP PRIORITY STATE %CPU DELTA_TIME TIME INTERRUPTED DAEMON
14 JVMCI-nat system 9 RUNNABLE 0.31 0.015 0:3.008 false true
67 Timer-for system 5 RUNNABLE 0.13 0.006 0:0.042 false true
65 arthas-Ne system 5 RUNNABLE 0.05 0.002 0:0.083 false true
40 Catalina- main 1 WAITING 0.02 0.000 0:0.136 false false
39 Catalina- main 1 TIMED_WAITING 0.02 0.000 0:0.149 false false
52 http-nio- main 5 RUNNABLE 0.0 0.000 0:0.078 false true
9 Reference system 10 RUNNABLE 0.0 0.000 0:0.002 false true
10 Finalizer system 8 WAITING 0.0 0.000 0:0.000 false true
Memory used total max usage GC
heap 39M 62M 136M 29.37% gc.g1_young_generation.count 15
g1_eden_space 5M 18M -1 27.78% gc.g1_young_generation.time(ms) 65
g1_old_gen 31M 41M 136M 23.48% gc.g1_concurrent_gc.count 8
g1_survivor_space 3M 3M -1 100.00% gc.g1_concurrent_gc.time(ms) 33
nonheap 92M 96M -1 96.06% gc.g1_old_generation.count 0
codeheap_'non-nmethods' 1M 2M 6M 24.09% gc.g1_old_generation.time(ms) 0
metaspace 65M 66M -1 99.09%从 JVM 状态信息中,我们可以分析出以下几点:
- 内存使用情况
Heap (堆内存): 使用了 39M,总大小为 62M,最大为 136M,使用率为 29.37%。
G1 Eden Space: 使用了 5M,总大小为 18M,使用率为 27.78%。
G1 Old Gen: 使用了 31M,总大小为 41M,最大为 136M,使用率为 23.48%。
G1 Survivor Space: 使用了 3M,总大小为 3M,使用率为 100%。
Non-Heap (非堆内存): 使用了 92M,总大小为 96M,使用率为 96.06%。
Metaspace: 使用了 65M,总大小为 66M,使用率为 99.09%。Metaspace
- GC (垃圾回收) 情况
- G1 Young Generation: 发生了 15 次 GC,耗时 65ms。
- G1 Old Generation: 发生了 0 次 GC,耗时 0ms。
- G1 Concurrent GC: 发生了 8 次并发 GC,耗时 33ms。
- 线程状态
- JVMCI-native: 占用 0.31% 的 CPU,处于 RUNNABLE 状态。
- Timer-for: 占用 0.13% 的 CPU,处于 RUNNABLE 状态。
- arthas-Netty: 占用 0.05% 的 CPU,处于 RUNNABLE 状态。
- Catalina-utility: 处于 WAITING 和 TIMED_WAITING 状态,占用 CPU 较低。
- http-nio: 占用 0% 的 CPU,处于 RUNNABLE 状态。
总结:
- 内存使用:堆内存使用率较低,非堆内存(特别是 Metaspace)使用率较高,可能需要关注。
- GC 情况:年轻代的 GC 频率较高,但每次耗时较短,老年代没有发生 GC,整体 GC 压力不大。
- 线程状态:系统线程和 Arthas 相关线程占用 CPU 较低,Tomcat 线程处于等待状态,整体线程状态正常。
查看线程信息
使用 thread 命令可以查看线程的堆栈信息:
查看活跃线程
$ thread
Threads Total: 33, NEW: 0, RUNNABLE: 14, BLOCKED: 0, WAITING: 14, TIMED_WAITING: 5, TERMINATED: 0
ID NAME GROUP PRIORITY STATE %CPU DELTA_TIME TIME INTERRUPTED DAEMON
66 arthas-command-execute system 5 RUNNABLE 0.23 0.000 0:0.004 false true
9 Reference Handler system 10 RUNNABLE 0.0 0.000 0:0.002 false true
10 Finalizer system 8 WAITING 0.0 0.000 0:0.000 false true
11 Signal Dispatcher system 9 RUNNABLE 0.0 0.000 0:0.000 false true
14 JVMCI-native CompilerThread0 system 9 RUNNABLE 0.0 0.000 0:3.599 false true查看 top n
$ thread -n 1
"arthas-command-execute" Id=66 cpuUsage=0.23% deltaTime=0ms time=41ms RUNNABLE
at java.management@21.0.2/sun.management.ThreadImpl.dumpThreads0(Native Method)
at java.management@21.0.2/sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:482)
at com.taobao.arthas.core.command.monitor200.ThreadCommand.processTopBusyThreads(ThreadCommand.java:206)
at com.taobao.arthas.core.command.monitor200.ThreadCommand.process(ThreadCommand.java:122)
at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl.process(AnnotatedCommandImpl.java:82)
at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl.access$100(AnnotatedCommandImpl.java:18)
at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl$ProcessHandler.handle(AnnotatedCommandImpl.java:111)
at com.taobao.arthas.core.shell.command.impl.AnnotatedCommandImpl$ProcessHandler.handle(AnnotatedCommandImpl.java:108)
at com.taobao.arthas.core.shell.system.impl.ProcessImpl$CommandProcessTask.run(ProcessImpl.java:385)
at java.base@21.0.2/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
at java.base@21.0.2/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:317)
at java.base@21.0.2/java.util.concurrent.FutureTask.run(FutureTask.java)
at java.base@21.0.2/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base@21.0.2/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base@21.0.2/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base@21.0.2/java.lang.Thread.runWith(Thread.java:1596)
at java.base@21.0.2/java.lang.Thread.run(Thread.java:1583)反编译类
使用 jad 命令可以反编译指定的类:
提示
反编译由 class 文件反编译到 Java,并非完全等价于源码,可用于逻辑验证,但不一定能一一对应。
kotlin 编译器会对 kotlin 代码进行增强处理,编译到 class 后与源码文件差异较大,但不妨碍使用
$ jad team.aikero.blade.examples.file.BladeFileApplicationKt
ClassLoader:
+-jdk.internal.loader.ClassLoaders$AppClassLoader@18b4aac2
+-jdk.internal.loader.ClassLoaders$PlatformClassLoader@5377414a
Location:
/Users/sivan/Developer/IDEA/Work-ZJ/blade-examples/blade-examples-file/build/classes/kotlin/main/
/*
* Decompiled with CFR.
*
* Could not load the following classes:
* team.aikero.blade.examples.file.BladeFileApplication
*/
package team.aikero.blade.examples.file;
import java.util.Arrays;
import kotlin.Metadata;
import kotlin.jvm.internal.Intrinsics;
import kotlin.jvm.internal.SourceDebugExtension;
import org.jetbrains.annotations.NotNull;
import org.springframework.boot.SpringApplication;
import team.aikero.blade.examples.file.BladeFileApplication;
@Metadata(mv={2, 0, 0}, k=2, xi=48, d1={"\n\n\n\n\n\n\b02\f\b00¢¨"}, d2={"main", "", "args", "", "", "([Ljava/lang/String;)V", "blade-examples-file"})
@SourceDebugExtension(value={"SMAP\nBladeFileApplication.kt\nKotlin\n*S Kotlin\n*F\n+ 1 BladeFileApplication.kt\nteam/aikero/blade/examples/file/BladeFileApplicationKt\n+ 2 SpringApplicationExtensions.kt\norg/springframework/boot/SpringApplicationExtensionsKt\n*L\n1#1,13:1\n34#2:14\n*S KotlinDebug\n*F\n+ 1 BladeFileApplication.kt\nteam/aikero/blade/examples/file/BladeFileApplicationKt\n*L\n11#1:14\n*E\n"})
public final class BladeFileApplicationKt {
public static final void main(@NotNull String[] args) {
Intrinsics.checkNotNullParameter(args, "args");
/*11*/ String[] args$iv = Arrays.copyOf(args, args.length);
boolean $i$f$runApplication = false;
Intrinsics.checkNotNullExpressionValue(SpringApplication.run(BladeFileApplication.class, Arrays.copyOf(args$iv, args$iv.length)), "run(T::class.java, *args)");
}
}
Affect(row-cnt:1) cost in 144 ms.查看 JVM 信息
使用 jvm 命令可以查看程序启动时的 JVM 参数、类加载情况、内存分区、系统、线程等概览信息:
$ jvm
RUNTIME
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
MACHINE-NAME 11688@Sivan.local
JVM-START-TIME 2025-01-22 17:38:42
MANAGEMENT-SPEC-VERSION 4.0
SPEC-NAME Java Virtual Machine Specification
SPEC-VENDOR Oracle Corporation
SPEC-VERSION 21
VM-NAME OpenJDK 64-Bit Server VM
VM-VENDOR GraalVM Community
VM-VERSION 21.0.2+13-jvmci-23.1-b30
INPUT-ARGUMENTS -XX:ThreadPriorityPolicy=1
-XX:+UnlockExperimentalVMOptions
-XX:+EnableJVMCIProduct
-XX:-UnlockExperimentalVMOptions
-agentlib:jdwp=transport=dt_socket,address=127.0.0.1:58275,suspend=y,server=n
-Xmx130m
-XX:+HeapDumpOnOutOfMemoryError
-javaagent:/Users/sivan/.gradle/caches/modules-2/files-2.1/org.jetbrains.kotlinx/kotlinx-coroutines-core-jvm/1.10.1/fe066928754beda3d59c8282
e04289546465a360/kotlinx-coroutines-core-jvm-1.10.1.jar
-Dspring.output.ansi.enabled=always
-Dcom.sun.management.jmxremote
-Dspring.jmx.enabled=true
-Dspring.liveBeansView.mbeanDomain
-Dspring.application.admin.enabled=true
-Dmanagement.endpoints.jmx.exposure.include=*
-Dkotlinx.coroutines.debug.enable.creation.stack.trace=false
-Ddebugger.agent.enable.coroutines=true
-Dkotlinx.coroutines.debug.enable.flows.stack.trace=true
-Dkotlinx.coroutines.debug.enable.mutable.state.flows.stack.trace=true
-Dfile.encoding=UTF-8
-Dsun.stdout.encoding=UTF-8
-Dsun.stderr.encoding=UTF-8
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CLASS-LOADING
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
LOADED-CLASS-COUNT 13746
TOTAL-LOADED-CLASS-COUNT 13995
UNLOADED-CLASS-COUNT 249
IS-VERBOSE false
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
COMPILATION
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
NAME HotSpot 64-Bit Tiered Compilers
TOTAL-COMPILE-TIME 7512
[time (ms)]
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GARBAGE-COLLECTORS
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
G1 Young Generation name : G1 Young Generation
[count/time (ms)] collectionCount : 26
collectionTime : 99
G1 Concurrent GC name : G1 Concurrent GC
[count/time (ms)] collectionCount : 14
collectionTime : 51
G1 Old Generation name : G1 Old Generation
[count/time (ms)] collectionCount : 0
collectionTime : 0
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
MEMORY-MANAGERS
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
CodeCacheManager CodeHeap 'non-nmethods'
CodeHeap 'profiled nmethods'
CodeHeap 'non-profiled nmethods'
Metaspace Manager Metaspace
Compressed Class Space
G1 Young Generation G1 Eden Space
G1 Survivor Space
G1 Old Gen
G1 Concurrent GC G1 Old Gen
G1 Old Generation G1 Eden Space
G1 Survivor Space
G1 Old Gen
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
MEMORY
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HEAP-MEMORY-USAGE init : 142606336(136.0 MiB)
[memory in bytes] used : 78803288(75.2 MiB)
committed : 104857600(100.0 MiB)
max : 142606336(136.0 MiB)
NO-HEAP-MEMORY-USAGE init : 7667712(7.3 MiB)
[memory in bytes] used : 108579560(103.5 MiB)
committed : 112590848(107.4 MiB)
max : -1(-1 B)
PENDING-FINALIZE-COUNT 0
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OPERATING-SYSTEM
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
OS Mac OS X
ARCH aarch64
PROCESSORS-COUNT 10
LOAD-AVERAGE 4.22119140625
VERSION 15.2
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
THREAD
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
COUNT 33
DAEMON-COUNT 29
PEAK-COUNT 34
STARTED-COUNT 40
DEADLOCK-COUNT 0
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
FILE-DESCRIPTOR
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
MAX-FILE-DESCRIPTOR-COUNT -1
OPEN-FILE-DESCRIPTOR-COUNT -1查看 Spring 对象
使用 vmtool 命令可以查看当前 JVM 所有对象实例:
# 获取 Spring Context 输出当前应用名
$ vmtool --action getInstances --className org.springframework.context.ConfigurableApplicationContext --express 'instances[0].getEnvironment().getProperty("spring.application.name")'
@String[blade-examples-file]更新日志层级
$ logger --name ROOT --level debug查找方法调用栈
当对于复杂的业务逻辑,一个方法的入口可能有多个入口,可以通过stack查找方法的调用栈,追踪调用链路
# 某方法执行时间 > 5ms 的调用栈
$ stack demo.MathGame primeFactors '#cost>5'
# 当某方法第一个参数小于 0 时输出,-n 限制输出次数
$ stack demo.MathGame primeFactors 'params[0]<0' -n 2高级命令
watch 命令
watch 命令用于观察方法的入参和返回值:
$ watch team.aikero.blade.examples.file.UploadController upload "{params,returnObj,target,throwExp}" -b -e -s
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 128 ms, listenerId: 1
method=team.aikero.blade.examples.file.UploadController.upload location=AtEnter
ts=2025-01-22 19:09:07.533; [cost=0.550334ms] result=@ArrayList[
@Object[][isEmpty=false;size=1],
null,
@UploadController[team.aikero.blade.examples.file.UploadController@68682797],
null,
]命令解析:
- "{params,returnObj,target,throwExp}" : 观察方法的入参、返回值、方法当前对象、异常信息
- -x 3 : 对象属性遍历深度,最大 4,可展开对象属性
- -b: 方法执行前,输出结果
- -s: 方法结束后,输出结果
- -e: 方法异常后,输出结果
输出解析:
method=team.aikero.blade.examples.file.UploadController.upload location=AtEnter <-- 结果输出时机,进入方法时
ts=2025-01-22 19:09:07.533; <-- 方法执行时间
[cost=0.550334ms] <-- 方法执行耗时
result=@ArrayList[ <-- 观测结果,按 "{params,returnObj,target,throwExp}" 定义顺序返回
@Object[][isEmpty=false;size=1], <-- params 数组,无论方法入参有多少,始终是 Object[]
null, <-- returnObj 方法返回值
@UploadController[team.aikero.blade.examples.file.UploadController@68682797], <-- 当前对象
null, <-- 方法异常信息
]trace 命令
trace 命令用于追踪方法的调用链路及执行时间:
$ trace team.aikero.blade.examples.file.UploadController upload
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 112 ms, listenerId: 2
`---ts=2025-01-22 19:14:08.680;thread_name=http-nio-8080-exec-2;id=43;is_daemon=true;priority=5;TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@316acbb5
`---[565.124833ms] team.aikero.blade.examples.file.UploadController:upload()
+---[0.00% 0.021083ms ] kotlin.jvm.internal.Intrinsics:checkNotNullParameter()
+---[0.06% 0.332541ms ] org.springframework.web.multipart.MultipartFile:getInputStream() #33
+---[0.00% 0.008667ms ] team.aikero.blade.examples.file.UploadController:getOssTemplate() #34
+---[0.00% 0.003458ms ] org.springframework.web.multipart.MultipartFile:getOriginalFilename() #34
+---[0.00% min=0.012458ms,max=0.0155ms,total=0.027958ms,count=2] kotlin.jvm.internal.Intrinsics:checkNotNull() #34
+---[0.00% 0.027875ms ] org.springframework.web.multipart.MultipartFile:getSize() #34
+---[89.76% 507.259ms ] team.aikero.blade.oss.OssTemplate:upload$default() #34
+---[0.00% 0.017209ms ] kotlin.io.CloseableKt:closeFinally() #33
+---[0.00% 0.002708ms ] team.aikero.blade.examples.file.UploadController:getOssTemplate() #36
+---[0.00% 0.003125ms ] org.springframework.web.multipart.MultipartFile:getOriginalFilename() #36
+---[0.00% 0.0025ms ] kotlin.jvm.internal.Intrinsics:checkNotNull() #36
`---[10.11% 57.106917ms ] team.aikero.blade.oss.OssTemplate:remove$default() #36输出解析:
`---ts=2025-01-22 19:14:08.680; <-- 方法执行时间
thread_name=http-nio-8080-exec-2; <-- 执行线程
id=43; <-- 线程 id
is_daemon=true; <-- 是否守护线程
priority=5; <-- 线程优先级
TCCL=org.springframework.boot.web.embedded.tomcat.TomcatEmbeddedWebappClassLoader@316acbb5 <-- ClassLoader
# 入口方法总耗时
`---[565.124833ms] team.aikero.blade.examples.file.UploadController:upload()
[方法耗时占比 方法耗时] 方法全路径
+---[0.00% 0.021083ms ] kotlin.jvm.internal.Intrinsics:checkNotNullParameter()
+---[0.06% 0.332541ms ] org.springframework.web.multipart.MultipartFile:getInputStream() #33
+---[0.00% 0.008667ms ] team.aikero.blade.examples.file.UploadController:getOssTemplate() #34
+---[0.00% 0.003458ms ] org.springframework.web.multipart.MultipartFile:getOriginalFilename() #34
[方法耗时占比 循环最小耗时,循环最大耗时,循环总耗时,循环次数] 方法全路径
+---[0.00% min=0.012458ms,max=0.0155ms,total=0.027958ms,count=2] kotlin.jvm.internal.Intrinsics:checkNotNull() #34
+---[0.00% 0.027875ms ] org.springframework.web.multipart.MultipartFile:getSize() #34
+---[89.76% 507.259ms ] team.aikero.blade.oss.OssTemplate:upload$default() #34
+---[0.00% 0.017209ms ] kotlin.io.CloseableKt:closeFinally() #33
+---[0.00% 0.002708ms ] team.aikero.blade.examples.file.UploadController:getOssTemplate() #36
+---[0.00% 0.003125ms ] org.springframework.web.multipart.MultipartFile:getOriginalFilename() #36
+---[0.00% 0.0025ms ] kotlin.jvm.internal.Intrinsics:checkNotNull() #36
`---[10.11% 57.106917ms ] team.aikero.blade.oss.OssTemplate:remove$default() #36tt 命令
tt 命令用于记录方法的调用情况,并可以回放,类似一个有历史记录的 watch 命令
$ tt -t team.aikero.blade.examples.file.UploadController upload
Press Q or Ctrl+C to abort.
Affect(class count: 1 , method count: 1) cost in 83 ms, listenerId: 3
INDEX TIMESTAMP COST(ms) IS-RET IS-EXP OBJECT CLASS METHOD
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1000 2025-01-22 19:22:28.945 625.192417 true false 0x68682797 UploadController upload
1001 2025-01-22 19:22:31.361 388.678167 true false 0x68682797 UploadController upload
$ tt -i 1000
# 方法重放(使用内存中保留的参数信息,重新执行方法)
$ tt -i 1000 -p
# 查看方法出入参信息,与 watch 一致
$ tt -i 1000 -w "{params,returnObj,target,throwExp}"结语
本篇主要介绍 arthas 的常用使用命令,足以应对和排查大多数程序运行时问题,更多高级方法参数过滤、对象查找、内存转储功能,可以通过以下方式进行了解:
- 官方文档
- GitHub Issue
- 联系 @李子凡 共同探讨
