Java线程在linux实现
- Java线程在Linux中为进程
- top -Hp [pid] 查看Java线程
- ps -mp [pid] -o THREAD,tid,time 查看Java线程及运行时间
- printf “%x\n” [tid] 将进程ID转换为16进制数,用于下一步分析
- jstack [pid]|grep [tid] -A 30 查看对应线程中的堆栈,这里显示30条记录
- pstack [pid] 查看线程的系统调用
- pstree -a [pid] 或者 pstree -c [pid] 查看线程的父子关系
AWK
正则表达式语法
awk --re-interval '{match($0,/{"appver":"(.*)","time":(.*),"uid":(.*),"usercode":(.*),"eventId":"(.*)"}/,a) ;print a[1]"\x001"a[2]"\x001"a[3]"\x001"a[4]"\x001"a[5]}'
awk 统计文本记录行数
awk 'BEGIN{D="";n=0} {split($2,a,",") ; if(D!=a[1]) {
if(n>900) print D" "n;
D=a[1]
n=0
} else {
n=n+1
}
} END{
print D" "n}' ~/wankun/n1.log > ~/wankun/n2
grep 'Accepted socket connection' zookeeper.log.15 |awk '{print $12}' |awk -F ':' '{a[$1]+=1;} END { for(i in a) {print a[i]" "i;}}'
常用函数
awk -F ',' '/":/{gsub("uid == null logger get rmd ","",$5);split($5,a,"&");sub("uid=","",a[1]);sub("usercode=","",a[2]);sub("title=","",a[3]);sub("subtitle=","",a[ #4]);sub("=","",a[5]); print $1"\x001"a[1]"\x001"a[2]"\x001"a[3]"\x001"a[4]"\x001"a[5]}'
Shell 信号量
[ -e ./fd1 ] || mkfifo ./fd1
exec 3<> ./fd1
rm -rf ./fd1
for i in `seq 1 6`;
do
echo >&3
done
read -u 3
{
do something
echo >&3
}&
wait
exec 3<&-
exec 3>&-
JVM 内存管理
使用 ManagementFactory.getMemoryMXBean().getHeapMemoryUsage()
获取到内存使用情况,getMax()
和getUsed()
来获取当前内存使用情况。
参考代码:org.apache.hadoop.hive.ql.exec.mapjoin.MapJoinMemoryExhaustionHandler
查看Tez 运行堆栈
jstack -l 100597 |grep -A 30 -E "TezChild.*daemon"
每日增量数据统计
hadoop dfs -du -s /user/hive/warehouse/fact.db/*/dt=20180130 | awk '{sum+=$1} END{print sum/1024/1024/1024}'