Android筆記-Linux Kernel Ftrace (Function Trace)解析(非常强悍的性能分析方法)

时间:2022-01-23 08:56:36

hlchou@mail2000.com.tw

by loda

在軟體開發時,通常都會面臨到系統效能調教的需求,我們希望知道哪些區塊的程式碼或函式被執行的次數頻繁,或是佔據較高的處理器時間,以便藉此優化程式碼撰寫的行為,或是改善耗CPU時間的算法,以Linux平台來說,OProfile(http://oprofile.sourceforge.net )會是一個大家常推薦的工具,OProfile支持Time-based透過系統Timer中斷蒐集當下執行環境資訊,並加以統計,或基於Event-based,以ARM 來說就是Performance Monitor Unit(CP15)硬體支援的Performance控制單元 (更多資訊可以參考:http://infocenter.arm.com/help/topic/com.arm.doc.dai0195b/index.html),ARM PMU提供例如Instruction/Data Cache使用狀況(miss,write-back,write-buffer..etc),Memory/MMU 存取的狀況, IRQ/FIQ Latency,Branch預測統計,I/D-TCM Status..etc,基於這機制的Profiling可以在影響系統效能最少的情況下,進行System-wide的性能統計資訊. 另一種選擇,則是透過ARM接上JTAG介面,藉由ICE軟體的Profiling功能進行分析.

然而,如果我們希望更明確的知道每個Function被執行的次數 (OProfile Time-based的統計時間夠長就可得出對應的比例,做為決定的依據),執行的流程,與個別時間成本,或是系統在排程 (Scheduling and Wake-up),中斷,Block/Net,或Kernel記憶體配置..等,與Linux Kernel核心物件有關資訊的話,其實Ftrace會是另一個可以輔助的資訊來源,不同於OProfile,Ftrace會透過gcc -pg把每個函式前面都插入呼叫mcount函式的動作,在Branch統計部分,也會把if或是透過likely/unlikely巨集,進行植入是的統計,因此,Ftrace相比OProfile雖然可以提供比較完整的Kernel層級統計資訊,但因為OProfile主要是透過ARM或其他處理器平台的Performance Monitor單元,因此,OProfile可以在影響系統效能較低的情況下進行統計(ㄟ...Time-based Function profiling也是會影響到被測端的效能的.),但總體而言,都比mcount植入每個函式中,對系統效能的影響更算是輕量. 如何決定應用哪個模塊進行效能分析,還是要依據當下開發時的目的與所遇到的問題來做決定.

Ftrace最應該參考的文件就是Linux Kernel原始碼中位於Documentation/ftrace.txt的文件,參考該文件資訊與Google一下,Ftrace作者為在RedHat服務的 Steven Rostedt,主要目的是為Linux Kernel提供一個系統效能分析的工具,以便用以除錯或是改善/優化系統效能,Ftrace為一個以Function Trace為基礎的工具,並包含了包括行程Context-Switch,Wake-Up/Ready到執行的時間成本,中斷關閉的時間,以及是哪些函式呼叫所觸發的,這都有助於在複雜的系統執行下,提供必要資訊以便定位問題.

接下來,我們將介紹GCC對於Ftrace Profiling上,在編譯器層級的支援,以及有關的builtin函式,讓各位清楚這些機制底層運作的原理,最後,並以Ftrace為主,說明個機制的內容,但本文並不會深入探究到各Ftrace模組機制的實作部分,主要只以筆者自己認為值得說明的區塊,來加以說明,對各項細節有興趣的開發者,建議可以自行探究.

GCC “-pg” Profiling 機制與builtin函式對Ftrace Branch Profiling的支援

Ftrace支援在有加入 “likely/unlikely” 條件判斷式位置的Brnch Profiling與對整個核心 if 條件判斷式的Brnch Profiling (當然,選擇後者對效能影響也比較明顯...要做記錄的地方變多了.).使用者可以透過Kernel hacking --->Tracers --->Branch Profiling ---> 來選擇“No branch profiling”,”Trace likely/unlikely profiler” 或 “Profile all if conditionalss”. 對系統進行Branch Profiling的動作. (Ftrace在 config中有這四個設定跟Branch Profiling有關CONFIG_TRACE_BRANCH_PROFILING,CONFIG_BRANCH_PROFILE_NONE,CONFIG_PROFILE_ANNOTATED_BRANCHES與 CONFIG_PROFILE_ALL_BRANCHES)

參考include/linux/compiler.h中的實作,如果選擇“Profile all if conditionalss”,就會把全部的if條件判斷字元,透過gcc precompile定義為巨集 __trace_if,如下所示

#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )

#define __trace_if(cond) /

if (__builtin_constant_p((cond)) ? !!(cond) : /

({ /

int ______r; /

static struct ftrace_branch_data /

__attribute__((__aligned__(4))) /

__attribute__((section("_ftrace_branch"))) /

______f = { /

.func = __func__, /

.file = __FILE__, /

.line = __LINE__, /

}; /

______r = !!(cond); /

______f.miss_hit[______r]++; /

______r; /

}))

如果if 條件式為常數(也就是說編譯器可以在編譯階段就決定好if/else路徑了),就不納入統計,反之,就會根據條件式的結果(______r =0 or 1)統計命中的次數,作為if/else條件設計的參考. (其實,透過likely/unlikely優化編譯階段的Branch Predition是很有幫助的).

如果是設定為”Trace likely/unlikely profiler”,就會把 likely與unlikely巨集定義如下所示

/*

* Using __builtin_constant_p(x) to ignore cases where the return

* value is always the same. This idea is taken from a similar patch

* written by Daniel Walker.

*/

# ifndef likely

# define likely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 1))

# endif

# ifndef unlikely

# define unlikely(x) (__builtin_constant_p(x) ? !!(x) : __branch_check__(x, 0))

# endif

其中__branch_check__定義如下

#define likely_notrace(x) __builtin_expect(!!(x), 1)

#define unlikely_notrace(x) __builtin_expect(!!(x), 0)

#define __branch_check__(x, expect) ({ /

int ______r; /

static struct ftrace_branch_data /

__attribute__((__aligned__(4))) /

__attribute__((section("_ftrace_annotated_branch"))) /

______f = { /

.func = __func__, /

.file = __FILE__, /

.line = __LINE__, /

}; /

______r = likely_notrace(x); /

ftrace_likely_update(&______f, ______r, expect); /

______r; /

})

函式ftrace_likely_update (位置在kernel/trace/trace_branch.c)實作如下所示,

void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect)

{

/*

* I would love to have a trace point here instead, but the

* trace point code is so inundated with unlikely and likely

* conditions that the recursive nightmare that exists is too

* much to try to get working. At least for now.

*/

trace_likely_condition(f, val, expect);

/* FIXME: Make this atomic! */

if (val == expect)

f->correct++;

else

f->incorrect++;

}

有關函式 trace_likely_condition的行為在此就不追蹤,只談函式ftrace_likely_update,這函式會統計開發者使用likely/unlikely定義好的if/else區塊順序,跟實際執行時,if/else執行的結果,透過 correct/incorrect累加,我們可以根據Run-Time實際統計的結果,看是否原本likely/unlikely有需要修正的空間(往統計正確的方向去,就可避免處理器Pipeline Flush的機會),以便得到更好的執行效能.

若沒有啟動任何Branch Profiling的動作,則likely與unlikely就只會透過Builtin函式_builtin_expect (在GCC 2.96版本之後支援)進行編譯階段的Branch Predition優化動作,如下宣告

# define likely(x) __builtin_expect(!!(x), 1)

# define unlikely(x) __builtin_expect(!!(x), 0)

編譯器支援的Builtin函式__builtin_constant_p,主要的功能為判斷所給予的值是否為常數(__builtin_constant_p會返回1,若不是常數__builtin_constant_p會返回0),若是則可在編譯時期安排好對應的執行動作,就不需要把全部的行為,都放到編譯後的程式碼,透過執行時期才決定.

以如下程式碼來驗證行為,

#define Y 6

void Func(int Z)

{

int vResult;

vResult=__builtin_constant_p(Z)?(Z*100):-1;

printf("5:Result:%ld/n",vResult);

vResult=(Z*100);

printf("6:Result:%ld/n",vResult);

}

int main()

{

int X=5;

int vResult;

vResult=__builtin_constant_p(X)?(X*100):-1;

printf("1:Result:%ld/n",vResult);

vResult=(X*100);

printf("2:Result:%ld/n",vResult);

vResult=__builtin_constant_p(Y)?(Y*100):-1;

printf("3:Result:%ld/n",vResult);

vResult=(Y*100);

printf("4:Result:%ld/n",vResult);

Func(7);

return;

}

以gcc版本4.1.2來驗證,若以-O0編譯,X為一個區域變數,__builtin_constant_p(X)會返回0,反之,Y為一個定義的常數,__builtin_constant_p(Y)會返回1,而如果把一個常數透過函式參數Z傳遞,因為這個值會被放到堆疊(根據每個處理器的Calling Convention,在ARM上會先放到暫存器R0-R3),導致__builtin_constant_p(Z)返回0,若是以-O1或-O2編譯,則編譯器可以判斷區域變數X的值,__builtin_constant_p(X)會返回1,若是函式函式傳遞的參數Z,_builtin_constant_p(Z)還是會傳回0. 優化的區塊還是以Function本身為單位,並且有打開優化選項,可以讓Builtin函式發揮更好的效能.

參考GCC文件,__builtin_constant_p也可以作為Constant變數初始值的指定(文件建議要在GCC 3.0.1版本之後),如下所示,若 EXPRESSION為常數,則table初始值為該常數,反之則初始值為0.

static const int table[] = {

__builtin_constant_p (EXPRESSION) ? (EXPRESSION) : -1,

};

另一個需要介紹的Builtin函式為 __builtin_expect,這函式的功能主要在提供編譯器Branch Prediction的能力,如以下的程式碼

void FuncA(int X)

{

if(__builtin_expect(X,1))

{

printf("FuncA 1:%ld/n",X*0x100);

}

else

{

printf("FuncA 2:%ld/n",X);

}

}

void FuncB(int X)

{

if(__builtin_expect(X,0))

{

printf("FuncB 1:%ld/n",X*0x100);

}

else

{

printf("FuncB 2:%ld/n",X);

}

}

int main()

{

FuncA(7);

FuncB(8);

return;

}

執行結果為

FuncA 1:700h

FuncB 1:800h

以gcc 4.1.2搭配-O2進行編譯(在這驗證環境下,使用-O0,函式__builtin_expect會沒有效果),執行結果一致,透過反組譯程式碼結果如下

FuncA/B (-O0)

FuncA (-O2) - if(__builtin_expect(X,1))

FuncB(-O2)-if(__builtin_expect(X,0))

push %ebp

mov %esp,%ebp

sub $0x8,%esp

mov 0x8(%ebp),%eax

test %eax,%eax

je 80483a9 <FuncA+0x25>

mov 0x8(%ebp),%eax

shl $0x8,%eax

mov %eax,0x4(%esp)

movl $0x8048500,(%esp)

call 8048298 <printf@plt>

jmp 80483bc <FuncA+0x38>

80483a9:

mov 0x8(%ebp),%eax

mov %eax,0x4(%esp)

movl $0x804850d,(%esp)

call 8048298 <printf@plt>

leave

ret

push %ebp

mov %esp,%ebp

sub $0x8,%esp

mov 0x8(%ebp),%eax

test %eax,%eax

je 80483f2 <FuncA+0x22>

shl $0x8,%eax

mov %eax,0x4(%esp)

movl $0x804853a,(%esp)

call 8048298 <printf@plt>

leave

ret

80483f2 :

movl $0x0,0x4(%esp)

movl $0x8048547,(%esp)

call 8048298 <printf@plt>

leave

ret

push %ebp

mov %esp,%ebp

sub $0x8,%esp

mov 0x8(%ebp),%eax

test %eax,%eax

jne 80483b3 <FuncB+0x23>

movl $0x0,0x4(%esp)

movl $0x804852d,(%esp)

call 8048298 <printf@plt>

leave

ret

80483b3 :

shl $0x8,%eax

mov %eax,0x4(%esp)

movl $0x8048520,(%esp)

call 8048298 <printf@plt>

leave

ret

我們可以看到__builtin_expect(X,1)會優先把if的執行區塊放到連續的程式碼中,而__builtin_expect(X,0)則是會把else的執行區塊,放到前面連續的程式碼中,至於執行時期會執行到if或else的區塊,就根據X條件是否為0來決定.參考GCC文件,我們也可以搭配條件判斷的寫法

可以參考Linux Kernel中include/linux/compiler.h中的實作,如下所示

# define likely(x) __builtin_expect(!!(x), 1)

# define unlikely(x) __builtin_expect(!!(x), 0)

如果開發者判斷,if的區塊是較常被執行到的,那就應該用likely,例如

if (likely(success))

{….}

else //else區塊可能為error處理邏輯,多數的情況應該希望走的是if的邏輯

{….}

如果開發者判斷else區塊,是希望較常被執行到的,就可以使用unlikely,例如

if (unlikely(error))

{….}

else //success

{….}

處理器本身也有Branch Predition的能力,如果不在有被處理器記憶到的BP Entry中,處理器通常會循序Fetch指令進來,一旦發現分支預測錯誤,就會Flush Pipeline,透過函式__builtin_expect,我們可以在編譯時期,依據傳入值的結果,決定 if/else編譯為機械碼時程式碼排列的順序,減少處理器Pipeline被Flush的機率,這對執行效能也是有很大的幫助.

GCC support for the GNU profiler gprof

有關GNU gprof的介紹, 可以參考網頁http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html .基於GCC對Profiling的支援,開發端可以透過 -pg 的編譯參數,讓gcc 把profiling的功能加入到程式碼中,

以如下程式碼為例,

int FuncA()

{

int i;

int vRet=0;

for(i=0;i<20000;i++)

{

vRet+=i;

}

return vRet;

}

int FuncB(int I)

{

int i;

int vRet=0;

for(i=0;i<9999;i++)

{

vRet+=I+FuncA();

}

return vRet;

}

int main()

{

int vResult;

vResult=FuncA();

vResult=FuncB(vResult);

printf("Result:%ld/n",vResult);

return 0;

}

透過gcc編譯後,有加上 -pg 與沒加上的差異如下

函式名稱

無-pg

有加上-pg

main

lea 0x4(%esp),%ecx

and $0xfffffff0,%esp

pushl 0xfffffffc(%ecx)

push %ebp

mov %esp,%ebp

push %ecx

sub $0x24,%esp

call 8048384 <FuncA>

mov %eax,0xfffffff8(%ebp)

mov 0xfffffff8(%ebp),%eax

mov %eax,(%esp)

call 80483b2 <FuncB>

mov %eax,0xfffffff8(%ebp)

mov 0xfffffff8(%ebp),%eax

mov %eax,0x4(%esp)

movl $0x8048500,(%esp)

call 8048298 <printf@plt>

mov $0x0,%eax

add $0x24,%esp

pop %ecx

pop %ebp

lea 0xfffffffc(%ecx),%esp

ret

lea 0x4(%esp),%ecx

and $0xfffffff0,%esp

pushl 0xfffffffc(%ecx)

push %ebp

mov %esp,%ebp

push %ecx

sub $0x24,%esp

call 804837c <mcount@plt> =>Glibc提供的mcount函式

call 80484b4 <FuncA>

mov %eax,0xfffffff8(%ebp)

mov 0xfffffff8(%ebp),%eax

mov %eax,(%esp)

call 80484e7 <FuncB>

mov %eax,0xfffffff8(%ebp)

mov 0xfffffff8(%ebp),%eax

mov %eax,0x4(%esp)

movl $0x8048680,(%esp)

call 804835c <printf@plt>

mov $0x0,%eax

add $0x24,%esp

pop %ecx

pop %ebp

lea 0xfffffffc(%ecx),%esp

ret

FuncA

push %ebp

mov %esp,%ebp

sub $0x10,%esp

movl $0x0,0xfffffffc(%ebp)

movl $0x0,0xfffffff8(%ebp)

jmp 80483a4 <FuncA+0x20>

mov 0xfffffff8(%ebp),%eax

add %eax,0xfffffffc(%ebp)

addl $0x1,0xfffffff8(%ebp)

cmpl $0x4e1f,0xfffffff8(%ebp)

jle 804839a <FuncA+0x16>

mov 0xfffffffc(%ebp),%eax

leave

ret

push %ebp

mov %esp,%ebp

sub $0x10,%esp

call 804837c <mcount@plt> =>Glibc提供的mcount函式

movl $0x0,0xfffffffc(%ebp)

movl $0x0,0xfffffff8(%ebp)

jmp 80484d9 <FuncA+0x25>

mov 0xfffffff8(%ebp),%eax

add %eax,0xfffffffc(%ebp)

addl $0x1,0xfffffff8(%ebp)

cmpl $0x4e1f,0xfffffff8(%ebp)

jle 80484cf <FuncA+0x1b>

mov 0xfffffffc(%ebp),%eax

leave

ret

FuncB

push %ebp

mov %esp,%ebp

sub $0x10,%esp

movl $0x0,0xfffffffc(%ebp)

movl $0x0,0xfffffff8(%ebp)

jmp 80483d7 <FuncB+0x25>

call 8048384 <FuncA>

add 0x8(%ebp),%eax

add %eax,0xfffffffc(%ebp)

addl $0x1,0xfffffff8(%ebp)

cmpl $0x270e,0xfffffff8(%ebp)

jle 80483c8 <FuncB+0x16>

mov 0xfffffffc(%ebp),%eax

leave

ret

push %ebp

mov %esp,%ebp

sub $0x10,%esp

call 804837c <mcount@plt> =>Glibc提供的mcount函式

movl $0x0,0xfffffffc(%ebp)

movl $0x0,0xfffffff8(%ebp)

jmp 8048511 <FuncB+0x2a>

call 80484b4 <FuncA>

add 0x8(%ebp),%eax

add %eax,0xfffffffc(%ebp)

addl $0x1,0xfffffff8(%ebp)

cmpl $0x270e,0xfffffff8(%ebp)

jle 8048502 <FuncB+0x1b>

mov 0xfffffffc(%ebp),%eax

leave

ret

[root@localhost pg]# gcc -g -pg pg.c -o pg

[root@localhost pg]# ./pg

Result:785467424

[root@localhost pg]# ls -l gmon.out

-rw-r--r-- 1 root root 464 May 13 06:32 gmon.out

[root@localhost pg]# gprof --brief ./pg

[root@localhost pg]# gprof --brief ./pg

Flat profile:

Each sample counts as 0.01 seconds.

% time

cumulative (seconds)

self (seconds)

calls

Self (ms/call)

Total (ms/call)

name

100.54

0.47

0.47

10000

0.05

0.05

FuncA

0

0.47

0

1

0

472.49

FuncB

=>簡要說明: 每次呼叫FuncA需要0.05ms,總共呼叫了10000次,耗時0.47秒 (ㄟ 也許應該說每次呼叫FuncA需要0.047ms會比較精確一點),而FnucB被呼叫1次,耗時472.49ms (在funcB中會執行9999次FuncA).

Call graph

granularity: each sample hit covers 2 byte(s) for 2.12% of 0.47 seconds

index

% time

self

children

called

name

0

0

1/10000

main [2]

0.47

0

9999/10000

FuncB [3]

[1]

100

0.47

0

10000

FuncA [1]

=>簡要說明: 以FuncA為主體來對比,FuncA執行了10000次共0.47秒的時間,FuncB呼叫了FuncA 9999 (佔總數9999/10000), main呼叫了FuncA 1次 (佔總數1/10000)

index

% time

self

children

called

name

[2]

100

0

0.47

main [2]

0

0.47

1

FuncB [3]

0

0

1/10000

FuncA [1]

=>簡要說明: 以main為主體來對比,main呼叫的函式(children)花了 0.47秒,分別呼叫了FuncB 佔了9999/10000(約等於1)的時間,與呼叫了FuncA佔了1/10000的時間.

index

% time

self

children

called

name

0

0.47

1

main [2]

[3]

100

0

0.47

1

FuncB [3]

0.47

0

9999/10000

FuncA [1]

=>簡要說明: 以funcB為主體來對比,main呼叫的函式FuncA 與funcB(children)花了0.47秒,FuncB呼叫的函式FuncA花了0.47秒 (因為最小單位為0.01秒,太接近的兩個數字如果差距是在0.01秒以內,數字就有機會變成一樣),FuncB被呼叫1次,FuncA被FuncB呼叫9999次

Index by function name

[1] FuncA [3] FuncB

Infrastructure for profiling code inserted by 'gcc -pg'.

http://www.network-theory.co.uk/docs/gccintro/gccintro_80.html

在Linux Kernel中Enable Tracer

要在核心中支援Ftrace,除了GCC編譯器要支援-pg外,所採用的C函式庫也要支援相關mcount的函式,筆者用Android SDK帶的arm-eabi-gcc 4.4.0會產生編譯錯誤“undefined reference to `__gnu_mcount_nc' “. 筆者後來是到Sourcery 網址 http://www.codesourcery.com/sgpp/lite/arm/portal/subscription3053 下載arm-2007q3版本的編譯器與函式庫環境,以此為基礎進行ARM環境Linux Kernel 開啟Ftrace的編譯,就可以解決無法Link '__gnu_mcount_nc'的錯誤了. (有關編譯環境與Kernel Source Code版本的選擇,會隨著版本演進而有所差異,如有遇到編譯錯誤發生時,還請以各位自己所選擇的環境為主來判斷.)

接下來執行 make menuconfig.

如果是在x86的機器上,要Enable “Kernel Function Graph Trace”的功能,就要透過General setup --->Optimize for size ,選擇關閉 “Optimize for size “. 而在ARM平台上,目前沒有“Kernel Function Graph Trace”的選項,所以就不受此限制.

此外,在Linux Kernel 2.6.31 給ARM環境的組態中,目前並不支援Dynamic Ftrace,所以像是set_ftrace_filter與set_ftrace_nontrace這類可以動態設定函式名稱過濾原則的機制,在筆者環境中就無法支援,我在說明時,會以x86版本來替代,後續支援與否請以Linux Kernel版本演進為主.

在選單中進入Kernel hacking ---> Tracers --->就可以選擇要開啟哪些Tracers功能,以筆者手中針對ARM版本的Config內容如下 (我是都勾選了...)

-*- Kernel Function Tracer

[*] Interrupts-off Latency Tracer

[*] Scheduling Latency Tracer

-*- Trace process context switches

[*] Trace boot initcalls

[*] Trace likely/unlikely profiler

[*] Profile all if conditionals

[*] Trace likely/unlikely instances

[*] Trace max stack

最後確認Debug Filesystem是否有被勾選,路徑為Kernel hacking --->-*- Debug Filesystem.

退出選單,由於剛才已經勾選 “CONFIG_FUNCTION_TRACER” 組態,可以參考檔案kernel/trace/Makefile的內容,設定 CONFIG_FUNCTION_TRACER後,就會把KBUILD_CFLAGS加上 -pg ,每個核心函式中都會被置入mcount函式的呼叫

ifdef CONFIG_FUNCTION_TRACER

ORIG_CFLAGS := $(KBUILD_CFLAGS)

KBUILD_CFLAGS = $(subst -pg,,$(ORIG_CFLAGS))

...

endif

Kernel內部有關除錯,或是Ftrace相關的檔案,會移除-pg的編譯參數,避免影響統計的結果,例如在kernel/Makefile檔案中

ifdef CONFIG_FUNCTION_TRACER

# Do not trace debug files and internal ftrace files

CFLAGS_REMOVE_lockdep.o = -pg

CFLAGS_REMOVE_lockdep_proc.o = -pg

CFLAGS_REMOVE_mutex-debug.o = -pg

............

endif

比較一下,有打開 Function Tracer跟沒有打開Function Tracer的核心編譯結果如下所示,可以看到GCC編譯器,對編譯結果的影響.

沒有打開Ftracer的do_fork函式

打開Ftracer的do_fork函式

00001458 <do_fork>:

push {r4, r5, r6, r7, r8, r9, sl, lr}

tst r0, #268435456 ; 0x10000000

sub sp, sp, #32 ; 0x20

mov r5, r0

mov r7, r1

mov r6, r2

mov sl, r3

bne 1688 <do_fork+0x230>

ands r4, r5, #33554432 ; 0x2000000

moveq r9, #0 ; 0x0

movne r9, #1 ; 0x1

bne 16c0 <do_fork+0x268>

ldr r3, [r6, #64]

tst r3, #15 ; 0xf

bne 149c <do_fork+0x44>

tst r5, #8388608 ; 0x800000

beq 1604 <do_fork+0x1ac>

mov r8, #0 ; 0x0

ldr ip, [sp, #68]

mov r1, r7

str ip, [sp]

mov r2, r6

000018b0 <do_fork>:

mov ip, sp

push {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr, pc}

sub fp, ip, #4 ; 0x4

sub sp, sp, #52 ; 0x34

mov ip, lr

bl 0 <mcount>

.word 0x00000040

tst r0, #268435456 ; 0x10000000

mov r7, r0

mov sl, r1

mov r6, r2

str r3, [fp, #-72]

bne 1c7c <do_fork+0x3cc>

ands r5, r7, #33554432 ; 0x2000000

moveq r8, #0 ; 0x0

movne r8, #1 ; 0x1

mov r1, r8

ldr r0, [pc, #1060] ; 1d20 <do_fork+0x470>

mov r2, #0 ; 0x0

bl 0 <ftrace_likely_update>

subs r1, r8, #0 ; 0x0

DebugFS FileSystem

Ftrace會使用虛擬的檔案系統debugfs做為設定檔與輸出結果的儲存位置,我們可以先在根目錄產生 /debug檔案 ,然後mount debugfs到/debug目錄下 (Ftrace文件建議可以mount到/sys/kernel/debug目錄下)

# mkdir /debug

# mount -t debugfs nodev /debug

之後進入/debug/tracing目錄下

# cd /debug/tracing

查詢系統中支援哪些Tracer

$ cat available_tracers

function_graph function_duration function sched_switch nop

選擇 sched_switch作為目前運作的Tracer

# echo sched_switch > current_tracer

可以檢視目錄下的trace檔案,觀察sched_switch輸出的結果

# cat trace

# tracer: sched_switch

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

<idle>-0 [000] 745.934676: 0:140:R ==> [000] 2892:120:R

gnome-terminal-2892 [000] 745.934770: 2892:120:S ==> [000] 0:140:R

<idle>-0 [000] 745.934861: 0:140:R ==> [000] 2892:120:R

gnome-terminal-2892 [000] 745.934937: 2892:120:S ==> [000] 0:140:R

bash-5175 [001] 745.934960: 5175:120:S + [000] 2892:120:S

bash-5175 [001] 745.935149: 5175:120:S + [000] 2892:120:S

暫停Ftrace的運作,

# echo nop > current_tracer

上述步驟,為透過debugfs操作Ftrace的基本行為,接下來讓我們初步認識Ftrace相關的控制檔案

介紹Ftrace 目錄tracing下的檔案

我們以Android ARM平台搭配Linux Kernel 2.6.31為例子,首先建立 /data/debug目錄,並且把 debugfs 虛擬檔案系統載入

# mkdir /data/debug

# mount -t debugfs nodev /data/debug

# cd /data/debug

瀏覽目錄下的內容,

# ls

sched_features

mmc0

tracing

bdi

#

進入目錄tracing後,ls瀏覽該目錄的結果

# cd /data/debug

# ls -l

-r--r--r-- root root 0 1970-01-01 00:00 stack_trace

-rw-r--r-- root root 0 1970-01-01 00:00 stack_max_size

drwxr-xr-x root root 1970-01-01 00:00 events

-rw-r--r-- root root 0 1970-01-01 00:00 set_event

-r--r--r-- root root 0 1970-01-01 00:00 available_events

-r--r--r-- root root 0 1970-01-01 00:00 printk_formats

drwxr-xr-x root root 1970-01-01 00:00 per_cpu

drwxr-xr-x root root 1970-01-01 00:00 options

-r--r--r-- root root 0 1970-01-01 00:00 saved_cmdlines

--w--w---- root root 0 1970-01-01 00:00 trace_marker

-rw-r--r-- root root 0 1970-01-01 00:00 buffer_size_kb

-r--r--r-- root root 0 1970-01-01 00:00 trace_pipe

-r--r--r-- root root 0 1970-01-01 00:00 README

-rw-r--r-- root root 0 1970-01-01 00:00 tracing_thresh

-rw-r--r-- root root 0 1970-01-01 00:00 tracing_max_latency

-rw-r--r-- root root 0 1970-01-01 00:00 current_tracer

-r--r--r-- root root 0 1970-01-01 00:00 available_tracers

-rw-r--r-- root root 0 1970-01-01 00:00 trace

-rw-r--r-- root root 0 1970-01-01 00:00 tracing_cpumask

-rw-r--r-- root root 0 1970-01-01 00:00 trace_options

-rw-r--r-- root root 0 1970-01-01 00:00 tracing_enabled

-rw-r--r-- root root 0 1970-01-01 00:00 tracing_on

-rw-r--r-- root root 0 1970-01-01 00:00 function_profile_enabled

drwxr-xr-x root root 1970-01-01 00:00 trace_stat

-rw-r--r-- root root 0 1970-01-01 00:00 set_ftrace_pid

接下來,筆者以這目錄下的檔案,挑選認為應該說明的部份,如下所示

檔案名稱

說明

current_tracer

用來顯示目前被設定的Tracer.

如果沒有任何被設定的Tracer 就會返回nop

# cat current_tracer

nop

以 sched_switch設定到current_tracer來示範這檔案的功能

# echo > /data/debug/tracing/trace =>清空trace內容

# echo sched_switch > /data/debug/tracing/current_tracer

# echo 1 > /data/debug/tracing/tracing_enabled

# sleep 1

# echo 0 > /data/debug/tracing/tracing_enabled

#cat /data/debug/tracing/trace

# tracer: sched_switch

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

bash-9560 [000] 52138.209612: 9560:120:R ==> [000] 9538:120:R kworker/0:1

kworker/0:1-9538 [000] 52138.209903: 9538:120:S ==> [000] 9558:120:R in.telnetd

in.telnetd-9558 [000] 52138.211017: 9558:120:S ==> [000] 9560:120:R bash

bash-9560 [000] 52138.211412: 9560:120:S + [000] 9560:120:S bash

bash-9560 [000] 52138.211613: 9560:120:R + [000] 9538:120:R kworker/0:1

bash-9560 [000] 52138.211820: 9560:120:R ==> [000] 9538:120:R kworker/0:1

kworker/0:1-9538 [000] 52138.211990: 9538:120:R + [000] 9558:120:R in.telnetd

kworker/0:1-9538 [000] 52138.212148: 9538:120:S ==> [000] 9558:120:R in.telnetd

in.telnetd-9558 [000] 52138.212953: 9558:120:S ==> [000] 9560:120:R bash

bash-9560 [000] 52138.213221: 9560:120:S ==> [000] 0:120:R <idle>

<idle>-0 [000] 52138.414815: 0:120:R + [000] 9538:120:R kworker/0:1

<idle>-0 [000] 52138.414976: 0:120:R ==> [000] 9538:120:R kworker/0:1

kworker/0:1-9538 [000] 52138.415243: 9538:120:S ==> [000] 0:120:R <idle>

<idle>-0 [000] 52138.417111: 0:120:R + [000] 9538:120:R kworker/0:1

<idle>-0 [000] 52138.417312: 0:120:R ==> [000] 9538:120:R kworker/0:1

kworker/0:1-9538 [000] 52138.417541: 9538:120:S ==> [000] 0:120:R <idle>

<idle>-0 [000] 52138.448254: 0:120:R + [000] 1811:120:R pcscd

<idle>-0 [000] 52138.448547: 0:120:R ==> [000] 1811:120:R pcscd

pcscd-1811 [000] 52138.450170: 1811:120:S ==> [000] 0:120:R <idle>

<idle>-0 [000] 52138.540657: 0:120:R + [000] 1892:139:R gam_server

available_tracers

列出目前Linux Kernel中在編譯時有被啟用的Tracer. 我們只要把這些對應的名字設定到current_tracer中就可以啟用對應的Tracer. 如下所示為筆者環境中所支持的Tracer.

# cat available_tracers

blk kmemtrace branch wakeup_rt wakeup irqsoff function sched_switch initcall nop

tracing_enabled

用以控制Tracer的啟動或是停止,當為1時表示Tracer啟動,反之,為0就是停止,如下所示

啟動 Tracer

#echo 1 > /data/debug/tracing/tracing_enabled

暫停 Tracer

#echo 0 > /data/debug/tracing/tracing_enabled

trace

用以把Tracer的結果呈現出人可閱讀的格式.

#more /data/debug/tracing/trace

清空Tracer結果內容

#echo > /data/debug/tracing/trace

trace_pipe

支援以pipe的機制讀取跟trace檔案一樣的內容,並以Block的機制讀取,直到有資料進來後才會返回,每次資料讀取後,下一次讀取,就會讀取到新的資料,適合用來循序讀取,把Trace內容依序讀出來.

如果希望每次都讀取到完整的trace buffer內容,就從trace檔案中讀取,反之,如果是每次都要讀到沒讀取到的資料,就可以從trace_pipe中讀取.

trace_options

可以用來控制trace要顯示的資料內容. (ㄟ 也必須要該Plug-in Tracer有支援對應的Options).

#cat trace_options

print-parent

nosym-offset

nosym-addr

noverbose

noraw

nohex

nobin

noblock

nostacktrace

nosched-tree

trace_printk

noftrace_preempt

nobranch

annotate

nouserstacktrace

nosym-userobj

noprintk-msg-only

context-info

nolatency-format

noglobal-clock

sleep-time

graph-time

控制方式為,把Option內容選擇加上 no (關閉)或是移除no(開啟),

例如:

關閉print-parent

# echo noprint-parent > /data/debug/tracing/trace_options

重新 cat trace_options就會看到print-parent變成noprint-parent

反之,原本設定為nosym-addr,透過

#echo sym-addr > /data/debug/tracing/trace_options

重新 cat trace_options就會看到nosym-addr 變成 sym-addr

tracing_max_latency

用以顯示剛才的Tracer進行Latency統計時(例如: irqsoff 或是 wakeup...etc),最大的 Latency數值為何(單位為 us),以提供核心設計者,找出系統中比較缺乏效率的部份,加以改善.

# echo irqsoff > /data/debug/tracing/current_tracer

# echo 1 > /data/debug/tracing/tracing_enabled

….do somehting....sleep....

# echo 0 > /data/debug/tracing/tracing_enabled

# more tracing_max_latency

7367

# more /data/debug/tracing/trace

# tracer: irqsoff

#

# irqsoff latency trace v1.1.5 on 2.6.38.6

# --------------------------------------------------------------------

# latency: 7367 us, #58/58, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:1)

# -----------------

# | task: bash-1905 (uid:0 nice:0 policy:0 rt_prio:0)

# -----------------

# => started at: apic_timer_interrupt

# => ended at: call_on_stack

#

#

# _------=> CPU#

# / _-----=> irqs-off

# | / _----=> need-resched

# || / _---=> hardirq/softirq

# ||| / _--=> preempt-depth

# |||| /_--=> lock-depth

# |||||/ delay

# cmd pid |||||| time | caller

# / / |||||| / | /

...etc

buffer_size_kb

用來設定每個處理器要用多少記憶體空間當做Tracer Buffer (單位為kbytes)(一般而言一個Page單位為4kbytes)(每個處理器都會依據此值配置一樣大的空間),

例如:

#cat buffer_size_kb

1408

就表示在筆者環境中為每個處理器用 1.408MB當做Tracer Buffer.

這個值只有在current_tracer內容為 “nop”時才可以修改. (也就是說要沒有啟用任何Tracer Plug-in才可以修改這個值)

tracing_cpumask

用來設定Tracer要在哪個處理器上蒐集資訊. 顯示的為字串16進位的字串.在筆者單核ARM的環境中執行如下

#cat tracing_cpumask

1

#

set_ftrace_filter

用在動態Ftrace的組態下,程式碼在編譯過程中被gcc -pg置入呼叫mcount函式的行為,在支援動態Ftrace (Dynamic Ftrace)的環境中,只有在透過這檔案設定的函式,才會被納入Function Trace中.

設定Trace Functon開頭為 account與run的函式呼叫行為,如下指令所示

#echo > /debug/tracing/trace

#echo 'account*' 'run*' > /debug/tracing/set_ftrace_filter

#echo function > /debug/tracing/current_tracer

#echo 1 > /debug/tracing/tracing_enabled

#sleep 1

#echo 0 > /debug/tracing/tracing_enabled

#cat /debug/tracing/trace

# tracer: function

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

bash-1951 [000] 2602.286489: account_process_tick <-update_process_times

bash-1951 [000] 2602.286501: account_system_time <-account_process_tick

bash-1951 [000] 2602.286510: run_local_timers <-update_process_times

bash-1951 [000] 2602.286565: run_posix_cpu_timers <-update_process_times

bash-1951 [000] 2602.286605: run_timer_softirq <-__do_softirq

kworker/0:0-4 [000] 2602.287494: account_process_tick <-update_process_times

kworker/0:0-4 [000] 2602.287503: account_system_time <-account_process_tick

kworker/0:0-4 [000] 2602.287509: run_local_timers <-update_process_times

kworker/0:0-4 [000] 2602.287565: run_posix_cpu_timers <-update_process_times

kworker/0:0-4 [000] 2602.287605: run_timer_softirq <-__do_softirq

….

在筆者的Linux Kernel 2.6.31 ARM的設置中,目前並沒有支援HAVE_DYNAMIC_FTRACE,也就是說在ARM版本中這個檔案屬性會無法使用. (還請以各位手中的Linux Kernel版本為主,看是否有新的更新.)

set_ftrace_notrace

跟set_ftrace_filter相反,在動態Ftrace(Dynamic Ftrace)的組態下,那設定到這檔案中的函式名稱,都會被取消呼叫mcount的動作,不納入Function Trace的機制中. 如果同一個函式同時被設定到 set_ftrace_filter 與 set_ftrace_notrace,則該函式將會以 set_ftrace_notrace為主,將不納入Trace中.

設定不要Trace有包含cpu與align字元的函式名稱,如下指令所示

# echo '*cpu*' '*align*' > /debug/tracing/set_ftrace_notrace

#echo function > /debug/tracing/current_tracer

#echo 1 > /debug/tracing/tracing_enabled

#sleep 1

#echo 0 > /debug/tracing/tracing_enabled

#cat /debug/tracing/trace

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

bash-1951 [000] 2331.630911: calc_global_load <-do_timer

bash-1951 [000] 2331.630921: update_process_times <-tick_periodic

bash-1951 [000] 2331.630929: account_process_tick <-update_process_times

bash-1951 [000] 2331.630937: account_system_time <-account_process_tick

bash-1951 [000] 2331.630945: run_local_timers <-update_process_times

bash-1951 [000] 2331.630953: hrtimer_run_queues <-run_local_timers

bash-1951 [000] 2331.630967: __current_kernel_time <-hrtimer_run_queues

bash-1951 [000] 2331.630974: __get_wall_to_monotonic <-hrtimer_run_queues

bash-1951 [000] 2331.630984: _raw_spin_lock <-hrtimer_run_queues

bash-1951 [000] 2331.630995: raise_softirq <-run_local_timers

bash-1951 [000] 2331.631006: rcu_check_callbacks <-update_process_times

bash-1951 [000] 2331.631014: rcu_bh_qs <-rcu_check_callbacks

bash-1951 [000] 2331.631022: __rcu_pending <-rcu_check_callbacks

….

在筆者的Linux Kernel 2.6.31 ARM的設置中,目前並沒有支援HAVE_DYNAMIC_FTRACE,也就是說在ARM版本中這個檔案屬性會無法使用. (還請以各位手中的Linux Kernel版本為主,看是否有新的更新.)

set_ftrace_pid

設定Function tracer只針對特定的PID進行蒐集. (如果不設定這個值,而是對全系統蒐集,很有可能會讓系統Hang住=>在我的ARM環境中不設定PID縮小範圍,就會Hang住...@_@)

操作範例如下,

#cat /debug/tracing/set_ftrace_pid

no pid

#ps =>找一個要Trace的Process ID

USER PID PPID VSIZE RSS WCHAN PC NAME

….

system 65 31 152436 25296 ffffffff afd0db4c S system_server

app_23 128 31 111664 20728 ffffffff afd0eb08 S com.android.launcher

….

# echo 65 > /data/debug/tracing/set_ftrace_pid

# echo function > /data/debug/tracing/current_tracer

# echo 1 > /data/debug/tracing/tracing_enabled

.do something....

#cat /data/debug/tracing/trace

….此時 trace中就只會顯示該 PID的tracing結果...

available_filter_functions

這個檔案會列出目前所有可以供"set_ftrace_filter" 或 "set_ftrace_notrace"用來設定過濾條件的函式名稱.

例如:

# more /debug/tracing/available_filter_functions

_stext

do_one_initcall

run_init_process

init_post

name_to_dev_t

match_dev_by_uuid

thread_saved_pc

get_wchan

prepare_to_copy

release_thread

copy_thread

start_thread

__show_regs

cpu_idle

setup_sigcontext

align_sigframe

signal_fault

sys_sigaltstack

restore_sigcontext

sys_sigaction

sys_rt_sigreturn

do_notify_resume

….etc

介紹Ftrace每個模組

Ftrace的設計概念為提供Plug-in Framework的機制,隨著核心模組的演進,如果需要針對新的核心模組進行效能上的統計分析時,就可以透過Plug-in的方式,撰寫新的Ftrace模組達成新的tracer目的,作為一個可延展性的Tracer 架構,本段落會介紹Ftrace主要支援的Tracer模組,這些模組的數量或是功能,都有可能隨著Linux Kernel演進而有所改動 (例如:Kernel 2.6.39移除 Kernel Lock),版本的更迭,都請以最新的Linux Kernel Source Code為主要依據.

名稱

說明

Function tracer

設定路徑為Kernel hacking --->Tracers --->Kernel Function Tracer

用以顯示,Function呼叫的流程與順序

操作範例如下所示:

#echo 1855 > /debug/tracing/set_ftrace_pid

=>選擇vsftpd 的PID (=1855)

#echo function > /debug/tracing/current_tracer

=>選擇Function Tracer

#echo 1 > /debug/tracing/tracing_enabled

=>之後透過另一台電腦經由FTP連線到這主機來,就可以記錄到這Daemon Function操作的行為

#echo 0 > /debug/tracing/tracing_enabled

#cat /debug/tracing/trace > /trace.txt

#echo nop > /debug/tracing/current_tracer

=>移除Tracer

再來看 trace.txt中結果如下

# tracer: function

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

vsftpd-1964 [000] 272.754213: irq_exit <-smp_apic_timer_interrupt

vsftpd-1964 [000] 272.754222: do_softirq <-irq_exit

vsftpd-1964 [000] 272.754229: call_on_stack <-do_softirq

vsftpd-1964 [000] 272.754235: __do_softirq <-call_on_stack

vsftpd-1964 [000] 272.754241: __local_bh_disable <-__do_softirq

vsftpd-1964 [000] 272.754279: run_timer_softirq <-__do_softirq

vsftpd-1964 [000] 272.754302: hrtimer_run_pending <-run_timer_softirq

vsftpd-1964 [000] 272.754324: _raw_spin_lock_irq <-run_timer_softirq

vsftpd-1964 [000] 272.754347: rcu_bh_qs <-__do_softirq

vsftpd-1964 [000] 272.754378: rcu_process_callbacks <-__do_softirq

vsftpd-1964 [000] 272.754400: __rcu_process_callbacks <-rcu_process_callbacks

vsftpd-1964 [000] 272.754422: force_quiescent_state <-__rcu_process_callbacks

vsftpd-1964 [000] 272.754444: rcu_gp_in_progress <-force_quiescent_state

…...

可以看到在vsftpd對應處理Client需求的行程運作時,Linux Kernel這所對應調用的函式行為與時間值.

Dynamic selection of functions

設定路徑為Kernel hacking --->Tracers --->enable/disable ftrace tracepoints dynamically

由於對全部Function Tracing的成本很高,以筆者所用的ARM環境而言,幾乎會導致系統無法正常的反應(mmm,如果你是用ARM Cortex A系列的處理器,應該就會很順了...),Ftrace也提供了動態選擇Function(Dynamic selection of functions) 進行Tracing的機制,以筆者所驗證的Linux Kernel 2.6.31而言,目前在ARM平台尚未支援這機制,而X86平台則可以支援.

使用範例如下的寫法,選擇只追蹤函式名稱包含有 spin與process字串的Linux Kernel Function呼叫

#echo '*spin*' '*process*' > /debug/tracing/set_ftrace_filter

執行結果如下所示

# tracer: function

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

vsftpd-1855 [000] 5064.651690: _raw_spin_lock_bh <-lock_sock_nested

vsftpd-1855 [000] 5064.651739: _raw_spin_lock_bh <-release_sock

vsftpd-1855 [000] 5064.651760: _raw_spin_unlock_bh <-release_sock

vsftpd-1855 [000] 5064.651780: _raw_spin_lock_bh <-lock_sock_nested

vsftpd-1855 [000] 5064.651800: _raw_spin_lock_bh <-release_sock

vsftpd-1855 [000] 5064.651819: _raw_spin_unlock_bh <-release_sock

vsftpd-1855 [000] 5064.651840: _raw_spin_lock <-fd_install

vsftpd-1855 [000] 5064.651871: _raw_spin_lock_irq <-sigprocmask

vsftpd-1855 [000] 5064.651924: _raw_spin_lock_irq <-sigprocmask

vsftpd-1855 [000] 5064.651974: _raw_spin_lock <-tick_periodic

vsftpd-1855 [000] 5064.652009: update_process_times <-tick_periodic

vsftpd-1855 [000] 5064.652017: account_process_tick <-update_process

_times

vsftpd-1855 [000] 5064.652034: _raw_spin_lock <-hrtimer_run_queues

vsftpd-1855 [000] 5064.652082: _raw_spin_lock <-scheduler_tick

vsftpd-1855 [000] 5064.652164: _raw_spin_lock_irq <-run_timer_softir

q

vsftpd-1855 [000] 5064.652241: copy_process <-do_fork

vsftpd-1855 [000] 5064.652278: _raw_spin_lock <-cache_alloc_refill

…..

就只會看到有包含 spin與process字串的函式呼叫,

接下來,如果要把有包含 raw 與 times字元的函式過濾掉,可透過如下設定

# echo '*raw*' '*times*' > /debug/tracing/set_ftrace_notrace

執行結果如下所示

# tracer: function

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

vsftpd-2099 [000] 5220.419859: tcp_rcv_state_process <-tcp_v4_do_rcv

vsftpd-2099 [000] 5220.419996: account_process_tick <-update_process_times

vsftpd-2099 [000] 5220.420520: tcp_child_process <-tcp_v4_do_rcv

vsftpd-2099 [000] 5220.420542: tcp_rcv_state_process <-tcp_child_process

vsftpd-2099 [000] 5220.420913: wake_up_process <-wake_up_worker

vsftpd-2099 [000] 5220.421737: account_process_tick <-update_process_times

vsftpd-1855 [000] 5220.422045: account_process_tick <-update_process_times

vsftpd-1855 [000] 5220.422358: copy_process <-do_fork

vsftpd-1855 [000] 5220.422953: account_process_tick <-update_process_times

vsftpd-2099 [000] 5220.423830: tcp_rcv_state_process <-tcp_v4_do_rcv

vsftpd-2099 [000] 5220.423903: kick_process <-signal_wake_up

vsftpd-2099 [000] 5220.423961: account_process_tick <-update_process_times

vsftpd-2099 [000] 5220.424257: kick_process <-signal_wake_up

…..

可以看到最後的結果,就是會包含有 spin與process字串的函式呼叫,並且把其中有包括 raw與times字串的函式呼叫過濾掉.

Function graph tracer

設定路徑為Kernel hacking --->Tracers --->Kernel Function Tracer --->Kernel Function Graph Tracer

以筆者所驗證的Linux Kernel 2.6.31而言,目前在ARM平台尚未支援這機制,而X86平台則可以支援.

操作範例如下所示,

#echo '*' > /debug/tracing/set_ftrace_filter

=>恢復對所有函式的Tracing

#echo function_graph > /debug/tracing/current_tracer

#echo 1 > /debug/tracing/tracing_enabled

=>讓 FTP Client連進來

# echo 0 > /debug/tracing/tracing_enabled

執行結果如下所示

# tracer: function_graph

#

# TIME CPU DURATION FUNCTION CALLS

# | | | | | | | |

0) 7.525 us | } /* idle_cpu */

0) + 36.032 us | } /* irq_enter */

0) | tick_handle_periodic() {

0) | tick_periodic() {

0) | do_timer() {

0) + 16.017 us | update_wall_time();

0) 6.749 us | calc_global_load();

0) + 43.684 us | }

0) | account_process_tick() {

0) 6.852 us | account_system_time();

0) + 20.041 us | }

0) | run_local_timers() {

0) | hrtimer_run_queues() {

0) 6.243 us | __current_kernel_time();

0) 6.183 us | __get_wall_to_monotonic();

0) + 35.781 us | }

0) 9.591 us | raise_softirq();

0) + 65.198 us | }

0) | rcu_check_callbacks() {

0) 6.357 us | idle_cpu();

0) | __rcu_pending() {

0) 6.199 us | rcu_gp_in_progress();

0) 6.218 us | cpu_has_callbacks_ready_to_invoke();

0) 6.265 us | cpu_needs_another_gp();

0) 6.175 us | rcu_gp_in_progress();

0) + 59.513 us | }

0) 6.867 us | raise_softirq();

0) + 98.833 us | }

0) 6.292 us | printk_tick();

0) | scheduler_tick() {

0) + 11.142 us | ktime_get();

0) 7.172 us | update_rq_clock();

0) | sched_avg_update() {

0) 6.353 us | sched_avg_period();

0) + 19.685 us | }

0) | task_tick_fair() {

0) 6.826 us | update_curr();

0) | sched_slice() {

0) 6.851 us | calc_delta_mine();

0) + 20.522 us | }

原本結果顯示的是時間差,也可以透過 trace_options選擇顯示絕對時間值

# echo funcgraph-abstime >/debug/tracing/trace_options

執行結果如下所示

# tracer: function_graph

#

# TIME CPU DURATION FUNCTION CALLS

# | | | | | | | |

8919.132197 | 0) ! 201.630 us | } /* __do_softirq */

8919.132218 | 0) ! 265.733 us | } /* do_softirq */

8919.132229 | 0) ! 297.422 us | } /* irq_exit */

------------------------------------------

0) vsftpd-2125 => <...>-1855

------------------------------------------

8919.132798 | 0) + 50.964 us | irq_enter();

8919.132908 | 0) + 25.462 us | raise_softirq();

8919.133053 | 0) | irq_exit() {

8919.133076 | 0) | do_softirq() {

8919.133095 | 0) | __do_softirq() {

8919.133153 | 0) + 43.064 us | run_timer_softirq();

8919.133245 | 0) ! 148.201 us | }

8919.133260 | 0) ! 185.542 us | }

8919.133272 | 0) ! 221.503 us | }

8919.134684 | 0) + 47.945 us | irq_enter();

8919.134785 | 0) + 19.704 us | raise_softirq();

8919.134911 | 0) | irq_exit() {

8919.134931 | 0) | do_softirq() {

8919.134949 | 0) | __do_softirq() {

8919.135019 | 0) + 17.437 us | irq_enter();

8919.135080 | 0) + 18.143 us | raise_softirq();

8919.135165 | 0) + 19.679 us | irq_exit();

8919.135230 | 0) + 39.134 us | run_timer_softirq();

8919.135307 | 0) + 33.181 us | run_timer_softirq();

8919.135384 | 0) ! 434.587 us | }

8919.135398 | 0) ! 468.233 us | }

8919.135408 | 0) ! 500.772 us | }

------------------------------------------

0) <...>-1855 => vsftpd-2125

------------------------------------------

8919.135778 | 0) + 43.128 us | irq_enter();

8919.135900 | 0) + 18.773 us | raise_softirq();

8919.136019 | 0) | irq_exit() {

8919.136037 | 0) | do_softirq() {

8919.136052 | 0) | __do_softirq() {

8919.136100 | 0) + 37.945 us | run_timer_softirq();

8919.136179 | 0) ! 125.562 us | }

8919.136192 | 0) ! 155.517 us | }

8919.136201 | 0) ! 184.853 us | }

Latency Tracing

1, irqsoff tracer

設定路徑為Kernel hacking --->Tracers --->Interrupts-off Latency Tracer

主要用來Tracing中斷被關閉的行為,並且會把關閉最久的時間與對應的呼叫動作顯示出來,提供系統效能分析的判斷之用,例如:當在中斷中做了過多I/O的行為,就會對系統效能造成影響,透過分析可以把這類設計移到Bottom-Half (HISR)中處理,減少系統把中斷關閉所導致的效能影響. (屬於I/O的等待應該放到系統閒置的時間才處理,在中斷裡要越快結束越好,把處理器執行時間讓給優先級高的Task).

執行範例如下所示

# echo nop > current_tracer

# echo irqsoff > current_tracer

# echo 1 > tracing_enabled

# sleep 3

# echo 0 >tracing_enabled

顯示結果如下

# tracer: irqsoff

#

# irqsoff latency trace v1.1.5 on 2.6.38.6

# --------------------------------------------------------------------

# latency: 10747 us, #123/123, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:1)

# -----------------

# | task: vsftpd-1961 (uid:0 nice:0 policy:0 rt_prio:0)

# -----------------

# => started at: __make_request

# => ended at: __blk_run_queue

#

#

# _------=> CPU#

# / _-----=> irqs-off

# | / _----=> need-resched

# || / _---=> hardirq/softirq

# ||| / _--=> preempt-depth

# |||| /_--=> lock-depth

# |||||/ delay

# cmd pid |||||| time | caller

# / / |||||| / | /

vsftpd-1961 0dN... 23us+: _raw_spin_lock_irq <-__make_request

vsftpd-1961 0dN... 103us+: cpu_coregroup_mask <-__make_request

vsftpd-1961 0dN... 118us!: elv_queue_empty <-__make_request

vsftpd-1961 0dN... 997us+: cfq_queue_empty <-elv_queue_empty

vsftpd-1961 0dN... 1055us+: blk_plug_device <-__make_request

vsftpd-1961 0dN... 1109us!: mod_timer <-blk_plug_device

vsftpd-1961 0dN... 1216us+: lock_timer_base <-mod_timer

vsftpd-1961 0dN... 1231us+: _raw_spin_lock_irqsave <-lock_timer_base

vsftpd-1961 0dN... 1253us+: internal_add_timer <-mod_timer

vsftpd-1961 0dN... 1270us+: _raw_spin_unlock_irqrestore <-mod_timer

vsftpd-1961 0dN... 1369us!: drive_stat_acct <-__make_request

vsftpd-1961 0dN... 1480us!: disk_map_sector_rcu <-drive_stat_acct

vsftpd-1961 0dN... 1713us+: part_round_stats <-drive_stat_acct

vsftpd-1961 0dN... 1782us+: part_round_stats_single <-part_round_stats

vsftpd-1961 0dN... 1832us!: part_round_stats_single <-part_round_stats

vsftpd-1961 0dN... 1938us+: __elv_add_request <-__make_request

vsftpd-1961 0dN... 1993us!: elv_insert <-__elv_add_request

vsftpd-1961 0dN... 2100us!: elv_rqhash_add <-elv_insert

vsftpd-1961 0dN... 2259us+: cfq_insert_request <-elv_insert

vsftpd-1961 0dN... 2330us+: cfq_init_prio_data <-cfq_insert_request

vsftpd-1961 0dN... 2421us+: cfq_add_rq_rb <-cfq_insert_request

vsftpd-1961 0dN... 2489us!: elv_rb_add <-cfq_add_rq_rb

vsftpd-1961 0dN... 2637us+: cfq_resort_rr_list <-cfq_add_rq_rb

vsftpd-1961 0dN... 2689us+: cfq_service_tree_add <-cfq_resort_rr_list

vsftpd-1961 0dN... 2724us+: cfqq_type <-cfq_service_tree_add

..

vsftpd-1961 0dN... 10463us+: _raw_spin_lock_irqsave <-lock_timer_base

vsftpd-1961 0dN... 10476us+: internal_add_timer <-mod_timer

vsftpd-1961 0dN... 10490us+: _raw_spin_unlock_irqrestore <-mod_timer

vsftpd-1961 0dN... 10562us!: _raw_spin_lock <-scsi_request_fn

vsftpd-1961 0dN... 10736us+: scsi_request_fn <-__blk_run_queue

vsftpd-1961 0dN... 10760us+: trace_hardirqs_on <-__blk_run_queue

….

其中,中間六個屬性數值意義分別為

1,CPU : 該任務所運作的CPU ID

2,irqs-off : 'd'=關閉中斷, '.'=中斷沒關閉,'X'=該平台不支援讀取IRQ Flag

3,need-resched: 'N'=need_resched被設置(表示行程會重新排程), '.'=前項不成立.

4,hardirq/softirq: 'H'=在軟體中斷進行過程中,發生的硬體中斷 , 'h' =正在硬體中斷處理中, 's' =正在軟體中斷處理中, '.' =一般行程運作中

5,preempt-depth: 表示preempt_disabled的Level

6,lock-depth:

接下來屬於 time的欄位,單位為 us,代表的是從呼叫的函式到被呼叫的函式所經過的時間,這段時間如果超過preempt_mark_thresh (default 100)的值,時間後面就會標示 '!',如果超過1個us,就會標示 '+',如果小於或等於1us則部會標示任何符號.

以這例子來說,起點會在函式__make_request (實作在block/blk-core.c中) 呼叫spin_lock_irq關閉硬體中斷,最後在函式__blk_run_queue(實作在block/blk-core.c中)後,恢復硬體中斷運作.

=>恢復中斷的流程,是呼叫函式trace_hardirqs_on(實作在kernel/lockdep.c中),再呼叫到函式trace_hardirqs_on_caller(實作在kernel/lockdep.c中),完成恢復硬體中斷的動作.

2, Wakeup tracer

設定路徑為Kernel hacking --->Tracers --->Scheduling Latency Tracer

可用以顯示 Real-Time Task取得執行權所需等待的時間,操作範例如下所示

# echo wakeup > current_tracer

# echo 1 > tracing_enabled

# chrt -f 10 ps ==>用real-Time 權限10執行 ps 指令

執行結果如下

# tracer: wakeup

#

# wakeup latency trace v1.1.5 on 2.6.38.6

# --------------------------------------------------------------------

# latency: 5961 us, #149/149, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:1)

# -----------------

# | task: kworker/0:1-9481 (uid:0 nice:0 policy:0 rt_prio:0)

# -----------------

#

# _------=> CPU#

# / _-----=> irqs-off

# | / _----=> need-resched

# || / _---=> hardirq/softirq

# ||| / _--=> preempt-depth

# |||| /_--=> lock-depth

# |||||/ delay

# cmd pid |||||| time | caller

# / / |||||| / | /

gam_serv-1892 0dNs.. 15us!: 1892:139:R + [000] 9481:120:R kworker/0:1

gam_serv-1892 0dNs.. 171us+: wake_up_process <-wake_up_worker

gam_serv-1892 0dNs.. 187us+: check_preempt_curr <-try_to_wake_up

gam_serv-1892 0dNs.. 199us+: check_preempt_wakeup <-check_preempt_curr

gam_serv-1892 0dNs.. 214us+: update_curr <-check_preempt_wakeup

….

gam_serv-1892 0.N... 5392us+: mutex_lock <-inotify_poll

gam_serv-1892 0.N... 5416us+: _cond_resched <-mutex_lock

gam_serv-1892 0.N... 5431us!: __cond_resched <-_cond_resched

gam_serv-1892 0.N... 5668us+: schedule <-__cond_resched

gam_serv-1892 0.N... 5697us+: rcu_note_context_switch <-schedule

gam_serv-1892 0.N... 5727us+: rcu_sched_qs <-rcu_note_context_switch

gam_serv-1892 0.N... 5755us+: _raw_spin_lock_irq <-schedule

gam_serv-1892 0dN... 5785us+: update_rq_clock <-schedule

gam_serv-1892 0dN... 5798us+: put_prev_task_fair <-schedule

gam_serv-1892 0dN... 5805us+: update_curr <-put_prev_task_fair

gam_serv-1892 0dN... 5812us+: check_spread <-put_prev_task_fair

gam_serv-1892 0dN... 5821us+: __enqueue_entity <-put_prev_task_fair

gam_serv-1892 0dN... 5839us+: pick_next_task_fair <-schedule

gam_serv-1892 0dN... 5845us+: __pick_next_entity <-pick_next_task_fair

gam_serv-1892 0dN... 5853us+: clear_buddies <-pick_next_task_fair

gam_serv-1892 0dN... 5862us+: set_next_entity <-pick_next_task_fair

gam_serv-1892 0dN... 5869us+: update_stats_wait_end <-set_next_entity

gam_serv-1892 0dN... 5880us+: __dequeue_entity <-set_next_entity

gam_serv-1892 0d.... 5937us+: schedule <-__cond_resched

gam_serv-1892 0d.... 5943us : 1892:139:R ==> [000] 9481:120:R kworker/0:1

根據上述的結果,我們可以看到花了約5961us從由kworker交出執行權到重新取得執行權,上面的列表中,包括了超過 1us的函式呼叫標示為'+',與超過100us的函式呼叫標示為'!',以如下的行為來說,就是從函式set_next_entity呼叫函式 update_stats_wait_end到函式set_next_entity呼叫函式__dequeue_entity ,中間間隔了5880-5869=11us的時間(因為超過1us所以標示為'+')

gam_serv-1892 0dN... 5869us+: update_stats_wait_end <-set_next_entity

gam_serv-1892 0dN... 5880us+: __dequeue_entity <-set_next_entity

或是

gam_serv-1892 0.N... 5431us!: __cond_resched <-_cond_resched

gam_serv-1892 0.N... 5668us+: schedule <-__cond_resched

Task Context-Switch Schedule tracer

設定路徑為Kernel hacking --->Tracers --->Scheduling Latency Tracer

用以解析Task Context-Switch的資訊,可以參考如下的範例

# echo > /data/debug/tracing/trace =>清空trace內容

# echo sched_switch > /data/debug/tracing/current_tracer

# echo 1 > /data/debug/tracing/tracing_enabled

# sleep 1

# echo 0 > /data/debug/tracing/tracing_enabled

#cat /data/debug/tracing/trace

# tracer: sched_switch

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

events/0-4 [000] 1044.598519: 4:115:S ==> [000] 87:120:R AlarmManager

AlarmManager-87 [000] 1044.598749: 87:120:R + [000] 69:118:S ActivityManager

AlarmManager-87 [000] 1044.599057: 87:120:R ==> [000] 69:118:R ActivityManager

ActivityManager-69 [000] 1044.599411: 69:118:S ==> [000] 87:120:R AlarmManager

AlarmManager-87 [000] 1044.599643: 87:120:R + [000] 69:118:S ActivityManager

AlarmManager-87 [000] 1044.599957: 87:120:R ==> [000] 69:118:R ActivityManager

ActivityManager-69 [000] 1044.600498: 69:118:S ==> [000] 87:120:R AlarmManager

AlarmManager-87 [000] 1044.600929: 87:120:R + [000] 69:118:S ActivityManager

AlarmManager-87 [000] 1044.601509: 87:120:R ==> [000] 69:118:R ActivityManager

ActivityManager-69 [000] 1044.602102: 69:118:R + [000] 72:120:S ProcessStats

ActivityManager-69 [000] 1044.604484: 69:118:R + [000] 67:118:S er.ServerThread

ActivityManager-69 [000] 1044.604661: 69:118:R ==> [000] 67:118:R er.ServerThread

er.ServerThread-67 [000] 1044.605145: 67:118:S ==> [000] 72:120:R ProcessStats

ProcessStats-72 [000] 1044.611966: 72:120:R ==> [000] 69:118:R ActivityManager

如上所示,所列出的數據分別包含Task ID/Name,運作的處理器ID(供多核心的環境分析),每次觸發的時間值,如下以上述例子做一些說明

AlarmManager-87 [000] 1044.600929: 87:120:R + [000] 69:118:S ActivityManager

=>PID 87,執行優先級120,Task狀態為 Running,運作在處理器0的AlarmManager, 會喚醒 PID 69,執行優先級118,Task狀態為Sleep運作在處理器0的ActivityManager

AlarmManager-87 [000] 1044.601509: 87:120:R ==> [000] 69:118:R ActivityManager

=>PID 87,執行優先級120,Task狀態為 Running,運作在處理器0的AlarmManager, 會把執行權切換(Context-Switch)到 PID 69,執行優先級118,Task狀態為Running運作在處理器0的ActivityManager

ActivityManager-69 [000] 1044.602102: 69:118:R + [000] 72:120:S ProcessStats

=>PID 69,執行優先級118,Task狀態為 Running,運作在處理器0的ActivityManager, 會喚醒 PID 72,執行優先級120,Task狀態為Sleep運作在處理器0的ProcessStats

ActivityManager-69 [000] 1044.604484: 69:118:R + [000] 67:118:S er.ServerThread

=>PID 69,執行優先級118,Task狀態為 Running,運作在處理器0的ActivityManager, 會喚醒 PID 67,執行優先級118,Task狀態為Sleep運作在處理器0的er.ServerThread

ActivityManager-69 [000] 1044.604661: 69:118:R ==> [000] 67:118:R er.ServerThread

=>PID 69,執行優先級118,Task狀態為 Running,運作在處理器0的ActivityManager , 會把執行權切換(Context-Switch)到 PID 67,執行優先級118,Task狀態為Running運作在處理器0的er.ServerThread

有關Task Priority 參考如下表

(技術部分也可以參考筆者這篇文章 Linux 2.4/2.6 核心排程機制剖析 ”http://loda.hala01.com/2009/04/linux-2426-%e6%a0%b8%e5%bf%83%e6%8e%92%e7%a8%8b%e6%a9%9f%e5%88%b6%e5%89%96%e6%9e%90/”)

Kernel task priority

說明

0 to 99

Real-Time priority 99 to 0

100 to 139

對應到Task Nice Value -20 to 19

140

idle task priority

有關Task 狀態 參考如下表

狀態符號

說明

R (Running)

Wants to run, may not

S (Sleep)ctually be running

Process is waiting to be woken up (handles signals)

D (Disk sleep)

(uninterruptible sleep) Process must be woken up (ignores signals)

T (Stopped)

Process suspended

t (Traced)

Process is being traced (with something like gdb)

Z (Zombie)

Process waiting to be cleaned up

X (Unknown)

Block Device Tracer

設定路徑為Kernel hacking --->Tracers --->Support for tracing block io actions

可以用來監控儲存裝置的使用狀況,與行為

例如筆者要啟動Block Device Tracer監控目前正在編譯程式碼的儲存裝置sda2,如下操作指令

#echo 0 > /debug/tracing/tracing_enabled

#echo 1 >/sys/block/sda/sda2/trace/enable

# echo blk > current_tracer

#echo 1 > /debug/tracing/tracing_enabled

…..執行編譯的行為....

#echo 0 > /debug/tracing/tracing_enabled

#echo 0 >/sys/block/sda/sda2/trace/enable

再來檢視 trace中的結果如下所示,

# tracer: blk

#

mv-21550 [000] 6663.719323: 8,2 m N cfq21550 sl_used=4 disp=1 charge=4 iops=0 sect=8

mv-21550 [000] 6663.719341: 8,2 m N cfq21550 del_from_rr

mv-21550 [000] 6663.719367: 8,2 m N cfq21550 put_queue

cc1-21557 [000] 6663.920569: 8,2 A R 221728392 + 56 <- (253,0) 221728008

cc1-21557 [000] 6663.920576: 8,2 A R 221937237 + 56 <- (8,2) 221728392

cc1-21557 [000] 6663.920577: 8,2 Q R 221937237 + 56 [cc1]

cc1-21557 [000] 6663.920751: 8,2 m N cfq21557 alloced

cc1-21557 [000] 6663.920816: 8,2 G R 221937237 + 56 [cc1]

cc1-21557 [000] 6663.920875: 8,2 P N [cc1]

cc1-21557 [000] 6663.921844: 8,2 I R 221937237 + 56 [cc1]

cc1-21557 [000] 6663.921870: 8,2 m N cfq21557 insert_request

cc1-21557 [000] 6663.921906: 8,2 m N cfq21557 add_to_rr

as-21558 [000] 6663.923906: 8,2 UT N [as] 1

kworker/0:1-16318 [000] 6663.924047: 8,2 U N [kworker/0:1] 1

kworker/0:1-16318 [000] 6663.924102: 8,2 m N cfq workload slice:100

kworker/0:1-16318 [000] 6663.924146: 8,2 m N cfq21557 set_active wl_prio:0 wl_type:2

…..............

ld-21578 [000] 6665.826767: 8,2 U N [ld] 0

ld-21578 [000] 6665.826865: 8,2 U N [ld] 0

ld-21578 [000] 6665.829130: 8,2 U N [ld] 0

ld-21578 [000] 6665.831853: 8,2 U N [ld] 0

ld-21578 [000] 6665.839538: 8,2 m N cfq21578 put_queue

<...>-21532 [000] 6665.902373: 8,2 m N cfq21532 put_queue

make-21583 [000] 6665.920115: 8,2 A R 221728880 + 8 <- (253,0) 221728496

make-21583 [000] 6665.920121: 8,2 A R 221937725 + 8 <- (8,2) 221728880

make-21583 [000] 6665.920123: 8,2 Q R 221937725 + 8 [make]

make-21583 [000] 6665.920468: 8,2 m N cfq21583 alloced

make-21583 [000] 6665.920646: 8,2 G R 221937725 + 8 [make]

make-21583 [000] 6665.920691: 8,2 P N [make]

make-21583 [000] 6665.920741: 8,2 I R 221937725 + 8 [make]

make-21583 [000] 6665.920770: 8,2 m N cfq21583 insert_request

make-21583 [000] 6665.920800: 8,2 m N cfq21583 add_to_rr

make-21583 [000] 6665.920861: 8,2 U N [make] 1

make-21583 [000] 6665.920897: 8,2 m N cfq workload slice:100

make-21583 [000] 6665.920923: 8,2 m N cfq21583 set_active wl_prio:0 wl_type:2

…..............

Branch tracer

1, Trace likely/unlikely profiler

設定路徑為Kernel hacking --->Tracers --->Branch Profiling --->Trace likely/unlikely profiler

用以顯示Kernel基於likely/unlikely的Run-Time統計結果

操作範例如下所示

#echo 0 > tracing_enabled

# echo branch > current_tracer

#echo 1 > tracing_enabled

…..do something.....

#echo 0 > tracing_enabled

查看結果如下

# tracer: branch

#

# TASK-PID CPU# TIMESTAMP CORRECT FUNC:FILE:LINE

# | | | | | |

<idle>-0 [000] 55415.397835: [ ok ] read_seqretry:seqlock.h:110

<idle>-0 [000] 55415.397848: [ ok ] native_sched_clock:tsc.c:55

<idle>-0 [000] 55415.397859: [ ok ] native_sched_clock:tsc.c:55

….

<idle>-0 [000] 55415.398682: [ ok ] __local_bh_enable:softirq.c:138

<idle>-0 [000] 55415.398695: [ ok ] update_wall_time:timekeeping.c:789

<idle>-0 [000] 55415.398710: [ MISS ] timekeeping_adjust:timekeeping.c:711

<idle>-0 [000] 55415.398725: [ ok ] update_wall_time:timekeeping.c:839

<idle>-0 [000] 55415.398734: [ ok ] update_wall_time:timekeeping.c:859

…..

<idle>-0 [000] 55415.401654: [ ok ] update_wall_time:timekeeping.c:789

<idle>-0 [000] 55415.401668: [ MISS ] timekeeping_adjust:timekeeping.c:711

<idle>-0 [000] 55415.401677: [ ok ] update_wall_time:timekeeping.c:839

<idle>-0 [000] 55415.401687: [ ok ] update_wall_time:timekeeping.c:859

…..

針對MISS比較多的段落寫法,我們可以透過修改likely/unlikely的配置,讓gcc __builtin_expect的函式發揮作用,把真正的熱區放在循序的執行過程中,非熱區放在需要Jump/Branch的位址,減少處理器Pipeline被Flush的機會,從而也可以讓系統的運作效率更佳.

2, Profile all if conditionals

設定路徑為Kernel hacking --->Tracers --->Branch Profiling --->Profile all if conditionals

操作範例如下所示

#echo 0 > tracing_enabled

# echo branch > current_tracer

#echo 1 > tracing_enabled

…..do something.....

#echo 0 > tracing_enabled

查看結果如下

# tracer: branch

#

# TASK-PID CPU# TIMESTAMP CORRECT FUNC:FILE:LINE

# | | | | | |

<...>-1806 [000] 297.131833: [ ok ] native_sched_clock:tsc.c:55

<...>-1806 [000] 297.131843: [ ok ] native_sched_clock:tsc.c:55

<...>-1806 [000] 297.131850: [ ok ] sched_clock_local:sched_cloc

k.c:149

<...>-1806 [000] 297.131860: [ ok ] sched_clock_cpu:sched_clock.

c:219

<...>-1806 [000] 297.131866: [ ok ] sched_clock_cpu:sched_clock.

c:219

<...>-1806 [000] 297.131873: [ ok ] sched_clock_cpu:sched_clock.

c:224

<...>-1806 [000] 297.131879: [ ok ] native_sched_clock:tsc.c:55

<...>-1806 [000] 297.131886: [ ok ] sched_clock_local:sched_cloc

k.c:149

<...>-1806 [000] 297.131896: [ ok ] update_curr:sched_fair.c:576

<...>-1806 [000] 297.131903: [ ok ] calc_delta_fair:sched_fair.c

:477

<...>-1806 [000] 297.131913: [ ok ] trace_sched_stat_runtime:sch

ed.h:363:

<...>-1806 [000] 297.131922: [ ok ] __sched_period:sched_fair.c:496

<...>-1806 [000] 297.131928: [ ok ] sched_slice:sched_fair.c:521

<...>-1806 [000] 297.131935: [ MISS ] calc_delta_mine:sched.c:1327

<...>-1806 [000] 297.131942: [ ok ] resched_task:sched.c:1158

<...>-1806 [000] 297.131949: [ MISS ] test_tsk_need_resched:sched.h:2378

<...>-1806 [000] 297.131956: [ ok ] perf_event_task_tick:perf_event.c:1828

<...>-1806 [000] 297.131963: [ ok ] perf_event_task_tick:perf_event.c:1828

<...>-1806 [000] 297.131970: [ MISS ] trigger_load_balance:sched_fair.c:3989

<...>-1806 [000] 297.131978: [ ok ] run_posix_cpu_timers:posix-cpu-timers.c:1312

<...>-1806 [000] 297.131995: [ ok ] __local_bh_disable:softirq.c:98

<...>-1806 [000] 297.132002: [ ok ] __local_bh_disable:softirq.c:98

<...>-1806 [000] 297.132032: [ ok ] trace_softirq_entry:irq.h:117

<...>-1806 [000] 297.132054: [ ok ] trace_softirq_exit:irq.h:131

Kernel memory tracer

內存tracer主要用來跟踪slab allocator的分配情況。包括kfree,kmem_cache_alloc 等API 的調用情況,用戶程序可以根據tracer 收集到的信息分析內部碎片情況,找出內存分配最頻繁的代碼片斷,等等

設定路徑為Kernel hacking --->Tracers --->Trace SLAB allocations

操作範例如下所示

# echo 0 > tracing_enabled

# echo kmemtrace > current_tracer

# echo 1 > tracing_enabled

# echo 0 > tracing_enabled

….etc

Workqueue statistical tracer

這是一個statistic tracer,主要用來統計系統所有workqueue現況,可以知道有多少work被插入到queue中,以及有多少被執行,與work的名稱, 可用以提供開發端對於workqueue機制實現的參考.

設定路徑為Kernel hacking --->Tracers --->Trace workqueues,系統啟動後,可以到/data/debug/tracing/trace_stat/workqueues下,查看內容,如下所示

# CPU INSERTED EXECUTED NAME

# | | | |

0 2139 2139 events/0

0 0 0 khelper

0 0 0 suspend

0 1 1 kblockd/0

0 1 1 kmmcd

0 0 0 aio/0

0 0 0 crypto/0

0 0 0 rpciod/0

Profiling

1,unlikely/likely

可用以顯示 likely/unlikely 條件判斷式的統計結果

統計結果位於 /debug/tracing/trace_stat/branch_annotated

如下所示

correct incorrect % Function File Line

------- --------- - -------- ---- ----

0 1 100 tcp_synack_options tcp_output.c 535

0 1 100 tcp_synack_options tcp_output.c 529

0 1 100 tcp_synack_options tcp_output.c 524

0 9 100 tcp_options_write tcp_output.c 397

0 3622 100 tcp_established_options tcp_output.c 562

0 7 100 sys_inotify_add_watch inotify_user.c 750

0 7 100 fput_light file.h 29

0 17920 100 get_futex_key futex.c 269

0 28 100 clocksource_adjust timekeeping.c 472

0 4 100 blocking_notifier_chain_regist notifier.c 220

0 1 100 signal_pending sched.h 2251

0 6849 100 trace_workqueue_execution workqueue.h 53

0 8 100 trace_workqueue_creation workqueue.h 76

0 6849 100 trace_workqueue_insertion workqueue.h 31

0 38 100 yield_task_fair sched_fair.c 1016

0 25617 100 _pick_next_task_rt sched_rt.c 1041

0 92856 100 sched_info_switch sched_stats.h 269

0 85509 100 sched_info_queued sched_stats.h 222

0 54733 100 sched_info_dequeued sched_stats.h 177

0 1 100 consistent_init dma-mapping.c 470

9 1710 99 tcp_options_write tcp_output.c 396

2 354 99 disk_put_part genhd.h 209

606 67234 99 fput_light file.h 29

35 9260 99 trace_kmalloc kmem.h 79

313 12957 97 fput_light file.h 29

542 10337 95 sys_gettimeofday time.c 111

70 770 91 blk_update_request blk-core.c 1968

28116 135585 82 fget_light file_table.c 332

2 8 80 trace_kmalloc kmem.h 79

24893 53424 68 fput_light file.h 29

4547 8951 66 need_resched sched.h 2273

750 1338 64 fput_light file.h 29

15323 25008 62 wakeup_gran sched_fair.c 1382

44 61 58 mmdrop sched.h 2024

821 1006 55 pskb_trim_rcsum skbuff.h 1690

84797 95325 52 calc_delta_fair sched_fair.c 400

1 1 50 remap_pte_range memory.c 1634

270 270 50 isolate_lru_pages vmscan.c 926

32028 27195 45 rmqueue_bulk page_alloc.c 907

1970 1505 43 copy_pte_range memory.c 618

24549 17984 42 trace_kfree kmem.h 208

2,branche all

可用以顯示所有 if條件判斷式的統計結果

統計結果位於 /debug/tracing/trace_stat/branch_all

如下所示

miss hit % Function File Line

------- --------- - -------- ---- ----

0 1 100 i386_start_kernel head32.c 50

1 0 0 reserve_ebda_region head.c 51

1 0 0 reserve_ebda_region head.c 47

0 1 100 reserve_ebda_region head.c 42

0 0 X nrcpus main.c 167

0 0 X maxcpus main.c 178

1098 0 0 do_one_initcall main.c 773

1098 0 0 do_one_initcall main.c 769

1098 0 0 do_one_initcall main.c 765

1098 0 0 do_one_initcall main.c 762

1098 0 0 do_one_initcall main.c 755

1 0 0 start_kernel main.c 682

0 1 100 start_kernel main.c 675

1 0 0 start_kernel main.c 661

1 0 0 start_kernel main.c 647

1 0 0 start_kernel main.c 630

1 0 0 start_kernel main.c 611

1 0 0 kernel_init main.c 914

0 1 100 kernel_init main.c 911

1 0 0 kernel_init main.c 901

1 0 0 smp_init main.c 399

1 0 0 smp_init main.c 397

1 1 50 cpumask_next cpumask.h 172

0 0 X init_post main.c 850

0 1 100 init_post main.c 838

0 0 X unknown_bootoption main.c 321

0 0 X unknown_bootoption main.c 313

0 0 X unknown_bootoption main.c 309

0 0 X unknown_bootoption main.c 305

0 0 X unknown_bootoption main.c 302

0 0 X unknown_bootoption main.c 299

0 2 100 unknown_bootoption main.c 295

0 0 X unknown_bootoption main.c 286

0 1 100 unknown_bootoption main.c 284

1 1 50 unknown_bootoption main.c 282

0 2 100 obsolete_checksetup main.c 235

2 0 0 obsolete_checksetup main.c 231

0 0 X obsolete_checksetup main.c 229

2 0 0 obsolete_checksetup main.c 224

28 2 6 obsolete_checksetup main.c 223

1 1 50 parse_early_param main.c 496

0 0 X do_early_param main.c 476

368 0 0 do_early_param main.c 475

1 0 0 readonly do_mounts.c 44

3,Functions

可用以顯示所有函式執行次數與時間的統計結果.

操作範例如下所示

# echo 1 > /debug/tracing/function_profile_enabled

…..do something.....

# echo 0 > /debug/tracing/function_profile_enabled

#cat /debug/tracing/trace_stat/function0

可看到如下結果

Function Hit Time Avg s^2

-------- --- ---- ------

schedule 752 99800000 us 132712.7 us 2635264000 us

poll_schedule_timeout 57 37432000 us 656701.7 us 2143996599 us

schedule_hrtimeout_range 55 37429000 us 680527.2 us 339918459 us

schedule_hrtimeout_range_clock 55 37424000 us 680436.3 us 237544596 us

sys_select 87 25028000 us 287678.1 us 306802216 us

core_sys_select 87 25012000 us 287494.2 us 155578451 us

do_select 87 24875000 us 285919.5 us 323446561 us

smp_apic_timer_interrupt 15677 16056000 us 1024.175 us 76796.05 us

do_timer 15677 15677000 us 1000.000 us 0.000 us

tick_periodic 15677 15677000 us 1000.000 us 0.000 us

tick_handle_periodic 15677 15677000 us 1000.000 us 0.000 us

sys_poll 14 14227000 us 1016214 us 3 503583162 us

do_sys_poll 14 14217000 us 1015500 us 3 379054126 us

sys_nanosleep 15 13957000 us 930466.6 us 1834329984 us

hrtimer_nanosleep 15 13945000 us 929666.6 us 1719729983 us

do_nanosleep 15 13944000 us 929600.0 us 1710319131 us

sys_read 167 13503000 us 80856.28 us 1298120022 us

vfs_read 172 13490000 us 78430.23 us 2981769264 us

tty_read 61 13297000 us 217983.6 us 2678381427 us

n_tty_read 61 13256000 us 217311.4 us 2834883001 us

schedule_timeout 20 12966000 us 648300.0 us 2458710875 us

Max Stack Tracer

可用以顯示核心函式呼叫最深的Stack大小與在該情況下的Call Stack路徑.

設定路徑為Kernel hacking --->Tracers --->Trace max stack

# echo 1 > /proc/sys/kernel/stack_tracer_enabled

# sleep 10

# echo 0 > /proc/sys/kernel/stack_tracer_enabled

# cat stack_max_size

2152

# cat stack_trace

Depth Size Location (22 entries)

----- ---- --------

0) 2084 68 update_curr+0x100/0x210

1) 2016 56 enqueue_task_fair+0x90/0x4e0

2) 1960 24 enqueue_task+0x8d/0xb0

3) 1936 12 activate_task+0x25/0x50

4) 1924 88 try_to_wake_up+0x24a/0x5f0

5) 1836 8 wake_up_process+0x14/0x20

6) 1828 8 hrtimer_wakeup+0x2c/0x30

7) 1820 80 hrtimer_run_queues+0x1d4/0x480

8) 1740 8 run_local_timers+0xd/0x20

9) 1732 20 update_process_times+0x34/0x90

10) 1712 8 tick_periodic+0x7a/0x90

11) 1704 32 tick_handle_periodic+0x1e/0xb0

12) 1672 8 smp_apic_timer_interrupt+0x9c/0x9e

13) 1664 108 apic_timer_interrupt+0x2f/0x34

14) 1556 232 schedule+0x62b/0x10f0

15) 1324 76 schedule_hrtimeout_range_clock+0x14b/0x170

16) 1248 12 schedule_hrtimeout_range+0x17/0x20

17) 1236 20 poll_schedule_timeout+0x4c/0x70

18) 1216 764 do_select+0x789/0x7c0

19) 452 332 core_sys_select+0x27d/0x460

20) 120 40 sys_select+0x40/0xe0

21) 80 80 syscall_call+0x7/0xb

主要會列出目前最深的Stack Size,與根據該Stack Size發生的情況下,完整的Call Stack流程與每個函式所佔的空間(會包括區域變數與Function Parameters所佔的Stack空間).讓開發者,可以根據這流程去檢討函式與呼叫流程設計的合理性.

Event tracing

Event幾乎是整個Ftrace中,各種資訊的集合,瀏覽/debug/tracing/events目錄中的檔案名稱如下所示

drwxr-xr-x 4 root root 0 XXX 20 23:55 bkl

drwxr-xr-x 20 root root 0 XXX 20 23:55 block

-rw-r--r-- 1 root root 0 XXX 20 23:55 enable

drwxr-xr-x 14 root root 0 XXX 20 23:55 ftrace

-r--r--r-- 1 root root 0 XXX 20 23:55 header_event

-r--r--r-- 1 root root 0 XXX 20 23:55 header_page

drwxr-xr-x 7 root root 0 XXX 20 23:55 irq

drwxr-xr-x 14 root root 0 XXX 20 23:55 kmem

drwxr-xr-x 3 root root 0 XXX 20 23:55 mce

drwxr-xr-x 7 root root 0 XXX 20 23:55 module

drwxr-xr-x 3 root root 0 XXX 20 23:55 napi

drwxr-xr-x 6 root root 0 XXX 20 23:55 net

drwxr-xr-x 12 root root 0 XXX 20 23:55 power

drwxr-xr-x 4 root root 0 XXX 20 23:55 raw_syscalls

drwxr-xr-x 18 root root 0 XXX 20 23:55 sched

drwxr-xr-x 7 root root 0 XXX 20 23:55 scsi

drwxr-xr-x 6 root root 0 XXX 20 23:55 signal

drwxr-xr-x 5 root root 0 XXX 20 23:55 skb

drwxr-xr-x 634 root root 0 XXX 20 23:55 syscalls

drwxr-xr-x 14 root root 0 XXX 20 23:55 timer

drwxr-xr-x 15 root root 0 XXX 20 23:55 vmscan

drwxr-xr-x 6 root root 0 XXX 20 23:55 workqueue

drwxr-xr-x 23 root root 0 XXX 20 23:55 writeback

除了 enable,header_event與header_page這三個控制檔案外,其他每個目錄都包含一種類別的Event參數,以目前筆者所使用的環境,Event包含了以下的類別

名稱

說明

bkl

BKL為Linux Kernel 2.2時,為了提供一個暫時簡易的支援SMP機制,除了讓多核心的架構可以支援多個Tasks同時運作外,還需要支援一個Global Spin-Lock機制,確保同一時間只有一個Task在一個需要保護的子系統中運作,因此,就有了BKL機制的誕生,然而,BKL機制本身仍有諸多缺陷 (詳情,後續會再抽空用其他文章描述多核心的Kernel運作),也因此,新版的Linux Kernel會逐步的移除BKL機制,讓核心可以採用更適當的多核心設計機制.

參考Kernel Lock實作/lib/kernel_lock.c,BKL全名為 Big Kernel Lock,主要是提供各別子系統,確保同一時間只有一個Task可以在該子系統中運作(例如,在Kernel 2.6.31的reiserfs檔案系統實作中,針對SMP(Symmetric Multi-Processors)處理器組態,就用到BKL確保同一時間只有一個Task可以對reiserfs進行操作),目前BKL並不建議在新設計的程式碼中使用,在最新的Linux Kernel 2.6.39中,BKL機制也已經移除.

檢視 events/bkl目錄下,共支援以下的Events

lock_kernel unlock_kernel

#echo 1 > /debug/tracing/events/bkl/enable

檢視 /debug/tracing/trace下的結果,

block

會把blk-core (in block/blk-core.c)中相關處理的動作,記錄下來,供開發者觀察相關行程對Block裝置的行為,

檢視 events/block目錄下,共支援以下的Events

block_bio_backmerge block_bio_remap block_rq_insert block_split block_bio_bounce block_getrq block_rq_issue block_unplug_io block_bio_complete block_plug block_rq_remap block_unplug_timer block_bio_frontmerge block_rq_abort block_rq_requeue block_bio_queue block_rq_complete block_sleeprq

#echo 1 > /debug/tracing/events/block/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

kjournald-48 [000] 5699.102000: block_bio_queue: 253,0 WS 217891064 + 8 [kjournald]

kjournald-48 [000] 5699.102000: block_bio_remap: 8,2 WS 217891448 + 8 <- (253,0) 217891064

kjournald-48 [000] 5699.102000: block_bio_remap: 8,0 WS 218100293 + 8 <- (8,2) 217891448

kjournald-48 [000] 5699.102000: block_bio_queue: 8,0 WS 218100293 + 8 [kjournald]

kjournald-48 [000] 5699.102000: block_getrq: 8,0 WS 218100293 + 8 [kjournald]

kjournald-48 [000] 5699.102000: block_plug: [kjournald]

kjournald-48 [000] 5699.102000: block_rq_insert: 8,0 WS 0 () 218100293 + 8 [kjournald]

kjournald-48 [000] 5699.102000: block_rq_issue: 8,0 WS 0 () 218100293 + 8 [kjournald]

<...>-13394 [000] 5699.114000: block_rq_complete: 8,0 WS () 218100293 + 8 [0]

<...>-13394 [000] 5699.114000: block_bio_complete: 253,0 WS 217891064 + 8 [0]

kjournald-48 [000] 5699.126000: block_bio_queue: 253,0 WS 61864 + 8 [kjournald]

kjournald-48 [000] 5699.126000: block_bio_remap: 8,2 WS 62248 + 8 <- (253,0) 61864

kjournald-48 [000] 5699.126000: block_bio_remap: 8,0 WS 271093 + 8 <- (8,2) 62248

kjournald-48 [000] 5699.126000: block_bio_queue: 8,0 WS 271093 + 8 [kjournald]

kjournald-48 [000] 5699.126000: block_getrq: 8,0 WS 271093 + 8 [kjournald]

kjournald-48 [000] 5699.126000: block_plug: [kjournald]

kjournald-48 [000] 5699.126000: block_rq_insert: 8,0 WS 0 () 271093 + 8 [kjournald]

kjournald-48 [000] 5699.126000: block_rq_issue: 8,0 WS 0 () 271093 + 8 [kjournald]

kjournald-48 [000] 5699.127000: block_rq_complete: 8,0 WS () 271093 + 8 [0]

kjournald-48 [000] 5699.127000: block_bio_complete: 253,0 WS 61864 + 8 [0]

kjournald-48 [000] 5699.128000: block_bio_queue: 253,0 WS 61872 + 8 [kjournald]

kjournald-48 [000] 5699.128000: block_bio_remap: 8,2 WS 62256 + 8 <- (253,0) 61872

kjournald-48 [000] 5699.128000: block_bio_remap: 8,0 WS 271101 + 8 <- (8,2) 62256

kjournald-48 [000] 5699.128000: block_bio_queue: 8,0 WS 271101 + 8 [kjournald]

kjournald-48 [000] 5699.128000: block_getrq: 8,0 WS 271101 + 8 [kjournald]

kjournald-48 [000] 5699.128000: block_plug: [kjournald]

kjournald-48 [000] 5699.128000: block_rq_insert: 8,0 WS 0 () 271101 + 8 [kjournald]

kjournald-48 [000] 5699.128000: block_rq_issue: 8,0 WS 0 () 271101 + 8 [kjournald]

kjournald-48 [000] 5699.129000: block_rq_complete: 8,0 WS () 271101 + 8 [0]

kjournald-48 [000] 5699.129000: block_bio_complete: 253,0 WS 61872 + 8 [0]

kjournald-48 [000] 5699.129000: block_bio_queue: 253,0 WS 61880 + 8 [kjournald]

irq

檢視 events/irq目錄下,共支援以下的Events

irq_handler_entry softirq_entry softirq_raise irq_handler_exit softirq_exit

#echo 1 > /debug/tracing/events/irq/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

<idle>-0 [000] 5601.436000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.436000: softirq_raise: vec=9 [action=RCU]

<idle>-0 [000] 5601.436000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.436000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.436000: softirq_entry: vec=9 [action=RCU]

<idle>-0 [000] 5601.436000: softirq_exit: vec=9 [action=RCU]

<idle>-0 [000] 5601.437000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.437000: softirq_raise: vec=9 [action=RCU]

<idle>-0 [000] 5601.437000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.437000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.437000: softirq_entry: vec=9 [action=RCU]

<idle>-0 [000] 5601.437000: softirq_exit: vec=9 [action=RCU]

<idle>-0 [000] 5601.438000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.438000: softirq_raise: vec=9 [action=RCU]

<idle>-0 [000] 5601.438000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.438000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.438000: softirq_entry: vec=9 [action=RCU]

<idle>-0 [000] 5601.438000: softirq_exit: vec=9 [action=RCU]

<idle>-0 [000] 5601.439000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.439000: softirq_raise: vec=9 [action=RCU]

<idle>-0 [000] 5601.439000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.439000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.439000: softirq_entry: vec=9 [action=RCU]

<idle>-0 [000] 5601.439000: softirq_exit: vec=9 [action=RCU]

<idle>-0 [000] 5601.440000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.440000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.440000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.441000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.441000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.441000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.442000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.442000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.442000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.443000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.443000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.443000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.444000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.444000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.444000: softirq_exit: vec=1 [action=TIMER]

<idle>-0 [000] 5601.445000: softirq_raise: vec=1 [action=TIMER]

<idle>-0 [000] 5601.445000: softirq_entry: vec=1 [action=TIMER]

<idle>-0 [000] 5601.445000: softirq_exit: vec=1 [action=TIMER]

kmem

檢視 events/kmem目錄下,共支援以下的Events

kmalloc_node mm_page_alloc mm_page_pcpu_drain kmem_cache_alloc mm_page_alloc_extfrag mm_pagevec_free kfree kmem_cache_alloc_node mm_page_alloc_zone_locked kmalloc kmem_cache_free mm_page_free_direct

#echo 1 > /debug/tracing/events/kmem/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

cc1-12884 [000] 5513.761000: kmem_cache_alloc: call_site=c0527d6e ptr=f4ed5300 bytes_req=192 bytes_alloc=192 gfp_flags=GFP_KERNEL|GFP_ZERO

cc1-12884 [000] 5513.762000: kmem_cache_free: call_site=c047fb8b ptr=f59e7740

cc1-12884 [000] 5513.762000: kmem_cache_free: call_site=c0527d1b ptr=f4eb3680

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c050c13a ptr=f4f2217c bytes_req=88 bytes_alloc=88 gfp_flags=GFP_KERNEL|GFP_ZERO

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c050c13a ptr=f4f2222c bytes_req=88 bytes_alloc=88 gfp_flags=GFP_KERNEL|GFP_ZERO

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c050be6e ptr=f4f53b1c bytes_req=88 bytes_alloc=88 gfp_flags=GFP_KERNEL|GFP_ZERO

cc1-12884 [000] 5513.762000: mm_page_alloc: page=f7490cc0 pfn=316902 order=0 migratetype=0 gfp_flags=GFP_KERNEL|GFP_REPEAT|GFP_ZERO|0x2

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c0511475 ptr=f4f407d0 bytes_req=24 bytes_alloc=24 gfp_flags=GFP_KERNEL

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c05114c5 ptr=f4f58ac8 bytes_req=16 bytes_alloc=16 gfp_flags=GFP_KERNEL

cc1-12884 [000] 5513.762000: mm_page_alloc: page=f7214760 pfn=235451 order=0 migratetype=2 gfp_flags=GFP_HIGHUSER_MOVABLE

cc1-12884 [000] 5513.762000: kmalloc: call_site=c057c8e7 ptr=f4cc7800 bytes_req=192 bytes_alloc=192 gfp_flags=GFP_KERNEL

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c050c13a ptr=f4f79124 bytes_req=88 bytes_alloc=88 gfp_flags=GFP_KERNEL|GFP_ZERO

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c050af83 ptr=f4f221d4 bytes_req=88 bytes_alloc=88 gfp_flags=GFP_KERNEL

cc1-12884 [000] 5513.762000: kmem_cache_free: call_site=c050a063 ptr=f4f221d4

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c050c13a ptr=f4f221d4 bytes_req=88 bytes_alloc=88 gfp_flags=GFP_KERNEL|GFP_ZERO

cc1-12884 [000] 5513.762000: mm_page_alloc: page=f7490c80 pfn=316900 order=0 migratetype=0 gfp_flags=GFP_KERNEL|GFP_REPEAT|GFP_ZERO|0x2

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c0511475 ptr=f4edb368 bytes_req=24 bytes_alloc=24 gfp_flags=GFP_KERNEL

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c05114c5 ptr=f4f58438 bytes_req=16 bytes_alloc=16 gfp_flags=GFP_KERNEL

cc1-12884 [000] 5513.762000: mm_page_alloc: page=f7284d80 pfn=249836 order=0 migratetype=2 gfp_flags=GFP_HIGHUSER_MOVABLE

cc1-12884 [000] 5513.762000: kfree: call_site=c057cac7 ptr=f4cc7800

cc1-12884 [000] 5513.762000: kfree: call_site=c057be25 ptr=f4f517e0

cc1-12884 [000] 5513.762000: kfree: call_site=c057be2d ptr=f5af3280

cc1-12884 [000] 5513.762000: kmem_cache_alloc: call_site=c050baa3 ptr=f4f79f94 bytes_req=88 bytes_alloc=88 gfp_flags=GFP_KERNEL|GFP_ZERO

cc1-12884 [000] 5513.762000: mm_page_alloc: page=f727d7a0 pfn=248893 order=0 migratetype=2 gfp_flags=GFP_HIGHUSER_MOVABLE|GFP_ZERO

cc1-12884 [000] 5513.762000: kfree: call_site=c057b030 ptr=f4ee5f40

cc1-12884 [000] 5513.762000: kfree: call_site=c052ce9b ptr=f5af3480

cc1-12884 [000] 5513.762000: kmem_cache_free: call_site=c0536936 ptr=f4cdf000

cc1-12884 [000] 5513.762000: mm_page_alloc: page=f72571c0 pfn=243982 order=0 migratetype=2 gfp_flags=GFP_HIGHUSER_MOVABLE

cc1-12884 [000] 5513.763000: mm_page_alloc: page=f7490be0 pfn=316895 order=0 migratetype=0 gfp_flags=GFP_KERNEL|GFP_REPEAT|GFP_ZERO|0x2

cc1-12884 [000] 5513.763000: mm_page_alloc: page=f7274860 pfn=247747 order=0 migratetype=2 gfp_flags=GFP_HIGHUSER_MOVABLE

mce

檢視 events/mce目錄下,共支援以下的Events

mce_record

module

檢視 events/module目錄下,共支援以下的Events

module_free module_load module_request module_get module_put

#echo 1 > /debug/tracing/events/module/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

insmod-12522 [000] 5384.763000: module_load: cramfs

insmod-12522 [000] 5384.766000: module_put: cramfs call_site=sys_init_module refcnt=2

rmmod-12526 [000] 5396.917000: module_free: cramfs

napi

檢視 events/napi目錄下,共支援以下的Events

napi_poll

#echo 1 > /debug/tracing/events/napi/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

<...>-12085 [000] 5144.521000: napi_poll: napi poll on napi struct f5981460 for device eth0

<...>-12090 [000] 5144.695000: napi_poll: napi poll on napi struct f5981460 for device eth0

<...>-12090 [000] 5144.695000: napi_poll: napi poll on napi struct f5981460 for device eth0

make-12099 [000] 5144.895000: napi_poll: napi poll on napi struct f5981460 for device eth0

make-12099 [000] 5144.895000: napi_poll: napi poll on napi struct f5981460 for device eth0

cc1-12102 [000] 5144.969000: napi_poll: napi poll on napi struct f5981460 for device eth0

cc1-12102 [000] 5144.972000: napi_poll: napi poll on napi struct f5981460 for device eth0

as-12103 [000] 5145.005000: napi_poll: napi poll on napi struct f5981460 for device eth0

make-12111 [000] 5145.175000: napi_poll: napi poll on napi struct f5981460 for device eth0

date-12120 [000] 5145.363000: napi_poll: napi poll on napi struct f5981460 for device eth0

in.telnetd-1955 [000] 5145.366000: napi_poll: napi poll on napi struct f5981460 for device eth0

grep-12132 [000] 5145.566000: napi_poll: napi poll on napi struct f5981460 for device eth0

sh-12136 [000] 5145.631000: napi_poll: napi poll on napi struct f5981460 for device eth0

bash-1957 [000] 5145.634000: napi_poll: napi poll on napi struct f5981460 for device eth0

sh-12143 [000] 5145.834000: napi_poll: napi poll on napi struct f5981460 for device eth0

top-9088 [000] 5145.855000: napi_poll: napi poll on napi struct f5981460 for device eth0

sh-12143 [000] 5145.857000: napi_poll: napi poll on napi struct f5981460 for device eth0

make-12147 [000] 5145.915000: napi_poll: napi poll on napi struct f5981460 for device eth0

make-12147 [000] 5145.920000: napi_poll: napi poll on napi struct f5981460 for device eth0

cc1-12152 [000] 5146.053000: napi_poll: napi poll on napi struct f5981460 for device eth0

cc1-12152 [000] 5146.053000: napi_poll: napi poll on napi struct f5981460 for device eth0

as-12153 [000] 5146.113000: napi_poll: napi poll on napi struct f5981460 for device eth0

make-12161 [000] 5146.253000: napi_poll: napi poll on napi struct f5981460 for device eth0

sh-12162 [000] 5146.540000: napi_poll: napi poll on napi struct f5981460 for device eth0

rm-12168 [000] 5146.543000: napi_poll: napi poll on napi struct f5981460 for device eth0

make-12173 [000] 5146.720000: napi_poll: napi poll on napi struct f5981460 for device eth0

bash-1957 [000] 5146.724000: napi_poll: napi poll on napi struct f5981460 for device eth0

ld-12179 [000] 5146.901000: napi_poll: napi poll on napi struct f5981460 for device eth0

ld-12179 [000] 5146.903000: napi_poll: napi poll on napi struct f5981460 for device eth0

make-12185 [000] 5147.004000: napi_poll: napi poll on napi struct f5981460 for device eth0

net

檢視 events/net目錄下,共支援以下的Events

net_dev_queue net_dev_xmit netif_receive_skb netif_rx

#echo 1 > /debug/tracing/events/net/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

in.telnetd-1955 [000] 4446.238000: net_dev_queue: dev=eth0 skbaddr=f5aa2470 len=56

in.telnetd-1955 [000] 4446.238000: net_dev_xmit: dev=eth0 skbaddr=f5aa2470 len=56 rc=0

<idle>-0 [000] 4446.436000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

<idle>-0 [000] 4446.436000: net_dev_queue: dev=eth0 skbaddr=f5aa2bf0 len=80

<idle>-0 [000] 4446.436000: net_dev_xmit: dev=eth0 skbaddr=f5aa2bf0 len=80 rc=0

<idle>-0 [000] 4446.593000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

in.telnetd-1955 [000] 4446.596000: net_dev_queue: dev=eth0 skbaddr=f5aa2bf0 len=66

in.telnetd-1955 [000] 4446.596000: net_dev_xmit: dev=eth0 skbaddr=f5aa2bf0 len=66 rc=0

in.telnetd-9058 [000] 4446.689000: net_dev_queue: dev=eth0 skbaddr=f5aa2470 len=136

in.telnetd-9058 [000] 4446.689000: net_dev_xmit: dev=eth0 skbaddr=f5aa2470 len=136 rc=0

<idle>-0 [000] 4446.782000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

in.telnetd-1955 [000] 4446.785000: net_dev_queue: dev=eth0 skbaddr=f5aa2bf0 len=95

in.telnetd-1955 [000] 4446.785000: net_dev_xmit: dev=eth0 skbaddr=f5aa2bf0 len=95 rc=0

<idle>-0 [000] 4446.888000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

<idle>-0 [000] 4446.888000: net_dev_queue: dev=eth0 skbaddr=f5aa2d70 len=953

<idle>-0 [000] 4446.888000: net_dev_xmit: dev=eth0 skbaddr=f5aa2d70 len=953 rc=0

<idle>-0 [000] 4446.983000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

<idle>-0 [000] 4447.087000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

<idle>-0 [000] 4447.550000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

in.telnetd-1955 [000] 4447.552000: net_dev_queue: dev=eth0 skbaddr=f5aa2d70 len=82

in.telnetd-1955 [000] 4447.552000: net_dev_xmit: dev=eth0 skbaddr=f5aa2d70 len=82 rc=0

<idle>-0 [000] 4447.752000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

<idle>-0 [000] 4447.974000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

in.telnetd-1955 [000] 4447.976000: net_dev_queue: dev=eth0 skbaddr=f5aa2d70 len=56

in.telnetd-1955 [000] 4447.976000: net_dev_xmit: dev=eth0 skbaddr=f5aa2d70 len=56 rc=0

<idle>-0 [000] 4448.175000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

<idle>-0 [000] 4448.175000: net_dev_queue: dev=eth0 skbaddr=f5aa2bf0 len=1138

<idle>-0 [000] 4448.175000: net_dev_xmit: dev=eth0 skbaddr=f5aa2bf0 len=1138 rc=0

<idle>-0 [000] 4448.373000: netif_receive_skb: dev=eth0 skbaddr=f4f6ad40 len=46

in.telnetd-9058 [000] 4448.701000: net_dev_queue: dev=eth0 skbaddr=f5aa2bf0 len=1004

in.telnetd-9058 [000] 4448.701000: net_dev_xmit: dev=eth0 skbaddr=f5aa2bf0 len=1004 rc=0

power

檢視 events/power目錄下,共支援以下的Events

clock_disable cpu_frequency power_end clock_enable cpu_idle machine_suspend power_frequency clock_set_rate power_domain_target power_start

#echo 1 > /debug/tracing/events/power/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

<idle>-0 [000] 4370.240000: power_start: type=1 state=1 cpu_id=0

<idle>-0 [000] 4370.240000: cpu_idle: state=1 cpu_id=0

<idle>-0 [000] 4370.241000: power_end: cpu_id=0

<idle>-0 [000] 4370.241000: cpu_idle: state=4294967295 cpu_id=0

<idle>-0 [000] 4370.241000: power_start: type=1 state=1 cpu_id=0

<idle>-0 [000] 4370.241000: cpu_idle: state=1 cpu_id=0

<idle>-0 [000] 4370.242000: power_end: cpu_id=0

<idle>-0 [000] 4370.242000: cpu_idle: state=4294967295 cpu_id=0

<idle>-0 [000] 4370.242000: power_start: type=1 state=1 cpu_id=0

<idle>-0 [000] 4370.242000: cpu_idle: state=1 cpu_id=0

<idle>-0 [000] 4370.243000: power_end: cpu_id=0

<idle>-0 [000] 4370.243000: cpu_idle: state=4294967295 cpu_id=0

<idle>-0 [000] 4370.243000: power_start: type=1 state=1 cpu_id=0

<idle>-0 [000] 4370.243000: cpu_idle: state=1 cpu_id=0

<idle>-0 [000] 4370.244000: power_end: cpu_id=0

<idle>-0 [000] 4370.244000: cpu_idle: state=4294967295 cpu_id=0

<idle>-0 [000] 4370.244000: power_start: type=1 state=1 cpu_id=0

<idle>-0 [000] 4370.244000: cpu_idle: state=1 cpu_id=0

<idle>-0 [000] 4370.245000: power_end: cpu_id=0

<idle>-0 [000] 4370.245000: cpu_idle: state=4294967295 cpu_id=0

<idle>-0 [000] 4370.245000: power_start: type=1 state=1 cpu_id=0

<idle>-0 [000] 4370.245000: cpu_idle: state=1 cpu_id=0

<idle>-0 [000] 4370.246000: power_end: cpu_id=0

<idle>-0 [000] 4370.246000: cpu_idle: state=4294967295 cpu_id=0

<idle>-0 [000] 4370.246000: power_start: type=1 state=1 cpu_id=0

<idle>-0 [000] 4370.246000: cpu_idle: state=1 cpu_id=0

<idle>-0 [000] 4370.247000: power_end: cpu_id=0

<idle>-0 [000] 4370.247000: cpu_idle: state=4294967295 cpu_id=0

<idle>-0 [000] 4370.247000: power_start: type=1 state=1 cpu_id=0

<idle>-0 [000] 4370.247000: cpu_idle: state=1 cpu_id=0

raw_syscalls

檢視 events/raw_syscalls目錄下,共支援以下的Events

sys_enter sys_exit

#echo 1 > /debug/tracing/events/raw_syscalls/enable

檢視 /debug/tracing/trace下的結果,

bash-3798 [000] 9209.897945: sys_enter: NR 63 (a, 1, a, 0, a, bff53448)

bash-3798 [000] 9209.897966: sys_exit: NR 63 = 1

bash-3798 [000] 9209.897970: sys_enter: NR 221 (a, 1, 0, 0, b77cfff4, bff53328)

bash-3798 [000] 9209.897972: sys_exit: NR 221 = 1

bash-3798 [000] 9209.897975: sys_enter: NR 6 (a, 1, 0, 0, 0, bff53448)

bash-3798 [000] 9209.897976: sys_exit: NR 6 = 0

bash-3798 [000] 9209.897988: sys_enter: NR 175 (0, 0, 80fe4dc, 8,b77cfff4, bff535bc)

bash-3798 [000] 9209.897991: sys_exit: NR 175 = 0

bash-3798 [000] 9209.897994: sys_enter: NR 174 (2, bff53400, bff53374, 8, b77cfff4, bff53498)

bash-3798 [000] 9209.897996: sys_exit: NR 174 = 0

sched

檢視 events/sched目錄下,共支援以下的Events

sched_process_exit sched_stat_sleep sched_process_fork sched_stat_wait sched_kthread_stop sched_process_free sched_switch sched_kthread_stop_ret sched_process_wait sched_wait_task sched_migrate_task sched_stat_iowait sched_wakeup sched_pi_setprio sched_stat_runtime sched_wakeup_new

#echo 1 > /debug/tracing/events/sched/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

bash-1957 [000] 4232.264000: sched_stat_wait: comm=kworker/0:0 pid=4 delay=0 [ns]

bash-1957 [000] 4232.264000: sched_switch: prev_comm=bash prev_pid=1957 prev_prio=120 prev_state=R ==> next_comm=kworker/0:0 next_pid=4 next_prio=120

kworker/0:0-4 [000] 4232.265000: sched_stat_runtime: comm=kworker/0:0 pid=4 runtime=956546 [ns] vruntime=1353114806115 [ns]

kworker/0:0-4 [000] 4232.265000: sched_stat_sleep: comm=in.telnetd pid=1955 delay=3893232 [ns]

kworker/0:0-4 [000] 4232.265000: sched_wakeup: comm=in.telnetd pid=1955 prio=120 success=1 target_cpu=000

kworker/0:0-4 [000] 4232.265000: sched_stat_wait: comm=in.telnetd pid=1955 delay=0 [ns]

kworker/0:0-4 [000] 4232.265000: sched_switch: prev_comm=kworker/0:0 prev_pid=4 prev_prio=120 prev_state=S ==> next_comm=in.telnetd next_pid=1955 next_prio=120

in.telnetd-1955 [000] 4232.265000: sched_stat_wait: comm=bash pid=1957 delay=956546 [ns]

in.telnetd-1955 [000] 4232.265000: sched_switch: prev_comm=in.telnetd prev_pid=1955 prev_prio=120 prev_state=S ==> next_comm=bash next_pid=1957 next_prio=120

bash-1957 [000] 4232.266000: sched_stat_runtime: comm=bash pid=1957 runtime=1329219 [ns] vruntime=1353118178788 [ns]

bash-1957 [000] 4232.267000: sched_stat_runtime: comm=bash pid=1957 runtime=672432 [ns] vruntime=1353118851220 [ns]

bash-1957 [000] 4232.267000: sched_wakeup: comm=bash pid=1957 prio=120 success=0 target_cpu=000

bash-1957 [000] 4232.267000: sched_switch: prev_comm=bash prev_pid=1957 prev_prio=120 prev_state=S ==> next_comm=swapper next_pid=0 next_prio=120

<idle>-0 [000] 4232.268000: sched_stat_sleep: comm=kworker/0:0 pid=4 delay=3016584 [ns]

<idle>-0 [000] 4232.268000: sched_wakeup: comm=kworker/0:0 pid=4 prio=120 success=1 target_cpu=000

<idle>-0 [000] 4232.268000: sched_stat_wait: comm=kworker/0:0 pid=4 delay=0 [ns]

<idle>-0 [000] 4232.268000: sched_switch: prev_comm=swapper prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:0 next_pid=4 next_prio=120

kworker/0:0-4 [000] 4232.268000: sched_stat_sleep: comm=in.telnetd pid=1955 delay=3016584 [ns]

kworker/0:0-4 [000] 4232.268000: sched_wakeup: comm=in.telnetd pid=1955 prio=120 success=1 target_cpu=000

kworker/0:0-4 [000] 4232.268000: sched_stat_wait: comm=in.telnetd pid=1955 delay=0 [ns]

kworker/0:0-4 [000] 4232.268000: sched_switch: prev_comm=kworker/0:0 prev_pid=4 prev_prio=120 prev_state=S ==> next_comm=in.telnetd next_pid=1955 next_prio=120

in.telnetd-1955 [000] 4232.268000: sched_switch: prev_comm=in.telnetd prev_pid=1955 prev_prio=120 prev_state=S ==> next_comm=swapper next_pid=0 next_prio=120

<idle>-0 [000] 4232.533000: sched_stat_sleep: comm=top pid=9088 delay=2003729805 [ns]

<idle>-0 [000] 4232.533000: sched_wakeup: comm=top pid=9088 prio=120 success=1 target_cpu=000

<idle>-0 [000] 4232.533000: sched_stat_wait: comm=top pid=9088 delay=292215 [ns]

<idle>-0 [000] 4232.533000: sched_switch: prev_comm=swapper prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=top next_pid=9088 next_prio=120

<...>-9088 [000] 4232.533000: sched_stat_sleep: comm=in.telnetd pid=9058 delay=2005198147 [ns]

<...>-9088 [000] 4232.533000: sched_wakeup: comm=in.telnetd pid=9058 prio=120 success=1 target_cpu=000

<...>-9088 [000] 4232.534000: sched_stat_runtime: comm=top pid=9088 runtime=889219 [ns] vruntime=1353116740439 [ns]

<...>-9088 [000] 4232.535000: sched_stat_runtime: comm=top pid=9088 runtime=872178 [ns] vruntime=1353117612617 [ns]

<...>-9088 [000] 4232.536000: sched_stat_runtime: comm=top pid=9088 runtime=1030578 [ns] vruntime=1353118643195 [ns]

scsi

檢視 events/scsi目錄下,共支援以下的Events

scsi_dispatch_cmd_done scsi_dispatch_cmd_start scsi_eh_wakeup scsi_dispatch_cmd_error scsi_dispatch_cmd_timeout

#echo 1 > /debug/tracing/events/scsi/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

kjournald-48 [000] 4120.730000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=218489949 txlen=8 protect=0 raw=2a 00 0d 05 e4 5d 00 00 08 00)

<...>-10718 [000] 4120.730000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=218489949 txlen=8 protect=0 raw=2a 00 0d 05 e4 5d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.734000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267293 txlen=8 protect=0 raw=2a 00 00 04 14 1d 00 00 08 00)

kjournald-48 [000] 4120.734000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267293 txlen=8 protect=0 raw=2a 00 00 04 14 1d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.734000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267301 txlen=8 protect=0 raw=2a 00 00 04 14 25 00 00 08 00)

kjournald-48 [000] 4120.734000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267301 txlen=8 protect=0 raw=2a 00 00 04 14 25 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.735000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267309 txlen=8 protect=0 raw=2a 00 00 04 14 2d 00 00 08 00)

kjournald-48 [000] 4120.735000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267309 txlen=8 protect=0 raw=2a 00 00 04 14 2d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.736000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267317 txlen=8 protect=0 raw=2a 00 00 04 14 35 00 00 08 00)

kjournald-48 [000] 4120.736000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267317 txlen=8 protect=0 raw=2a 00 00 04 14 35 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.736000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267325 txlen=8 protect=0 raw=2a 00 00 04 14 3d 00 00 08 00)

kjournald-48 [000] 4120.736000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267325 txlen=8 protect=0 raw=2a 00 00 04 14 3d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.738000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267333 txlen=8 protect=0 raw=2a 00 00 04 14 45 00 00 08 00)

kjournald-48 [000] 4120.738000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267333 txlen=8 protect=0 raw=2a 00 00 04 14 45 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.738000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267341 txlen=8 protect=0 raw=2a 00 00 04 14 4d 00 00 08 00)

kjournald-48 [000] 4120.738000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267341 txlen=8 protect=0 raw=2a 00 00 04 14 4d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.738000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267349 txlen=8 protect=0 raw=2a 00 00 04 14 55 00 00 08 00)

kjournald-48 [000] 4120.738000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267349 txlen=8 protect=0 raw=2a 00 00 04 14 55 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.742000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267357 txlen=8 protect=0 raw=2a 00 00 04 14 5d 00 00 08 00)

kjournald-48 [000] 4120.742000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267357 txlen=8 protect=0 raw=2a 00 00 04 14 5d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.742000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267365 txlen=8 protect=0 raw=2a 00 00 04 14 65 00 00 08 00)

kjournald-48 [000] 4120.742000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267365 txlen=8 protect=0 raw=2a 00 00 04 14 65 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.743000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267373 txlen=8 protect=0 raw=2a 00 00 04 14 6d 00 00 08 00)

kjournald-48 [000] 4120.743000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267373 txlen=8 protect=0 raw=2a 00 00 04 14 6d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.744000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267381 txlen=8 protect=0 raw=2a 00 00 04 14 75 00 00 08 00)

kjournald-48 [000] 4120.744000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267381 txlen=8 protect=0 raw=2a 00 00 04 14 75 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.744000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267389 txlen=8 protect=0 raw=2a 00 00 04 14 7d 00 00 08 00)

kjournald-48 [000] 4120.744000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267389 txlen=8 protect=0 raw=2a 00 00 04 14 7d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.745000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267397 txlen=8 protect=0 raw=2a 00 00 04 14 85 00 00 08 00)

kjournald-48 [000] 4120.745000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267397 txlen=8 protect=0 raw=2a 00 00 04 14 85 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

kjournald-48 [000] 4120.745000: scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267405 txlen=8 protect=0 raw=2a 00 00 04 14 8d 00 00 08 00)

kjournald-48 [000] 4120.745000: scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=1 prot_sgl=0 cmnd=(WRITE_10 lba=267405 txlen=8 protect=0 raw=2a 00 00 04 14 8d 00 00 08 00) result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)

signal

檢視 events/signal目錄下,共支援以下的Events

signal_deliver signal_lose_info signal_generate signal_overflow_fail

#echo 1 > /debug/tracing/events/signal/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

<...>-9862 [000] 3893.720000: signal_generate: sig=17 errno=0 code=262145 comm=bash pid=1957

<...>-9863 [000] 3893.732000: signal_generate: sig=17 errno=0 code=262145 comm=bash pid=1957

bash-1957 [000] 3893.733000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=807aa60 sa_flags=0

<...>-9864 [000] 3894.941000: signal_generate: sig=17 errno=0 code=262145 comm=bash pid=1957

<...>-9865 [000] 3894.949000: signal_generate: sig=17 errno=0 code=262145 comm=bash pid=1957

bash-1957 [000] 3894.949000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=807aa60 sa_flags=0

<...>-9868 [000] 3899.215000: signal_generate: sig=17 errno=0 code=262145 comm=sh pid=9867

<...>-9869 [000] 3899.225000: signal_generate: sig=17 errno=0 code=262145 comm=sh pid=9867

<...>-9867 [000] 3899.226000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=807aa60 sa_flags=0

<...>-9867 [000] 3899.228000: signal_generate: sig=17 errno=0 code=262145 comm=make pid=9866

make-9866 [000] 3899.228000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=8053850 sa_flags=10000000

<...>-9870 [000] 3899.245000: signal_generate: sig=17 errno=0 code=262145 comm=make pid=9866

make-9866 [000] 3899.246000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=8053850 sa_flags=10000000

<...>-9873 [000] 3899.336000: signal_generate: sig=17 errno=0 code=262145 comm=arm-none-eabi-g pid=9872

<...>-9874 [000] 3899.412000: signal_generate: sig=17 errno=0 code=262145 comm=arm-none-eabi-g pid=9872

<...>-9872 [000] 3899.414000: signal_generate: sig=17 errno=0 code=262145 comm=sh pid=9871

<...>-9871 [000] 3899.414000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=807aa60 sa_flags=0

<...>-9875 [000] 3899.428000: signal_generate: sig=17 errno=0 code=262145 comm=sh pid=9871

<...>-9871 [000] 3899.428000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=807aa60 sa_flags=0

<...>-9871 [000] 3899.431000: signal_generate: sig=17 errno=0 code=262145 comm=make pid=9866

make-9866 [000] 3899.431000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=8053850 sa_flags=10000000

<...>-9878 [000] 3899.514000: signal_generate: sig=17 errno=0 code=262145 comm=arm-none-eabi-g pid=9877

<...>-9879 [000] 3899.578000: signal_generate: sig=17 errno=0 code=262145 comm=arm-none-eabi-g pid=9877

<...>-9877 [000] 3899.580000: signal_generate: sig=17 errno=0 code=262145 comm=sh pid=9876

<...>-9876 [000] 3899.580000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=807aa60 sa_flags=0

<...>-9880 [000] 3899.593000: signal_generate: sig=17 errno=0 code=262145 comm=sh pid=9876

<...>-9876 [000] 3899.594000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=807aa60 sa_flags=0

<...>-9876 [000] 3899.595000: signal_generate: sig=17 errno=0 code=262145 comm=make pid=9866

make-9866 [000] 3899.595000: signal_deliver: sig=17 errno=0 code=262145 sa_handler=8053850 sa_flags=10000000

<...>-9883 [000] 3899.682000: signal_generate: sig=17 errno=0 code=262145 comm=arm-none-eabi-g pid=9882

skb

檢視 events/skb目錄下,共支援以下的Events

consume_skb kfree_skb skb_copy_datagram_iovec

#echo 1 > /debug/tracing/events/skb/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

bash-1957 [000] 3800.664000: consume_skb: skbaddr=f5aa2ef0

<idle>-0 [000] 3800.864000: consume_skb: skbaddr=f5aa2a70

top-9088 [000] 3801.184000: consume_skb: skbaddr=f5aa2a70

in.telnetd-1955 [000] 3801.281000: skb_copy_datagram_iovec: skbaddr=f4f6a500 len=3

<idle>-0 [000] 3801.285000: consume_skb: skbaddr=f5aa2ef0

in.telnetd-1955 [000] 3801.437000: skb_copy_datagram_iovec: skbaddr=f4f6a500 len=3

bash-1957 [000] 3801.440000: consume_skb: skbaddr=f5aa2ef0

in.telnetd-1955 [000] 3802.094000: skb_copy_datagram_iovec: skbaddr=f4f6a500 len=3

bash-1957 [000] 3802.098000: consume_skb: skbaddr=f5aa2ef0

in.telnetd-1955 [000] 3802.605000: skb_copy_datagram_iovec: skbaddr=f4f6a500 len=1

bash-1957 [000] 3802.607000: kfree_skb: skbaddr=f4f6a500 protocol=0 location=c0814f61

bash-1957 [000] 3802.609000: consume_skb: skbaddr=f5aa2ef0

<idle>-0 [000] 3802.804000: consume_skb: skbaddr=f5aa2a70

top-9088 [000] 3803.193000: consume_skb: skbaddr=f5aa2a70

<idle>-0 [000] 3803.392000: consume_skb: skbaddr=f5aa2ef0

top-9088 [000] 3805.206000: consume_skb: skbaddr=f5aa2ef0

top-9088 [000] 3805.211000: consume_skb: skbaddr=f5aa2a70

top-9088 [000] 3805.212000: consume_skb: skbaddr=f5aa22f0

in.telnetd-1955 [000] 3805.970000: skb_copy_datagram_iovec: skbaddr=f4f6a500 len=3

in.telnetd-1955 [000] 3805.974000: consume_skb: skbaddr=f5aa22f0

in.telnetd-1955 [000] 3806.251000: skb_copy_datagram_iovec: skbaddr=f4f6a500 len=1

bash-1957 [000] 3806.252000: kfree_skb: skbaddr=f4f6a500 protocol=0 location=c0814f61

bash-1957 [000] 3806.255000: consume_skb: skbaddr=f5aa22f0

<idle>-0 [000] 3806.457000: consume_skb: skbaddr=f5aa2a70

top-9088 [000] 3807.222000: consume_skb: skbaddr=f5aa2a70

top-9088 [000] 3807.222000: consume_skb: skbaddr=f5aa22f0

top-9088 [000] 3807.223000: consume_skb: skbaddr=f5aa22f0

in.telnetd-1955 [000] 3807.300000: skb_copy_datagram_iovec: skbaddr=f4f6a500 len=1

more-9857 [000] 3807.303000: consume_skb: skbaddr=f5aa2ef0

<idle>-0 [000] 3807.431000: consume_skb: skbaddr=f5aa2a70

<idle>-0 [000] 3807.471000: consume_skb: skbaddr=f5aa2770

in.telnetd-1955 [000] 3807.471000: skb_copy_datagram_iovec: skbaddr=f4f6a500 len=3

syscalls

檢視 events/syscalls目錄下,共支援以下的Events

sys_enter_accept

sys_enter_accept4

sys_enter_access

sys_enter_acct

sys_enter_add_key

sys_enter_adjtimex

sys_enter_alarm

sys_enter_bdflush

sys_enter_bind

sys_enter_brk

….

sys_exit_vmsplice

sys_exit_wait4

sys_exit_waitid

sys_exit_waitpid

sys_exit_write

sys_exit_writev

#echo 1 > /debug/tracing/events/syscalls/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

bash-1957 [000] 3716.573000: sys_open -> 0x3

bash-1957 [000] 3716.573000: sys_fcntl64(fd: 1, cmd: 1, arg: 0)

bash-1957 [000] 3716.573000: sys_fcntl64 -> 0x0

bash-1957 [000] 3716.573000: sys_fcntl64(fd: 1, cmd: 0, arg: a)

bash-1957 [000] 3716.573000: sys_fcntl64 -> 0xa

bash-1957 [000] 3716.573000: sys_fcntl64(fd: 1, cmd: 1, arg: 0)

bash-1957 [000] 3716.573000: sys_fcntl64 -> 0x0

bash-1957 [000] 3716.573000: sys_fcntl64(fd: a, cmd: 2, arg: 1)

bash-1957 [000] 3716.573000: sys_fcntl64 -> 0x0

bash-1957 [000] 3716.573000: sys_dup2(oldfd: 3, newfd: 1)

bash-1957 [000] 3716.573000: sys_dup2 -> 0x1

bash-1957 [000] 3716.573000: sys_close(fd: 3)

bash-1957 [000] 3716.573000: sys_close -> 0x0

bash-1957 [000] 3716.574000: sys_write(fd: 1, buf: b781c000, count: 1)

bash-1957 [000] 3716.574000: sys_write -> 0x1

bash-1957 [000] 3716.574000: sys_dup2(oldfd: a, newfd: 1)

bash-1957 [000] 3716.574000: sys_dup2 -> 0x1

bash-1957 [000] 3716.574000: sys_fcntl64(fd: a, cmd: 1, arg: 0)

bash-1957 [000] 3716.574000: sys_fcntl64 -> 0x1

bash-1957 [000] 3716.574000: sys_close(fd: a)

bash-1957 [000] 3716.574000: sys_close -> 0x0

bash-1957 [000] 3716.574000: sys_rt_sigprocmask(how: 0, set: 0, oset: 80fe4dc, sigsetsize: 8)

bash-1957 [000] 3716.574000: sys_rt_sigprocmask -> 0x0

bash-1957 [000] 3716.574000: sys_rt_sigaction(sig: 2, act: bfd40dc0, oact: bfd40d34, sigsetsize: 8)

bash-1957 [000] 3716.574000: sys_rt_sigaction -> 0x0

bash-1957 [000] 3716.574000: sys_time(tloc: 0)

bash-1957 [000] 3716.574000: sys_time -> 0x4dd7d442

bash-1957 [000] 3716.574000: sys_rt_sigprocmask(how: 0, set: bfd4054c, oset: bfd404cc, sigsetsize: 8)

bash-1957 [000] 3716.574000: sys_rt_sigprocmask -> 0x0

bash-1957 [000] 3716.574000: sys_ioctl(fd: ff, cmd: 5410, arg: bfd404c4)

bash-1957 [000] 3716.574000: sys_ioctl -> 0x0

timer

檢視 events/timer目錄下,共支援以下的Events

hrtimer_expire_exit itimer_state timer_init hrtimer_init timer_cancel timer_start hrtimer_cancel hrtimer_start timer_expire_entry hrtimer_expire_entry itimer_expire timer_expire_exit

#echo 1 > /debug/tracing/events/timer/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

bash-1925 [000] 3590.212000: timer_cancel: timer=f58a10cc

bash-1925 [000] 3590.213000: timer_expire_entry: timer=f58a10cc function=delayed_work_timer_fn now=3290213

bash-1925 [000] 3590.213000: timer_expire_exit: timer=f58a10cc

in.telnetd-1923 [000] 3590.213000: timer_start: timer=f5446b4c function=tcp_write_timer expires=3290566 [timeout=353]

bash-1925 [000] 3590.215000: timer_start: timer=f58a10cc function=delayed_work_timer_fn expires=3290216 [timeout=1]

<idle>-0 [000] 3590.216000: timer_cancel: timer=f58a10cc

<idle>-0 [000] 3590.216000: timer_expire_entry: timer=f58a10cc function=delayed_work_timer_fn now=3290216

<idle>-0 [000] 3590.216000: timer_expire_exit: timer=f58a10cc

<idle>-0 [000] 3590.275000: timer_cancel: timer=f5446b68

<idle>-0 [000] 3590.275000: timer_expire_entry: timer=f5446b68 function=tcp_delack_timer now=3290275

<idle>-0 [000] 3590.275000: timer_expire_exit: timer=f5446b68

<idle>-0 [000] 3590.409000: timer_cancel: timer=f5446b4c

<idle>-0 [000] 3590.409000: timer_start: timer=f5446b4c function=tcp_write_timer expires=3290767 [timeout=358]

<idle>-0 [000] 3590.458000: hrtimer_cancel: hrtimer=f4d21b80

<idle>-0 [000] 3590.458000: hrtimer_expire_entry: hrtimer=f4d21b80 function=hrtimer_wakeup now=3612200721124

<idle>-0 [000] 3590.458000: hrtimer_expire_exit: hrtimer=f4d21b80

gam_server-1890 [000] 3590.459000: hrtimer_init: hrtimer=f4d21b80 clockid=CLOCK_MONOTONIC mode=HRTIMER_MODE_ABS

gam_server-1890 [000] 3590.459000: hrtimer_start: hrtimer=f4d21b80 function=hrtimer_wakeup expires=3613207309303 softexpires=3613202309315

<idle>-0 [000] 3590.767000: timer_cancel: timer=f5446b4c

<idle>-0 [000] 3590.767000: timer_expire_entry: timer=f5446b4c function=tcp_write_timer now=3290767

<idle>-0 [000] 3590.767000: timer_expire_exit: timer=f5446b4c

<idle>-0 [000] 3591.002000: timer_cancel: timer=f64036b0

<idle>-0 [000] 3591.002000: timer_expire_entry: timer=f64036b0 function=delayed_work_timer_fn now=3291002

<idle>-0 [000] 3591.002000: timer_expire_exit: timer=f64036b0

kworker/0:0-4 [000] 3591.002000: timer_start: timer=f64036b0 function=delayed_work_timer_fn expires=3292002 [timeout=1000]

<idle>-0 [000] 3591.004000: timer_cancel: timer=f598135c

<idle>-0 [000] 3591.004000: timer_expire_entry: timer=f598135c function=e1000_watchdog now=3291004

<idle>-0 [000] 3591.004000: timer_start: timer=f598135c function=e1000_watchdog expires=3293004 [timeout=2000]

vmscan

檢視 events/vmscan目錄下,共支援以下的Events

mm_vmscan_memcg_isolate mm_vmscan_memcg_reclaim_begin mm_vmscan_direct_reclaim_begin mm_vmscan_memcg_reclaim_end mm_vmscan_direct_reclaim_end mm_vmscan_memcg_softlimit_reclaim_begin mm_vmscan_kswapd_sleep mm_vmscan_memcg_softlimit_reclaim_end mm_vmscan_kswapd_wake mm_vmscan_wakeup_kswapd mm_vmscan_lru_isolate mm_vmscan_writepage mm_vmscan_lru_shrink_inactive

#echo 1 > /debug/tracing/events/vmscan/enable

檢視 /debug/tracing/trace下的結果,

workqueue

檢視 events/workqueue目錄下,共支援以下的Events

workqueue_activate_work workqueue_execute_start workqueue_execute_end workqueue_queue_work

#echo 1 > /debug/tracing/events/workqueue/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

bash-1957 [000] 3205.721000: workqueue_queue_work: work struct=f58a24bc function=flush_to_ldisc workqueue=f600dac0 req_cpu=0 cpu=0

bash-1957 [000] 3205.721000: workqueue_activate_work: work struct f58a24bc

kworker/0:0-4 [000] 3205.721000: workqueue_execute_start: work struct f58a24bc: function flush_to_ldisc

kworker/0:0-4 [000] 3205.721000: workqueue_execute_end: work struct f58a24bc

<idle>-0 [000] 3205.723000: workqueue_queue_work: work struct=f58a24bc function=flush_to_ldisc workqueue=f600dac0 req_cpu=0 cpu=0

<idle>-0 [000] 3205.723000: workqueue_activate_work: work struct f58a24bc

kworker/0:0-4 [000] 3205.723000: workqueue_execute_start: work struct f58a24bc: function flush_to_ldisc

kworker/0:0-4 [000] 3205.723000: workqueue_execute_end: work struct f58a24bc

<idle>-0 [000] 3206.002000: workqueue_queue_work: work struct=f64036a0 function=vmstat_update workqueue=f600dac0 req_cpu=0 cpu=0

<idle>-0 [000] 3206.002000: workqueue_activate_work: work struct f64036a0

kworker/0:0-4 [000] 3206.002000: workqueue_execute_start: work struct f64036a0: function vmstat_update

kworker/0:0-4 [000] 3206.002000: workqueue_execute_end: work struct f64036a0

<idle>-0 [000] 3206.347000: workqueue_queue_work: work struct=f58bf8bc function=flush_to_ldisc workqueue=f600dac0 req_cpu=0 cpu=0

<idle>-0 [000] 3206.347000: workqueue_activate_work: work struct f58bf8bc

kworker/0:0-4 [000] 3206.347000: workqueue_execute_start: work struct f58bf8bc: function flush_to_ldisc

kworker/0:0-4 [000] 3206.347000: workqueue_execute_end: work struct f58bf8bc

<idle>-0 [000] 3206.349000: workqueue_queue_work: work struct=f58a24bc function=flush_to_ldisc workqueue=f600dac0 req_cpu=0 cpu=0

<idle>-0 [000] 3206.349000: workqueue_activate_work: work struct f58a24bc

kworker/0:0-4 [000] 3206.349000: workqueue_execute_start: work struct f58a24bc: function flush_to_ldisc

kworker/0:0-4 [000] 3206.349000: workqueue_execute_end: work struct f58a24bc

<idle>-0 [000] 3206.654000: workqueue_queue_work: work struct=f58bf8bc function=flush_to_ldisc workqueue=f600dac0 req_cpu=0 cpu=0

<idle>-0 [000] 3206.654000: workqueue_activate_work: work struct f58bf8bc

kworker/0:0-4 [000] 3206.654000: workqueue_execute_start: work struct f58bf8bc: function flush_to_ldisc

kworker/0:0-4 [000] 3206.654000: workqueue_execute_end: work struct f58bf8bc

writeback

檢視 events/writeback目錄下,共支援以下的Events

writeback_exec writeback_nothread wbc_balance_dirty_start writeback_nowork wbc_balance_dirty_wait writeback_pages_written wbc_balance_dirty_written writeback_queue wbc_writeback_start writeback_thread_start wbc_writeback_wait writeback_thread_stop wbc_writeback_written writeback_wait_iff_congested wbc_writepage writeback_wake_background writeback_bdi_register writeback_wake_forker_thread writeback_bdi_unregister writeback_wake_thread writeback_congestion_wait

#echo 1 > /debug/tracing/events/writeback/enable

檢視 /debug/tracing/trace下的結果,

# tracer: nop

#

# TASK-PID CPU# TIMESTAMP FUNCTION

# | | | | |

<...>-2719 [000] 2262.160000: writeback_wake_thread: bdi 253:0

<...>-1921 [000] 2262.160000: wbc_writeback_start: bdi 253:0: towrt=1024 skip=0 mode=0 kupd=1 bgrd=0 reclm=0 cyclic=1 more=0 older=0x1d7b80 start=0x0 end=0x0

<...>-1921 [000] 2262.160000: wbc_writeback_written: bdi 253:0: towrt=1024 skip=0 mode=0 kupd=1 bgrd=0 reclm=0 cyclic=1 more=0 older=0x1d7b80 start=0x0 end=0x0

<...>-1921 [000] 2262.160000: writeback_pages_written: 0

<...>-1921 [000] 2267.160000: wbc_writeback_start: bdi 253:0: towrt=1024 skip=0 mode=0 kupd=1 bgrd=0 reclm=0 cyclic=1 more=0 older=0x1d8f08 start=0x0 end=0x0

<...>-1921 [000] 2267.160000: wbc_writeback_written: bdi 253:0: towrt=1024 skip=0 mode=0 kupd=1 bgrd=0 reclm=0 cyclic=1 more=0 older=0x1d8f08 start=0x0 end=0x0

<...>-1921 [000] 2267.160000: writeback_pages_written: 0

<...>-1921 [000] 2272.160000: wbc_writeback_start: bdi 253:0: towrt=1024 skip=0 mode=0 kupd=1 bgrd=0 reclm=0 cyclic=1 more=0 older=0x1da290 start=0x0 end=0x0

<...>-1921 [000] 2272.160000: writeback_pages_written: 0

<...>-1921 [000] 2277.160000: wbc_writeback_start: bdi 253:0: towrt=1024 skip=0 mode=0 kupd=1 bgrd=0 reclm=0 cyclic=1 more=0 older=0x1db618 start=0x0 end=0x0

<...>-1921 [000] 2277.160000: wbc_writeback_written: bdi 253:0: towrt=1024 skip=0 mode=0 kupd=1 bgrd=0 reclm=0 cyclic=1 more=0 older=0x1db618 start=0x0 end=0x0

<...>-1921 [000] 2277.160000: writeback_pages_written: 0

<...>-1921 [000] 2282.160000: wbc_writeback_start: bdi 253:0: towrt=1024 skip=0 mode=0 kupd=1 bgrd=0 reclm=0 cyclic=1 more=0 older=0x1dc9a0 start=0x0 end=0x0

結尾,

在平台移植與開發上,效能調教對於產品化的影響很深,本文主要介紹Ftrace的概念與操作範例,對筆者而言,這套工具可以幫助在複雜行為下,去分析系統效能的瓶頸,例如,是否有中斷關閉過長的行為,或是修改過的核心函式,不適當的行為佔據過多的處理器時間. 目前,在Linux平台上,能使用的工具不少(或是也可以使用商業的ICE產品),對這些工具內部運作原理掌握的程度越高,我們就會有足夠的知識背景幫助團隊在問題發生時,借重適當的工具去分析問題的原因,Ftrace只是提供系統問題一部分的資訊,其它可用的工具與分析問題的知識背景,會是有志於在這產業的開發者,需要持續努力的.