将功能从用户空间复制到内核并执行

时间:2022-04-23 17:49:38

First of all, I am doing this for fun so don't judge me.

首先,我这样做是为了好玩,所以不要评判我。

What I did is passing a function pointer from user space to kernel, copy the function body using copy_from_user to an static array in kernel and start jumping in that array to execute.

我所做的是将一个函数指针从用户空间传递给内核,使用copy_from_user将函数体复制到内核中的静态数组,然后开始跳转到该数组中执行。

in kernel:

static char handler_text[PAGE_SIZE] __page_aligned_data;
copy_from_user((void *)handler_text , (const void __user *)my_handler , PAGE_SIZE);
((void (*)())(handler_text))();

in user space, what this function does is very simple as follows

在用户空间中,此功能的作用非常简单,如下所示

void my_handler(){
volatile unsigned long * p = (volatile unsigned long *)0xF0000c10;
*p = 0x0000000;
}

10000938 <my_handler>: 
10000938:   3d 20 f0 00     lis     r9,-4096 
1000093c:   39 40 00 00     li      r10,0 
10000940:   61 29 0c 10     ori     r9,r9,3088 
10000944:   91 49 00 00     stw     r10,0(r9) 
10000948:   4e 80 00 20     blr 
1000094c:   00 01 88 08     .long 0x18808

The problem is the first time I do this always generates a Oops. But the second time I do this and there after, the problem is gone and there is no Oops any more. I can clearly see the function is executed by kernel by reading the memory. I am running a PowerPc target so Oops shows the exception is 700, which is program exception. From the Oops, I can see the instruction dump, where the nip (after) is exactly the same instruction as my_handler.

问题是我第一次这样做总是生成一个Oops。但是第二次我做这个以及之后,问题就消失了,再也没有哎呀了。我可以通过读取内存清楚地看到函数是由内核执行的。我正在运行一个PowerPc目标,所以Oops显示异常是700,这是程序异常。从Oops中,我可以看到指令转储,其中nip(after)与my_handler完全相同。

Instruction dump:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 <3d20f000> 39400000 61290c10 91490000

I couldn't figure any sense out of it. Can anyone? Thanks

我无法理解它的任何意义。有人可以吗?谢谢

2 个解决方案

#1


4  

I hate to discourage an admirable notion, but what you're trying to do is difficult, if not impossible, without some serious extra work.

我讨厌劝阻一个令人钦佩的观念,但如果不是不可能的话,你想要做的事情很难,如果没有一些认真的额外工作。

Your function is linked at location F in user space. You're copying it to kernel space at the location of the static array: A. A is probably in the kernel's data section, so execution may not be possible. Also, your function is linked at the wrong location (e.g. F != A).

您的功能在用户空间中的位置F处链接。您将它复制到静态数组位置的内核空间:答:A可能在内核的数据部分,因此可能无法执行。此外,您的功能链接在错误的位置(例如F!= A)。

Further, even if your function could link to the correct location A, how are you handling the relocation of symbols within it (e.g. If it calls printk, how are you relinking the address inside the function to match the actual printk address)?

此外,即使您的函数可以链接到正确的位置A,您如何处理其中的符号重定位(例如,如果它调用printk,您如何重新链接函数内的地址以匹配实际的printk地址)?

It is much easier to create a kernel module and load that (via modprobe) and you can do whatever you want.

创建内核模块并加载(通过modprobe)更容易,你可以做任何你想做的事情。

Side note: This is a huge security vulernability. A similar one was used by the "Stuxnet" worm to penetrate Windows.

旁注:这是一个巨大的安全性。 “Stuxnet”蠕虫使用类似的蠕虫来穿透Windows。


UPDATE:

The dump occurs [in time] long after the exception event. By that time, it has the correct data, so the dump shows the current state, so to speak, but not what happened on the exact cycle in question [due to the nature of this "self-modifying" code].

转储在异常事件发生后很长时间[及时]发生。到那时,它具有正确的数据,因此转储显示当前状态,可以说,而不是在所讨论的确切周期上发生的事情[由于这种“自修改”代码的性质]。

But, when initially executed it may have had garbage (i.e. the 700). I'm not sure about PPC, but other arches have separate inst and data caches. With out-of-order execution. The data would be in the data cache, but not necessarily in the inst cache [or queue]. And, they tend to operate independently for speed ["Harvard" architecture].

但是,当最初执行时,它可能有垃圾(即700)。我不确定PPC,但其他拱门有单独的inst和数据缓存。无序执行。数据将位于数据高速缓存中,但不一定位于inst高速缓存[或队列]中。而且,他们倾向于独立运作以提高速度[“哈佛”架构]。

(e.g.) On x86, after setting the static area, you must flush/synchronize so that exec unit refetches the area. Otherwise, it may have already speculatively prefetched the instruction data (e.g. it isn't expecting it to be "self-modifying") with data that isn't what is expected [probably 0x00000000].

(例如)在x86上,设置静态区域后,必须刷新/同步,以便exec单元重新获取该区域。否则,它可能已经推测性地预取了指令数据(例如,它不期望它是“自修改”),其数据不是预期的[可能是0x00000000]。

Consider: After the copy_from_user the desired data is in the data cache, but has not yet been flushed to RAM. The execution unit [and inst cache], not having any data from the static area, will fetch from RAM. Because self-modifying code is rare, the inst and data caches do not snoop each other [it would slow things down].

考虑:在copy_from_user之后,所需数据在数据高速缓存中,但尚未刷新到RAM。执行单元[和inst cache]没有来自静态区域的任何数据,将从RAM中获取。因为自修改代码很少见,所以inst和数据缓存不会互相窥探[它会减慢速度]。

So, the execution unit got its data from RAM (e.g. 0x00000000) instead of the loaded data [which is only in the data cache].

因此,执行单元从RAM(例如0x00000000)获取其数据而不是加载的数据[仅在数据高速缓存中]。

The second time works because the data fetched by the execution unit comes from the data during the first attempt [which has had time to flush to RAM]. That is, the static area has now been populated and the second copy_from_user is, effectively, a NOP.

第二次工作是因为执行单元获取的数据来自第一次尝试期间的数据[有时间刷新到RAM]。也就是说,现在已经填充了静态区域,并且第二个copy_from_user实际上是NOP。

A "post-mortem" dump of the area [as mentioned] would not be able to show this discrepancy.

如上所述,该地区的“验尸”转储无法显示出这种差异。

#2


4  

Figured it out. It turned out to be the cache thing. Thanks both Ctx and Craig, I added a

弄清楚了。事实证明这是缓存的事情。感谢Ctx和Craig,我添加了一个

flush_dcache_icache_page(virt_to_page((unsigned long)(handler_text)));

after

copy_from_user((void *)handler_text , (const void __user *)my_handler , PAGE_SIZE);

And it is all good now. Before I asked the question, I tried just flush_dcache_page and it didn't work. So I have to flush both dcache and icache to make this work. Thanks again.

现在一切都很好。在我问这个问题之前,我只尝试了flush_dcache_page,但它没有用。所以我必须刷新dcache和icache以使其工作。再次感谢。

#1


4  

I hate to discourage an admirable notion, but what you're trying to do is difficult, if not impossible, without some serious extra work.

我讨厌劝阻一个令人钦佩的观念,但如果不是不可能的话,你想要做的事情很难,如果没有一些认真的额外工作。

Your function is linked at location F in user space. You're copying it to kernel space at the location of the static array: A. A is probably in the kernel's data section, so execution may not be possible. Also, your function is linked at the wrong location (e.g. F != A).

您的功能在用户空间中的位置F处链接。您将它复制到静态数组位置的内核空间:答:A可能在内核的数据部分,因此可能无法执行。此外,您的功能链接在错误的位置(例如F!= A)。

Further, even if your function could link to the correct location A, how are you handling the relocation of symbols within it (e.g. If it calls printk, how are you relinking the address inside the function to match the actual printk address)?

此外,即使您的函数可以链接到正确的位置A,您如何处理其中的符号重定位(例如,如果它调用printk,您如何重新链接函数内的地址以匹配实际的printk地址)?

It is much easier to create a kernel module and load that (via modprobe) and you can do whatever you want.

创建内核模块并加载(通过modprobe)更容易,你可以做任何你想做的事情。

Side note: This is a huge security vulernability. A similar one was used by the "Stuxnet" worm to penetrate Windows.

旁注:这是一个巨大的安全性。 “Stuxnet”蠕虫使用类似的蠕虫来穿透Windows。


UPDATE:

The dump occurs [in time] long after the exception event. By that time, it has the correct data, so the dump shows the current state, so to speak, but not what happened on the exact cycle in question [due to the nature of this "self-modifying" code].

转储在异常事件发生后很长时间[及时]发生。到那时,它具有正确的数据,因此转储显示当前状态,可以说,而不是在所讨论的确切周期上发生的事情[由于这种“自修改”代码的性质]。

But, when initially executed it may have had garbage (i.e. the 700). I'm not sure about PPC, but other arches have separate inst and data caches. With out-of-order execution. The data would be in the data cache, but not necessarily in the inst cache [or queue]. And, they tend to operate independently for speed ["Harvard" architecture].

但是,当最初执行时,它可能有垃圾(即700)。我不确定PPC,但其他拱门有单独的inst和数据缓存。无序执行。数据将位于数据高速缓存中,但不一定位于inst高速缓存[或队列]中。而且,他们倾向于独立运作以提高速度[“哈佛”架构]。

(e.g.) On x86, after setting the static area, you must flush/synchronize so that exec unit refetches the area. Otherwise, it may have already speculatively prefetched the instruction data (e.g. it isn't expecting it to be "self-modifying") with data that isn't what is expected [probably 0x00000000].

(例如)在x86上,设置静态区域后,必须刷新/同步,以便exec单元重新获取该区域。否则,它可能已经推测性地预取了指令数据(例如,它不期望它是“自修改”),其数据不是预期的[可能是0x00000000]。

Consider: After the copy_from_user the desired data is in the data cache, but has not yet been flushed to RAM. The execution unit [and inst cache], not having any data from the static area, will fetch from RAM. Because self-modifying code is rare, the inst and data caches do not snoop each other [it would slow things down].

考虑:在copy_from_user之后,所需数据在数据高速缓存中,但尚未刷新到RAM。执行单元[和inst cache]没有来自静态区域的任何数据,将从RAM中获取。因为自修改代码很少见,所以inst和数据缓存不会互相窥探[它会减慢速度]。

So, the execution unit got its data from RAM (e.g. 0x00000000) instead of the loaded data [which is only in the data cache].

因此,执行单元从RAM(例如0x00000000)获取其数据而不是加载的数据[仅在数据高速缓存中]。

The second time works because the data fetched by the execution unit comes from the data during the first attempt [which has had time to flush to RAM]. That is, the static area has now been populated and the second copy_from_user is, effectively, a NOP.

第二次工作是因为执行单元获取的数据来自第一次尝试期间的数据[有时间刷新到RAM]。也就是说,现在已经填充了静态区域,并且第二个copy_from_user实际上是NOP。

A "post-mortem" dump of the area [as mentioned] would not be able to show this discrepancy.

如上所述,该地区的“验尸”转储无法显示出这种差异。

#2


4  

Figured it out. It turned out to be the cache thing. Thanks both Ctx and Craig, I added a

弄清楚了。事实证明这是缓存的事情。感谢Ctx和Craig,我添加了一个

flush_dcache_icache_page(virt_to_page((unsigned long)(handler_text)));

after

copy_from_user((void *)handler_text , (const void __user *)my_handler , PAGE_SIZE);

And it is all good now. Before I asked the question, I tried just flush_dcache_page and it didn't work. So I have to flush both dcache and icache to make this work. Thanks again.

现在一切都很好。在我问这个问题之前,我只尝试了flush_dcache_page,但它没有用。所以我必须刷新dcache和icache以使其工作。再次感谢。