进程切换之奥秘解析

时间:2021-10-21 21:16:27

学号:SA12**6112

前面一篇博文分析了进程从用户态切换到内核态时,内核所做的主要的事,本文将研究在进程切换时,内核所做的事。

在内核态,进程切换主要分两步:

1:切换页全局目录

2:切换内核堆栈和硬件上下文

 

用prev指向被替换进程的表述符,next指向被激活进程的描述符

下面分析进程切换的第二步

第二步主要由switch_to宏实现:

3.3内核中X86体系下:/arch/x86/include/asm/system.h文件的第48行处:

 48 #define switch_to(prev, next, last)                                     \
 49 do {                                                                    \
 50         /*                                                              \
 51          * Context-switching clobbers all registers, so we clobber      \
 52          * them explicitly, via unused output variables.                \
 53          * (EAX and EBP is not listed because EBP is saved/restored     \
 54          * explicitly for wchan access and EAX is the return value of   \
 55          * __switch_to())                                               \
 56          */                                                             \
 57         unsigned long ebx, ecx, edx, esi, edi;                          \
 58                                                                         \
 59         asm volatile("pushfl\n\t"               /* save    flags */     \
 60                      "pushl %%ebp\n\t"          /* save    EBP   */     \
 61                      "movl %%esp,%[prev_sp]\n\t"        /* save    ESP   */ \
 62                      "movl %[next_sp],%%esp\n\t"        /* restore ESP   */ \
 63                      "movl $1f,%[prev_ip]\n\t"  /* save    EIP   */     \
 64                      "pushl %[next_ip]\n\t"     /* restore EIP   */     \
 65                      __switch_canary                                    \
 66                      "jmp __switch_to\n"        /* regparm call  */     \
 67                      "1:\t"                                             \
 68                      "popl %%ebp\n\t"           /* restore EBP   */     \
 69                      "popfl\n"                  /* restore flags */     \
 70                                                                         \
 71                      /* output parameters */                            \
 72                      : [prev_sp] "=m" (prev->thread.sp),                \
 73                        [prev_ip] "=m" (prev->thread.ip),                \
 74                        "=a" (last),                                     \
 75                                                                         \
 76                        /* clobbered output registers: */                \
 77                        "=b" (ebx), "=c" (ecx), "=d" (edx),              \
 78                        "=S" (esi), "=D" (edi)                           \
 79                                                                         \
 80                        __switch_canary_oparam                           \
 81                                                                         \
 82                        /* input parameters: */                          \
 83                      : [next_sp]  "m" (next->thread.sp),                \
 84                        [next_ip]  "m" (next->thread.ip),                \
 85                                                                         \
 86                        /* regparm parameters for __switch_to(): */      \
 87                        [prev]     "a" (prev),                           \
 88                        [next]     "d" (next)                            \
 89                                                                         \
 90                        __switch_canary_iparam                           \
 91                                                                         \
 92                      : /* reloaded segment registers */                 \
 93                         "memory");                                      \
 94 } while (0)

 

一:由上面的代码可以看出,切换内核堆栈主要工作是:

1:把eflags和ebp寄存器保存到prev内核栈中。

2:把esp保存到prev->thread.sp中,eip保存到prev->thread.ip中。

3:把next指向的新进程的thread.esp保存到esp中,把next->thread.ip保存到eip中

至此已经完成了内核堆栈的切换。

 

二:切换内核堆栈之后,TSS段也要相应的改变:

这是因为对于linux系统来说同一个CPU上所有的进程共用一个TSS,进程切换了,因此TSS需要随之改变。

linux系统中主要从两个方面用到了TSS:

(1)任何进程从用户态陷入内核态都必须从TSS获得内核堆栈指针

(2)用户态读写IO需要访问TSS的权限位图。

所以在进程切换时也要更新TSS中的esp0和IO权位图的值,这主要在_switch_to函数中完成:

3.3内核X86体系下:/arch/x86/kernel/process_32.c文件中第296行处:

296 __notrace_funcgraph struct task_struct *
297 __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
298 {
299         struct thread_struct *prev = &prev_p->thread,
300                                  *next = &next_p->thread;
301         int cpu = smp_processor_id();
302         struct tss_struct *tss = &per_cpu(init_tss, cpu);
303         fpu_switch_t fpu;
304 
305         /* never put a printk in __switch_to... printk() calls wake_up*() indirectly */
306 
307         fpu = switch_fpu_prepare(prev_p, next_p, cpu);
308 
309         /*
310          * Reload esp0.
311          */
312         load_sp0(tss, next);
313 
314         /*
315          * Save away %gs. No need to save %fs, as it was saved on the
316          * stack on entry.  No need to save %es and %ds, as those are
317          * always kernel segments while inside the kernel.  Doing this
318          * before setting the new TLS descriptors avoids the situation
319          * where we temporarily have non-reloadable segments in %fs
320          * and %gs.  This could be an issue if the NMI handler ever
321          * used %fs or %gs (it does not today), or if the kernel is
322          * running inside of a hypervisor layer.
323          */
324         lazy_save_gs(prev->gs);
325 
326         /*
327          * Load the per-thread Thread-Local Storage descriptor.
328          */
329         load_TLS(next, cpu);
330 
331         /*
332          * Restore IOPL if needed.  In normal use, the flags restore
333          * in the switch assembly will handle this.  But if the kernel
334          * is running virtualized at a non-zero CPL, the popf will
335          * not restore flags, so it must be done in a separate step.
336          */
337         if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))
338                 set_iopl_mask(next->iopl);
339 
340         /*
341          * Now maybe handle debug registers and/or IO bitmaps
342          */
343         if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV ||
344                      task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))
345                 __switch_to_xtra(prev_p, next_p, tss);
346 
347         /*
348          * Leave lazy mode, flushing any hypercalls made here.
349          * This must be done before restoring TLS segments so
350          * the GDT and LDT are properly updated, and must be
351          * done before math_state_restore, so the TS bit is up
352          * to date.
353          */
354         arch_end_context_switch(next_p);
355 
356         /*
357          * Restore %gs if needed (which is common)
358          */
359         if (prev->gs | next->gs)
360                 lazy_load_gs(next->gs);
361 
362         switch_fpu_finish(next_p, fpu);
363 
364         percpu_write(current_task, next_p);
365 
366         return prev_p;
367 }

 由上面的代码可看出:TSS的更新主要是

1: load_sp0(tss, next); 从下一个进程的thread字段中获取它的sp0,并用它来更新TSS中的sp0

2: __switch_to_xtra(prev_p, next_p, tss);必要的时候会更新IO权位值。