模拟linux内核进程调度的代码及对其的分析

作者：刘磊

参考代码出处：《Linux内核分析》MOOC课程 http://mooc.study.163.com/course/USTC-1000029000

多道程序设计技术是在计算机内存中同时存放几道相互独立的程序，使它们在管理程序控制下，相互穿插运行，两个或两个以上程序在计算机系统中同处于开始到结束之间的状态, 这些程序共享计算机系统资源。在单核的物理机上，从宏观的角度描述多道程序，是多个进程并行地执行；而从微观的角度说，多道程序实际上是轮流拥有CPU的计算资源，是串行地执行，只是切换的时间较为短暂，使用者察觉不到程序在单个处理器上的切换。

而为了提高进程切换的效率，当进程消耗完属于它的时间片后会发出一个中断请求，中断处理程序会保存进程的“现场”，便于下一次调度再次执行。在“现场”保存完毕后，CPU被让渡给任务队列中下一个就绪的进程，并对其进行“现场”恢复，继续执行其之前没有执行完的任务。

下面我就根据参考代码，同大家一起学习这一机制。

代码结构入下图所示：

模拟linux内核进程调度的代码及对其的分析

1 mypcb.h

1.1 常量定义

#define MAX_TASK_NUM 10 // max num of task in system
#define KERNEL_STACK_SIZE 1024*8
#define PRIORITY_MAX 30 //priority range from 0 to 30

其中MAX_TASK_NUM常量为模拟内核的最大任务数，KERNEL_STACK_SIZE为内核堆栈的大小，PRIORITY_MAX为进程调度的最大优先级。

1.2 结构体定义

/* CPU-specific state of this task */
struct Thread {
    unsigned long    ip;//point to cpu run address
    unsigned long    sp;//point to the thread stack's top address
    //todo add other attrubte of system thread
};
//PCB Struct
typedef struct PCB{
    int pid; // pcb id 
    volatile long state;    /* -1 unrunnable, 0 runnable, >0 stopped */
    char stack[KERNEL_STACK_SIZE];// each pcb stack size is 1024*8
    /* CPU-specific state of this task */
    struct Thread thread;
    unsigned long    task_entry;//the task execute entry memory address
    struct PCB *next;//pcb is a circular linked list
    unsigned long priority;// task priority ////////
    //todo add other attrubte of process control block
}tPCB;

在每一个操作系统的内核中都有一套对线程的描述机制，在这里我们用一个PCB（进程管理块）来描述一个真实存在的进程。

（1）在PCB结构体中，pid即代表进程号，是此进程在计算机系统中的唯一标识，可用于线程的调度及其他的管理。

（2）state属性用于描述进程现在处于的状态，可以取三类值。

（3）stack为系统分配给进程的堆栈，用于现场保护等。

（4） Thread结构体用于保存ip和sp两个指针。

（5）task_entry指向系统分配给的内存空间。

（6）在模拟代码中，任务队列是由循环链式队列构成，next指针指向下一个PCB块，便于PCB块的管理。

（7）priority记录了当前进程在CPU调度时的优先级，系统在调度资源时会考虑到优先级进行系统资源的分配。

2 myinterrupt.c

2.1 定时器

void my_timer_handler(void)
{
#if 1
    // make sure need schedule after system circle 2000 times.
    if(time_count%2000 == 0 && my_need_sched != 1)
    {
        my_need_sched = 1;
    //time_count=0;
    }
    time_count ++ ;
#endif
    return;
}

此函数每1000个单位时间产生一个中断信号，并将my_need_sched设置为1，通知mymain.c中的my_process函数对进程进行切换。

2.2 进程调度策略

tPCB * get_next(void)
{
    int pid,i;
    tPCB * point=NULL;
    tPCB * hig_pri=NULL;//points to the the hightest task
    all_task_print();
    hig_pri=my_current_task;
    for(i=0;i<MAX_TASK_NUM;i++)
        if(task[i].priority<hig_pri->priority)    
            hig_pri=&task[i];
    printk("                higst process is:%d priority is:%d\n",hig_pri->pid,hig_pri->priority);
    return hig_pri;

}//end of priority_schedule

此模拟内核的调度策略是，遍历任务队列中的所有进程，选取优先级最高的进程，并将下一个时间片分配给它。

2.3 进程的切换

void my_schedule(void)
{
    tPCB * next;
    tPCB * prev;
    // if there no task running or only a task ,it shouldn't need schedule
    if(my_current_task == NULL
        || my_current_task->next == NULL)
    {
    printk(KERN_NOTICE "                time out!!!,but no more than 2 task,need not schedule\n");
     return;
    }
    /* schedule */

    next = get_next();
    prev = my_current_task;
    printk(KERN_NOTICE "                the next task is %d priority is %u\n",next->pid,next->priority);
    if(next->state == 0)/* -1 unrunnable, 0 runnable, >0 stopped */
    {//save current scene
     /* switch to next process */
     asm volatile(    
         "pushl %%ebp\n\t" /* save ebp */
         "movl %%esp,%0\n\t" /* save esp */
         "movl %2,%%esp\n\t" /* restore esp */
         "movl $1f,%1\n\t" /* save eip */    
         "pushl %3\n\t"
         "ret\n\t" /* restore eip */
         "1:\t" /* next process start here */
         "popl %%ebp\n\t"
         : "=m" (prev->thread.sp),"=m" (prev->thread.ip)
         : "m" (next->thread.sp),"m" (next->thread.ip)
     );
     my_current_task = next;//switch to the next task
    printk(KERN_NOTICE "                switch from %d process to %d process\n                >>>process %d running!!!<<<\n\n",prev->pid,next->pid,next->pid);

  }
    else
    {
        next->state = 0;
        my_current_task = next;
    printk(KERN_NOTICE "                switch from %d process to %d process\n                >>>process %d running!!!<<<\n\n\n",prev->pid,next->pid,next->pid);

     /* switch to new process */
     asm volatile(    
         "pushl %%ebp\n\t" /* save ebp */
         "movl %%esp,%0\n\t" /* save esp */
         "movl %2,%%esp\n\t" /* restore esp */
         "movl %2,%%ebp\n\t" /* restore ebp */
         "movl $1f,%1\n\t" /* save eip */    
         "pushl %3\n\t"
         "ret\n\t" /* restore eip */
         : "=m" (prev->thread.sp),"=m" (prev->thread.ip)
         : "m" (next->thread.sp),"m" (next->thread.ip)
     );
    }
    return;    
}//end of my_schedule

进程间的切换主要涉及两个方面，一是保存当前进程的上下文，即“现场”保护，目的是再次调度到此进程时，能够将CPU的寄存器状态恢复到上次中断的时间点，以继续执行之后的程序；二是恢复下一个要执行的进程的“现场”。在保护“现场”时应该按照一定的顺序将各个段寄存器和指令寄存器等的值进行压栈，在恢复“现场”的时候根据“先进后出”的原则将各寄存器的内容进行出栈操作。

3 mymain.c

3.1 进程优先级的初始化

void sand_priority(void)
{
    int i;
    for(i=0;i<MAX_TASK_NUM;i++)
        task[i].priority=get_rand(PRIORITY_MAX);
}

unsigned long get_rand(max)
{
    unsigned long a;
    unsigned long umax;
    umax=(unsigned long)max;
     get_random_bytes(&a, sizeof(unsigned long ));
    a=(a+umax)%umax;
    return a;
}

循环遍历初始任务队列，每遍历至一个节点，获取一个伪随机数对其优先级进行初始化。

3.2 模拟CPU运行环境

void my_process(void)
{
    int i = 0;
    while(1)
    {
        i++;
        if(i%10000000 == 0)
        {
      
            if(my_need_sched == 1)
            {
                my_need_sched = 0;
        sand_priority();
           my_schedule();  
        
       }
        }
    }
}//end of my_process

此函数模拟了一直处于工作状态的CPU，在收到定时器的中断信号，即变量my_need_sched的值设置为1时，调用进程调度函数，切换正在运行的进程。

3.3 进程初始化

void __init my_start_kernel(void)
{
    int pid = 0;
    /* Initialize process 0*/
    task[pid].pid = pid;
    task[pid].state = 0;/* -1 unrunnable, 0 runnable, >0 stopped */
    // set task 0 execute entry address to my_process
    task[pid].task_entry = task[pid].thread.ip = (unsigned long)my_process;
    task[pid].thread.sp = (unsigned long)&task[pid].stack[KERNEL_STACK_SIZE-1];
    task[pid].next = &task[pid];
    /*fork more process */
    for(pid=1;pid<MAX_TASK_NUM;pid++)
    {
        memcpy(&task[pid],&task[0],sizeof(tPCB));
        task[pid].pid = pid;
        task[pid].state = -1;
        task[pid].thread.sp = (unsigned long)&task[pid].stack[KERNEL_STACK_SIZE-1];
    task[pid].priority=get_rand(PRIORITY_MAX);//each time all tasks get a random priority
    }
    task[MAX_TASK_NUM-1].next=&task[0];
    printk(KERN_NOTICE "\n\n\n\n\n\n                system begin :>>>process 0 running!!!<<<\n\n");
    /* start process 0 by task[0] */
    pid = 0;
    my_current_task = &task[pid];
asm volatile(
     "movl %1,%%esp\n\t" /* set task[pid].thread.sp to esp */
     "pushl %1\n\t" /* push ebp */
     "pushl %0\n\t" /* push task[pid].thread.ip */
     "ret\n\t" /* pop task[pid].thread.ip to eip */
     "popl %%ebp\n\t"
     :
     : "c" (task[pid].thread.ip),"d" (task[pid].thread.sp)    /* input c or d mean %ecx/%edx*/
);
}

此函数创建了模拟的初始队列，并对每个进程控制块的各个属性字段进行了赋值，并指定了模拟CPU程序的函数入口。

4 总结

此次对简单的时间片轮转多道程序内核代码进行了分析，使较为枯燥繁琐的linux内核代码变得简单立体，理解起来更为容易。在分析代码的同时又对操作系统的知识作了一次复习，使本人对进程调度方面的认识又上了一个台阶。

秒客网

模拟linux内核进程调度的代码及对其的分析

相关文章