如何提高C中大数据排序的执行速度

时间:2021-10-17 03:53:43

I managed to roll off an insertion sort routine as shown:

我设法滚动插入排序例程,如下所示:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct{
    int n;
    char l;
    char z;
} dat;

void sortx(dat* y){
    char tmp[sizeof(dat)+1];
    dat *sp=y;
    while(y->l){
        dat *ip=y;
        while(ip>sp && ip->n < (ip-1)->n){
            memcpy(tmp,ip,sizeof(dat));
            memcpy(ip,ip-1,sizeof(dat));
            memcpy(ip-1,tmp,sizeof(dat));
            ip--;
        }
        y++;
    }
}

void printa(dat* y){
    while(y->l){printf("%c %d,",y->l,y->n);y++;}
    printf("\n");
}

int main(int argc,char* argv[]){
    const long sz=10000;
    dat* new=calloc(sz+2,sizeof(dat));
    dat* randx=new;
    //fill struct array with random values
    int i;
    for (i = 0 ; i < sz ; i++) {
        randx->l = (unsigned char)(65+(rand() % 25));
        randx->n = (rand() % 1000);randx++;
    }
    //sort - takes forever
    sortx(new);
    printa(new);
    free(new);
    return 0;
}

My sorting routine was partly derived from: http://www.programmingsimplified.com/c/source-code/c-program-insertion-sort but because I am dealing with sorting the array based on the numeric value in the struct, memcpy works for me so far.

我的排序程序部分来自:http://www.programmingsimplified.com/c/source-code/c-program-insertion-sort但是因为我正在处理基于struct中的数值排序数组,memcpy到目前为止对我有用。

The computer I'm using to execute this code has a Pentium 1.6Ghz Processor and when I change sz in the main function to at least 20000, I notice I have to wait two seconds to see the results on the screen.

我用来执行此代码的计算机有一个Pentium 1.6Ghz处理器,当我将main函数中的sz更改为至少20000时,我注意到我必须等待两秒才能在屏幕上看到结果。

The reason why I'm testing large numbers is because I want to process server logs in C and will be sorting information by timestamps and sometimes the logs can become very large, and I don't want to put too much strain on the CPU as it is running other processes already such as apache.

我测试大数字的原因是因为我想在C中处理服务器日志,并且会按时间戳对信息进行排序,有时日志会变得非常大,我不想给CPU带来太多压力它正在运行其他进程,例如apache。

Is there anyway I can improve this code so I don't have to wait two seconds to see 20000 structs sorted?

无论如何我可以改进这个代码,所以我不必等待两秒钟才能看到20000个结构排序?

3 个解决方案

#1


2  

There is already a function that does this, and it's built in in the C standard library: qsort. You just have to provide suitable comparison function.

已有一个函数可以执行此操作,它内置在C标准库中:qsort。你只需要提供合适的比较功能。

This function has to return -1 if the item taken as a left argument should be put earlier in the desired order, 1 if it should be put later, or 0 if the items are to be considered equal by qsort.

如果作为左参数的项目应按先前的顺序排列,则此函数必须返回-1,如果应该稍后放置,则返回1,如果qsort认为项目相等,则返回0。

int dat_sorter(const void* l, const void* r)
{
    const dat* left = (const dat*)l;
    const dat* right = (const dat*)r;
    if(left->n > right->n)
        return 1;
    else if(left->n < right->n)
        return -1;
    else
        return 0;
}

void sortx(dat* y)
{
    /* find the length */
    dat* it = y;
    size_t count = 0;
    while(it->l)
    {
        count++;
        it++;
    }
    /* do the sorting */
    qsort(y, count, sizeof(dat), dat_sorter);
}

If you want to speed it up even more, you can make sortx function take length of the array, so the function won't need to figure it out on its own.

如果你想加快速度,你可以使sortx函数占用数组的长度,因此函数不需要自己解决它。

#2


0  

Use quick sort, heap sort, or bottom up merge sort. Wiki has examples of these in their articles, and typically have more complete examples on each article's talk page.

使用快速排序,堆排序或自下而上合并排序。 Wiki在他们的文章中有这些例子,并且通常在每篇文章的谈话页面上都有更完整的例子。

#3


0  

Insertion sort has O(n^2) time complexity, and there are other algorithms out there that will give you O(nlogn) time complexity like mergesort, quicksort, and heapsort. It looks like you are sorting by an integer, so you also might want to consider using LSD radix sort, which is O(n) time complexity.

插入排序具有O(n ^ 2)时间复杂度,并且还有其他算法可以为您提供O(nlogn)时间复杂度,如mergesort,quicksort和heapsort。看起来你要按整数排序,所以你也可以考虑使用LSD基数排序,这是O(n)时间复杂度。

#1


2  

There is already a function that does this, and it's built in in the C standard library: qsort. You just have to provide suitable comparison function.

已有一个函数可以执行此操作,它内置在C标准库中:qsort。你只需要提供合适的比较功能。

This function has to return -1 if the item taken as a left argument should be put earlier in the desired order, 1 if it should be put later, or 0 if the items are to be considered equal by qsort.

如果作为左参数的项目应按先前的顺序排列,则此函数必须返回-1,如果应该稍后放置,则返回1,如果qsort认为项目相等,则返回0。

int dat_sorter(const void* l, const void* r)
{
    const dat* left = (const dat*)l;
    const dat* right = (const dat*)r;
    if(left->n > right->n)
        return 1;
    else if(left->n < right->n)
        return -1;
    else
        return 0;
}

void sortx(dat* y)
{
    /* find the length */
    dat* it = y;
    size_t count = 0;
    while(it->l)
    {
        count++;
        it++;
    }
    /* do the sorting */
    qsort(y, count, sizeof(dat), dat_sorter);
}

If you want to speed it up even more, you can make sortx function take length of the array, so the function won't need to figure it out on its own.

如果你想加快速度,你可以使sortx函数占用数组的长度,因此函数不需要自己解决它。

#2


0  

Use quick sort, heap sort, or bottom up merge sort. Wiki has examples of these in their articles, and typically have more complete examples on each article's talk page.

使用快速排序,堆排序或自下而上合并排序。 Wiki在他们的文章中有这些例子,并且通常在每篇文章的谈话页面上都有更完整的例子。

#3


0  

Insertion sort has O(n^2) time complexity, and there are other algorithms out there that will give you O(nlogn) time complexity like mergesort, quicksort, and heapsort. It looks like you are sorting by an integer, so you also might want to consider using LSD radix sort, which is O(n) time complexity.

插入排序具有O(n ^ 2)时间复杂度,并且还有其他算法可以为您提供O(nlogn)时间复杂度,如mergesort,quicksort和heapsort。看起来你要按整数排序,所以你也可以考虑使用LSD基数排序,这是O(n)时间复杂度。