如何最好地对循环缓冲区的一部分进行排序?

时间:2022-08-02 03:58:46

I have a circular, statically allocated buffer in C, which I'm using as a queue for a depth breadth first search. I'd like have the top N elements in the queue sorted. It would be easy to just use a regular qsort() - except it's a circular buffer, and the top N elements might wrap around. I could, of course, write my own sorting implementation that uses modular arithmetic and knows how to wrap around the array, but I've always thought that writing sorting functions is a good exercise, but something better left to libraries.

我在C中有一个循环的,静态分配的缓冲区,我将其用作深度广度优先搜索的队列。我希望排队队列中的前N个元素。使用常规qsort()很容易 - 除了它是一个循环缓冲区,并且前N个元素可能会环绕。当然,我可以编写自己的排序实现,它使用模块化算法并知道如何包装数组,但我一直认为编写排序函数是一个很好的练习,但最好留给库。

I thought of several approaches:

我想到了几种方法:

  1. Use a separate linear buffer - first copy the elements from the circular buffer, then apply qsort, then copy them back. Using an additional buffer means an additional O(N) space requirement, which brings me to
    • Sort the "top" and "bottom" halve using qsort, and then merge them using the additional buffer
    • 使用qsort对“top”和“bottom”half进行排序,然后使用其他缓冲区合并它们

    • Same as 2. but do the final merge in-place (I haven't found much on in-place merging, but the implementations I've seen don't seem worth the reduced space complexity)
    • 与2.相同但是就地进行最终合并(我在就地合并方面没有找到太多内容,但我看到的实现似乎不值得降低空间复杂度)

  2. 使用单独的线性缓冲区 - 首先从循环缓冲区复制元素,然后应用qsort,然后将它们复制回来。使用额外的缓冲区意味着额外的O(N)空间要求,这使我使用qsort对“top”和“bottom”half进行排序,然后使用附加缓冲区将它们合并为2.但是最后合并在 - 地方(我没有找到很多就地合并,但我看到的实现似乎不值得减少空间复杂度)

On the other hand, spending an hour contemplating how to elegantly avoid writing my own quicksort, instead of adding those 25 (or so) lines might not be the most productive either...

另一方面,花一个小时考虑如何优雅地避免编写我自己的快速排序,而不是添加那些25(左右)线可能不是最有效率...

Correction: Made a stupid mistake of switching DFS and BFS (I prefer writing a DFS, but in this particular case I have to use a BFS), sorry for the confusion.

更正:切换DFS和BFS犯了一个愚蠢的错误(我更喜欢编写DFS,但在这种特殊情况下我必须使用BFS),抱歉让人感到困惑。

Further description of the original problem:

进一步描述原始问题:

I'm implementing a breadth first search (for something not unlike the fifteen puzzle, just more complicated, with about O(n^2) possible expansions in each state, instead of 4). The "bruteforce" algorithm is done, but it's "stupid" - at each point, it expands all valid states, in a hard-coded order. The queue is implemented as a circular buffer (unsigned queue[MAXLENGTH]), and it stores integer indices into a table of states. Apart from two simple functions to queue and dequeue an index, it has no encapsulation - it's just a simple, statically allocated array of unsigned's.

我正在实施广度优先搜索(对于与十五个难题不同的东西,只是更复杂,在每个状态中有大约O(n ^ 2)个可能的扩展,而不是4)。完成了“强力”算法,但它是“愚蠢的” - 在每个点上,它以硬编码的顺序扩展所有有效状态。队列实现为循环缓冲区(unsigned queue [MAXLENGTH]),并将整数索引存储到状态表中。除了两个简单的函数来排队和出列索引之外,它没有封装 - 它只是一个简单的,静态分配的无符号数组。

Now I want to add some heuristics. The first thing I want to try is to sort the expanded child states after expansion ("expand them in a better order") - just like I would if I were programming a simple best-first DFS. For this, I want to take part of the queue (representing the most recent expanded states), and sort them using some kind of heuristic. I could also expand the states in a different order (so in this case, it's not really important if I break the FIFO properties of the queue).

现在我想添加一些启发式方法。我想要尝试的第一件事是在扩展后对扩展的子状态进行排序(“以更好的顺序扩展它们”) - 就像我编写一个简单的最佳优先DFS一样。为此,我想参与队列的一部分(代表最近的扩展状态),并使用某种启发式对它们进行排序。我也可以按不同的顺序扩展状态(所以在这种情况下,如果我打破队列的FIFO属性,这并不重要)。

My goal is not to implement A*, or a depth first search based algorithm (I can't afford to expand all states, but if I don't, I'll start having problems with infinite cycles in the state space, so I'd have to use something like iterative deepening).

我的目标不是实现A *,或者是基于深度优先搜索的算法(我不能扩展所有状态,但如果我不这样做,我将开始遇到状态空间中无限循环的问题,所以我'必须使用迭代加深之类的东西)。

6 个解决方案

#1


2  

I'm not seeing exactly the solution you asked for in c. You might consider one of these ideas:

我没有看到你要求的解决方案。您可以考虑以下其中一个想法:

  • If you have access to the source for your libc's qsort(), you might copy it and simply replace all the array access and indexing code with appropriately generalized equivalents. This gives you some modest assurance that the underling sort is efficient and has few bugs. No help with the risk of introducing your own bugs, of course. Big O like the system qsort, but possibly with a worse multiplier.

    如果您可以访问libc的qsort()的源代码,则可以复制它,并简单地用适当的通用等价物替换所有数组访问和索引代码。这为您提供了一些适度的保证,即基础排序是有效的,并且几乎没有错误。当然,没有帮助引入自己的错误的风险。 Big O喜欢系统qsort,但可能有更差的乘数。

  • If the region to be sorted is small compared to the size of the buffer, you could use the straight ahead linear sort, guarding the call with a test-for-wrap and doing the copy-to-linear-buffer-sort-then-copy-back routine only if needed. Introduces an extra O(n) operation in the cases that trip the guard (for n the size of the region to be sorted), which makes the average O(n^2/N) < O(n).

    如果要排序的区域与缓冲区的大小相比较小,则可以使用直接线性排序,使用测试换行保护调用并执行复制到线性缓冲区排序 - 然后 - 仅在需要时复制例程。在使防护装置跳闸的情况下(对于要分类的区域的大小为n)引入额外的O(n)操作,这使得平均O(n ^ 2 / N) (n)。


I see that C++ is not an option for you. ::sigh:: I will leave this here in case someone else can use it.

我看到C ++不是你的选择。 ::叹气::我会留下这个以防万一其他人可以使用它。

  • If C++ is an option you could (subclass the buffer if needed and) overload the [] operator to make the standard sort algorithms work. Again, should work like the standard sort with a multiplier penalty.
  • 如果C ++是一个选项,你可以(如果需要的话继承缓冲区)并重载[]运算符以使标准排序算法起作用。同样,应该像标准排序一样工作,带有乘数惩罚。

#2


5  

I think you need to take a big step back from the problem and try to solve it as a whole - chances are good that the semi-sorted circular buffer is not the best way to store your data. If it is, then you're already committed and you will have to write the buffer to sort the elements - whether that means performing an occasional sort with an outside library, or doing it when elements are inserted I don't know. But at the end of the day it's going to be ugly because a FIFO and sorted buffer are fundamentally different.

我认为您需要从问题中退一步并尝试将其作为一个整体来解决 - 半排序循环缓冲区不是存储数据的最佳方式。如果是,那么你已经提交了,你必须编写缓冲区来对元素进行排序 - 这是否意味着偶尔对外部库进行排序,或者在插入元素时进行,我不知道。但是在一天结束时它会变得丑陋,因为FIFO和排序缓冲区根本不同。


Previous answer, which assumes your sort library has a robust and feature filled API (as requested in your question, this does not require you to write your own mod sort or anything - it depends on the library supporting arbitrary located data, usually through a callback function. If your sort doesn't support linked lists, it can't handle this):

以前的答案,假设您的排序库具有强大且功能丰富的API(在您的问题中要求,这不需要您编写自己的mod排序或任何东西 - 它取决于支持任意位置数据的库,通常通过回调如果你的排序不支持链表,它就无法解决这个问题):

The circular buffer has already solved this problem using % (mod) arithmetic. QSort, etc don't care about the locations in memory - they just need a scheme to address the data in a linear manner.

循环缓冲区已经使用%(mod)算法解决了这个问题。 QSort等不关心内存中的位置 - 它们只需要一种以线性方式处理数据的方案。

They work as well for linked lists (which are not linear in memory) as they do for 'real' linear non circular arrays.

它们对于链接列表(在内存中不是线性的)也适用于“真实”线性非圆形阵列。

So if you have a circular array with 100 entries, and you find you need to sort the top 10, and the top ten happen to wrap in half at the top, then you feed the sort the following two bits of information:

因此,如果您有一个包含100个条目的循环数组,并且您发现需要对前10个进行排序,并且前十个恰好在顶部包裹了一半,那么您将对以下两位信息进行排序:

  • The function to locate an array item is (x % 100)
  • 定位数组项的函数是(x%100)

  • The items to be sorted are at locations 95 to 105
  • 要分类的项目位于95到105的位置

The function will convert the addresses the sort uses into an index used in the real array, and the fact that the array wraps around is hidden, although it may look weird to sort an array past its bounds, a circular array, by definition, has no bounds. The % operator handles that for you, and you might as well be referring to the part of the array as 1295 to 1305 for all it cares.

该函数将sort排序使用的地址转换为真实数组中使用的索引,并且数组环绕的事实是隐藏的,尽管将数组排序超出其边界可能看起来很奇怪,根据定义,圆形数组具有没有界限。 %运算符会为您处理,您可能也会将数组的部分称为1295到1305,只关注它。

Bonus points for having an array with 2^n elements.

具有2 ^ n个元素的数组的加分点。


Additional points of consideration:

It sounds to me that you're using a sorting library which is incapable of sorting anything other than a linear array - so it can't sort linked lists, or arrays with anything other than simple ordering. You really only have three choices:

听起来你正在使用一个排序库,它不能排序除线性数组之外的任何东西 - 因此它不能对链接列表进行排序,也不能对除简单排序以外的任何数组进行排序。你真的只有三个选择:

  • You can re-write the library to be more flexible (ie, when you call it you give it a set of function pointers for comparison operations, and data access operations)
  • 您可以重新编写库以使其更加灵活(即,当您调用它时,您可以为其提供一组用于比较操作和数据访问操作的函数指针)

  • You can re-write your array so it somehow fits your existing libraries
  • 您可以重新编写数组,以便它以某种方式适合您现有的库

  • You can write custom sorts for your particular solution.
  • 您可以为特定解决方案编写自定义排序。

Now, for my part I'd re-write the sort code so it was more flexible (or duplicate it and edit the new copy so you have sorts which are fast for linear arrays, and sorts which are flexible for non-linear arrays)

现在,就我而言,我会重新编写排序代码,因此它更灵活(或者复制它并编辑新副本,因此您可以对线性数组进行快速排序,对非线性数组进行灵活排序)

But the reality is that right now your sort library is so simple you can't even tell it how to access data that is non linearly stored.

但实际情况是,现在您的排序库非常简单,您甚至无法告诉它如何访问非线性存储的数据。

If it's that simple, there should be no hesitation to adapting the library itself to your particular needs, or adapting your buffer to the library.

如果这很简单,应该毫不犹豫地使库本身适应您的特定需求,或者使您的缓冲区适应库。

Trying an ugly kludge, like somehow turning your buffer into a linear array, sorting it, and then putting it back in is just that - an ugly kludge that you're going to have to understand and maintain later. You're going to 'break' into your FIFO and fiddle with the innards.

尝试一个丑陋的kludge,就像以某种方式将你的缓冲区变成一个线性阵列,对它进行排序,然后把它放回去 - 这是一个丑陋的kludge,你将不得不理解和维护以后。你将“打破”你的先进先出,然后摆弄内脏。

-Adam

#3


2  

Perhaps a priority queue could be adapted to solve your issue.'

也许可以调整优先级队列来解决您的问题。

#4


2  

You could rotate the circular queue until the subset in question no longer wraps around. Then just pass that subset to qsort like normal. This might be expensive if you need to sort frequently or if the array element size is very large. But if your array elements are just pointers to other objects then rotating the queue may be fast enough. And in fact if they are just pointers then your first approach might also be fast enough: making a separate linear copy of a subset, sorting it, and writing the results back.

您可以旋转循环队列,直到有问题的子集不再包围。然后像普通一样将该子集传递给qsort。如果您需要经常排序或者数组元素大小非常大,这可能会很昂贵。但是如果你的数组元素只是指向其他对象的指针,那么旋转队列可能足够快。事实上,如果它们只是指针,那么你的第一种方法也可能足够快:制作一个子集的单独线性副本,对其进行排序,然后将结果写回来。

#5


1  

Do you know about the rules regarding optimization? You can google them (you'll find a few versions, but they all say pretty much the same thing, DON'T).

你知道有关优化的规则吗?你可以谷歌他们(你会找到几个版本,但他们都说同样的事情,不要)。

It sounds like you are optimizing without testing. That's a huge no-no. On the other hand, you're using straight C, so you are probably on a restricted platform that requires some level of attention to speed, so I expect you need to skip the first two rules because I assume you have no choice:

听起来你在没有测试的情况下进行优化。这是一个巨大的禁忌。另一方面,你正在使用直接C,所以你可能在受限制的平台上需要一定程度的关注速度,所以我希望你需要跳过前两条规则,因为我认为你别无选择:

Rules of optimization:

优化规则:

  1. Don't optimize.

  2. If you know what you are doing, see rule #1

    如果您知道自己在做什么,请参阅规则#1

You can go to the more advanced rules:

您可以转到更高级的规则:

Rules of optimization (cont):

优化规则(续):

  1. If you have a spec that requires a certain level of performance, write the code unoptimized and write a test to see if it meets that spec. If it meets it, you're done. NEVER write code taking performance into consideration until you have reached this point.

    如果您的规范需要一定程度的性能,请编写未经优化的代码并编写测试以查看它是否符合该规范。如果遇到它,你就完成了。在您达到这一点之前,永远不要编写考虑性能的代码。

  2. If you complete step 3 and your code does not meet the specs, recode it leaving your original "most obvious" code in there as comments and retest. If it does not meet the requirements, throw it away and use the unoptimized code.

    如果您完成第3步并且您的代码不符合规范,请将其重新编码,将原始“最明显”的代码留在那里作为注释并重新测试。如果它不符合要求,请将其丢弃并使用未经优化的代码。

  3. If your improvements made the tests pass, ensure that the tests remain in the codebase and are re-run, and that your original code remains in there as comments.

    如果您的改进使测试通过,请确保测试保留在代码库中并重新运行,并且原始代码仍作为注释保留在那里。

Note: that should be 3. 4. 5. Something is screwed up--I'm not even using any markup tags.

注意:应该是3. 4. 5.有些东西搞砸了 - 我甚至没有使用任何标记标签。

Okay, so finally--I'm not saying this because I read it somewhere. I've spent DAYS trying to untangle some god-awful messes that other people coded because it was "Optimized"--and the really funny part is that 9 times out of 10, the compiler could have optimized it better than they did.

好的,最后 - 我不是这么说的,因为我在某处读到了它。我花了DAYS试图解开其他人编码的一些可怕的混乱,因为它是“优化的” - 真正有趣的部分是10次中的9次,编译器可以比他们更好地优化它。

I realize that there are times when you will NEED to optimize, all I'm saying is write it unoptimized, test and recode it. It really won't take you much longer--might even make writing the optimized code easier.

我意识到有时候你需要进行优化,我所说的只是将其写入未经优化,测试并重新编码。它真的不会花费你更长的时间 - 甚至可以使编写优化的代码更容易。

The only reason I'm posting this is because almost every line you've written concerns performance, and I'm worried that the next person to see your code is going to be some poor sap like me.

我发布这个的唯一原因是因为你写的几乎每一行都涉及到性能,我担心下一个看到你代码的人会像我一样陷入困境。

#6


0  

How about somthing like this example here. This example easely sorts a part or whatever you want without having to redefine a lot of extra memory. It takes inly two pointers a status bit and a counter for the for loop.

这个例子怎么样?这个例子可以轻松地对一部分或任何你想要的东西进行分类,而无需重新定义大量的额外内存。它需要两个指针状态位和一个for循环计数器。

#define _PRINT_PROGRESS
#define N 10
BYTE buff[N]={4,5,2,1,3,5,8,6,4,3};
BYTE *a = buff;
BYTE *b = buff;
BYTE changed = 0;
int main(void)
{
    BYTE n=0;
    do
    {
        b++;
        changed = 0;
        for(n=0;n<(N-1);n++)
        {
            if(*a > *b)
            {
                *a ^= *b;
                *b ^= *a;
                *a ^= *b;
                changed = 1;
            }
            a++;
            b++;
        }
        a = buff;
        b = buff;
#ifdef _PRINT_PROGRESS
        for(n=0;n<N;n++)
            printf("%d",buff[n]);
        printf("\n");
    }
#endif
    while(changed);
    system( "pause" );
}

#1


2  

I'm not seeing exactly the solution you asked for in c. You might consider one of these ideas:

我没有看到你要求的解决方案。您可以考虑以下其中一个想法:

  • If you have access to the source for your libc's qsort(), you might copy it and simply replace all the array access and indexing code with appropriately generalized equivalents. This gives you some modest assurance that the underling sort is efficient and has few bugs. No help with the risk of introducing your own bugs, of course. Big O like the system qsort, but possibly with a worse multiplier.

    如果您可以访问libc的qsort()的源代码,则可以复制它,并简单地用适当的通用等价物替换所有数组访问和索引代码。这为您提供了一些适度的保证,即基础排序是有效的,并且几乎没有错误。当然,没有帮助引入自己的错误的风险。 Big O喜欢系统qsort,但可能有更差的乘数。

  • If the region to be sorted is small compared to the size of the buffer, you could use the straight ahead linear sort, guarding the call with a test-for-wrap and doing the copy-to-linear-buffer-sort-then-copy-back routine only if needed. Introduces an extra O(n) operation in the cases that trip the guard (for n the size of the region to be sorted), which makes the average O(n^2/N) < O(n).

    如果要排序的区域与缓冲区的大小相比较小,则可以使用直接线性排序,使用测试换行保护调用并执行复制到线性缓冲区排序 - 然后 - 仅在需要时复制例程。在使防护装置跳闸的情况下(对于要分类的区域的大小为n)引入额外的O(n)操作,这使得平均O(n ^ 2 / N) (n)。


I see that C++ is not an option for you. ::sigh:: I will leave this here in case someone else can use it.

我看到C ++不是你的选择。 ::叹气::我会留下这个以防万一其他人可以使用它。

  • If C++ is an option you could (subclass the buffer if needed and) overload the [] operator to make the standard sort algorithms work. Again, should work like the standard sort with a multiplier penalty.
  • 如果C ++是一个选项,你可以(如果需要的话继承缓冲区)并重载[]运算符以使标准排序算法起作用。同样,应该像标准排序一样工作,带有乘数惩罚。

#2


5  

I think you need to take a big step back from the problem and try to solve it as a whole - chances are good that the semi-sorted circular buffer is not the best way to store your data. If it is, then you're already committed and you will have to write the buffer to sort the elements - whether that means performing an occasional sort with an outside library, or doing it when elements are inserted I don't know. But at the end of the day it's going to be ugly because a FIFO and sorted buffer are fundamentally different.

我认为您需要从问题中退一步并尝试将其作为一个整体来解决 - 半排序循环缓冲区不是存储数据的最佳方式。如果是,那么你已经提交了,你必须编写缓冲区来对元素进行排序 - 这是否意味着偶尔对外部库进行排序,或者在插入元素时进行,我不知道。但是在一天结束时它会变得丑陋,因为FIFO和排序缓冲区根本不同。


Previous answer, which assumes your sort library has a robust and feature filled API (as requested in your question, this does not require you to write your own mod sort or anything - it depends on the library supporting arbitrary located data, usually through a callback function. If your sort doesn't support linked lists, it can't handle this):

以前的答案,假设您的排序库具有强大且功能丰富的API(在您的问题中要求,这不需要您编写自己的mod排序或任何东西 - 它取决于支持任意位置数据的库,通常通过回调如果你的排序不支持链表,它就无法解决这个问题):

The circular buffer has already solved this problem using % (mod) arithmetic. QSort, etc don't care about the locations in memory - they just need a scheme to address the data in a linear manner.

循环缓冲区已经使用%(mod)算法解决了这个问题。 QSort等不关心内存中的位置 - 它们只需要一种以线性方式处理数据的方案。

They work as well for linked lists (which are not linear in memory) as they do for 'real' linear non circular arrays.

它们对于链接列表(在内存中不是线性的)也适用于“真实”线性非圆形阵列。

So if you have a circular array with 100 entries, and you find you need to sort the top 10, and the top ten happen to wrap in half at the top, then you feed the sort the following two bits of information:

因此,如果您有一个包含100个条目的循环数组,并且您发现需要对前10个进行排序,并且前十个恰好在顶部包裹了一半,那么您将对以下两位信息进行排序:

  • The function to locate an array item is (x % 100)
  • 定位数组项的函数是(x%100)

  • The items to be sorted are at locations 95 to 105
  • 要分类的项目位于95到105的位置

The function will convert the addresses the sort uses into an index used in the real array, and the fact that the array wraps around is hidden, although it may look weird to sort an array past its bounds, a circular array, by definition, has no bounds. The % operator handles that for you, and you might as well be referring to the part of the array as 1295 to 1305 for all it cares.

该函数将sort排序使用的地址转换为真实数组中使用的索引,并且数组环绕的事实是隐藏的,尽管将数组排序超出其边界可能看起来很奇怪,根据定义,圆形数组具有没有界限。 %运算符会为您处理,您可能也会将数组的部分称为1295到1305,只关注它。

Bonus points for having an array with 2^n elements.

具有2 ^ n个元素的数组的加分点。


Additional points of consideration:

It sounds to me that you're using a sorting library which is incapable of sorting anything other than a linear array - so it can't sort linked lists, or arrays with anything other than simple ordering. You really only have three choices:

听起来你正在使用一个排序库,它不能排序除线性数组之外的任何东西 - 因此它不能对链接列表进行排序,也不能对除简单排序以外的任何数组进行排序。你真的只有三个选择:

  • You can re-write the library to be more flexible (ie, when you call it you give it a set of function pointers for comparison operations, and data access operations)
  • 您可以重新编写库以使其更加灵活(即,当您调用它时,您可以为其提供一组用于比较操作和数据访问操作的函数指针)

  • You can re-write your array so it somehow fits your existing libraries
  • 您可以重新编写数组,以便它以某种方式适合您现有的库

  • You can write custom sorts for your particular solution.
  • 您可以为特定解决方案编写自定义排序。

Now, for my part I'd re-write the sort code so it was more flexible (or duplicate it and edit the new copy so you have sorts which are fast for linear arrays, and sorts which are flexible for non-linear arrays)

现在,就我而言,我会重新编写排序代码,因此它更灵活(或者复制它并编辑新副本,因此您可以对线性数组进行快速排序,对非线性数组进行灵活排序)

But the reality is that right now your sort library is so simple you can't even tell it how to access data that is non linearly stored.

但实际情况是,现在您的排序库非常简单,您甚至无法告诉它如何访问非线性存储的数据。

If it's that simple, there should be no hesitation to adapting the library itself to your particular needs, or adapting your buffer to the library.

如果这很简单,应该毫不犹豫地使库本身适应您的特定需求,或者使您的缓冲区适应库。

Trying an ugly kludge, like somehow turning your buffer into a linear array, sorting it, and then putting it back in is just that - an ugly kludge that you're going to have to understand and maintain later. You're going to 'break' into your FIFO and fiddle with the innards.

尝试一个丑陋的kludge,就像以某种方式将你的缓冲区变成一个线性阵列,对它进行排序,然后把它放回去 - 这是一个丑陋的kludge,你将不得不理解和维护以后。你将“打破”你的先进先出,然后摆弄内脏。

-Adam

#3


2  

Perhaps a priority queue could be adapted to solve your issue.'

也许可以调整优先级队列来解决您的问题。

#4


2  

You could rotate the circular queue until the subset in question no longer wraps around. Then just pass that subset to qsort like normal. This might be expensive if you need to sort frequently or if the array element size is very large. But if your array elements are just pointers to other objects then rotating the queue may be fast enough. And in fact if they are just pointers then your first approach might also be fast enough: making a separate linear copy of a subset, sorting it, and writing the results back.

您可以旋转循环队列,直到有问题的子集不再包围。然后像普通一样将该子集传递给qsort。如果您需要经常排序或者数组元素大小非常大,这可能会很昂贵。但是如果你的数组元素只是指向其他对象的指针,那么旋转队列可能足够快。事实上,如果它们只是指针,那么你的第一种方法也可能足够快:制作一个子集的单独线性副本,对其进行排序,然后将结果写回来。

#5


1  

Do you know about the rules regarding optimization? You can google them (you'll find a few versions, but they all say pretty much the same thing, DON'T).

你知道有关优化的规则吗?你可以谷歌他们(你会找到几个版本,但他们都说同样的事情,不要)。

It sounds like you are optimizing without testing. That's a huge no-no. On the other hand, you're using straight C, so you are probably on a restricted platform that requires some level of attention to speed, so I expect you need to skip the first two rules because I assume you have no choice:

听起来你在没有测试的情况下进行优化。这是一个巨大的禁忌。另一方面,你正在使用直接C,所以你可能在受限制的平台上需要一定程度的关注速度,所以我希望你需要跳过前两条规则,因为我认为你别无选择:

Rules of optimization:

优化规则:

  1. Don't optimize.

  2. If you know what you are doing, see rule #1

    如果您知道自己在做什么,请参阅规则#1

You can go to the more advanced rules:

您可以转到更高级的规则:

Rules of optimization (cont):

优化规则(续):

  1. If you have a spec that requires a certain level of performance, write the code unoptimized and write a test to see if it meets that spec. If it meets it, you're done. NEVER write code taking performance into consideration until you have reached this point.

    如果您的规范需要一定程度的性能,请编写未经优化的代码并编写测试以查看它是否符合该规范。如果遇到它,你就完成了。在您达到这一点之前,永远不要编写考虑性能的代码。

  2. If you complete step 3 and your code does not meet the specs, recode it leaving your original "most obvious" code in there as comments and retest. If it does not meet the requirements, throw it away and use the unoptimized code.

    如果您完成第3步并且您的代码不符合规范,请将其重新编码,将原始“最明显”的代码留在那里作为注释并重新测试。如果它不符合要求,请将其丢弃并使用未经优化的代码。

  3. If your improvements made the tests pass, ensure that the tests remain in the codebase and are re-run, and that your original code remains in there as comments.

    如果您的改进使测试通过,请确保测试保留在代码库中并重新运行,并且原始代码仍作为注释保留在那里。

Note: that should be 3. 4. 5. Something is screwed up--I'm not even using any markup tags.

注意:应该是3. 4. 5.有些东西搞砸了 - 我甚至没有使用任何标记标签。

Okay, so finally--I'm not saying this because I read it somewhere. I've spent DAYS trying to untangle some god-awful messes that other people coded because it was "Optimized"--and the really funny part is that 9 times out of 10, the compiler could have optimized it better than they did.

好的,最后 - 我不是这么说的,因为我在某处读到了它。我花了DAYS试图解开其他人编码的一些可怕的混乱,因为它是“优化的” - 真正有趣的部分是10次中的9次,编译器可以比他们更好地优化它。

I realize that there are times when you will NEED to optimize, all I'm saying is write it unoptimized, test and recode it. It really won't take you much longer--might even make writing the optimized code easier.

我意识到有时候你需要进行优化,我所说的只是将其写入未经优化,测试并重新编码。它真的不会花费你更长的时间 - 甚至可以使编写优化的代码更容易。

The only reason I'm posting this is because almost every line you've written concerns performance, and I'm worried that the next person to see your code is going to be some poor sap like me.

我发布这个的唯一原因是因为你写的几乎每一行都涉及到性能,我担心下一个看到你代码的人会像我一样陷入困境。

#6


0  

How about somthing like this example here. This example easely sorts a part or whatever you want without having to redefine a lot of extra memory. It takes inly two pointers a status bit and a counter for the for loop.

这个例子怎么样?这个例子可以轻松地对一部分或任何你想要的东西进行分类,而无需重新定义大量的额外内存。它需要两个指针状态位和一个for循环计数器。

#define _PRINT_PROGRESS
#define N 10
BYTE buff[N]={4,5,2,1,3,5,8,6,4,3};
BYTE *a = buff;
BYTE *b = buff;
BYTE changed = 0;
int main(void)
{
    BYTE n=0;
    do
    {
        b++;
        changed = 0;
        for(n=0;n<(N-1);n++)
        {
            if(*a > *b)
            {
                *a ^= *b;
                *b ^= *a;
                *a ^= *b;
                changed = 1;
            }
            a++;
            b++;
        }
        a = buff;
        b = buff;
#ifdef _PRINT_PROGRESS
        for(n=0;n<N;n++)
            printf("%d",buff[n]);
        printf("\n");
    }
#endif
    while(changed);
    system( "pause" );
}