在Objective-C中由于整数溢出而处理和报告内存分配错误的最佳方法是什么?

时间:2022-09-02 10:45:47

To begin with, let me say that I understand how and why the problem I'm describing can happen. I was a Computer Science major, and I understand overflow/underflow and signed/unsigned arithmetic. (For those unfamiliar with the topic, Apple's Secure Coding Guide discusses integer overflow briefly.)

首先,让我说我理解我所描述的问题是如何以及为什么会发生的。我是计算机科学专业,我理解溢出/下溢和签名/无符号算术。 (对于那些不熟悉该主题的人,Apple的安全编码指南会简要讨论整数溢出。)

My question is about reporting and recovering from such an error once it has been detected, and more specifically in the case of an Objective-C framework. (I write and maintain CHDataStructures.) I have a few collections classes that allocate memory for storing objects and dynamically expand as necessary. I haven't yet seen any overflow-related crashes, probably because my test cases mostly use sane data. However, given unvalidated values, things could explode rather quickly, and I want to prevent that.

我的问题是关于在检测到这种错误后报告和恢复,更具体地说是在Objective-C框架的情况下。 (我编写和维护CHDataStructures。)我有一些集合类,它们分配用于存储对象的内存并根据需要动态扩展。我还没有看到任何与溢出相关的崩溃,可能是因为我的测试用例主要使用了理智的数据。但是,给定未经验证的值,事情可能会很快爆炸,我想阻止它。

I have identified at least two common cases where this can occur:

我已经确定了至少两种可能发生这种情况的常见情况:

  1. The caller passes a very large unsigned value (or negative signed value) to -initWithCapacity:.
  2. 调用者将非常大的无符号值(或负符号值)传递给-initWithCapacity:。

  3. Enough objects have been added to cause the capacity to dynamically expand, and the capacity has grown large enough to cause overflow.
  4. 已添加足够的对象以使容量动态扩展,并且容量已经增大到足以导致溢出。

The easy part is detecting whether overflow will occur. (For example, before attempting to allocate length * sizeof(void*) bytes, I can check whether length <= UINT_MAX / sizeof(void*), since failing this test will mean that the product will overflow and potentially allocate a much smaller region of memory than desired. On platforms that support it, the checkint.h API is another alternative.) The harder part is determining how to deal with it gracefully. In the first scenario, the caller is perhaps better equipped (or at least in the mindset) to deal with a failure. The second scenario can happen anywhere in the code that an object is added to the collection, which may be quite non-deterministic.

简单的部分是检测是否会发生溢出。 (例如,在尝试分配长度* sizeof(void *)字节之前,我可以检查长度是否<= UINT_MAX / sizeof(void *),因为未通过此测试将意味着产品将溢出并可能分配更小的区域在支持它的平台上,checkint.h API是另一种选择。)更难的部分是确定如何优雅地处理它。在第一种情况下,呼叫者可能更好地(或至少在思维模式中)处理故障。第二种情况可能发生在代码中的任何地方,对象被添加到集合中,这可能是非常不确定的。

My question, then, is this: How is "good citizen" Objective-C code expected to act when integer overflow occurs in this type of situation? (Ideally, since my project is a framework in the same spirit as Foundation in Cocoa, I'd like to model off of the way it behaves for maximum "impedance matching". The Apple documentation I've found doesn't mention much at all about this.) I figure that in any case, reporting the error is a given. Since the APIs to add an object (which could cause scenario 2) don't accept an error parameter, what can I really do to help resolve the problem, if anything? What is really considered okay in such situations? I'm loath to knowingly write crash-prone code if I can do better...

那么,我的问题是:在这种情况下,当整数溢出发生时,“好公民”Objective-C代码如何表现? (理想情况下,由于我的项目是一个与Cocoa中的Foundation基本相同的框架,我想建模它的行为方式以获得最大的“阻抗匹配”。我发现的Apple文档中没有提及太多所有这一切。)我认为,无论如何,报告错误是给定的。由于添加对象的API(可能导致方案2)不接受错误参数,我可以做些什么来帮助解决问题,如果有的话?在这种情况下,真正考虑的是什么?如果我能做得更好,我不愿意故意编写容易崩溃的代码......

5 个解决方案

#1


3  

There are two issues at hand:

手头有两个问题:

(1) An allocation has failed and you are out of memory.

(1)分配失败,你失去记忆。

(2) You have detected an overflow or other erroneous condition that will lead to (1) if you continue.

(2)您已检测到溢出或其他错误情况,如果继续,将导致(1)。

In the case of (1), you are hosed (unless the failed allocation was both stupid large & you know that the failed allocation was only that one). If this happens, the best thing you can do is to crash as quickly as possible and leave behind as much evidence as you can. In particular, creating a function that calls abort() of a name like IAmCrashingOnPurposeBecauseYourMemoryIsDepleted() will leave evidence in the crash log.

在(1)的情况下,你被软管(除非失败的分配都是愚蠢的大,你知道失败的分配只是那个)。如果发生这种情况,您可以做的最好的事情就是尽快崩溃并留下尽可能多的证据。特别是,创建一个调用诸如IAmCrashingOnPurposeBecauseYourMemoryIsDepleted()之类的名称的abort()的函数将在崩溃日志中留下证据。

If it is really (2), then there are additional questions. Specifically, can you recover from the situation and, regardless, is the user's data still intact? If you can recover, then grand... do so and the user never has to know. If not, then you need to make absolutely sure that the user's data is not corrupt. If it isn't, then save and die. If the user's data is corrupt, then do your best to not persist the corrupted data and let the user know that something has gone horribly wrong. If the user's data is already persisted, but corrupt, then... well... ouch... you might want to consider creating a recovery tool of some kind.

如果真的是(2),那么还有其他问题。具体来说,您可以从情况中恢复,无论用户的数据是否仍然完好无损?如果你可以恢复,那么盛大...这样做,用户永远不必知道。如果没有,那么您需要确保用户的数据没有损坏。如果不是,那么保存并死掉。如果用户的数据已损坏,请尽量不要保留已损坏的数据并让用户知道某些内容已经出现严重错误。如果用户的数据已经存在但是已损坏,那么......好吧......哎哟......您可能需要考虑创建某种类型的恢复工具。

#2


4  

Log and raise an exception.

记录并引发异常。

You can only really be a good citizen to other programmers, not the end user, so pass the problem upstairs and do it in a way that clearly explains what is going on, what the problem is (give numbers) and where it is happening so the root cause can be removed.

你只能真正成为其他程序员的好公民,而不是最终用户,所以把问题传递到楼上并以明确解释发生了什么,问题是什么(给出数字)以及它发生在哪里的方式进行。根本原因可以删除。

#3


3  

With regards to dynamically growing, array-based storage, there's only so much that can be done. I'm a developer on the Moab scheduler for supercomputers, and we also deal with very large numbers on systems with thousands of processors, thousands of jobs, and massive amounts of job output. At some point, you can't declare a buffer to be any bigger, without creating a whole new data-type to deal with sizes larger than UINT_MAX, or LONG_LONG_MAX etc., at which point on most "normal" machines you'll be running out of stack/heap space anyway. So I'd say log a meaningful error-message, keep the collection from exploding, and if the user needs to add that many things to a CHDataStructures collection, they ought to know that there are issues dealing with very large numbers, and the caller ought to check whether the add was successful (keep track of the size of the collection, etc.).

关于动态增长的基于阵列的存储,只能做很多事情。我是Moab超级计算机调度程序的开发人员,我们还在拥有数千个处理器,数千个作业和大量作业输出的系统上处理大量数据。在某些时候,你不能声明一个更大的缓冲区,而不是创建一个全新的数据类型来处理大于UINT_MAX或LONG_LONG_MAX等的大小,此时大多数“普通”机器你将是无论如何都耗尽了堆栈/堆空间。所以我要说一个有意义的错误消息,保持集合不爆炸,如果用户需要将很多东西添加到CHDataStructures集合中,他们应该知道处理非常大的数字和调用者的问题应该检查添加是否成功(跟踪集合的大小等)。

Another possibility is to convert array-based storage to dynamically allocated, linked-list-based storage when you get to the point when you can't allocate a larger array with an unsigned int or unsigned long. This would be expensive, but would happen rarely enough that it shouldn't be terribly noticeable to users of the framework. Since the limit on the size of a dynamically allocated, linked-list-based collection is the size of the heap, any user that added enough items to a collection to "overflow" it then would have bigger problems than whether or not his item was successfully added.

另一种可能性是,当您无法使用unsigned int或unsigned long分配更大的数组时,将基于数组的存储转换为动态分配的基于链表的存储。这将是昂贵的,但很少发生,它不应该是框架的用户非常明显。由于动态分配的,基于链表的集合的大小限制是堆的大小,任何向集合添加足够的项以“溢出”它的用户都会遇到比他的项目是否是更大的问题。成功添加。

#4


1  

I'd say the correct thing to do would be to do what the Cocoa collections do. For example, if I have the following code:

我要说的是,正确的做法是做Cocoa系列的工作。例如,如果我有以下代码:

int main (int argc, const char * argv[]) {
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

    NSMutableArray * a = [[NSMutableArray alloc] init];

    for (uint32_t i = 0; i < ULONG_MAX; ++i) {
        for (uint32_t i = 0; i < 10000000; ++i) {
            [a addObject:@"foo"];
        }
        NSLog(@"%lu rounds of 10,000,000 completed", i+1);
    }

    [a release];

    [pool drain];
    return 0;
}

..and just let it run, it will eventually die with EXC_BAD_ACCESS. (I compiled and ran this as a 32-bit app so I could be sure to run out of space when I hit 2**32 objects.

..然后让它运行,它最终会死于EXC_BAD_ACCESS。 (我编译并运行它作为一个32位应用程序,所以当我击中2 ** 32个对象时,我可以肯定用完空间。

In other words, throwing an exception would be nice, but I don't think you really have to do anything.

换句话说,抛出异常会很好,但我认为你真的不需要做任何事情。

#5


0  

Using assertions and a custom assertion handler may be the best available option for you.

使用断言和自定义断言处理程序可能是最适合您的选项。

With assertions, you could easily have many checkpoints in your code, where you verify that things work as they should. If they don't, by default the assertion macro logs the error (developer-defined string), and throws an exception. You can also override the default behavior using a custom assertion handler and implement a different way to handle error conditions (even avoid throwing exceptions).

通过断言,您可以在代码中轻松拥有许多检查点,在这些检查点中,您可以验证事情是否正常工作。如果它们不这样做,默认情况下断言宏会记录错误(开发人员定义的字符串),并抛出异常。您还可以使用自定义断言处理程序覆盖默认行为,并实现不同的方法来处理错误条件(甚至避免抛出异常)。

This approach allows for a greater degree of flexibility and you can easily modify your error handling strategy (throwing exceptions vs. dealing with errors internally) at any point.

这种方法允许更大程度的灵活性,您可以随时轻松修改错误处理策略(抛出异常与内部处理错误)。

The documentation is very concise: Assertions and Logging.

文档非常简洁:断言和记录。

#1


3  

There are two issues at hand:

手头有两个问题:

(1) An allocation has failed and you are out of memory.

(1)分配失败,你失去记忆。

(2) You have detected an overflow or other erroneous condition that will lead to (1) if you continue.

(2)您已检测到溢出或其他错误情况,如果继续,将导致(1)。

In the case of (1), you are hosed (unless the failed allocation was both stupid large & you know that the failed allocation was only that one). If this happens, the best thing you can do is to crash as quickly as possible and leave behind as much evidence as you can. In particular, creating a function that calls abort() of a name like IAmCrashingOnPurposeBecauseYourMemoryIsDepleted() will leave evidence in the crash log.

在(1)的情况下,你被软管(除非失败的分配都是愚蠢的大,你知道失败的分配只是那个)。如果发生这种情况,您可以做的最好的事情就是尽快崩溃并留下尽可能多的证据。特别是,创建一个调用诸如IAmCrashingOnPurposeBecauseYourMemoryIsDepleted()之类的名称的abort()的函数将在崩溃日志中留下证据。

If it is really (2), then there are additional questions. Specifically, can you recover from the situation and, regardless, is the user's data still intact? If you can recover, then grand... do so and the user never has to know. If not, then you need to make absolutely sure that the user's data is not corrupt. If it isn't, then save and die. If the user's data is corrupt, then do your best to not persist the corrupted data and let the user know that something has gone horribly wrong. If the user's data is already persisted, but corrupt, then... well... ouch... you might want to consider creating a recovery tool of some kind.

如果真的是(2),那么还有其他问题。具体来说,您可以从情况中恢复,无论用户的数据是否仍然完好无损?如果你可以恢复,那么盛大...这样做,用户永远不必知道。如果没有,那么您需要确保用户的数据没有损坏。如果不是,那么保存并死掉。如果用户的数据已损坏,请尽量不要保留已损坏的数据并让用户知道某些内容已经出现严重错误。如果用户的数据已经存在但是已损坏,那么......好吧......哎哟......您可能需要考虑创建某种类型的恢复工具。

#2


4  

Log and raise an exception.

记录并引发异常。

You can only really be a good citizen to other programmers, not the end user, so pass the problem upstairs and do it in a way that clearly explains what is going on, what the problem is (give numbers) and where it is happening so the root cause can be removed.

你只能真正成为其他程序员的好公民,而不是最终用户,所以把问题传递到楼上并以明确解释发生了什么,问题是什么(给出数字)以及它发生在哪里的方式进行。根本原因可以删除。

#3


3  

With regards to dynamically growing, array-based storage, there's only so much that can be done. I'm a developer on the Moab scheduler for supercomputers, and we also deal with very large numbers on systems with thousands of processors, thousands of jobs, and massive amounts of job output. At some point, you can't declare a buffer to be any bigger, without creating a whole new data-type to deal with sizes larger than UINT_MAX, or LONG_LONG_MAX etc., at which point on most "normal" machines you'll be running out of stack/heap space anyway. So I'd say log a meaningful error-message, keep the collection from exploding, and if the user needs to add that many things to a CHDataStructures collection, they ought to know that there are issues dealing with very large numbers, and the caller ought to check whether the add was successful (keep track of the size of the collection, etc.).

关于动态增长的基于阵列的存储,只能做很多事情。我是Moab超级计算机调度程序的开发人员,我们还在拥有数千个处理器,数千个作业和大量作业输出的系统上处理大量数据。在某些时候,你不能声明一个更大的缓冲区,而不是创建一个全新的数据类型来处理大于UINT_MAX或LONG_LONG_MAX等的大小,此时大多数“普通”机器你将是无论如何都耗尽了堆栈/堆空间。所以我要说一个有意义的错误消息,保持集合不爆炸,如果用户需要将很多东西添加到CHDataStructures集合中,他们应该知道处理非常大的数字和调用者的问题应该检查添加是否成功(跟踪集合的大小等)。

Another possibility is to convert array-based storage to dynamically allocated, linked-list-based storage when you get to the point when you can't allocate a larger array with an unsigned int or unsigned long. This would be expensive, but would happen rarely enough that it shouldn't be terribly noticeable to users of the framework. Since the limit on the size of a dynamically allocated, linked-list-based collection is the size of the heap, any user that added enough items to a collection to "overflow" it then would have bigger problems than whether or not his item was successfully added.

另一种可能性是,当您无法使用unsigned int或unsigned long分配更大的数组时,将基于数组的存储转换为动态分配的基于链表的存储。这将是昂贵的,但很少发生,它不应该是框架的用户非常明显。由于动态分配的,基于链表的集合的大小限制是堆的大小,任何向集合添加足够的项以“溢出”它的用户都会遇到比他的项目是否是更大的问题。成功添加。

#4


1  

I'd say the correct thing to do would be to do what the Cocoa collections do. For example, if I have the following code:

我要说的是,正确的做法是做Cocoa系列的工作。例如,如果我有以下代码:

int main (int argc, const char * argv[]) {
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

    NSMutableArray * a = [[NSMutableArray alloc] init];

    for (uint32_t i = 0; i < ULONG_MAX; ++i) {
        for (uint32_t i = 0; i < 10000000; ++i) {
            [a addObject:@"foo"];
        }
        NSLog(@"%lu rounds of 10,000,000 completed", i+1);
    }

    [a release];

    [pool drain];
    return 0;
}

..and just let it run, it will eventually die with EXC_BAD_ACCESS. (I compiled and ran this as a 32-bit app so I could be sure to run out of space when I hit 2**32 objects.

..然后让它运行,它最终会死于EXC_BAD_ACCESS。 (我编译并运行它作为一个32位应用程序,所以当我击中2 ** 32个对象时,我可以肯定用完空间。

In other words, throwing an exception would be nice, but I don't think you really have to do anything.

换句话说,抛出异常会很好,但我认为你真的不需要做任何事情。

#5


0  

Using assertions and a custom assertion handler may be the best available option for you.

使用断言和自定义断言处理程序可能是最适合您的选项。

With assertions, you could easily have many checkpoints in your code, where you verify that things work as they should. If they don't, by default the assertion macro logs the error (developer-defined string), and throws an exception. You can also override the default behavior using a custom assertion handler and implement a different way to handle error conditions (even avoid throwing exceptions).

通过断言,您可以在代码中轻松拥有许多检查点,在这些检查点中,您可以验证事情是否正常工作。如果它们不这样做,默认情况下断言宏会记录错误(开发人员定义的字符串),并抛出异常。您还可以使用自定义断言处理程序覆盖默认行为,并实现不同的方法来处理错误条件(甚至避免抛出异常)。

This approach allows for a greater degree of flexibility and you can easily modify your error handling strategy (throwing exceptions vs. dealing with errors internally) at any point.

这种方法允许更大程度的灵活性,您可以随时轻松修改错误处理策略(抛出异常与内部处理错误)。

The documentation is very concise: Assertions and Logging.

文档非常简洁:断言和记录。