大型多维数组(Jagged Array)C#的解决方法?

时间:2020-12-03 21:35:44

I'm trying to initialize an array in three dimension to load a voxel world.

我正在尝试初始化三维数组以加载体素世界。

The total size of the map should be (2048/1024/2048). I tried to initialize an jagged array of "int" but I throw a memory exception. What is the size limit? Size of my table: 2048 * 1024 * 2048 = 4'191'893'824

地图的总大小应为(2048/1024/2048)。我试图初始化一个“int”的锯齿状数组,但我抛出了一个内存异常。尺寸限制是多少?我桌子的大小:2048 * 1024 * 2048 = 4'191'893'824

Anyone know there a way around this problem?

有人知道解决这个问题吗?

// System.OutOfMemoryException here !
int[][][] matrice = CreateJaggedArray<int[][][]>(2048,1024,2048);
// if i try normal Initialization I also throws the exception
int[, ,] matrice = new int[2048,1024,2048];

    static T CreateJaggedArray<T>(params int[] lengths)
    {
        return (T)InitializeJaggedArray(typeof(T).GetElementType(), 0, lengths);
    }

    static object InitializeJaggedArray(Type type, int index, int[] lengths)
    {
        Array array = Array.CreateInstance(type, lengths[index]);
        Type elementType = type.GetElementType();

        if (elementType != null)
        {
            for (int i = 0; i < lengths[index]; i++)
            {
                array.SetValue(
                    InitializeJaggedArray(elementType, index + 1, lengths), i);
            }
        }

        return array;
    }

4 个解决方案

#1


4  

The maximum size of a single object in C# is 2GB. Since you are creating a multi-dimensional array rather than a jagged array (despite the name of your method) it is a single object that needs to contain all of those items, not several. If you actually used a jagged array then you wouldn't have a single item with all of that data (even though the total memory footprint would be a tad larger, not smaller, it's just spread out more).

C#中单个对象的最大大小为2GB。由于您正在创建一个多维数组而不是锯齿状数组(尽管您的方法的名称),因此它是一个单个对象,需要包含所有这些项,而不是几个。如果您实际使用了锯齿状阵列,那么您将不会拥有包含所有这些数据的单个项目(即使总内存占用量更大,而不是更小,它只是分散更多)。

#2


3  

From the MSDN documentation on Arrays (emphasis added)

从阵列上的MSDN文档(重点添加)

By default, the maximum size of an Array is 2 gigabytes (GB). In a 64-bit environment, you can avoid the size restriction by setting the enabled attribute of the gcAllowVeryLargeObjects configuration element to true in the run-time environment. However, the array will still be limited to a total of 4 billion elements, and to a maximum index of 0X7FEFFFFF in any given dimension (0X7FFFFFC7 for byte arrays and arrays of single-byte structures).

默认情况下,阵列的最大大小为2千兆字节(GB)。在64位环境中,可以通过在运行时环境中将gcAllowVeryLargeObjects配置元素的enabled属性设置为true来避免大小限制。但是,该阵列仍将限制为总共40亿个元素,并且在任何给定维度中最大索引为0X7FEFFFFF(对于字节数组和单字节结构数组,为0X7FFFFFC7)。

So despite the above answers, even if you set the flag to allow a larger object size, the array is still limited to the 32bit limit of the number of elements.

因此,尽管有上述答案,即使您将标志设置为允许更大的对象大小,该数组仍然限制为元素数量的32位限制。

EDIT: You'll likely have to redesign to eliminate the need for a multidimensional array as you're currently using it (as others have suggested, there are a few ways to do this between using actual jagged arrays, or some other collection of dimensions). Given the scale of the number of elements, it may be best to use a design that dynamically allocates objects/memory as used instead of arrays that have to pre-allocate it. (unless you don't mind using many gigabytes of memory) EDITx2: That is, perhaps you can define data structures that define filled content rather than defining every possible voxel in the world, even the "empty" ones. (I'm assuming the vast majority of voxels are "empty" rather than "filled")

编辑:您可能不得不重新设计以消除对当前使用它的多维数组的需求(正如其他人所建议的那样,在使用实际锯齿状数组或其他一些维度集之间有几种方法可以做到这一点)。考虑到元素数量的大小,最好使用动态分配对象/内存的设计,而不是必须预先分配它的数组。 (除非你不介意使用许多千兆字节的内存)EDITx2:也就是说,也许你可以定义定义填充内容的数据结构,而不是定义世界上每个可能的体素,甚至是“空”的体素。 (我假设绝大多数体素都是“空的”而不是“填充”)

EDIT: Although not trivial, especially if most of the space is considered "empty", then your best bet would be to introduce some sort of spatial tree that will let you efficiently query your world to see what objects are in a particular area. For example: Octrees (as Eric suggested) or RTrees

编辑:虽然不是微不足道的,特别是如果大多数空间被认为是“空的”,那么你最好的选择是引入某种空间树,让你有效地查询你的世界,看看特定区域中的物体是什么。例如:八分之一(如Eric所建议)或RTrees

#3


3  

Thank you so much to all the staff who tried to help me in understanding and solving my problem.

非常感谢所有帮助我理解和解决问题的工作人员。

I tried several solution to be able to load a lot of data and stored in a table. After two days, here are my tests and finally the solution which can store 4'191'893'824 entry into one array

我尝试了几种解决方案,能够加载大量数据并存储在表中。两天后,这里是我的测试,最后是可以将4'191'893'824条目存入一个阵列的解决方案

I add my final solution, hoping someone could help

我添加了我的最终解决方案,希望有人可以提供帮助

the goal

I recall the goal: Initialize an integer array [2048/1024/2048] for storing 4'191'893'824 data

我记得目标:初始化一个整数数组[2048/1024/2048]用于存储4'191'893'824数据


Test 1: with JaggedArray method (failure)


system out of memory exception thrown

系统内存异常抛出

            /* ******************** */
            /* Jagged Array method  */
            /* ******************** */

            // allocate the first dimension;
            bigData = new int[2048][][];
            for (int x = 0; x < 2048; x++)
            {
                // allocate the second dimension;
                bigData[x] = new int[1024][];
                for (int y = 0; y < 1024; y++)
                {
                    // the last dimension allocation
                    bigData[x][y] = new int[2048];
                }
            }

Test 2: with List method (failure)


system out of memory exception thrown (divide the big array into several small array .. Does not work because "List <>" allows a maximum of "2GB" Ram allocution like a simple array unfortunately.)

系统内存异常抛出(将大数组分成几个小数组..因为“List <>”允许最大的“2GB”Ram一样不幸的是像一个简单的数组。)

        /* ******************** */
        /* List method          */
        /* ******************** */

        List<int[,,]> bigData = new List<int[,,]>(512);
        for (int a = 0; a < 512; a++)
        {
            bigData.Add(new int[256, 128, 256]);
        }

Test 3: with MemoryMappedFile (Solution)


I finally finally found the solution! Use the class "Memory Mapped File" contains the contents of a file in virtual memory.

我终于找到了解决方案!使用“Memory Mapped File”类包含虚拟内存中文件的内容。

MemoryMappedFile MSDN Use with custom class that I found on codeproject here. The initialization is long but it works well!

MemoryMappedFile MSDN使用我在codeproject上找到的自定义类。初始化很长但效果很好!

        /* ************************ */
        /* MemoryMappedFile method  */
        /* ************************ */

        string path = AppDomain.CurrentDomain.BaseDirectory;            
        var myList = new GenericMemoryMappedArray<int>(2048L*1024L*2048L, path); 
        using (myList)
        {
            myList.AutoGrow = false;

            /*
            for (int a = 0; a < (2048L * 1024L * 2048L); a++)
            {
                myList[a] = a;
            }
            */

            myList[12456] = 8;
            myList[1939848234] = 1;
            // etc...
        }

#4


1  

Creating this object as described, either as a standard array or as a jagged array, is going to destroy the locality of reference that allows your CPU to be performant. I recommend you use a structure like this instead:

如上所述创建此对象,无论是作为标准数组还是作为锯齿状数组,都将破坏允许CPU执行的引用局部性。我建议您使用这样的结构:

class BigArray 
{
    ArrayCell[,,] arrayCell = new ArrayCell[32,16,32];

    public int this[int i, int j, int k]
    { 
        get { return (arrayCell[i/64, j/64, k/64])[i%64, j%64, k%16]; } 
    }
}


class ArrayCell 
{
    int[,,] cell = new int[64,64,64];

    public int this[int i, int j, int k] 
    { 
        get { return cell[i,j,k]; } 
    }  
}

#1


4  

The maximum size of a single object in C# is 2GB. Since you are creating a multi-dimensional array rather than a jagged array (despite the name of your method) it is a single object that needs to contain all of those items, not several. If you actually used a jagged array then you wouldn't have a single item with all of that data (even though the total memory footprint would be a tad larger, not smaller, it's just spread out more).

C#中单个对象的最大大小为2GB。由于您正在创建一个多维数组而不是锯齿状数组(尽管您的方法的名称),因此它是一个单个对象,需要包含所有这些项,而不是几个。如果您实际使用了锯齿状阵列,那么您将不会拥有包含所有这些数据的单个项目(即使总内存占用量更大,而不是更小,它只是分散更多)。

#2


3  

From the MSDN documentation on Arrays (emphasis added)

从阵列上的MSDN文档(重点添加)

By default, the maximum size of an Array is 2 gigabytes (GB). In a 64-bit environment, you can avoid the size restriction by setting the enabled attribute of the gcAllowVeryLargeObjects configuration element to true in the run-time environment. However, the array will still be limited to a total of 4 billion elements, and to a maximum index of 0X7FEFFFFF in any given dimension (0X7FFFFFC7 for byte arrays and arrays of single-byte structures).

默认情况下,阵列的最大大小为2千兆字节(GB)。在64位环境中,可以通过在运行时环境中将gcAllowVeryLargeObjects配置元素的enabled属性设置为true来避免大小限制。但是,该阵列仍将限制为总共40亿个元素,并且在任何给定维度中最大索引为0X7FEFFFFF(对于字节数组和单字节结构数组,为0X7FFFFFC7)。

So despite the above answers, even if you set the flag to allow a larger object size, the array is still limited to the 32bit limit of the number of elements.

因此,尽管有上述答案,即使您将标志设置为允许更大的对象大小,该数组仍然限制为元素数量的32位限制。

EDIT: You'll likely have to redesign to eliminate the need for a multidimensional array as you're currently using it (as others have suggested, there are a few ways to do this between using actual jagged arrays, or some other collection of dimensions). Given the scale of the number of elements, it may be best to use a design that dynamically allocates objects/memory as used instead of arrays that have to pre-allocate it. (unless you don't mind using many gigabytes of memory) EDITx2: That is, perhaps you can define data structures that define filled content rather than defining every possible voxel in the world, even the "empty" ones. (I'm assuming the vast majority of voxels are "empty" rather than "filled")

编辑:您可能不得不重新设计以消除对当前使用它的多维数组的需求(正如其他人所建议的那样,在使用实际锯齿状数组或其他一些维度集之间有几种方法可以做到这一点)。考虑到元素数量的大小,最好使用动态分配对象/内存的设计,而不是必须预先分配它的数组。 (除非你不介意使用许多千兆字节的内存)EDITx2:也就是说,也许你可以定义定义填充内容的数据结构,而不是定义世界上每个可能的体素,甚至是“空”的体素。 (我假设绝大多数体素都是“空的”而不是“填充”)

EDIT: Although not trivial, especially if most of the space is considered "empty", then your best bet would be to introduce some sort of spatial tree that will let you efficiently query your world to see what objects are in a particular area. For example: Octrees (as Eric suggested) or RTrees

编辑:虽然不是微不足道的,特别是如果大多数空间被认为是“空的”,那么你最好的选择是引入某种空间树,让你有效地查询你的世界,看看特定区域中的物体是什么。例如:八分之一(如Eric所建议)或RTrees

#3


3  

Thank you so much to all the staff who tried to help me in understanding and solving my problem.

非常感谢所有帮助我理解和解决问题的工作人员。

I tried several solution to be able to load a lot of data and stored in a table. After two days, here are my tests and finally the solution which can store 4'191'893'824 entry into one array

我尝试了几种解决方案,能够加载大量数据并存储在表中。两天后,这里是我的测试,最后是可以将4'191'893'824条目存入一个阵列的解决方案

I add my final solution, hoping someone could help

我添加了我的最终解决方案,希望有人可以提供帮助

the goal

I recall the goal: Initialize an integer array [2048/1024/2048] for storing 4'191'893'824 data

我记得目标:初始化一个整数数组[2048/1024/2048]用于存储4'191'893'824数据


Test 1: with JaggedArray method (failure)


system out of memory exception thrown

系统内存异常抛出

            /* ******************** */
            /* Jagged Array method  */
            /* ******************** */

            // allocate the first dimension;
            bigData = new int[2048][][];
            for (int x = 0; x < 2048; x++)
            {
                // allocate the second dimension;
                bigData[x] = new int[1024][];
                for (int y = 0; y < 1024; y++)
                {
                    // the last dimension allocation
                    bigData[x][y] = new int[2048];
                }
            }

Test 2: with List method (failure)


system out of memory exception thrown (divide the big array into several small array .. Does not work because "List <>" allows a maximum of "2GB" Ram allocution like a simple array unfortunately.)

系统内存异常抛出(将大数组分成几个小数组..因为“List <>”允许最大的“2GB”Ram一样不幸的是像一个简单的数组。)

        /* ******************** */
        /* List method          */
        /* ******************** */

        List<int[,,]> bigData = new List<int[,,]>(512);
        for (int a = 0; a < 512; a++)
        {
            bigData.Add(new int[256, 128, 256]);
        }

Test 3: with MemoryMappedFile (Solution)


I finally finally found the solution! Use the class "Memory Mapped File" contains the contents of a file in virtual memory.

我终于找到了解决方案!使用“Memory Mapped File”类包含虚拟内存中文件的内容。

MemoryMappedFile MSDN Use with custom class that I found on codeproject here. The initialization is long but it works well!

MemoryMappedFile MSDN使用我在codeproject上找到的自定义类。初始化很长但效果很好!

        /* ************************ */
        /* MemoryMappedFile method  */
        /* ************************ */

        string path = AppDomain.CurrentDomain.BaseDirectory;            
        var myList = new GenericMemoryMappedArray<int>(2048L*1024L*2048L, path); 
        using (myList)
        {
            myList.AutoGrow = false;

            /*
            for (int a = 0; a < (2048L * 1024L * 2048L); a++)
            {
                myList[a] = a;
            }
            */

            myList[12456] = 8;
            myList[1939848234] = 1;
            // etc...
        }

#4


1  

Creating this object as described, either as a standard array or as a jagged array, is going to destroy the locality of reference that allows your CPU to be performant. I recommend you use a structure like this instead:

如上所述创建此对象,无论是作为标准数组还是作为锯齿状数组,都将破坏允许CPU执行的引用局部性。我建议您使用这样的结构:

class BigArray 
{
    ArrayCell[,,] arrayCell = new ArrayCell[32,16,32];

    public int this[int i, int j, int k]
    { 
        get { return (arrayCell[i/64, j/64, k/64])[i%64, j%64, k%16]; } 
    }
}


class ArrayCell 
{
    int[,,] cell = new int[64,64,64];

    public int this[int i, int j, int k] 
    { 
        get { return cell[i,j,k]; } 
    }  
}