拆分阵列的最佳方法

时间:2022-08-25 19:43:27

Afternoon, I need to find out what the best way to split an array into smaller "chunks" would be.

下午,我需要找出将阵列拆分成较小“块”的最佳方法。

I am passing over about 1200 items, and need to split these into easier to handle groups of 100, then i need to pass them over to processed.

我正在传递大约1200个项目,并且需要将这些项目拆分为更容易处理的100个组,然后我需要将它们传递给处理。

Could any one please make some suggestions?

有人可以提出一些建议吗?

9 个解决方案

#1


17  

You can use LINQ to group all items by the chunk size and create new Arrays afterwards.

您可以使用LINQ按块大小对所有项目进行分组,然后创建新的数组。

// build sample data with 1200 Strings
string[] items = Enumerable.Range(1, 1200).Select(i => "Item" + i).ToArray();
// split on groups with each 100 items
String[][] chunks = items
                    .Select((s, i) => new { Value = s, Index = i })
                    .GroupBy(x => x.Index / 100)
                    .Select(grp => grp.Select(x => x.Value).ToArray())
                    .ToArray();

for (int i = 0; i < chunks.Length; i++)
{
    foreach (var item in chunks[i])
        Console.WriteLine("chunk:{0} {1}", i, item);
}

Note that it's not necessary to create new arrays(needs cpu cycles and memory). You could also use the IEnumerable<IEnumerable<String>> when you omit the two ToArrays.

请注意,不必创建新数组(需要cpu周期和内存)。当您省略两个ToArrays时,您还可以使用IEnumerable >。

Here's the running code: http://ideone.com/K7Hn2

这是运行代码:http://ideone.com/K7Hn2

#2


44  

Array.Copy has been around since 1.1 and does an excellent job of chunking arrays.

Array.Copy自1.1以来一直存在,并且在块数组方面做得非常出色。

string[] buffer;

for(int i = 0; i < source.Length; i+=100)
{
    buffer = new string[100];
    Array.Copy(source, i, buffer, 0, 100);
    // process array
}

And to make an extension for it:

并为它做一个扩展:

public static class Extensions
{
    public static T[] Slice<T>(this T[] source, int index, int length)
    {       
        T[] slice = new T[length];
        Array.Copy(source, index, slice, 0, length);
        return slice;
    }
}

And to use the extension:

并使用扩展名:

string[] source = new string[] { 1200 items here };

// get the first 100
string[] slice = source.Slice(0, 100);

Update: I think you might be wanting ArraySegment<> No need for performance checks, because it simply uses the original array as its source and maintains an Offset and Count property to determine the 'segment'. Unfortunately, there isn't a way to retrieve JUST the segment as an array, so some folks have written wrappers for it, like here: ArraySegment - Returning the actual segment C#

更新:我认为你可能想要ArraySegment <>不需要性能检查,因为它只是使用原始数组作为源,并维护一个Offset和Count属性来确定'segment'。不幸的是,没有办法将段作为数组检索JUST,所以有些人为它编写了包装器,就像这里:Arr​​aySegment - 返回实际的段C#

ArraySegment<string> segment;

for (int i = 0; i < source.Length; i += 100)
{
    segment = new ArraySegment<string>(source, i, 100);

    // and to loop through the segment
    for (int s = segment.Offset; s < segment.Array.Length; s++)
    {
        Console.WriteLine(segment.Array[s]);
    }
}

Performance of Array.Copy vs Skip/Take vs LINQ

Test method (in Release mode):

测试方法(在发布模式下):

static void Main(string[] args)
{
    string[] source = new string[1000000];
    for (int i = 0; i < source.Length; i++)
    {
        source[i] = "string " + i.ToString();
    }

    string[] buffer;

    Console.WriteLine("Starting stop watch");

    Stopwatch sw = new Stopwatch();

    for (int n = 0; n < 5; n++)
    {
        sw.Reset();
        sw.Start();
        for (int i = 0; i < source.Length; i += 100)
        {
            buffer = new string[100];
            Array.Copy(source, i, buffer, 0, 100);
        }

        sw.Stop();
        Console.WriteLine("Array.Copy: " + sw.ElapsedMilliseconds.ToString());

        sw.Reset();
        sw.Start();
        for (int i = 0; i < source.Length; i += 100)
        {
            buffer = new string[100];
            buffer = source.Skip(i).Take(100).ToArray();
        }
        sw.Stop();
        Console.WriteLine("Skip/Take: " + sw.ElapsedMilliseconds.ToString());

        sw.Reset();
        sw.Start();
        String[][] chunks = source                            
            .Select((s, i) => new { Value = s, Index = i })                            
            .GroupBy(x => x.Index / 100)                            
            .Select(grp => grp.Select(x => x.Value).ToArray())                            
            .ToArray();
        sw.Stop();
        Console.WriteLine("LINQ: " + sw.ElapsedMilliseconds.ToString());
    }
    Console.ReadLine();
}

Results (in milliseconds):

结果(以毫秒为单位):

Array.Copy:    15
Skip/Take:  42464
LINQ:         881

Array.Copy:    21
Skip/Take:  42284
LINQ:         585

Array.Copy:    11
Skip/Take:  43223
LINQ:         760

Array.Copy:     9
Skip/Take:  42842
LINQ:         525

Array.Copy:    24
Skip/Take:  43134
LINQ:         638

#3


10  

You can use Skip() and Take()

你可以使用Skip()和Take()

string[] items = new string[]{ "a", "b", "c"};
string[] chunk = items.Skip(1).Take(1).ToArray();

#4


7  

    string[]  amzProductAsins = GetProductAsin();;
    List<string[]> chunks = new List<string[]>();
    for (int i = 0; i < amzProductAsins.Count; i += 100)
    {
        chunks.Add(amzProductAsins.Skip(i).Take(100).ToArray());
    }

#5


6  

here I found another linq-solution:

在这里我发现了另一个linq解决方案:

int[] source = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
int i = 0;
int chunkSize = 3;
int[][] result = source.GroupBy(s => i++ / chunkSize).Select(g => g.ToArray()).ToArray();

//result = [1,2,3][4,5,6][7,8,9]

#6


1  

You can use List.GetRange:

您可以使用List.GetRange:

for(var i = 0; i < source.Count; i += chunkSize)
{
    List<string> items = source.GetRange(i, Math.Min(chunkSize, source.Count - i));
}

Although not at fast as Array.Copy, I think it looks cleaner:

虽然不像Array.Copy那样快,但我觉得它看起来更干净:

var list = Enumerable.Range(0, 723748).ToList();

var stopwatch = new Stopwatch();

for (int n = 0; n < 5; n++)
{
    stopwatch.Reset();
    stopwatch.Start();
    for(int i = 0; i < list.Count; i += 100)
    {
        List<int> c = list.GetRange(i, Math.Min(100, list.Count - i));
    }
    stopwatch.Stop();
    Console.WriteLine("List<T>.GetRange: " + stopwatch.ElapsedMilliseconds.ToString());

    stopwatch.Reset();
    stopwatch.Start();
    for (int i = 0; i < list.Count; i += 100)
    {
        List<int> c = list.Skip(i).Take(100).ToList();
    }
    stopwatch.Stop();
    Console.WriteLine("Skip/Take: " + stopwatch.ElapsedMilliseconds.ToString());

    stopwatch.Reset();
    stopwatch.Start();
    var test = list.ToArray();
    for (int i = 0; i < list.Count; i += 100)
    {
        int length = Math.Min(100, list.Count - i);
        int[] c = new int[length];
        Array.Copy(test, i, c, 0, length);
    }
    stopwatch.Stop();
    Console.WriteLine("Array.Copy: " + stopwatch.ElapsedMilliseconds.ToString());

    stopwatch.Reset();
    stopwatch.Start();
    List<List<int>> chunks = list
        .Select((s, i) => new { Value = s, Index = i })
        .GroupBy(x => x.Index / 100)
        .Select(grp => grp.Select(x => x.Value).ToList())
        .ToList();
    stopwatch.Stop();
    Console.WriteLine("LINQ: " + stopwatch.ElapsedMilliseconds.ToString());
}

Results in milliseconds:

结果以毫秒为单位:

List<T>.GetRange: 1
Skip/Take: 9820
Array.Copy: 1
LINQ: 161

List<T>.GetRange: 9
Skip/Take: 9237
Array.Copy: 1
LINQ: 148

List<T>.GetRange: 5
Skip/Take: 9470
Array.Copy: 1
LINQ: 186

List<T>.GetRange: 0
Skip/Take: 9498
Array.Copy: 1
LINQ: 110

List<T>.GetRange: 8
Skip/Take: 9717
Array.Copy: 1
LINQ: 148

#7


0  

Use LINQ, you can use Take() and Skip() functions

使用LINQ,可以使用Take()和Skip()函数

#8


0  

General recursive extension method:

一般递归扩展方法:

    public static IEnumerable<IEnumerable<T>> SplitList<T>(this IEnumerable<T> source, int maxPerList)
    {
        var enumerable = source as IList<T> ?? source.ToList();
        if (!enumerable.Any())
        {
            return new List<IEnumerable<T>>();
        }
        return (new List<IEnumerable<T>>() { enumerable.Take(maxPerList) }).Concat(enumerable.Skip(maxPerList).SplitList<T>(maxPerList));
    }

#9


-3  

    public static string[] SplitArrey(string[] ArrInput, int n_column)
    {

        string[] OutPut = new string[n_column];
        int NItem = ArrInput.Length; // Numero elementi
        int ItemsForColum = NItem / n_column; // Elementi per arrey
        int _total = ItemsForColum * n_column; // Emelemti totali divisi
        int MissElement = NItem - _total; // Elementi mancanti

        int[] _Arr = new int[n_column];
        for (int i = 0; i < n_column; i++)
        {
            int AddOne = (i < MissElement) ? 1 : 0;
            _Arr[i] = ItemsForColum + AddOne;
        }

        int offset = 0;
        for (int Row = 0; Row < n_column; Row++)
        {
            for (int i = 0; i < _Arr[Row]; i++)
            {
                OutPut[Row] += ArrInput[i + offset] + " "; // <- Here to change how the strings are linked 
            }
            offset += _Arr[Row];
        }
        return OutPut;
    }

#1


17  

You can use LINQ to group all items by the chunk size and create new Arrays afterwards.

您可以使用LINQ按块大小对所有项目进行分组,然后创建新的数组。

// build sample data with 1200 Strings
string[] items = Enumerable.Range(1, 1200).Select(i => "Item" + i).ToArray();
// split on groups with each 100 items
String[][] chunks = items
                    .Select((s, i) => new { Value = s, Index = i })
                    .GroupBy(x => x.Index / 100)
                    .Select(grp => grp.Select(x => x.Value).ToArray())
                    .ToArray();

for (int i = 0; i < chunks.Length; i++)
{
    foreach (var item in chunks[i])
        Console.WriteLine("chunk:{0} {1}", i, item);
}

Note that it's not necessary to create new arrays(needs cpu cycles and memory). You could also use the IEnumerable<IEnumerable<String>> when you omit the two ToArrays.

请注意,不必创建新数组(需要cpu周期和内存)。当您省略两个ToArrays时,您还可以使用IEnumerable >。

Here's the running code: http://ideone.com/K7Hn2

这是运行代码:http://ideone.com/K7Hn2

#2


44  

Array.Copy has been around since 1.1 and does an excellent job of chunking arrays.

Array.Copy自1.1以来一直存在,并且在块数组方面做得非常出色。

string[] buffer;

for(int i = 0; i < source.Length; i+=100)
{
    buffer = new string[100];
    Array.Copy(source, i, buffer, 0, 100);
    // process array
}

And to make an extension for it:

并为它做一个扩展:

public static class Extensions
{
    public static T[] Slice<T>(this T[] source, int index, int length)
    {       
        T[] slice = new T[length];
        Array.Copy(source, index, slice, 0, length);
        return slice;
    }
}

And to use the extension:

并使用扩展名:

string[] source = new string[] { 1200 items here };

// get the first 100
string[] slice = source.Slice(0, 100);

Update: I think you might be wanting ArraySegment<> No need for performance checks, because it simply uses the original array as its source and maintains an Offset and Count property to determine the 'segment'. Unfortunately, there isn't a way to retrieve JUST the segment as an array, so some folks have written wrappers for it, like here: ArraySegment - Returning the actual segment C#

更新:我认为你可能想要ArraySegment <>不需要性能检查,因为它只是使用原始数组作为源,并维护一个Offset和Count属性来确定'segment'。不幸的是,没有办法将段作为数组检索JUST,所以有些人为它编写了包装器,就像这里:Arr​​aySegment - 返回实际的段C#

ArraySegment<string> segment;

for (int i = 0; i < source.Length; i += 100)
{
    segment = new ArraySegment<string>(source, i, 100);

    // and to loop through the segment
    for (int s = segment.Offset; s < segment.Array.Length; s++)
    {
        Console.WriteLine(segment.Array[s]);
    }
}

Performance of Array.Copy vs Skip/Take vs LINQ

Test method (in Release mode):

测试方法(在发布模式下):

static void Main(string[] args)
{
    string[] source = new string[1000000];
    for (int i = 0; i < source.Length; i++)
    {
        source[i] = "string " + i.ToString();
    }

    string[] buffer;

    Console.WriteLine("Starting stop watch");

    Stopwatch sw = new Stopwatch();

    for (int n = 0; n < 5; n++)
    {
        sw.Reset();
        sw.Start();
        for (int i = 0; i < source.Length; i += 100)
        {
            buffer = new string[100];
            Array.Copy(source, i, buffer, 0, 100);
        }

        sw.Stop();
        Console.WriteLine("Array.Copy: " + sw.ElapsedMilliseconds.ToString());

        sw.Reset();
        sw.Start();
        for (int i = 0; i < source.Length; i += 100)
        {
            buffer = new string[100];
            buffer = source.Skip(i).Take(100).ToArray();
        }
        sw.Stop();
        Console.WriteLine("Skip/Take: " + sw.ElapsedMilliseconds.ToString());

        sw.Reset();
        sw.Start();
        String[][] chunks = source                            
            .Select((s, i) => new { Value = s, Index = i })                            
            .GroupBy(x => x.Index / 100)                            
            .Select(grp => grp.Select(x => x.Value).ToArray())                            
            .ToArray();
        sw.Stop();
        Console.WriteLine("LINQ: " + sw.ElapsedMilliseconds.ToString());
    }
    Console.ReadLine();
}

Results (in milliseconds):

结果(以毫秒为单位):

Array.Copy:    15
Skip/Take:  42464
LINQ:         881

Array.Copy:    21
Skip/Take:  42284
LINQ:         585

Array.Copy:    11
Skip/Take:  43223
LINQ:         760

Array.Copy:     9
Skip/Take:  42842
LINQ:         525

Array.Copy:    24
Skip/Take:  43134
LINQ:         638

#3


10  

You can use Skip() and Take()

你可以使用Skip()和Take()

string[] items = new string[]{ "a", "b", "c"};
string[] chunk = items.Skip(1).Take(1).ToArray();

#4


7  

    string[]  amzProductAsins = GetProductAsin();;
    List<string[]> chunks = new List<string[]>();
    for (int i = 0; i < amzProductAsins.Count; i += 100)
    {
        chunks.Add(amzProductAsins.Skip(i).Take(100).ToArray());
    }

#5


6  

here I found another linq-solution:

在这里我发现了另一个linq解决方案:

int[] source = new[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
int i = 0;
int chunkSize = 3;
int[][] result = source.GroupBy(s => i++ / chunkSize).Select(g => g.ToArray()).ToArray();

//result = [1,2,3][4,5,6][7,8,9]

#6


1  

You can use List.GetRange:

您可以使用List.GetRange:

for(var i = 0; i < source.Count; i += chunkSize)
{
    List<string> items = source.GetRange(i, Math.Min(chunkSize, source.Count - i));
}

Although not at fast as Array.Copy, I think it looks cleaner:

虽然不像Array.Copy那样快,但我觉得它看起来更干净:

var list = Enumerable.Range(0, 723748).ToList();

var stopwatch = new Stopwatch();

for (int n = 0; n < 5; n++)
{
    stopwatch.Reset();
    stopwatch.Start();
    for(int i = 0; i < list.Count; i += 100)
    {
        List<int> c = list.GetRange(i, Math.Min(100, list.Count - i));
    }
    stopwatch.Stop();
    Console.WriteLine("List<T>.GetRange: " + stopwatch.ElapsedMilliseconds.ToString());

    stopwatch.Reset();
    stopwatch.Start();
    for (int i = 0; i < list.Count; i += 100)
    {
        List<int> c = list.Skip(i).Take(100).ToList();
    }
    stopwatch.Stop();
    Console.WriteLine("Skip/Take: " + stopwatch.ElapsedMilliseconds.ToString());

    stopwatch.Reset();
    stopwatch.Start();
    var test = list.ToArray();
    for (int i = 0; i < list.Count; i += 100)
    {
        int length = Math.Min(100, list.Count - i);
        int[] c = new int[length];
        Array.Copy(test, i, c, 0, length);
    }
    stopwatch.Stop();
    Console.WriteLine("Array.Copy: " + stopwatch.ElapsedMilliseconds.ToString());

    stopwatch.Reset();
    stopwatch.Start();
    List<List<int>> chunks = list
        .Select((s, i) => new { Value = s, Index = i })
        .GroupBy(x => x.Index / 100)
        .Select(grp => grp.Select(x => x.Value).ToList())
        .ToList();
    stopwatch.Stop();
    Console.WriteLine("LINQ: " + stopwatch.ElapsedMilliseconds.ToString());
}

Results in milliseconds:

结果以毫秒为单位:

List<T>.GetRange: 1
Skip/Take: 9820
Array.Copy: 1
LINQ: 161

List<T>.GetRange: 9
Skip/Take: 9237
Array.Copy: 1
LINQ: 148

List<T>.GetRange: 5
Skip/Take: 9470
Array.Copy: 1
LINQ: 186

List<T>.GetRange: 0
Skip/Take: 9498
Array.Copy: 1
LINQ: 110

List<T>.GetRange: 8
Skip/Take: 9717
Array.Copy: 1
LINQ: 148

#7


0  

Use LINQ, you can use Take() and Skip() functions

使用LINQ,可以使用Take()和Skip()函数

#8


0  

General recursive extension method:

一般递归扩展方法:

    public static IEnumerable<IEnumerable<T>> SplitList<T>(this IEnumerable<T> source, int maxPerList)
    {
        var enumerable = source as IList<T> ?? source.ToList();
        if (!enumerable.Any())
        {
            return new List<IEnumerable<T>>();
        }
        return (new List<IEnumerable<T>>() { enumerable.Take(maxPerList) }).Concat(enumerable.Skip(maxPerList).SplitList<T>(maxPerList));
    }

#9


-3  

    public static string[] SplitArrey(string[] ArrInput, int n_column)
    {

        string[] OutPut = new string[n_column];
        int NItem = ArrInput.Length; // Numero elementi
        int ItemsForColum = NItem / n_column; // Elementi per arrey
        int _total = ItemsForColum * n_column; // Emelemti totali divisi
        int MissElement = NItem - _total; // Elementi mancanti

        int[] _Arr = new int[n_column];
        for (int i = 0; i < n_column; i++)
        {
            int AddOne = (i < MissElement) ? 1 : 0;
            _Arr[i] = ItemsForColum + AddOne;
        }

        int offset = 0;
        for (int Row = 0; Row < n_column; Row++)
        {
            for (int i = 0; i < _Arr[Row]; i++)
            {
                OutPut[Row] += ArrInput[i + offset] + " "; // <- Here to change how the strings are linked 
            }
            offset += _Arr[Row];
        }
        return OutPut;
    }