获取对数组内部结构的引用

时间:2022-01-28 13:13:35

I want to modify a field of a struct which is inside an array without having to set entire struct. In the example below, I want to set one field of element 543 in the array. I don't want to have to copy entire element (because copying MassiveStruct would hurt performance).

我想修改一个位于数组中的struct的字段,而不需要设置整个struct。在下面的示例中,我想在数组中设置元素543的一个字段。我不想要复制整个元素(因为复制MassiveStruct会损害性能)。

class P
{
    struct S
    {
      public int a;
      public MassiveStruct b;
    }

    void f(ref S s)
    {
      s.a = 3;
    }

    public static void Main()
    {
      S[] s = new S[1000];
      f(ref s[543]);  // Error: An object reference is required for the non-static field, method, or property
    }
}

Is there a way to do it in C#? Or do I always have to copy entire struct out of array, modify the copy, and then put the modified copy back into array.

有办法在c#中实现吗?或者我必须从数组中复制整个结构体,修改拷贝,然后将修改后的拷贝放回数组中。

4 个解决方案

#1


10  

The only problem is that you're trying to call an instance method from a static method, without an instance of P.

唯一的问题是,您试图从静态方法调用实例方法,而没有P实例。

Make f a static method (or create an instance of P on which to call it) and it'll be fine. It's all about reading the compiler error :)

让f成为一个静态方法(或者创建一个P的实例来调用它),这样就可以了。这都是关于读取编译器错误:)

Having said that, I would strongly advise you to:

话虽如此,我强烈建议你:

  • Avoid creating massive structs if at all possible
  • 如果可能的话,避免创建大量的结构。
  • Avoid creating mutable structs if at all possible
  • 尽可能避免创建可变结构体
  • Avoid public fields
  • 避免公共字段

#2


47  

[edit 2017: see important comments regarding C# 7 at the end of this post]

[编辑2017:关于c# 7的重要评论见本文结尾]

After many years of wrestling with this exact problem, I'll summarize the few techniques and solu­tions I have found. Stylistic tastes aside, arrays of structs are really the only bulk storage method available in C#. If your app truly processes millions of medium-sized objects under high throughput conditions, there is little other choice.

经过多年的努力解决这个问题,我总结了一些设计技巧和溶解­我发现。除了风格的偏好,结构的数组实际上是c#中唯一可用的批量存储方法。如果你的应用程序真的在高吞吐量条件下处理数百万个中等大小的对象,那么就没有其他选择了。

I agree with @kaalus that object headers and GC pressure can quickly mount; my grammar pro­cessing system can manipulate 8-10 gigabytes (or more) of structural analyses in less than a min­ute when parsing or generating lengthy natural language sentences. Cue the chorus of "C# isn't meant for these problems, switch to assembly language, wire-wrap up a FPGA, etc." Instead let's run some tests.

我同意@kaalus,对象头和GC压力可以快速挂载;我的语法pro­ces系统可以操纵8 - 10 g(或更多)的结构分析在不到一分钟­ute当解析或自然语言生成冗长的句子。提示:“c#并不是为了解决这些问题,而是为了使用汇编语言,用线包FPGA等等。”让我们运行一些测试。

First of all, it is critical to have total understanding of the full spectrum of value type (struct) man­agement issues and the class vs. struct tradeoff sweet-spots. Also of course boxing, pinning/unsafe code, fixed buffers, GCHandle, IntPtr, and more, but most importantly of all in my opinion, wise use of managed pointers.

首先,它总理解至关重要的各种值类型(结构)男人­管理问题和类目的与结构权衡。当然还有装箱、固定/不安全代码、固定缓冲区、GCHandle、IntPtr等等,但在我看来,最重要的是,明智地使用托管指针。

Your mastery of this topic will also include knowledge of the fact that, should you happen to include in your struct one or more references to managed types (as opposed to just blittable primitives), then your options for accessing the struct with unsafe pointers are greatly reduced. This is not a problem for the managed pointer method I'll mention below. So generally, including object referen­ces is fine and doesn't change much regarding this discussion.

您对这个主题的掌握还将包括以下事实:如果您碰巧在您的struct中包含一个或多个托管类型的引用(而不仅仅是可调用的基本类型),那么您使用不安全指针访问结构体的选项将大大减少。对于下面我将提到的托管指针方法来说,这不是问题。所以一般来说,包括对象referen­ces很好关于这个讨论并没有太大的改变。

Oh, and if you do really need to preserve your unsafe access, you can use a GCHandle in 'Normal' mode to store object reference(s) in your struct indefinitely. Fortunately, putting the GCHandle into your struct does not trigger the unsafe-access prohibition. (Note that GCHandle is itself a value-type, and you can even define and go to town with

哦,如果您确实需要保留不安全的访问,您可以使用“正常”模式的GCHandle将对象引用无限期地存储在您的结构中。幸运的是,将GCHandle放入您的结构中不会触发不安全访问禁令。(请注意,GCHandle本身就是一种价值类型,你甚至可以定义和使用它。

var gch = GCHandle.Alloc("spookee",GCHandleType.Normal);
GCHandle* p = &gch;
String s = (String)p->Target;

...and so forth. As a value type, the GCHandle is imaged directly into your struct, but obviously any reference types it stores are not. They are out in the heap, not included in the physical layout of your array. Finally on GCHandle, beware of its copy-semantics, though, because you'll have a memory leak if you don't eventually Free each GCHandle you allocate.

…等等。作为一种值类型,GCHandle直接映射到结构体中,但显然它存储的任何引用类型都不是。它们在堆中,不包括在数组的物理布局中。最后,在GCHandle上,注意它的复制语义,因为如果您最终不能释放分配的每个GCHandle,那么您将会有内存泄漏。

@Ani reminds us that some people consider mutable structs "evil," but it's really the fact that they are accident prone that's the problem. Indeed, referring to the OP's example,

@Ani提醒我们,有些人认为可变结构体是“邪恶的”,但真正的问题是他们容易发生意外。实际上,参照OP的例子,

s[543].a = 3;

is exactly what we're trying to achieve: access our data records in-situ. (Note that the syntax for a jagged array is identical, but I'm specifically discussing only non-jagged arrays of user-defined value types here.) For my own programs, I generally consider it a severe bug if I encounter an oversized blittable struct that has (accidentally) been wholly imaged out of its array storage row:

这正是我们要实现的目标:访问我们的数据记录。(请注意,交错数组的语法是相同的,但这里我只讨论用户定义值类型的非交错数组。)对于我自己的程序,如果遇到一个超大的可读结构体(意外地)完全从它的数组存储行中显示出来,我通常认为这是一个严重的错误:

 
 
 
  
  
  rec no_no = s[543];
 
 
    // don't do

 
 
 
  
  
  no_no.a = 3
 
 
            // it like this

As far as how big (wide) your struct can or should be, it won't matter, because you are going to be careful never to let them do what I just showed, that is, migrate out of their array. It's easy enough to define a struct to overlay our array (actually, thinking of the struct as a vacuous "memory temp­late"--as opposed to a data container, encapsulator--encourages the right thinking.)

至于你的结构可以或应该有多大(宽),这并不重要,因为你要小心不要让它们做我刚才展示的事情,也就是说,从它们的数组中迁移。它很容易定义一个结构体来覆盖我们的数组(实际上,思维结构的空洞的“内存临时­晚”——而不是一个数据容器,encapsulator——鼓励正确的想法。)

public struct rec
{
    public int a, b, c, d, e, f;
}

This one has 6 ints for a total of 24 bytes. You'll want to consider and be aware of packing options to obtain an alignment-friendly size. But excessive padding can cut into your memory budget: because a more important consideration is the 85,000 byte limit on non-LOH objects. Make sure your record size multiplied by the expected number of rows does not exceed this limit.

这个有6个ints,总共24个字节。您将需要考虑并注意打包选项以获得适合于aligni的大小。但是过多的填充会减少内存预算:因为更重要的考虑是对非loh对象的85,000字节限制。确保记录大小乘以预期行数不超过此限制。

So for this example, you would be best advised to keep your array of recs to no more 3,000 rows each. Hopefully your application can be designed around this sweet-spot. This is not so limiting when you remember that--alternatively--each row would be a separate garbage-collected object, instead of just the one array. You've cut your object proliferation by a three orders of magnitude, which is good for a day's work. Thus the .NET environment here is strongly steering us with a pretty specific constraint: it seems that if you target your app's memory design towards monolithic alloc­ations in the 30-70 KB range, then you really can get away with lots and lots of them, and in fact you'll instead become limited by a thornier set of performance bottlenecks (namely, bandwidth on the hardware bus).

因此,对于这个示例,最好建议您保持每个recs数组不超过3,000行。希望您的应用程序能够围绕这个亮点进行设计。当您记住——或者——每一行都是一个单独的垃圾收集对象,而不仅仅是一个数组时,这就没有那么大的限制了。你已经把你的物体的增殖减少了三个数量级,这对一天的工作来说是好的。因此这里的。net环境强烈指导我们相当具体的约束:如果目标应用程序的内存似乎对整体设计alloc­军内30 - 70 KB的范围,那么你真的可以摆脱很多很多,而事实上你会成为一系列棘手的性能瓶颈的限制(即硬件总线上的带宽)。

So now you have a single .NET reference type (array) with 3,000 6-tuples in physically contiguous tabular storage. First and foremost, we must be super-careful to never "pick up" one of the structs. As Jon Skeet notes above, "Massive structs will often perform worse than classes," and this is absolutely correct. There's no better way to paralyze your memory bus than to start throwing plump value types around willy-nilly.

所以现在您有了一个单独的。net引用类型(数组),在物理连续的表格存储中有3000个6元组。首先也是最重要的是,我们必须非常小心,永远不要“捡起”任何一个结构体。正如乔恩•斯凯特(Jon Skeet)在上面指出的那样,“大型结构体的性能常常比类差”,这是绝对正确的。没有什么比随意地在内存总线上抛出大量值类型更好的方法了。

So let's capitalize on an infrequently-mentioned aspect of the array of structs: All objects (and fields of those objects or structs) of all rows of the entire array are always initialized to their default values. You can start plugging values in, one at a time, in any row or column (field), anywhere in the array. You can leave some fields at their default values, or replace neighbor fields without dis­turbing one in the middle. Gone is that annoying manual initialization required with stack-resident (local variable) structs before use.

因此,让我们利用结构体数组中很少提到的一个方面:整个数组的所有行的所有对象(以及这些对象或结构的字段)总是初始化为它们的默认值。可以开始在数组中的任意行或列(字段)中每次插入一个值。你可以把一些字段默认值,或更换邻居字段没有说­电源变一个在中间。栈常驻(本地变量)结构在使用之前需要进行烦人的手工初始化,这种情况已经不复存在。

Sometimes it's hard to maintain the field-by-field approach because .NET is always trying to get us to blast in an entire new'd-up struct--but to me, this so-called "initialization" is just a violation of our taboo (against plucking the whole struct out of the array), in a different guise.

有时候很难保持逐字段方法,因为。net总是试图让我们以一种全新的结构体进行爆炸——但对我来说,这种所谓的“初始化”只是违反了我们的禁忌(以不同的方式从数组中取出整个结构体)。

Now we get to the crux of the matter. Clearly, accessing your tabular data in-situ minimizes data-shuffling busywork. But often this is an inconvenient hassle. Array accesses can be slow in .NET, due to bounds-checking. So how do you maintain a "working" pointer into the interior of an array, so as to avoid having the system constantly recomputing the indexing offsets.

现在我们来谈谈问题的症结所在。显然,访问表内数据最小化了数据转移的繁重工作。但这通常是一个麻烦。在。net中,由于绑定检查,数组访问可能很慢。因此,如何在数组内部维护一个“工作”指针,以避免系统不断地重新计算索引偏移量。

Evaluation

Let's evaluate the performance of five different methods for the manipulation of individual fields within value-type array storage rows. The test below is designed to measure the efficiency of intensively accessing the data fields of a struct positioned at some array index, in situ--that is, "where they lie," without extracting or rewriting the entire struct (array element). Five different access methods are compared, with all other factors held the same.

让我们评估五种不同的方法在值类型数组存储行中处理单个字段的性能。下面的测试是用来度量在某个数组索引中,在不提取或重写整个struct(数组元素)的情况下,对位于某个数组索引的struct的数据字段进行深入访问的效率。比较了五种不同的访问方法,所有其他因素都相同。

The five methods are as follows:

五种方法如下:

  1. Normal, direct array access via square-brackets and the field specifier dot. Note that, in .NET, arrays are a special and unique primitive of the Common Type System. As @Ani mentions above, this syntax cannot be used to change an individual field of a reference instance, such as a list, even when it is parameterized with a value-type.
  2. 通过方括号和字段说明符点直接访问普通数组。注意,在. net中,数组是通用类型系统的一个特殊的、惟一的原语。正如@Ani在上面提到的,这种语法不能用于更改引用实例(如列表)的单个字段,即使它是用值类型参数化的。
  3. Using the undocumented __makeref C# language keyword.
  4. 使用无文档的__makeref c#语言关键字。
  5. Managed pointer via a delegate which uses the ref keyword
  6. 通过使用ref关键字的委托管理指针
  7. "Unsafe" pointers
  8. “不安全”的指针
  9. Same as #3, but using a C# function instead of a delegate.
  10. 和#3一样,但是使用c#函数而不是委托。

Before I give the C# test results, here's the test harness implementation. These tests were run on .NET 4.5, an AnyCPU release build running on x64, Workstation gc. (Note that, because the test isn't interested the efficiency of allocating and de-allocating the array itself, the LOH consideration mentioned above does not apply.)

在给出c#测试结果之前,这里是测试管理实现。这些测试运行于。net 4.5,一个运行在x64上的AnyCPU版本构建,工作站gc。(注意,由于测试对分配和分配数组本身的效率不感兴趣,所以上面提到的LOH考虑并不适用。)

const int num_test = 100000;
static rec[] s1, s2, s3, s4, s5;
static long t_n, t_r, t_m, t_u, t_f;
static Stopwatch sw = Stopwatch.StartNew();
static Random rnd = new Random();

static void test2()
{
    s1 = new rec[num_test];
    s2 = new rec[num_test];
    s3 = new rec[num_test];
    s4 = new rec[num_test];
    s5 = new rec[num_test];

    for (int x, i = 0; i < 5000000; i++)
    {
        x = rnd.Next(num_test);
        test_m(x); test_n(x); test_r(x); test_u(x); test_f(x);
        x = rnd.Next(num_test);
        test_n(x); test_r(x); test_u(x); test_f(x); test_m(x);
        x = rnd.Next(num_test);
        test_r(x); test_u(x); test_f(x); test_m(x); test_n(x);
        x = rnd.Next(num_test);
        test_u(x); test_f(x); test_m(x); test_n(x); test_r(x);
        x = rnd.Next(num_test);
        test_f(x); test_m(x); test_n(x); test_r(x); test_u(x);
        x = rnd.Next(num_test);
    }
    Debug.Print("Normal (subscript+field):          {0,18}", t_n);
    Debug.Print("Typed-reference:                   {0,18}", t_r);
    Debug.Print("C# Managed pointer: (ref delegate) {0,18}", t_m);
    Debug.Print("C# Unsafe pointer:                 {0,18}", t_u);
    Debug.Print("C# Managed pointer: (ref func):    {0,18}", t_f);
}

Because the code fragments which implement the test for each specific method are long-ish, I'll give the results first. Time is 'ticks;' lower means better.

因为为每个特定方法实现测试的代码片段比较长,所以我将首先给出结果。时间是“嘀嗒”,越低意味着越好。

Normal (subscript+field):             20,804,691
Typed-reference:                      30,920,655
Managed pointer: (ref delegate)       18,777,666   // <- a close 2nd
Unsafe pointer:                       22,395,806
Managed pointer: (ref func):          18,767,179   // <- winner

I was surprised that these results were so unequivocal. TypedReferences are slowest, presumably because they lug around type information along with the pointer. Considering the heft of the IL-code for the belabored "Normal" version, it performed surprisingly well. Mode transitions seem to hurt unsafe code to the point where you really have to justify, plan, and measure each place you're going to deploy it.

我很惊讶这些结果如此明确。类型dreferences是最慢的,这可能是因为它们与指针一起拖拽类型信息。考虑到“正常”版本的IL-code的重要性,它的表现令人惊讶。模式转换似乎伤害了不安全的代码,以至于您必须真正地证明、计划和度量您将要部署它的每个地方。

But the hands down fastest times are achieved by leveraging the ref keyword in functions' parameter passing for the purpose of pointing to an interior part of the array, thus eliminating the "per-field-access" array indexing computation.

但是,通过利用函数的参数传递中的ref关键字来指向数组的内部部分,从而消除“每个字段访问”数组索引计算,可以获得最快的切换时间。

Perhaps the design of my test favors this one, but the test scenarios are representative of empirical use patterns in my app. What surprised my about those numbers is that the advantage of staying in managed mode--while having your pointers, too--was not cancelled by having to call a function or invoke through a delegate.

也许我的测试的设计支持,但经验使用模式的测试场景代表在我的应用程序。这些数字有什么让我感到意外,在管理模式的优势,而让你的指针,也没有取消通过调用一个函数或通过委托调用。

The Winner

Fastest one: (And perhaps simplest too?)

最快的:(或许也是最简单的?)

static void f(ref rec e)
{
    e.a = 4;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.b = 5;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.c = 6;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.d = 7;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.e = 8;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.f = 9;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.a = 10;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
}
static void test_f(int ix)
{
    long q = sw.ElapsedTicks;
    f(ref s5[ix]);
    t_f += sw.ElapsedTicks - q;
}

But it has the disadvantage that you can't keep related logic together in your program: the imple­mentation of the function is divided across two C# functions, f and test_f.

但是它有缺点,你不能保持相关的逻辑在您的程序:在及其­心理状态函数的划分在两个c#函数f和test_f。

We can address this particular problem with only a tiny sacrifice in performance. The next one is basically identical to the foregoing, but embeds one of the functions within the other as a lambda function...

我们只需要在性能上做出很小的牺牲就可以解决这个问题。下一个函数与前面的基本相同,但是将一个函数作为lambda函数嵌入到另一个函数中……

A Close Second

Replacing the static function in the preceding example with an inline delegate requires the use of ref arguments, which in turn precludes the use of the Func<T> lambda syntax; instead you must use an explicit delegate from old-style .NET.

将前面示例中的静态函数替换为内联委托需要使用ref参数,这反过来又排除了使用Func lambda语法;相反,您必须使用来自旧式。net的显式委托。

By adding this global declaration once:

通过添加本全球声明一次:

delegate void b(ref rec ee);

...we can use it throughout the program to directly ref into elements of array rec[], accessing them inline:

…我们可以在整个程序中使用它来直接引用数组rec[]的元素,内联地访问它们:

static void test_m(int ix)
{
    long q = sw.ElapsedTicks;
    /// the element to manipulate "e", is selected at the bottom of this lambda block
    ((b)((ref rec e) =>
    {
        e.a = 4;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.b = 5;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.c = 6;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.d = 7;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.e = 8;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.f = 9;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.a = 10;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
    }))(ref s3[ix]);
    t_m += sw.ElapsedTicks - q;
}

Also, although it may look like a new lambda function is being instantiated on each call, this won't happen if you're careful: when using this method, make sure you do not "close over" any local variables (that is, refer to variables which are outside the lambda function, from within its body), or do anything else that will bar your delegate instance from being static. If a local variable happens to fall into your lambda and the lambda thus gets promoted to an instance/class, you'll "probably" notice a difference as it tries to create five million delegates.

同时,尽管它可能看起来像一个新的lambda函数在每次调用被实例化,这不会发生,如果你小心:当使用这种方法时,确保你不“遮蔽”任何局部变量(即指lambda函数外的变量,从其身体内),或做任何其他的将禁止您的委托实例是静态的。如果一个局部变量恰好落在lambda中,并且lambda被提升为一个实例/类,那么当它试图创建500万代表时,您将“可能”注意到其中的差异。

As long as you keep the lambda function clear of these side-effects, there won't be multiple instances; what's happening here is that, whenever C# determines that a lambda has no non-explicit dependencies, it lazily creates (and caches) a static singleton. It's a little unfortunate that a performance alternation this drastic is hidden from our view as a silent optimization. Overall, I like this method. It's fast and clutter-free--except for the bizarre parentheses, none of which can be omitted here.

只要您保持lambda函数清除这些副作用,就不会出现多个实例;这里所发生的是,每当c#确定lambda没有非显式依赖项时,它都会延迟地创建(并缓存)一个静态singleton。令人遗憾的是,这种剧烈变化的性能被隐藏在无声的优化中。总的来说,我喜欢这个方法。它快速且不混乱——除了奇怪的括号外,这里没有一个可以省略。

And the rest

For completeness, here are the rest of the tests: normal bracketing-plus-dot; TypedReference; and unsafe pointers.

为了完整起见,以下是其余的测试:正常的括号加点;TypedReference;和不安全的指针。

static void test_n(int ix)
{
    long q = sw.ElapsedTicks;
    s1[ix].a = 4;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].b = 5;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].c = 6;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].d = 7;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].e = 8;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].f = 9;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].a = 10;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    t_n += sw.ElapsedTicks - q;
}


static void test_r(int ix)
{
    long q = sw.ElapsedTicks;
    var tr = __makeref(s2[ix]);
    __refvalue(tr, rec).a = 4;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).b = 5;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).c = 6;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).d = 7;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).e = 8;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).f = 9;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).a = 10;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    t_r += sw.ElapsedTicks - q;
}

static void test_u(int ix)
{
    long q = sw.ElapsedTicks;

    fixed (rec* p = &s4[ix])
    {
        p->a = 4;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->b = 5;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->c = 6;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->d = 7;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->e = 8;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->f = 9;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->a = 10;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
    }
    t_u += sw.ElapsedTicks - q;
}

Summary

For memory-intensive work in large-scale C# apps, using managed pointers to directly access the fields of value-typed array elements in-situ is the way to go.

对于大型c#应用程序中的内存密集型工作,使用托管指针直接访问值类型数组元素的字段是可行的。

If you're really serious about performance, this might be enough reason to use C++/CLI (or CIL, for that matter) instead of C# for the relevant parts of your app, because those languages allow you to directly declare managed pointers within a function body.

如果您对性能真的很认真,那么这可能就是使用c++ /CLI(或CIL)而不是c#的理由,因为这些语言允许您直接在函数体中声明托管指针。

In C#, the only way to create a managed pointer is to declare a function with a ref or out argument, and then the callee will observe the managed pointer. Thus, to get the performance benefits in C#, you have to use one of the (top two) methods shown above. [see C#7 below]

在c#中,创建托管指针的惟一方法是使用ref或out参数声明函数,然后callee将观察托管指针。因此,要在c#中获得性能优势,您必须使用上面所示的(前两个)方法之一。(见下面的c# 7)

Sadly, these deploy the kludge of splitting a function into multiple parts just for the purpose of accessing an array element. Although considerably less elegant than the equivalent C++/CLI code would be, tests indicate that even in C#, for high-throughput applications we still obtain a big performance benefit versus naïve value-type array access.

遗憾的是,这些应用程序将函数分解为多个部分,只是为了访问数组元素。虽然与等效的c++ /CLI代码相比,相当不优雅,但是测试表明,即使是在c#中,对于高吞吐量应用程序,我们仍然可以从简单的值类型数组访问中获得很大的性能优势。


[edit 2017: While perhaps conferring a small degree of prescience to this article's exhortations in general, the release of C# 7 in Visual Studio 2017 concomitantly renders the specific methods described above entirely obsolete. In short, the new ref locals feature in the language permits you to declare your own managed pointer as a local variable, and use it to consolidate the single array dereferencing operation. So given for example the test structure from above...

【编辑2017:尽管这篇文章的总体劝诫可能有点先见之明,但c# 7在Visual Studio 2017发布的同时,也使得上面描述的具体方法完全过时了。简而言之,在该语言中新的ref local特性允许您将您自己的托管指针声明为一个本地变量,并使用它来合并单个数组解除操作。举个例子,上面的测试结构。

public struct rec { public int a, b, c, d, e, f; }
static rec[] s7 = new rec[100000];

...here is how the same test function from above can now be written:

…下面是如何编写上述相同的测试函数:

static void test_7(int ix)
{
    ref rec e = ref s7[ix];         // <---  C#7 ref local
    e.a = 4;  e.e = e.a; e.b = e.d; e.f = e.d; e.b = e.e; e.a = e.c;
    e.b = 5;  e.d = e.f; e.c = e.b; e.e = e.a; e.b = e.d; e.f = e.d;
    e.c = 6;  e.b = e.e; e.a = e.c; e.d = e.f; e.c = e.b; e.e = e.a;
    e.d = 7;  e.b = e.d; e.f = e.d; e.b = e.e; e.a = e.c; e.d = e.f;
    e.e = 8;  e.c = e.b; e.e = e.a; e.b = e.d; e.f = e.d; e.b = e.e;
    e.f = 9;  e.a = e.c; e.d = e.f; e.c = e.b; e.e = e.a; e.b = e.d;
    e.a = 10; e.f = e.d; e.b = e.e; e.a = e.c; e.d = e.f; e.c = e.b;
}

Notice how this completely eliminates the need for kludges such as those I discussed above. The sleeker use of a managed pointer avoids the unnecessary function call that was used in "the winner," the best-performing methodology of those I reviewed. Therefore, the performance with the new feature can only be better than the winner of methods compared above.

注意,这完全消除了对组装件(如我上面讨论过的)的需要。更流畅地使用托管指针避免了不必要的函数调用,在“赢家”中使用,这是我见过的性能最好的方法。因此,新特性的性能只能比上述方法的胜出者更好。

Ironically enough, C# 7 also adds local functions, a feature which would directly solve the complaint about poor encapsulation I raised for two of the aforementioned hacks. Happily enough the whole enterprise of proliferating dedicated functions just for the purpose of gaining access to managed pointers is now completely moot.

具有讽刺意味的是,c# 7还添加了本地函数,这一特性将直接解决我为前面提到的两种方法所提出的糟糕封装的问题。令人高兴的是,仅仅为了获得对托管指针的访问而扩展专用函数的整个企业现在完全没有意义。

#3


2  

While Jon Skeet is correct about why your program doesn't compile, you can just do:

Jon Skeet关于为什么你的程序不能编译是正确的,你可以这样做:

s[543].a = 3;

...and it will operate directly on the struct in the array rather than on a copy.

…它会直接作用于数组中的结构体而不是拷贝。

Note that this idea works for arrays only, other collections such as lists will return a copy out from the indexer-getter (giving you a compiler error if you try something similar on the resulting value).

请注意,这种思想只适用于数组,其他集合(如列表)将从indexer-getter中返回一个副本(如果您尝试对结果值进行类似的操作,则会得到一个编译器错误)。

On another note, mutable structs are considered evil. Is there a strong reason why you don't want to make S a class?

另一方面,可变结构体被认为是邪恶的。为什么你不想把S当成一门课?

#4


0  

You could try to use a forwarding empty struct which does not hold the actual data but only keeps an index to a dataprovider object. This way you can store huge amounts of data without complicating the object graph. I am very certain that it should be quite easy in your case to replace your giant struct with a forwarding emtpy struct as long as you do not try to marshal it into unmanaged code.

您可以尝试使用一个转发空结构体,该结构体不包含实际的数据,但只保留到dataprovider对象的索引。通过这种方式,您可以存储大量数据,而不会使对象图复杂化。我非常确信,在您的例子中,只要不尝试将其编入非托管代码中,就可以很容易地将大型结构替换为转发的emtpy结构体。

Have a look at this struct. It can contain as much data inside it as you wish. The trick is that you do store the actual data in another object. This way you get reference semantics and the advantages of structs which consume less memory than class objects and faster GC cycles due to a simpler object graph (if you have many instances (millions) of them around).

请看这个结构。它可以包含尽可能多的数据。诀窍在于,您确实将实际数据存储在另一个对象中。通过这种方式,您可以获得引用语义和结构的优点,这些结构比类对象占用更少的内存,并且由于对象图更简单(如果您有许多实例(数百万)),所以GC周期更快。

    [StructLayout(LayoutKind.Sequential, Pack=1)]
    public struct ForwardingEmptyValueStruct
    {
        int _Row;
        byte _ProviderIdx;


        public ForwardingEmptyValueStruct(byte providerIdx, int row)
        {
            _ProviderIdx = providerIdx;
            _Row = row;
        }

        public double V1
        {
            get { return DataProvider._DataProviders[_ProviderIdx].Value1[_Row];  }
        }

        public int V2
        {
            get { return DataProvider._DataProviders[_ProviderIdx].Value2[_Row];  }
        }
    }

#1


10  

The only problem is that you're trying to call an instance method from a static method, without an instance of P.

唯一的问题是,您试图从静态方法调用实例方法,而没有P实例。

Make f a static method (or create an instance of P on which to call it) and it'll be fine. It's all about reading the compiler error :)

让f成为一个静态方法(或者创建一个P的实例来调用它),这样就可以了。这都是关于读取编译器错误:)

Having said that, I would strongly advise you to:

话虽如此,我强烈建议你:

  • Avoid creating massive structs if at all possible
  • 如果可能的话,避免创建大量的结构。
  • Avoid creating mutable structs if at all possible
  • 尽可能避免创建可变结构体
  • Avoid public fields
  • 避免公共字段

#2


47  

[edit 2017: see important comments regarding C# 7 at the end of this post]

[编辑2017:关于c# 7的重要评论见本文结尾]

After many years of wrestling with this exact problem, I'll summarize the few techniques and solu­tions I have found. Stylistic tastes aside, arrays of structs are really the only bulk storage method available in C#. If your app truly processes millions of medium-sized objects under high throughput conditions, there is little other choice.

经过多年的努力解决这个问题,我总结了一些设计技巧和溶解­我发现。除了风格的偏好,结构的数组实际上是c#中唯一可用的批量存储方法。如果你的应用程序真的在高吞吐量条件下处理数百万个中等大小的对象,那么就没有其他选择了。

I agree with @kaalus that object headers and GC pressure can quickly mount; my grammar pro­cessing system can manipulate 8-10 gigabytes (or more) of structural analyses in less than a min­ute when parsing or generating lengthy natural language sentences. Cue the chorus of "C# isn't meant for these problems, switch to assembly language, wire-wrap up a FPGA, etc." Instead let's run some tests.

我同意@kaalus,对象头和GC压力可以快速挂载;我的语法pro­ces系统可以操纵8 - 10 g(或更多)的结构分析在不到一分钟­ute当解析或自然语言生成冗长的句子。提示:“c#并不是为了解决这些问题,而是为了使用汇编语言,用线包FPGA等等。”让我们运行一些测试。

First of all, it is critical to have total understanding of the full spectrum of value type (struct) man­agement issues and the class vs. struct tradeoff sweet-spots. Also of course boxing, pinning/unsafe code, fixed buffers, GCHandle, IntPtr, and more, but most importantly of all in my opinion, wise use of managed pointers.

首先,它总理解至关重要的各种值类型(结构)男人­管理问题和类目的与结构权衡。当然还有装箱、固定/不安全代码、固定缓冲区、GCHandle、IntPtr等等,但在我看来,最重要的是,明智地使用托管指针。

Your mastery of this topic will also include knowledge of the fact that, should you happen to include in your struct one or more references to managed types (as opposed to just blittable primitives), then your options for accessing the struct with unsafe pointers are greatly reduced. This is not a problem for the managed pointer method I'll mention below. So generally, including object referen­ces is fine and doesn't change much regarding this discussion.

您对这个主题的掌握还将包括以下事实:如果您碰巧在您的struct中包含一个或多个托管类型的引用(而不仅仅是可调用的基本类型),那么您使用不安全指针访问结构体的选项将大大减少。对于下面我将提到的托管指针方法来说,这不是问题。所以一般来说,包括对象referen­ces很好关于这个讨论并没有太大的改变。

Oh, and if you do really need to preserve your unsafe access, you can use a GCHandle in 'Normal' mode to store object reference(s) in your struct indefinitely. Fortunately, putting the GCHandle into your struct does not trigger the unsafe-access prohibition. (Note that GCHandle is itself a value-type, and you can even define and go to town with

哦,如果您确实需要保留不安全的访问,您可以使用“正常”模式的GCHandle将对象引用无限期地存储在您的结构中。幸运的是,将GCHandle放入您的结构中不会触发不安全访问禁令。(请注意,GCHandle本身就是一种价值类型,你甚至可以定义和使用它。

var gch = GCHandle.Alloc("spookee",GCHandleType.Normal);
GCHandle* p = &gch;
String s = (String)p->Target;

...and so forth. As a value type, the GCHandle is imaged directly into your struct, but obviously any reference types it stores are not. They are out in the heap, not included in the physical layout of your array. Finally on GCHandle, beware of its copy-semantics, though, because you'll have a memory leak if you don't eventually Free each GCHandle you allocate.

…等等。作为一种值类型,GCHandle直接映射到结构体中,但显然它存储的任何引用类型都不是。它们在堆中,不包括在数组的物理布局中。最后,在GCHandle上,注意它的复制语义,因为如果您最终不能释放分配的每个GCHandle,那么您将会有内存泄漏。

@Ani reminds us that some people consider mutable structs "evil," but it's really the fact that they are accident prone that's the problem. Indeed, referring to the OP's example,

@Ani提醒我们,有些人认为可变结构体是“邪恶的”,但真正的问题是他们容易发生意外。实际上,参照OP的例子,

s[543].a = 3;

is exactly what we're trying to achieve: access our data records in-situ. (Note that the syntax for a jagged array is identical, but I'm specifically discussing only non-jagged arrays of user-defined value types here.) For my own programs, I generally consider it a severe bug if I encounter an oversized blittable struct that has (accidentally) been wholly imaged out of its array storage row:

这正是我们要实现的目标:访问我们的数据记录。(请注意,交错数组的语法是相同的,但这里我只讨论用户定义值类型的非交错数组。)对于我自己的程序,如果遇到一个超大的可读结构体(意外地)完全从它的数组存储行中显示出来,我通常认为这是一个严重的错误:

 
 
 
  
  
  rec no_no = s[543];
 
 
    // don't do

 
 
 
  
  
  no_no.a = 3
 
 
            // it like this

As far as how big (wide) your struct can or should be, it won't matter, because you are going to be careful never to let them do what I just showed, that is, migrate out of their array. It's easy enough to define a struct to overlay our array (actually, thinking of the struct as a vacuous "memory temp­late"--as opposed to a data container, encapsulator--encourages the right thinking.)

至于你的结构可以或应该有多大(宽),这并不重要,因为你要小心不要让它们做我刚才展示的事情,也就是说,从它们的数组中迁移。它很容易定义一个结构体来覆盖我们的数组(实际上,思维结构的空洞的“内存临时­晚”——而不是一个数据容器,encapsulator——鼓励正确的想法。)

public struct rec
{
    public int a, b, c, d, e, f;
}

This one has 6 ints for a total of 24 bytes. You'll want to consider and be aware of packing options to obtain an alignment-friendly size. But excessive padding can cut into your memory budget: because a more important consideration is the 85,000 byte limit on non-LOH objects. Make sure your record size multiplied by the expected number of rows does not exceed this limit.

这个有6个ints,总共24个字节。您将需要考虑并注意打包选项以获得适合于aligni的大小。但是过多的填充会减少内存预算:因为更重要的考虑是对非loh对象的85,000字节限制。确保记录大小乘以预期行数不超过此限制。

So for this example, you would be best advised to keep your array of recs to no more 3,000 rows each. Hopefully your application can be designed around this sweet-spot. This is not so limiting when you remember that--alternatively--each row would be a separate garbage-collected object, instead of just the one array. You've cut your object proliferation by a three orders of magnitude, which is good for a day's work. Thus the .NET environment here is strongly steering us with a pretty specific constraint: it seems that if you target your app's memory design towards monolithic alloc­ations in the 30-70 KB range, then you really can get away with lots and lots of them, and in fact you'll instead become limited by a thornier set of performance bottlenecks (namely, bandwidth on the hardware bus).

因此,对于这个示例,最好建议您保持每个recs数组不超过3,000行。希望您的应用程序能够围绕这个亮点进行设计。当您记住——或者——每一行都是一个单独的垃圾收集对象,而不仅仅是一个数组时,这就没有那么大的限制了。你已经把你的物体的增殖减少了三个数量级,这对一天的工作来说是好的。因此这里的。net环境强烈指导我们相当具体的约束:如果目标应用程序的内存似乎对整体设计alloc­军内30 - 70 KB的范围,那么你真的可以摆脱很多很多,而事实上你会成为一系列棘手的性能瓶颈的限制(即硬件总线上的带宽)。

So now you have a single .NET reference type (array) with 3,000 6-tuples in physically contiguous tabular storage. First and foremost, we must be super-careful to never "pick up" one of the structs. As Jon Skeet notes above, "Massive structs will often perform worse than classes," and this is absolutely correct. There's no better way to paralyze your memory bus than to start throwing plump value types around willy-nilly.

所以现在您有了一个单独的。net引用类型(数组),在物理连续的表格存储中有3000个6元组。首先也是最重要的是,我们必须非常小心,永远不要“捡起”任何一个结构体。正如乔恩•斯凯特(Jon Skeet)在上面指出的那样,“大型结构体的性能常常比类差”,这是绝对正确的。没有什么比随意地在内存总线上抛出大量值类型更好的方法了。

So let's capitalize on an infrequently-mentioned aspect of the array of structs: All objects (and fields of those objects or structs) of all rows of the entire array are always initialized to their default values. You can start plugging values in, one at a time, in any row or column (field), anywhere in the array. You can leave some fields at their default values, or replace neighbor fields without dis­turbing one in the middle. Gone is that annoying manual initialization required with stack-resident (local variable) structs before use.

因此,让我们利用结构体数组中很少提到的一个方面:整个数组的所有行的所有对象(以及这些对象或结构的字段)总是初始化为它们的默认值。可以开始在数组中的任意行或列(字段)中每次插入一个值。你可以把一些字段默认值,或更换邻居字段没有说­电源变一个在中间。栈常驻(本地变量)结构在使用之前需要进行烦人的手工初始化,这种情况已经不复存在。

Sometimes it's hard to maintain the field-by-field approach because .NET is always trying to get us to blast in an entire new'd-up struct--but to me, this so-called "initialization" is just a violation of our taboo (against plucking the whole struct out of the array), in a different guise.

有时候很难保持逐字段方法,因为。net总是试图让我们以一种全新的结构体进行爆炸——但对我来说,这种所谓的“初始化”只是违反了我们的禁忌(以不同的方式从数组中取出整个结构体)。

Now we get to the crux of the matter. Clearly, accessing your tabular data in-situ minimizes data-shuffling busywork. But often this is an inconvenient hassle. Array accesses can be slow in .NET, due to bounds-checking. So how do you maintain a "working" pointer into the interior of an array, so as to avoid having the system constantly recomputing the indexing offsets.

现在我们来谈谈问题的症结所在。显然,访问表内数据最小化了数据转移的繁重工作。但这通常是一个麻烦。在。net中,由于绑定检查,数组访问可能很慢。因此,如何在数组内部维护一个“工作”指针,以避免系统不断地重新计算索引偏移量。

Evaluation

Let's evaluate the performance of five different methods for the manipulation of individual fields within value-type array storage rows. The test below is designed to measure the efficiency of intensively accessing the data fields of a struct positioned at some array index, in situ--that is, "where they lie," without extracting or rewriting the entire struct (array element). Five different access methods are compared, with all other factors held the same.

让我们评估五种不同的方法在值类型数组存储行中处理单个字段的性能。下面的测试是用来度量在某个数组索引中,在不提取或重写整个struct(数组元素)的情况下,对位于某个数组索引的struct的数据字段进行深入访问的效率。比较了五种不同的访问方法,所有其他因素都相同。

The five methods are as follows:

五种方法如下:

  1. Normal, direct array access via square-brackets and the field specifier dot. Note that, in .NET, arrays are a special and unique primitive of the Common Type System. As @Ani mentions above, this syntax cannot be used to change an individual field of a reference instance, such as a list, even when it is parameterized with a value-type.
  2. 通过方括号和字段说明符点直接访问普通数组。注意,在. net中,数组是通用类型系统的一个特殊的、惟一的原语。正如@Ani在上面提到的,这种语法不能用于更改引用实例(如列表)的单个字段,即使它是用值类型参数化的。
  3. Using the undocumented __makeref C# language keyword.
  4. 使用无文档的__makeref c#语言关键字。
  5. Managed pointer via a delegate which uses the ref keyword
  6. 通过使用ref关键字的委托管理指针
  7. "Unsafe" pointers
  8. “不安全”的指针
  9. Same as #3, but using a C# function instead of a delegate.
  10. 和#3一样,但是使用c#函数而不是委托。

Before I give the C# test results, here's the test harness implementation. These tests were run on .NET 4.5, an AnyCPU release build running on x64, Workstation gc. (Note that, because the test isn't interested the efficiency of allocating and de-allocating the array itself, the LOH consideration mentioned above does not apply.)

在给出c#测试结果之前,这里是测试管理实现。这些测试运行于。net 4.5,一个运行在x64上的AnyCPU版本构建,工作站gc。(注意,由于测试对分配和分配数组本身的效率不感兴趣,所以上面提到的LOH考虑并不适用。)

const int num_test = 100000;
static rec[] s1, s2, s3, s4, s5;
static long t_n, t_r, t_m, t_u, t_f;
static Stopwatch sw = Stopwatch.StartNew();
static Random rnd = new Random();

static void test2()
{
    s1 = new rec[num_test];
    s2 = new rec[num_test];
    s3 = new rec[num_test];
    s4 = new rec[num_test];
    s5 = new rec[num_test];

    for (int x, i = 0; i < 5000000; i++)
    {
        x = rnd.Next(num_test);
        test_m(x); test_n(x); test_r(x); test_u(x); test_f(x);
        x = rnd.Next(num_test);
        test_n(x); test_r(x); test_u(x); test_f(x); test_m(x);
        x = rnd.Next(num_test);
        test_r(x); test_u(x); test_f(x); test_m(x); test_n(x);
        x = rnd.Next(num_test);
        test_u(x); test_f(x); test_m(x); test_n(x); test_r(x);
        x = rnd.Next(num_test);
        test_f(x); test_m(x); test_n(x); test_r(x); test_u(x);
        x = rnd.Next(num_test);
    }
    Debug.Print("Normal (subscript+field):          {0,18}", t_n);
    Debug.Print("Typed-reference:                   {0,18}", t_r);
    Debug.Print("C# Managed pointer: (ref delegate) {0,18}", t_m);
    Debug.Print("C# Unsafe pointer:                 {0,18}", t_u);
    Debug.Print("C# Managed pointer: (ref func):    {0,18}", t_f);
}

Because the code fragments which implement the test for each specific method are long-ish, I'll give the results first. Time is 'ticks;' lower means better.

因为为每个特定方法实现测试的代码片段比较长,所以我将首先给出结果。时间是“嘀嗒”,越低意味着越好。

Normal (subscript+field):             20,804,691
Typed-reference:                      30,920,655
Managed pointer: (ref delegate)       18,777,666   // <- a close 2nd
Unsafe pointer:                       22,395,806
Managed pointer: (ref func):          18,767,179   // <- winner

I was surprised that these results were so unequivocal. TypedReferences are slowest, presumably because they lug around type information along with the pointer. Considering the heft of the IL-code for the belabored "Normal" version, it performed surprisingly well. Mode transitions seem to hurt unsafe code to the point where you really have to justify, plan, and measure each place you're going to deploy it.

我很惊讶这些结果如此明确。类型dreferences是最慢的,这可能是因为它们与指针一起拖拽类型信息。考虑到“正常”版本的IL-code的重要性,它的表现令人惊讶。模式转换似乎伤害了不安全的代码,以至于您必须真正地证明、计划和度量您将要部署它的每个地方。

But the hands down fastest times are achieved by leveraging the ref keyword in functions' parameter passing for the purpose of pointing to an interior part of the array, thus eliminating the "per-field-access" array indexing computation.

但是,通过利用函数的参数传递中的ref关键字来指向数组的内部部分,从而消除“每个字段访问”数组索引计算,可以获得最快的切换时间。

Perhaps the design of my test favors this one, but the test scenarios are representative of empirical use patterns in my app. What surprised my about those numbers is that the advantage of staying in managed mode--while having your pointers, too--was not cancelled by having to call a function or invoke through a delegate.

也许我的测试的设计支持,但经验使用模式的测试场景代表在我的应用程序。这些数字有什么让我感到意外,在管理模式的优势,而让你的指针,也没有取消通过调用一个函数或通过委托调用。

The Winner

Fastest one: (And perhaps simplest too?)

最快的:(或许也是最简单的?)

static void f(ref rec e)
{
    e.a = 4;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.b = 5;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.c = 6;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.d = 7;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.e = 8;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.f = 9;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.a = 10;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
}
static void test_f(int ix)
{
    long q = sw.ElapsedTicks;
    f(ref s5[ix]);
    t_f += sw.ElapsedTicks - q;
}

But it has the disadvantage that you can't keep related logic together in your program: the imple­mentation of the function is divided across two C# functions, f and test_f.

但是它有缺点,你不能保持相关的逻辑在您的程序:在及其­心理状态函数的划分在两个c#函数f和test_f。

We can address this particular problem with only a tiny sacrifice in performance. The next one is basically identical to the foregoing, but embeds one of the functions within the other as a lambda function...

我们只需要在性能上做出很小的牺牲就可以解决这个问题。下一个函数与前面的基本相同,但是将一个函数作为lambda函数嵌入到另一个函数中……

A Close Second

Replacing the static function in the preceding example with an inline delegate requires the use of ref arguments, which in turn precludes the use of the Func<T> lambda syntax; instead you must use an explicit delegate from old-style .NET.

将前面示例中的静态函数替换为内联委托需要使用ref参数,这反过来又排除了使用Func lambda语法;相反,您必须使用来自旧式。net的显式委托。

By adding this global declaration once:

通过添加本全球声明一次:

delegate void b(ref rec ee);

...we can use it throughout the program to directly ref into elements of array rec[], accessing them inline:

…我们可以在整个程序中使用它来直接引用数组rec[]的元素,内联地访问它们:

static void test_m(int ix)
{
    long q = sw.ElapsedTicks;
    /// the element to manipulate "e", is selected at the bottom of this lambda block
    ((b)((ref rec e) =>
    {
        e.a = 4;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.b = 5;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.c = 6;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.d = 7;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.e = 8;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.f = 9;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.a = 10;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
    }))(ref s3[ix]);
    t_m += sw.ElapsedTicks - q;
}

Also, although it may look like a new lambda function is being instantiated on each call, this won't happen if you're careful: when using this method, make sure you do not "close over" any local variables (that is, refer to variables which are outside the lambda function, from within its body), or do anything else that will bar your delegate instance from being static. If a local variable happens to fall into your lambda and the lambda thus gets promoted to an instance/class, you'll "probably" notice a difference as it tries to create five million delegates.

同时,尽管它可能看起来像一个新的lambda函数在每次调用被实例化,这不会发生,如果你小心:当使用这种方法时,确保你不“遮蔽”任何局部变量(即指lambda函数外的变量,从其身体内),或做任何其他的将禁止您的委托实例是静态的。如果一个局部变量恰好落在lambda中,并且lambda被提升为一个实例/类,那么当它试图创建500万代表时,您将“可能”注意到其中的差异。

As long as you keep the lambda function clear of these side-effects, there won't be multiple instances; what's happening here is that, whenever C# determines that a lambda has no non-explicit dependencies, it lazily creates (and caches) a static singleton. It's a little unfortunate that a performance alternation this drastic is hidden from our view as a silent optimization. Overall, I like this method. It's fast and clutter-free--except for the bizarre parentheses, none of which can be omitted here.

只要您保持lambda函数清除这些副作用,就不会出现多个实例;这里所发生的是,每当c#确定lambda没有非显式依赖项时,它都会延迟地创建(并缓存)一个静态singleton。令人遗憾的是,这种剧烈变化的性能被隐藏在无声的优化中。总的来说,我喜欢这个方法。它快速且不混乱——除了奇怪的括号外,这里没有一个可以省略。

And the rest

For completeness, here are the rest of the tests: normal bracketing-plus-dot; TypedReference; and unsafe pointers.

为了完整起见,以下是其余的测试:正常的括号加点;TypedReference;和不安全的指针。

static void test_n(int ix)
{
    long q = sw.ElapsedTicks;
    s1[ix].a = 4;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].b = 5;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].c = 6;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].d = 7;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].e = 8;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].f = 9;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].a = 10;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    t_n += sw.ElapsedTicks - q;
}


static void test_r(int ix)
{
    long q = sw.ElapsedTicks;
    var tr = __makeref(s2[ix]);
    __refvalue(tr, rec).a = 4;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).b = 5;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).c = 6;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).d = 7;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).e = 8;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).f = 9;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).a = 10;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    t_r += sw.ElapsedTicks - q;
}

static void test_u(int ix)
{
    long q = sw.ElapsedTicks;

    fixed (rec* p = &s4[ix])
    {
        p->a = 4;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->b = 5;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->c = 6;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->d = 7;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->e = 8;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->f = 9;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->a = 10;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
    }
    t_u += sw.ElapsedTicks - q;
}

Summary

For memory-intensive work in large-scale C# apps, using managed pointers to directly access the fields of value-typed array elements in-situ is the way to go.

对于大型c#应用程序中的内存密集型工作,使用托管指针直接访问值类型数组元素的字段是可行的。

If you're really serious about performance, this might be enough reason to use C++/CLI (or CIL, for that matter) instead of C# for the relevant parts of your app, because those languages allow you to directly declare managed pointers within a function body.

如果您对性能真的很认真,那么这可能就是使用c++ /CLI(或CIL)而不是c#的理由,因为这些语言允许您直接在函数体中声明托管指针。

In C#, the only way to create a managed pointer is to declare a function with a ref or out argument, and then the callee will observe the managed pointer. Thus, to get the performance benefits in C#, you have to use one of the (top two) methods shown above. [see C#7 below]

在c#中,创建托管指针的惟一方法是使用ref或out参数声明函数,然后callee将观察托管指针。因此,要在c#中获得性能优势,您必须使用上面所示的(前两个)方法之一。(见下面的c# 7)

Sadly, these deploy the kludge of splitting a function into multiple parts just for the purpose of accessing an array element. Although considerably less elegant than the equivalent C++/CLI code would be, tests indicate that even in C#, for high-throughput applications we still obtain a big performance benefit versus naïve value-type array access.

遗憾的是,这些应用程序将函数分解为多个部分,只是为了访问数组元素。虽然与等效的c++ /CLI代码相比,相当不优雅,但是测试表明,即使是在c#中,对于高吞吐量应用程序,我们仍然可以从简单的值类型数组访问中获得很大的性能优势。


[edit 2017: While perhaps conferring a small degree of prescience to this article's exhortations in general, the release of C# 7 in Visual Studio 2017 concomitantly renders the specific methods described above entirely obsolete. In short, the new ref locals feature in the language permits you to declare your own managed pointer as a local variable, and use it to consolidate the single array dereferencing operation. So given for example the test structure from above...

【编辑2017:尽管这篇文章的总体劝诫可能有点先见之明,但c# 7在Visual Studio 2017发布的同时,也使得上面描述的具体方法完全过时了。简而言之,在该语言中新的ref local特性允许您将您自己的托管指针声明为一个本地变量,并使用它来合并单个数组解除操作。举个例子,上面的测试结构。

public struct rec { public int a, b, c, d, e, f; }
static rec[] s7 = new rec[100000];

...here is how the same test function from above can now be written:

…下面是如何编写上述相同的测试函数:

static void test_7(int ix)
{
    ref rec e = ref s7[ix];         // <---  C#7 ref local
    e.a = 4;  e.e = e.a; e.b = e.d; e.f = e.d; e.b = e.e; e.a = e.c;
    e.b = 5;  e.d = e.f; e.c = e.b; e.e = e.a; e.b = e.d; e.f = e.d;
    e.c = 6;  e.b = e.e; e.a = e.c; e.d = e.f; e.c = e.b; e.e = e.a;
    e.d = 7;  e.b = e.d; e.f = e.d; e.b = e.e; e.a = e.c; e.d = e.f;
    e.e = 8;  e.c = e.b; e.e = e.a; e.b = e.d; e.f = e.d; e.b = e.e;
    e.f = 9;  e.a = e.c; e.d = e.f; e.c = e.b; e.e = e.a; e.b = e.d;
    e.a = 10; e.f = e.d; e.b = e.e; e.a = e.c; e.d = e.f; e.c = e.b;
}

Notice how this completely eliminates the need for kludges such as those I discussed above. The sleeker use of a managed pointer avoids the unnecessary function call that was used in "the winner," the best-performing methodology of those I reviewed. Therefore, the performance with the new feature can only be better than the winner of methods compared above.

注意,这完全消除了对组装件(如我上面讨论过的)的需要。更流畅地使用托管指针避免了不必要的函数调用,在“赢家”中使用,这是我见过的性能最好的方法。因此,新特性的性能只能比上述方法的胜出者更好。

Ironically enough, C# 7 also adds local functions, a feature which would directly solve the complaint about poor encapsulation I raised for two of the aforementioned hacks. Happily enough the whole enterprise of proliferating dedicated functions just for the purpose of gaining access to managed pointers is now completely moot.

具有讽刺意味的是,c# 7还添加了本地函数,这一特性将直接解决我为前面提到的两种方法所提出的糟糕封装的问题。令人高兴的是,仅仅为了获得对托管指针的访问而扩展专用函数的整个企业现在完全没有意义。

#3


2  

While Jon Skeet is correct about why your program doesn't compile, you can just do:

Jon Skeet关于为什么你的程序不能编译是正确的,你可以这样做:

s[543].a = 3;

...and it will operate directly on the struct in the array rather than on a copy.

…它会直接作用于数组中的结构体而不是拷贝。

Note that this idea works for arrays only, other collections such as lists will return a copy out from the indexer-getter (giving you a compiler error if you try something similar on the resulting value).

请注意,这种思想只适用于数组,其他集合(如列表)将从indexer-getter中返回一个副本(如果您尝试对结果值进行类似的操作,则会得到一个编译器错误)。

On another note, mutable structs are considered evil. Is there a strong reason why you don't want to make S a class?

另一方面,可变结构体被认为是邪恶的。为什么你不想把S当成一门课?

#4


0  

You could try to use a forwarding empty struct which does not hold the actual data but only keeps an index to a dataprovider object. This way you can store huge amounts of data without complicating the object graph. I am very certain that it should be quite easy in your case to replace your giant struct with a forwarding emtpy struct as long as you do not try to marshal it into unmanaged code.

您可以尝试使用一个转发空结构体,该结构体不包含实际的数据,但只保留到dataprovider对象的索引。通过这种方式,您可以存储大量数据,而不会使对象图复杂化。我非常确信,在您的例子中,只要不尝试将其编入非托管代码中,就可以很容易地将大型结构替换为转发的emtpy结构体。

Have a look at this struct. It can contain as much data inside it as you wish. The trick is that you do store the actual data in another object. This way you get reference semantics and the advantages of structs which consume less memory than class objects and faster GC cycles due to a simpler object graph (if you have many instances (millions) of them around).

请看这个结构。它可以包含尽可能多的数据。诀窍在于,您确实将实际数据存储在另一个对象中。通过这种方式,您可以获得引用语义和结构的优点,这些结构比类对象占用更少的内存,并且由于对象图更简单(如果您有许多实例(数百万)),所以GC周期更快。

    [StructLayout(LayoutKind.Sequential, Pack=1)]
    public struct ForwardingEmptyValueStruct
    {
        int _Row;
        byte _ProviderIdx;


        public ForwardingEmptyValueStruct(byte providerIdx, int row)
        {
            _ProviderIdx = providerIdx;
            _Row = row;
        }

        public double V1
        {
            get { return DataProvider._DataProviders[_ProviderIdx].Value1[_Row];  }
        }

        public int V2
        {
            get { return DataProvider._DataProviders[_ProviderIdx].Value2[_Row];  }
        }
    }