在Ruby散列中使用fixnums作为键好吗?

时间:2021-04-07 16:20:43

I'm creating a hash to represent a few of the records in a MySQL database. The hash keys corresponds to the database ID fields and the hash values correspond to the database name fields.

我正在创建一个散列来表示MySQL数据库中的一些记录。哈希键对应于数据库ID字段,哈希值对应于数据库名称字段。

What's better & why?

更好的是什么和为什么?

  1. Array

    数组

    This works, but Ruby seems inefficient with sparse arrays because it appears that there's the extra overhead of setting the values of all intermediary indexes tp nil.

    这是可行的,但是Ruby在使用稀疏数组时似乎效率很低,因为设置所有中间索引tp nil值的额外开销。

    fruits = []
    fruits[23] = "apple"
    fruits[109] = "orange"
    # ...
    fruits[23429] = "banana"
    
  2. Hash with fixnum as keys

    以fixnum作为键的哈希

    I like this the best, but I've always read it's best to use symbols as keys in a hash. Is it equally as good to use fixnums as keys in a hash? I'm not sure if it is, but I think 34.hash because of the nature of fixnums, i.e., 34.equal? 34 is true whereas "hi".equal? "hi" is false.

    我最喜欢这个,但我总是读到最好在哈希中使用符号作为键。在散列中使用fixnums和键一样好吗?我不确定是不是,但我想是34。由于fixnums的性质,例如。34.等于多少?34是真的,而“hi”是相等的吗?“嗨”是错误的。

    fruits = {
      23 => "apple",
      109 => "orange",
      # ...
      23429 => "banana"
    }
    
  3. Hash with interned string representations of fixnums as keys

    将fixnums作为键的插入字符串表示哈希。

    By converting the fixnums to strings and then symbols, I'm able to use symbols as keys. This conversion, however, is annoying, and someone once told me that interning strings is inefficient. Is that so? They just look ugly to me.

    通过将固定数字转换成字符串和符号,我可以使用符号作为键。然而,这种转换很烦人,有人曾告诉我,交错字符串是低效的。是这样吗?我觉得他们很丑。

    fruits = {
      :"23" => "apple",
      :"109" => "orange",
      # ...
      :"23429" => "banana"
    }
    
  4. Hash with symbols as keys

    以符号作为键的哈希

    I can get prettier symbols (and also use the new Ruby 1.9 hash syntax) by prefixing each key with an alpha character, but then, this solution also requires conversion.

    通过在每个键前面加上一个字母字符,我可以得到更漂亮的符号(也可以使用新的Ruby 1.9哈希语法),但是这个解决方案还需要转换。

    fruits = {
      i23: "apple",
      i109: "orange",
      # ...
      i23429: "banana"
    }
    

2 个解决方案

#1


2  

AFAIK the reasoning is that symbol.hash is constant so calling hash on a symbol is a simple property lookup and quite fast; symbols are optimized for this particular use. The hash value for a string needs to be computed so calling hash on a string involves real work and strings don't appear to cache their hash values. The hash value for a Fixnum appears to be computed with some simple bit mangling on the Fixnum's internal object ID (a constant) so it should also be quick. Don't take any of this as authoritative, I just did a quick review of the 1.9.2 source but I'm hardly an expert on the Ruby internals.

理由就是那个符号。哈希是常量,所以在符号上调用哈希是一个简单的属性查找,而且非常快;符号是为这种特殊用途而优化的。需要计算字符串的散列值,因此对字符串调用散列涉及实际工作,而字符串似乎不缓存它们的散列值。Fixnum的散列值似乎是通过对Fixnum的内部对象ID(常量)进行一些简单的位重构来计算的,因此它也应该很快。不要认为这些都是权威的,我只是对1.9.2源代码做了简要的回顾,但我并不是Ruby内部的专家。

That said, I'd use Fixnums as hash keys. That gives you a natural representation for a sparse array that is also efficient in terms of memory. Any speed differences will probably be irrelevant noise. So, go with the clearest approach and worry about optimization when there is a real speed problem.

也就是说,我将使用Fixnums作为散列键。这就为稀疏数组提供了一种自然的表示形式,在内存方面也很有效。任何速度差异都可能是无关的噪音。所以,当有一个真正的速度问题时,使用最清晰的方法,并考虑优化问题。

#2


5  

My suggestion: use a Hash with Fixnum keys.

我的建议是:使用一个带有Fixnum键的散列。

As you say, this will allow a sparse object. There are special speed and memory optimizations that apply to Fixnums. They compare as expected and convert to everything. It should be faster and simpler than symbols and you won't have the strangeness of interning strings that couldn't ordinarily have been parsed.

如你所说,这将允许一个稀疏的对象。有一些特殊的速度和内存优化应用于修复程序。它们按照预期进行比较并转换为所有的东西。它应该比符号更快、更简单,而且你不会有通常无法解析的交织字符串的奇异性。

#1


2  

AFAIK the reasoning is that symbol.hash is constant so calling hash on a symbol is a simple property lookup and quite fast; symbols are optimized for this particular use. The hash value for a string needs to be computed so calling hash on a string involves real work and strings don't appear to cache their hash values. The hash value for a Fixnum appears to be computed with some simple bit mangling on the Fixnum's internal object ID (a constant) so it should also be quick. Don't take any of this as authoritative, I just did a quick review of the 1.9.2 source but I'm hardly an expert on the Ruby internals.

理由就是那个符号。哈希是常量,所以在符号上调用哈希是一个简单的属性查找,而且非常快;符号是为这种特殊用途而优化的。需要计算字符串的散列值,因此对字符串调用散列涉及实际工作,而字符串似乎不缓存它们的散列值。Fixnum的散列值似乎是通过对Fixnum的内部对象ID(常量)进行一些简单的位重构来计算的,因此它也应该很快。不要认为这些都是权威的,我只是对1.9.2源代码做了简要的回顾,但我并不是Ruby内部的专家。

That said, I'd use Fixnums as hash keys. That gives you a natural representation for a sparse array that is also efficient in terms of memory. Any speed differences will probably be irrelevant noise. So, go with the clearest approach and worry about optimization when there is a real speed problem.

也就是说,我将使用Fixnums作为散列键。这就为稀疏数组提供了一种自然的表示形式,在内存方面也很有效。任何速度差异都可能是无关的噪音。所以,当有一个真正的速度问题时,使用最清晰的方法,并考虑优化问题。

#2


5  

My suggestion: use a Hash with Fixnum keys.

我的建议是:使用一个带有Fixnum键的散列。

As you say, this will allow a sparse object. There are special speed and memory optimizations that apply to Fixnums. They compare as expected and convert to everything. It should be faster and simpler than symbols and you won't have the strangeness of interning strings that couldn't ordinarily have been parsed.

如你所说,这将允许一个稀疏的对象。有一些特殊的速度和内存优化应用于修复程序。它们按照预期进行比较并转换为所有的东西。它应该比符号更快、更简单,而且你不会有通常无法解析的交织字符串的奇异性。