子类化Ruby哈希，对象没有哈希方法?

I'm creating a object of hash in order to write a little script that reads in a file a line at a time, and assigns arrays into my hash class. I get wildly different results depending if I subclass Hash or not, plus using super changes things which I don't' understand.

我正在创建一个散列对象，以便编写一个小脚本，每次读取文件中的一行，并将数组分配到我的散列类中。我得到的结果大不相同取决于我是否子类化哈希，再加上使用我不理解的超变量。

My main issue is that without subclassing hash ( < Hash) it works perfectly, but I get no methods of Hash (like to iterate over the keys and get things out of it.... Subclassing Hash lets me do those things, but it seems that only the last element of the hashed arrays is ever stored.... so any insight into how you get the methods of a subclass. The Dictionary class is a great example I found on this site, and does exactly what I want, so I'm trying to understand how to use it properly.

我的主要问题是,如果没有子类化散列( <哈希)完美的工作,但我没有得到的哈希方法(如遍历的钥匙,把事情....子类化散列让我做这些事情,但似乎只有散列数组的最后一个元素存储....对于如何获得子类的方法有什么见解。dictionary类是我在这个站点上找到的一个很好的例子，并且它完全符合我的要求，因此我正在尝试理解如何正确地使用它。< p>

filename = 'inputfile.txt.'

# ??? class Dictionary < Hash
class Dictionary
  def initialize()
    @data = Hash.new { |hash, key| hash[key] = [] }
  end
  def [](key)
    @data[key]
  end
  def []=(key,words)
    @data[key] += [words].flatten
    @data[key]
#    super(key,words)
  end
end


listData = Dictionary.new

File.open(filename, 'r').each_line do |line|
  line = line.strip.split(/[^[:alpha:]|@|\.]/)
  puts "LIST-> #{line[0]}  SUB->  #{line[1]}  "
  listData[line[0]] = ("#{line[1]}")  
end

puts '====================================='
puts listData.inspect
puts '====================================='
print listData.reduce('') {|s, (k, v)|
  s << "The key is #{k} and the value is #{v}.\n"
}

If anyone understands what is going on here subclassing hash, and has some pointers, that would be excellent.

如果有人理解这里的子类化哈希，并且有一些指针，那就太棒了。

Running without explicit < Hash:

运行时没有显式 <散列:< p>

./list.rb:34:in `<main>': undefined method `reduce' for #<Dictionary:0x007fcf0a8879e0> (NoMethodError)

That is the typical error I see when I try and iterate in any way over my hash.

这是我在尝试遍历散列时看到的典型错误。

Here is a sample input file:

下面是一个输入文件示例:

listA   billg@microsoft.com
listA   ed@apple.com
listA   frank@lotus.com
listB   evanwhite@go.com
listB   joespink@go.com
listB   fredgrey@stop.com

2 个解决方案

#1

I can't reproduce your problem using your code:

我不能用你的代码重现你的问题:

d = Dictionary.new               #=> #<Dictionary:0x007f903a1adef8 @data={}>
d[4] << 5                        #=> [5]
d[5] << 6                        #=> [6]
d                                #=> #<Dictionary:0x007f903a1adef8 @data={4=>[5], 5=>[6]}>
d.instance_variable_get(:@data)  #=> {4=>[5], 5=>[6]}

But of course you won't get reduce if you don't subclass or include a class/module that defines it, or define it yourself!

但是，如果您不子类化、不包含定义它的类/模块或自己定义它的类/模块，当然不会得到reduce !

The way you have implemented Dictionary is bound to have problems. You should call super instead of reimplementing wherever possible. For example, simply this works:

实现字典的方法肯定会有问题。您应该调用super，而不是尽可能地重新实现。例如，简单地说，这是可行的:

class Dictionary < Hash
  def initialize
    super { |hash, key| hash[key] = [] }
  end
end

d = Dictionary.new  #=> {}
d['answer'] << 42   #=> [42]
d['pi'] << 3.14     #=> [3.14
d                   #=> {"answer"=>[42], "pi"=>[3.14]}

If you want to reimplement how and where the internal hash is stored (i.e., using @data), you'd have to reimplement at least each (since that is what almost all Enumerable methods call to) and getters/setters. Not worth the effort when you can just change one method instead.

如果您想重新实现内部散列的存储方式和位置(例如。，使用@data)，您必须至少重新实现每一个(因为几乎所有可枚举方法都调用它们)和getter /setter。当你可以只改变一个方法的时候，这是不值得的。

#2

While Andrew Marshall's answer already correct, You could also try this alternative below.

虽然Andrew Marshall的答案已经是正确的，但是你也可以试试下面的方法。

Going from your code, We could assume that you want to create an object that act like a Hash, but with a little bit different behaviour. Hence our first code will be like this.

从您的代码出发，我们可以假设您希望创建一个类似散列的对象，但行为略有不同。因此，我们的第一个代码是这样的。

class Dictionary < Hash

Assigning a new value to some key in the dictionary will be done differently in here. From your example above, the assignment won't replace the previous value with a new one, but instead push the new value to the previous or to a new array that initialized with the new value if the key doesn't exist yet.

在这里，为字典中的某个键分配一个新值的做法将有所不同。从上面的示例中，赋值不会用一个新的值替换以前的值，而是将新值推到前面的值，或者将新值用新值初始化，如果键还不存在的话。

Here I use the << operator as the shorthand of push method for Array. Also, the method return the value since it's what super do (see the if part)

这里我使用< <运算符作为数组的push方法的简写。此外，该方法返回值，因为它是super所做的(参见if部分)< p>

  def []=(key, value)
    if self[key]
      self[key] << value
      return value # here we mimic what super do
    else
      super(key, [value])
    end
  end

The advantage of using our own class is we could add new method to the class and it will be accessible to all of it instance. Hence we need not to monkeypatch the Hash class that considered dangerous thing.

使用我们自己的类的好处是我们可以向类添加新方法，并且所有的实例都可以访问它。因此，我们不需要对认为危险的哈希类进行monkeypatch。

  def size_of(key)
    return self[key].size if self[key]
    return 0   # the case for non existing key
  end

Now, if we combine all above we will get this code

现在，如果我们把上面所有的都合并起来，我们就会得到这个代码

class Dictionary < Hash
  def []=(key, value)
    if self[key]
      self[key] << value
      return value
    else
      super(key, [value])
    end
  end

  def size_of(key)
    return self[key].size if self[key]
    return 0   # the case for non existing key
  end
end

player_emails = Dictionary.new

player_emails["SAO"] = "kirito@sao.com" # note no << operator needed here
player_emails["ALO"] = "lyfa@alo.com"
player_emails["SAO"] = "lizbeth@sao.com"
player_emails["SAO"] = "asuna@sao.com"

player_emails.size_of("SAO") #=> 3
player_emails.size_of("ALO") #=> 1
player_emails.size_of("GGO") #=> 0

p listData
#=> {"SAO" => ["kirito@sao.com", "lizbeth@sao.com", "asuna@sao.com"],
#=>  "ALO" => ["lyfa@alo.com"] }

But, surely, the class definition could be replaced with this single line

但是，当然，类定义可以用这一行替换

player_emails = Hash.new { [] }
# note that we wont use
#
#     player_emails[key] = value
#
# instead
#
#     player_emails[key] << value
#
# Oh, if you consider the comment,
# it will no longer considered a single line

While the answer are finished, I wanna comment some of your example code:

当答案完成时，我想评论一下您的一些示例代码:

filename = 'inputfile.txt.'
# Maybe it's better to use ARGF instead,
# so you could supply the filename in the command line
# and, is the filename ended with a dot? O.o;

File.open(filename, 'r').each_line do |line|
# This line open the file anonimously,
# then access each line of the file.
# Please correct me, Is the file will properly closed? I doubt no.

# Saver version:
File.open(filename, 'r') do |file|
  file.each_line do |line|
    # ...
  end
end   # the file will closed when we reach here

# ARGF version:
ARGF.each_line do |line|
  # ...
end

# Inside the each_line block
line = line.strip.split(/[^[:alpha:]|@|\.]/)
# I don't know what do you mean by that line,
# but using that regex will result 
#
#     ["listA", "", "", "billg@microsoft.com"]
#
# Hence, your example will fail since
# line[0] == "listA" and line[1] == ""
# also note that your regex mean
#
# any character except:
#   letters, '|', '@', '|', '\.'
#
# If you want to split over one or more
# whitespace characters use \s+ instead.
# Hence we could replace it with:
line = line.strip.split(/\s+/)

puts "LIST-> #{line[0]} SUB-> #{line[1]}   "
# OK, Is this supposed to debug the line?
# Tips: the simplest way to debug is:
#
#     p line
#
# that's all,

listData[line[0]] = ("#{line[1]}")
# why? using (), then "", then #{}
# I suggest:
listData[line[0]] = line[1]

# But to make more simple, actually you could do this instead
key, value = line.strip.split(/\s+/)
listData[key] = value

# Outside the block:
puts '====================================='
# OK, that's too loooooooooong...
puts '=' * 30
# or better assign it to a variable since you use it twice
a = '=' * 30
puts a
p listData # better way to debug
puts a

# next:
print listData.reduce('') { |s, (k, v)|
  s << "The key is #{k} and the value is #{v}.\n"
}
# why using reduce?
# for debugging you could use `p listData` instead.
# but since you are printing it, why not iterate for
# each element then print each of that.
listData.each do |k, v|
  puts "The key is #{k} and the value is #{v}."
end

OK, sorry for blabbering so much, Hope it help.

好吧，抱歉说了这么多，希望能有所帮助。

#1