Ruby——如何使用sort_by对一个数组和另一个数组进行排序?

I'd like to sort the first array:

我想对第一个数组进行排序:

 filenames = ["z.pdf", "z.txt", "a.pdf", "z.rf", "a.rf","a.txt", "z.html", "a.html"]

by the following file's extensions array:

通过以下文件的扩展数组:

 extensions = ["html", "txt", "pdf", "rf"]

using sort_by. But when I try:

使用sort_by。但是当我尝试:

 filenames.sort_by { |x| extensions.index x.split('.')[1] }

I get:

我得到:

 ["a.html", "z.html", "z.txt", "a.txt", "a.pdf", "z.pdf", "z.rf", "a.rf"]

The filenames with extensions "txt" and "rf" are not sorted. I've tried to figure out how sort_by sorts by using a tuple but haven't been able to find the source code for sort_by.

扩展名为“txt”和“rf”的文件名没有被排序。我试图通过使用tuple来计算sort_by的排序方式，但是还没有找到sort_by的源代码。

How can I sort one array by another array using sort_by?

如何使用sort_by对另一个数组进行排序?

Edit:

编辑:

The result should look like:

结果应该是:

["a.html", "z.html", "a.txt", "z.txt", "a.pdf", "z.pdf", "a.rf", "z.rf"]

7 个解决方案

#1

Sort by the index of the extensions array, then the filename:

根据扩展数组的索引进行排序，然后是文件名:

filenames = ["z.pdf", "z.txt", "a.pdf", "z.rf", "a.rf","a.txt", "z.html", "a.html"]
extensions = ["html", "txt", "pdf", "rf"]

p sorted = filenames.sort_by{|fn| [extensions.index(File.extname(fn)[1..-1]), fn]} #[1..-1] chops off the dot
#=> ["a.html", "z.html", "a.txt", "z.txt", "a.pdf", "z.pdf", "a.rf", "z.rf"]

#2

sorted = filenames.sort_by do |filename|
  extension = File.extname(filename).gsub(/^\./, '')
  [
    extensions.index(extension) || -1,
    filename,
 ]
end
p sorted
# => ["a.html", "z.html", "a.txt", "z.txt", "a.pdf", "z.pdf", "a.rf", "z.rf"]

This uses the fact that the sort order of arrays is determined by the sort order of their elements, in the order they are defined. That means that if sort_by returns an array, the first element of the array is the primary sort order, the second element is the secondary sort order, and so on. We exploit that to sort by extension major, filename minor.

这利用了这样一个事实:数组的排序顺序是由它们的元素的排序顺序决定的，它们的定义顺序也是如此。这意味着如果sort_by返回一个数组，数组的第一个元素是主排序顺序，第二个元素是次排序顺序，依此类推。我们利用它对扩展名major、文件名minor进行排序。

If an extension is not in the list, this code puts it first by virtue of ||= -1. To put an unknown extension last, replace -1 with extensions.size.

如果扩展不在列表中，此代码通过||= -1将其放在首位。将一个未知的扩展放在最后，用extension .size替换-1。

#3

How about:

如何:

>> filenames.sort.group_by{ |s| File.extname(s)[1..-1] }.values_at(*extensions).flatten
[
    [0] "a.html",
    [1] "z.html",
    [2] "a.txt",
    [3] "z.txt",
    [4] "a.pdf",
    [5] "z.pdf",
    [6] "a.rf",
    [7] "z.rf"
]

group_by comes from Enumerable, and is a nice tool in our collection toolbox, letting us group things by "like" attributes. In this case, it's grouping on the file's extension, retrieved using File.extname, minus its leading '.'.

group_by来自可枚举，是集合工具箱中一个很好的工具，允许我们按“like”属性对事物进行分组。在本例中，它在文件的扩展名上分组，使用文件检索。extname，减去它的开头。

It's important to understand why File.extname is important. A file can have multiple sections delimited by '.', for various reasons. Simply using split('.') is a recipe for disaster at that point, because code following the split will have to deal with more than two strings. Other files don't contain a delimiting '.' at all. File.extname makes a reasonable attempt to retrieve the last extension found in the name, so it is a bit more sane way of dealing with file names and extensions. From the documentation:

理解文件的原因很重要。extname是很重要的。一个文件可以有多个用'分隔的部分。”,因为各种原因。简单地使用split('.' .')就会带来灾难，因为在split之后的代码必须处理两个以上的字符串。其他文件不包含分隔符。”。文件。extname合理地尝试检索名称中找到的最后一个扩展名，因此它处理文件名和扩展名的方式更加合理。从文档:

File.extname("test.rb")         #=> ".rb"
File.extname("a/b/d/test.rb")   #=> ".rb"
File.extname("foo.")            #=> ""
File.extname("test")            #=> ""
File.extname(".profile")        #=> ""
File.extname(".profile.sh")     #=> ".sh"

values_at comes from Hash, and extracts the values from a hash, in the order of the keys/parameters passed in. It's great for this sort of situation because we can force the order of the values to match the order of keys. When you have a huge hash and want to cherry-pick certain values from it in one action, values_at is the tool to grab. If you need your "by-extensions" order to be different, change extensions and the output will automagically reflect that as a result of values_at.

values_at来自散列，并按照传入的键/参数的顺序从散列中提取值。这对于这种情况很好，因为我们可以强制值的顺序与键的顺序匹配。当您有一个巨大的散列并且想要在一个操作中挑选它的某些值时，values_at是要获取的工具。如果您需要您的“by-extensions”顺序不同，那么更改扩展和输出将自动反映出values_at的结果。

#4

filenames.sort_by{|f| f.split(".").map{|base, ext|
  [extensions.index(ext), base]
}}

#5

extensions = [".html", ".txt", ".pdf", ".rf"]
filenames.sort_by { |file_name_string|
  [ extensions.index( File.extname file_name_string ), file_name_string ]
}

#6

filenames = ["z.pdf", "z.txt", "a.pdf", "z.rf", "a.rf","a.txt", "z.html", "a.html"]
extensions = ["html", "txt", "pdf", "rf"]
extensions.each_with_object([]){|k,ob| ob << filenames.find_all {|i| File.extname(i)[1..-1] == k }.sort}.flatten
#=> ["a.html", "z.html", "a.txt", "z.txt", "a.pdf", "z.pdf", "a.rf", "z.rf"]

#7

-1

There is no need to use the File class. Just the light and simple regex.

不需要使用文件类。只是简单的regex。

filenames.sort_by{|i| i.scan(/\..+$/)[0]}

#1