I have an array in Bash, for example:
我在Bash中有一个数组,例如:
array=(a c b f 3 5)
I need to sort the array. Not just displaying the content in a sorted way, but to get a new array with the sorted elements. The new sorted array can be a completely new one or the old one.
我需要对数组进行排序。不只是以排序的方式显示内容,而是用排序的元素获得一个新的数组。新的排序数组可以是全新的,也可以是旧的。
15 个解决方案
#1
129
You don't really need all that much code:
你不需要这么多代码:
IFS=$'\n' sorted=($(sort <<<"${array[*]}"))
unset IFS
Supports whitespace in elements (as long as it's not a newline), and works in Bash 3.x.
支持元素中的空白(只要它不是换行符),并在Bash 3.x中工作。
e.g.:
例如:
$ array=("a c" b f "3 5")
$ IFS=$'\n' sorted=($(sort <<<"${array[*]}"))
$ printf "[%s]\n" "${sorted[@]}"
[3 5]
[a c]
[b]
[f]
Note: @sorontar has pointed out that care is required if elements contain wildcards such as *
or ?
:
注意:@sorontar指出,如果元素包含诸如*或?
The sorted=($(...)) part is using the "split and glob" operator. You should turn glob off:
set -f
orset -o noglob
orshopt -op noglob
or an element of the array like*
will be expanded to a list of files.排序=($(…))部分使用“split和glob”操作符。您应该将glob关闭:设置-f或set -o noglob或shopt -op noglob或像*这样的数组元素将被扩展到一个文件列表。
What's happening:
The result is a culmination six things that happen in this order:
其结果是在这个顺序中发生了六件事:
IFS=$'\n'
- IFS = $ ' \ n '
"${array[*]}"
- " $ {阵列[*]} "
<<<
- < < <
sort
- 排序
sorted=($(...))
- 排序=((…)美元)
unset IFS
- 设置IFS
First, the IFS=$'\n'
This is an important part of our operation that affects the outcome of 2 and 5 in the following way:
这是我们手术的一个重要部分,它影响着2和5的结果:
Given:
考虑到:
-
"${array[*]}"
expands to every element delimited by the first character ofIFS
- “${array[*]}”扩展到每个元素,由IFS的第一个字符分隔。
-
sorted=()
creates elements by splitting on every character ofIFS
- 排序=()通过分解IFS的每个字符来创建元素。
IFS=$'\n'
sets things up so that elements are expanded using a new line as the delimiter, and then later created in a way that each line becomes an element. (i.e. Splitting on a new line.)
IFS=$'\n'设置这些元素,使元素以新行作为分隔符展开,然后以每一行成为元素的方式创建。(也就是说,在新线路上进行拆分。)
Delimiting by a new line is important because that's how sort
operates (sorting per line). Splitting by only a new line is not-as-important, but is needed preserve elements that contain spaces or tabs.
通过一条新行进行限制是很重要的,因为这是排序操作的方式(每行排序)。仅由一条新行进行拆分并不重要,但需要保存包含空格或制表符的元素。
The default value of IFS
is a space, a tab, followed by a new line, and would be unfit for our operation.
IFS的默认值是一个空格、一个制表符、后跟一个新行,并且不适合我们的操作。
Next, the sort <<<"${array[*]}"
part
<<<
, called here strings, takes the expansion of "${array[*]}"
, as explained above, and feeds it into the standard input of sort
.
<<,称为字符串,以“${array[*]}”的扩展为例,并将其输入到sort的标准输入中。
With our example, sort
is fed this following string:
在我们的示例中,sort被输入以下字符串:
a c
b
f
3 5
Since sort
sorts, it produces:
由于排序,它产生:
3 5
a c
b
f
Next, the sorted=($(...))
part
The $(...)
part, called command substitution, causes its content (sort <<<"${array[*]}
) to run as a normal command, while taking the resulting standard output as the literal that goes where ever $(...)
was.
$(…)部分,称为命令替换,将其内容(sort << ${数组[*]})作为一个普通命令运行,同时将生成的标准输出作为文本,以$(…)的形式输出。
In our example, this produces something similar to simply writing:
在我们的例子中,这产生了类似于简单的写:
sorted=(3 5
a c
b
f
)
sorted
then becomes an array that's created by splitting this literal on every new line.
排序然后变成一个数组,它是通过在每一个新行上拆分这个文本来创建的。
Finally, the unset IFS
This resets the value of IFS
to the default value, and is just good practice.
它将IFS的值重置为默认值,这是很好的实践。
It's to ensure we don't cause trouble with anything that relies on IFS
later in our script. (Otherwise we'd need to remember that we've switched things around--something that might be impractical for complex scripts.)
它的目的是确保我们不会在脚本中出现依赖于IFS的任何东西带来麻烦。(否则,我们需要记住我们已经切换了一些东西——对于复杂的脚本来说,这可能是不切实际的。)
#2
31
Original response:
最初的反应:
array=(a c b "f f" 3 5)
readarray -t sorted < <(for a in "${array[@]}"; do echo "$a"; done | sort)
output:
输出:
$ for a in "${sorted[@]}"; do echo "$a"; done
3
5
a
b
c
f f
Note this version copes with values that contains special characters or whitespace (except newlines)
注意,这个版本包含了包含特殊字符或空格的值(换行除外)
Note readarray is supported in bash 4+.
注意,在bash 4+中支持readarray。
Edit Based on the suggestion by @Dimitre I had updated it to:
根据@Dimitre的建议进行编辑,我将其更新为:
readarray -t sorted < <(printf '%s\0' "${array[@]}" | sort -z | xargs -0n1)
which has the benefit of even understanding sorting elements with newline characters embedded correctly. Unfortunately, as correctly signaled by @ruakh this didn't mean the the result of readarray
would be correct, because readarray
has no option to use NUL
instead of regular newlines as line-separators.
它的好处是,甚至可以理解正确地嵌入新行字符的排序元素。不幸的是,正如@ruakh的正确信号,这并不意味着readarray的结果是正确的,因为readarray没有选择使用NUL而不是常规的换行符作为行分隔符。
#3
28
Here's a pure Bash quicksort implementation:
这是一个纯粹的Bash快速排序实现:
#!/bin/bash
# quicksorts positional arguments
# return is in array qsort_ret
qsort() {
local pivot i smaller=() larger=()
qsort_ret=()
(($#==0)) && return 0
pivot=$1
shift
for i; do
if [[ $i < $pivot ]]; then
smaller+=( "$i" )
else
larger+=( "$i" )
fi
done
qsort "${smaller[@]}"
smaller=( "${qsort_ret[@]}" )
qsort "${larger[@]}"
larger=( "${qsort_ret[@]}" )
qsort_ret=( "${smaller[@]}" "$pivot" "${larger[@]}" )
}
Use as, e.g.,
使用,例如,
$ array=(a c b f 3 5)
$ qsort "${array[@]}"
$ declare -p qsort_ret
declare -a qsort_ret='([0]="3" [1]="5" [2]="a" [3]="b" [4]="c" [5]="f")'
This implementation is recursive… so here's an iterative quicksort:
这个实现是递归的,这是一个迭代的快速排序:
#!/bin/bash
# quicksorts positional arguments
# return is in array qsort_ret
# Note: iterative, NOT recursive! :)
qsort() {
(($#==0)) && return 0
local stack=( 0 $(($#-1)) ) beg end i pivot smaller larger
qsort_ret=("$@")
while ((${#stack[@]})); do
beg=${stack[0]}
end=${stack[1]}
stack=( "${stack[@]:2}" )
smaller=() larger=()
pivot=${qsort_ret[beg]}
for ((i=beg+1;i<=end;++i)); do
if [[ "${qsort_ret[i]}" < "$pivot" ]]; then
smaller+=( "${qsort_ret[i]}" )
else
larger+=( "${qsort_ret[i]}" )
fi
done
qsort_ret=( "${qsort_ret[@]:0:beg}" "${smaller[@]}" "$pivot" "${larger[@]}" "${qsort_ret[@]:end+1}" )
if ((${#smaller[@]}>=2)); then stack+=( "$beg" "$((beg+${#smaller[@]}-1))" ); fi
if ((${#larger[@]}>=2)); then stack+=( "$((end-${#larger[@]}+1))" "$end" ); fi
done
}
In both cases, you can change the order you use: I used string comparisons, but you can use arithmetic comparisons, compare wrt file modification time, etc. just use the appropriate test; you can even make it more generic and have it use a first argument that is the test function use, e.g.,
在这两种情况下,您可以更改您使用的顺序:我使用了字符串比较,但是您可以使用算术比较,比较wrt文件修改时间等等,只要使用适当的测试;你甚至可以使它更通用,并让它使用第一个参数,即测试函数使用,例如,
#!/bin/bash
# quicksorts positional arguments
# return is in array qsort_ret
# Note: iterative, NOT recursive! :)
# First argument is a function name that takes two arguments and compares them
qsort() {
(($#<=1)) && return 0
local compare_fun=$1
shift
local stack=( 0 $(($#-1)) ) beg end i pivot smaller larger
qsort_ret=("$@")
while ((${#stack[@]})); do
beg=${stack[0]}
end=${stack[1]}
stack=( "${stack[@]:2}" )
smaller=() larger=()
pivot=${qsort_ret[beg]}
for ((i=beg+1;i<=end;++i)); do
if "$compare_fun" "${qsort_ret[i]}" "$pivot"; then
smaller+=( "${qsort_ret[i]}" )
else
larger+=( "${qsort_ret[i]}" )
fi
done
qsort_ret=( "${qsort_ret[@]:0:beg}" "${smaller[@]}" "$pivot" "${larger[@]}" "${qsort_ret[@]:end+1}" )
if ((${#smaller[@]}>=2)); then stack+=( "$beg" "$((beg+${#smaller[@]}-1))" ); fi
if ((${#larger[@]}>=2)); then stack+=( "$((end-${#larger[@]}+1))" "$end" ); fi
done
}
Then you can have this comparison function:
然后你可以有这个比较函数:
compare_mtime() { [[ $1 -nt $2 ]]; }
and use:
和使用:
$ qsort compare_mtime *
$ declare -p qsort_ret
to have the files in current folder sorted by modification time (newest first).
将当前文件夹中的文件按修改时间排序(最新的)。
NOTE. These functions are pure Bash! no external utilities, and no subshells! they are safe wrt any funny symbols you may have (spaces, newline characters, glob characters, etc.).
请注意。这些函数是纯粹的Bash!没有外部实用程序,也没有子shell !它们是安全的wrt,任何你可能拥有的有趣的符号(空格,换行符,glob字符,等等)。
#4
25
If you don't need to handle special shell characters in the array elements:
如果您不需要处理数组元素中的特殊shell字符:
array=(a c b f 3 5)
sorted=($(printf '%s\n' "${array[@]}"|sort))
With bash you'll need an external sorting program anyway.
在bash中,您将需要一个外部排序程序。
With zsh no external programs are needed and special shell characters are easily handled:
zsh不需要外部程序,特殊的shell字符很容易处理:
% array=('a a' c b f 3 5); printf '%s\n' "${(o)array[@]}"
3
5
a a
b
c
f
ksh has set -s
to sort ASCIIbetically.
ksh设置了-s来排序。
#5
8
In the 3-hour train trip from Munich to Frankfurt (which I had trouble to reach because Oktoberfest starts tomorrow) I was thinking about my first post. Employing a global array is a much better idea for a general sort function. The following function handles arbitary strings (newlines, blanks etc.):
在从慕尼黑到法兰克福的3个小时的火车旅行中(我遇到了麻烦,因为Oktoberfest明天就要开始了),我在想我的第一个职位。对于一般排序函数,使用全局数组是一个更好的主意。下面的函数处理任意的字符串(换行、空格等):
declare BSORT=()
function bubble_sort()
{ #
# @param [ARGUMENTS]...
#
# Sort all positional arguments and store them in global array BSORT.
# Without arguments sort this array. Return the number of iterations made.
#
# Bubble sorting lets the heaviest element sink to the bottom.
#
(($# > 0)) && BSORT=("$@")
local j=0 ubound=$((${#BSORT[*]} - 1))
while ((ubound > 0))
do
local i=0
while ((i < ubound))
do
if [ "${BSORT[$i]}" \> "${BSORT[$((i + 1))]}" ]
then
local t="${BSORT[$i]}"
BSORT[$i]="${BSORT[$((i + 1))]}"
BSORT[$((i + 1))]="$t"
fi
((++i))
done
((++j))
((--ubound))
done
echo $j
}
bubble_sort a c b 'z y' 3 5
echo ${BSORT[@]}
This prints:
这个打印:
3 5 a b c z y
The same output is created from
创建相同的输出。
BSORT=(a c b 'z y' 3 5)
bubble_sort
echo ${BSORT[@]}
Note that probably Bash internally uses smart-pointers, so the swap-operation could be cheap (although I doubt it). However, bubble_sort
demonstrates that more advanced functions like merge_sort
are also in the reach of the shell language.
注意,可能Bash内部使用了智能指针,所以swap操作可能很便宜(尽管我对此表示怀疑)。但是,bubble_sort表明,像merge_sort这样更高级的函数也在shell语言的影响范围内。
#6
6
tl;dr:
tl;博士:
Sort array a_in
and store the result in a_out
(elements must not have embedded newlines[1] ):
排序数组a_in并将结果存储在a_out中(元素必须没有嵌入的新行[1]):
Bash v4+:
Bash v4 +:
readarray -t a_out < <(printf '%s\n' "${a_in[@]}" | sort)
Bash v3:
Bash v3:
IFS=$'\n' read -d '' -r -a a_out < <(printf '%s\n' "${a_in[@]}" | sort)
Advantages over antak's solution:
优势antak的解决办法:
-
You needn't worry about accidental globbing (accidental interpretation of the array elements as filename patterns), so no extra command is needed to disable globbing (
set -f
, andset +f
to restore it later).您不必担心意外的globbing(对数组元素的意外解释作为文件名模式),因此不需要额外的命令来禁用globbing(设置-f,并设置+f以恢复它)。
-
You needn't worry about resetting
IFS
withunset IFS
.[2]您不必担心使用未设置的IFS重新设置IFS。[2]
Optional reading: explanation and sample code
The above combines Bash code with external utility sort
for a solution that works with arbitrary single-line elements and either lexical or numerical sorting (optionally by field):
上面的方法将Bash代码与外部实用程序排序,用于处理任意单行元素和词法或数值排序(可选地):
-
Performance: For around 20 elements or more, this will be faster than a pure Bash solution - significantly and increasingly so once you get beyond around 100 elements.
(The exact thresholds will depend on your specific input, machine, and platform.)性能:对于大约20个元素或更多,这将比一个纯Bash解决方案更快——当您超过100个元素时,这将是非常重要的。(确切的阈值将取决于您的特定输入、机器和平台。)
- The reason it is fast is that it avoids Bash loops.
- 它快速的原因是它避免了Bash循环。
-
printf '%s\n' "${a_in[@]}" | sort
performs the sorting (lexically, by default - seesort
's POSIX spec):printf '%s\n' ${a_in[@]}“|排序执行排序(在词汇上,默认情况下,参见sort的POSIX规范):
-
"${a_in[@]}"
safely expands to the elements of arraya_in
as individual arguments, whatever they contain (including whitespace).“${a_in[@]}”安全地扩展到数组a_in的元素作为单独的参数,无论它们包含什么(包括空格)。
-
printf '%s\n'
then prints each argument - i.e., each array element - on its own line, as-is.printf '%s\n'然后打印每个参数,即。,每个数组元素——按其本身的行,按原样。
-
-
Note the use of a process substitution (
<(...)
) to provide the sorted output as input toread
/readarray
(via redirection to stdin,<
), becauseread
/readarray
must run in the current shell (must not run in a subshell) in order for output variablea_out
to be visible to the current shell (for the variable to remain defined in the remainder of the script).注意使用进程替换(<(…))提供排序输出作为输入来读/ readarray(通过重定向stdin,<),因为读/ readarray必须运行在当前shell(不得在shell中运行)为了使输出变量a_out可见当前shell(中定义的变量保持脚本的其余部分)。
-
Reading
sort
's output into an array variable:读取sort的输出到数组变量中:
-
Bash v4+:
readarray -t a_out
reads the individual lines output bysort
into the elements of array variablea_out
, without including the trailing\n
in each element (-t
).Bash v4+: readarray -t a_out通过sort输入数组变量a_out的元素来读取单独的行输出,而不包括每个元素(-t)中的末尾\n。
-
Bash v3:
readarray
doesn't exist, soread
must be used:IFS=$'\n' read -d '' -r -a a_out
tellsread
to read into array (-a
) variablea_out
, reading the entire input, across lines (-d ''
), but splitting it into array elements by newlines (IFS=$'\n'
.$'\n'
, which produces a literal newline (LF), is a so-called ANSI C-quoted string).
(-r
, an option that should virtually always be used withread
, disables unexpected handling of\
characters.)Bash v3: readarray不存在,因此必须使用read -d =$'\n' read -d " -r -a -r -a -a,让read进入数组(-a)变量a_out,读取整个输入,跨行(-d),但将其按换行(IFS=$'\n')分割成数组元素。$'\n',它产生一个字面的换行(LF),是一个所谓的ANSI c引用字符串)。(-r,这个选项实际上应该经常与read一起使用,禁用了意外处理\字符。)
-
Annotated sample code:
带注释的示例代码:
#!/usr/bin/env bash
# Define input array `a_in`:
# Note the element with embedded whitespace ('a c')and the element that looks like
# a glob ('*'), chosen to demonstrate that elements with line-internal whitespace
# and glob-like contents are correctly preserved.
a_in=( 'a c' b f 5 '*' 10 )
# Sort and store output in array `a_out`
# Saving back into `a_in` is also an option.
IFS=$'\n' read -d '' -r -a a_out < <(printf '%s\n' "${a_in[@]}" | sort)
# Bash 4.x: use the simpler `readarray -t`:
# readarray -t a_out < <(printf '%s\n' "${a_in[@]}" | sort)
# Print sorted output array, line by line:
printf '%s\n' "${a_out[@]}"
Due to use of sort
without options, this yields lexical sorting (digits sort before letters, and digit sequences are treated lexically, not as numbers):
由于没有选择的排序,这就产生了词法排序(在字母之前,数字排序,而数字序列在词汇上被处理,而不是数字):
*
10
5
a c
b
f
If you wanted numerical sorting by the 1st field, you'd use sort -k1,1n
instead of just sort
, which yields (non-numbers sort before numbers, and numbers sort correctly):
如果你想要第一个字段的数字排序,你可以使用sort -k1,1n而不是仅仅排序,它会产生(非数字排序之前的数字,数字排序正确):
*
a c
b
f
5
10
[1] To handle elements with embedded newlines, use the following variant (Bash v4+, with GNU sort
):readarray -d '' -t a_out < <(printf '%s\0' "${a_in[@]}" | sort -z)
.
Michał Górny's helpful answer has a Bash v3 solution.
[1]使用嵌入的换行符来处理元素,使用下面的变体(Bash v4+,使用GNU sort): readarray -d ' -t a_out < <(printf '%s\0' > ${a_in[@]})。MichałGorny有用的答案有一个Bash v3的解决方案。
[2] While IFS
is set in the Bash v3 variant, the change is scoped to the command.
By contrast, what follows IFS=$'\n'
in antak's answer is an assignment rather than a command, in which case the IFS
change is global.
[2]虽然IFS是在Bash v3变体中设置的,但更改的范围是命令。相比之下,在antak的回答中,IFS=$“\n”是一个任务,而不是命令,在这种情况下,IFS改变是全局的。
#7
5
Another solution that uses external sort
and copes with any special characters (except for NULs :)). Should work with bash-3.2 and GNU or BSD sort
(sadly, POSIX doesn't include -z
).
另一种使用外部排序和复制任何特殊字符的解决方案(除了NULs:))。应该使用bash-3.2和GNU或BSD排序(遗憾的是,POSIX不包括-z)。
local e new_array=()
while IFS= read -r -d '' e; do
new_array+=( "${e}" )
done < <(printf "%s\0" "${array[@]}" | LC_ALL=C sort -z)
First look at the input redirection at the end. We're using printf
built-in to write out the array elements, zero-terminated. The quoting makes sure array elements are passed as-is, and specifics of shell printf
cause it to reuse the last part of format string for each remaining parameter. That is, it's equivalent to something like:
首先看一下最后的输入重定向。我们使用内置的printf来写数组元素,零终止。引用确保数组元素是按原样传递的,而shell printf的细节使它可以重用每个剩余参数的格式字符串的最后一部分。也就是说,它等价于
for e in "${array[@]}"; do
printf "%s\0" "${e}"
done
The null-terminated element list is then passed to sort
. The -z
option causes it to read null-terminated elements, sort them and output null-terminated as well. If you needed to get only the unique elements, you can pass -u
since it is more portable than uniq -z
. The LC_ALL=C
ensures stable sort order independently of locale — sometimes useful for scripts. If you want the sort
to respect locale, remove that.
然后将以null结尾的元素列表排序。-z选项使它读取以null结尾的元素,并对它们进行排序,并终止输出。如果您只需要获得独特的元素,那么您可以通过-u,因为它比uniq -z更便携。LC_ALL=C确保独立于语言环境的稳定排序顺序——有时对脚本有用。如果你想要尊重locale,就删除它。
The <()
construct obtains the descriptor to read from the spawned pipeline, and <
redirects the standard input of the while
loop to it. If you need to access the standard input inside the pipe, you may use another descriptor — exercise for the reader :).
<()构造获得从生成的管道中读取的描述符,并且 <重定向while循环的标准输入。如果您需要访问管道内的标准输入,您可以使用另一个描述符—为读者提供的练习:)。< p>
Now, back to the beginning. The read
built-in reads output from the redirected stdin. Setting empty IFS
disables word splitting which is unnecessary here — as a result, read
reads the whole 'line' of input to the single provided variable. -r
option disables escape processing that is undesired here as well. Finally, -d ''
sets the line delimiter to NUL — that is, tells read
to read zero-terminated strings.
现在,回到开始。read内置的读输出来自重定向的stdin。在这里设置空的IFS禁用单词分割,这是不必要的——因此,read将输入的整个“行”读入单个提供的变量。-r选项禁用了在这里不需要的脱逃处理。最后,-d“将行分隔符设置为NUL—即,告诉read读取零终止字符串。
As a result, the loop is executed once for every successive zero-terminated array element, with the value being stored in e
. The example just puts the items in another array but you may prefer to process them directly :).
因此,对于每一个连续的零终止数组元素,循环被执行一次,其中的值被存储在e中。这个例子只是将这些项放在另一个数组中,但是您可能更喜欢直接处理它们:)。
Of course, that's just one of the many ways of achieving the same goal. As I see it, it is simpler than implementing complete sorting algorithm in bash and in some cases it will be faster. It handles all special characters including newlines and should work on most of the common systems. Most importantly, it may teach you something new and awesome about bash :).
当然,这只是实现同样目标的众多方法之一。正如我所看到的,它比在bash中实现完整的排序算法更简单,在某些情况下它会更快。它处理所有的特殊字符,包括换行,并且应该在大多数公共系统上工作。最重要的是,它可以教会你一些新的和很棒的关于bash的东西:)。
#8
4
min sort:
最小值:
#!/bin/bash
array=(.....)
index_of_element1=0
while (( ${index_of_element1} < ${#array[@]} )); do
element_1="${array[${index_of_element1}]}"
index_of_element2=$((index_of_element1 + 1))
index_of_min=${index_of_element1}
min_element="${element_1}"
for element_2 in "${array[@]:$((index_of_element1 + 1))}"; do
min_element="`printf "%s\n%s" "${min_element}" "${element_2}" | sort | head -n+1`"
if [[ "${min_element}" == "${element_2}" ]]; then
index_of_min=${index_of_element2}
fi
let index_of_element2++
done
array[${index_of_element1}]="${min_element}"
array[${index_of_min}]="${element_1}"
let index_of_element1++
done
#9
3
try this:
试试这个:
echo ${array[@]} | awk 'BEGIN{RS=" ";} {print $1}' | sort
Output will be:
输出将:
3 5 a b c f
Problem solved.
问题解决了。
#10
2
array=(a c b f 3 5)
new_array=($(echo "${array[@]}" | sed 's/ /\n/g' | sort))
echo ${new_array[@]}
echo contents of new_array will be:
new_array的echo内容为:
3 5 a b c f
#11
1
If you can compute a unique integer for each element in the array, like this:
如果您可以为数组中的每个元素计算一个惟一的整数,如下所示:
tab='0123456789abcdefghijklmnopqrstuvwxyz'
# build the reversed ordinal map
for ((i = 0; i < ${#tab}; i++)); do
declare -g ord_${tab:i:1}=$i
done
function sexy_int() {
local sum=0
local i ch ref
for ((i = 0; i < ${#1}; i++)); do
ch="${1:i:1}"
ref="ord_$ch"
(( sum += ${!ref} ))
done
return $sum
}
sexy_int hello
echo "hello -> $?"
sexy_int world
echo "world -> $?"
then, you can use these integers as array indexes, because Bash always use sparse array, so no need to worry about unused indexes:
然后,您可以使用这些整数作为数组索引,因为Bash总是使用稀疏数组,因此无需担心未使用的索引:
array=(a c b f 3 5)
for el in "${array[@]}"; do
sexy_int "$el"
sorted[$?]="$el"
done
echo "${sorted[@]}"
- Pros. Fast.
- 优点。快。
- Cons. Duplicated elements are merged, and it can be impossible to map contents to 32-bit unique integers.
- 复制的元素被合并,并且不可能将内容映射到32位的唯一整数。
#12
0
I am not convinced that you'll need an external sorting program in Bash.
我不相信您在Bash中需要一个外部排序程序。
Here is my implementation for the simple bubble-sort algorithm.
下面是简单的气泡排序算法的实现。
function bubble_sort()
{ #
# Sorts all positional arguments and echoes them back.
#
# Bubble sorting lets the heaviest (longest) element sink to the bottom.
#
local array=($@) max=$(($# - 1))
while ((max > 0))
do
local i=0
while ((i < max))
do
if [ ${array[$i]} \> ${array[$((i + 1))]} ]
then
local t=${array[$i]}
array[$i]=${array[$((i + 1))]}
array[$((i + 1))]=$t
fi
((i += 1))
done
((max -= 1))
done
echo ${array[@]}
}
array=(a c b f 3 5)
echo " input: ${array[@]}"
echo "output: $(bubble_sort ${array[@]})"
This shall print:
这将打印:
input: a c b f 3 5
output: 3 5 a b c f
#13
0
a=(e b 'c d')
shuf -e "${a[@]}" | sort >/tmp/f
mapfile -t g </tmp/f
#14
0
There is a workaround for the usual problem of spaces and newlines:
对于常见的空格和换行问题,有一个变通方法:
Use a character that is not in the original array (like $'\1'
or $'\4'
or similar).
使用一个不在原始数组中的字符(比如$'\1'或$'\4'或类似的)。
This function gets the job done:
这个函数完成了任务:
# Sort an Array may have spaces or newlines with a workaround (wa=$'\4')
sortarray(){ local wa=$'\4' IFS=''
if [[ $* =~ [$wa] ]]; then
echo "$0: error: array contains the workaround char" >&2
exit 1
fi
set -f; local IFS=$'\n' x nl=$'\n'
set -- $(printf '%s\n' "${@//$nl/$wa}" | sort -n)
for x
do sorted+=("${x//$wa/$nl}")
done
}
This will sort the array:
这将对数组进行排序:
$ array=( a b 'c d' $'e\nf' $'g\1h')
$ sortarray "${array[@]}"
$ printf '<%s>\n' "${sorted[@]}"
<a>
<b>
<c d>
<e
f>
<gh>
This will complain that the source array contains the workaround character:
这将会抱怨源数组包含了工作区:
$ array=( a b 'c d' $'e\nf' $'g\4h')
$ sortarray "${array[@]}"
./script: error: array contains the workaround char
description
- We set two local variables
wa
(workaround char) and a null IFS - 我们设置了两个局部变量wa(工作区char)和一个null IFS。
- Then (with ifs null) we test that the whole array
$*
. - 然后(使用ifs null)测试整个数组$*。
- Does not contain any woraround char
[[ $* =~ [$wa] ]]
. - 不包含任何woraround char [[$* = [$wa]]]。
- If it does, raise a message and signal an error:
exit 1
- 如果是,请发出一条消息并发出一个错误:退出1。
- Avoid filename expansions:
set -f
- 避免文件名扩展:设置-f。
- Set a new value of IFS (
IFS=$'\n'
) a loop variablex
and a newline var (nl=$'\n'
). - 设置一个新的IFS值(IFS=$'\n')一个循环变量x和一个newline var (nl=$'\n')。
- We print all values of the arguments received (the input array
$@
). - 我们打印接收到的参数的所有值(输入数组$@)。
- but we replace any new line by the workaround char
"${@//$nl/$wa}"
. - 但是我们用“${@//$nl/$wa}”来替代任何新行。
- send those values to be sorted
sort -n
. - 将这些值排序为-n。
- and place back all the sorted values in the positional arguments
set --
. - 并将所有已排序的值放在位置参数集合中—。
- Then we assign each argument one by one (to preserve newlines).
- 然后我们逐个地分配每个参数(以保存换行)。
- in a loop
for x
- 对x进行循环。
- to a new array:
sorted+=(…)
- 对一个新数组:排序+=(…)
- inside quotes to preserve any existing newline.
- 内部引号保存任何现有的换行。
- restoring the workaround to a newline
"${x//$wa/$nl}"
. - 将工作恢复到新行“${x//$wa/$nl}”。
- done
- 完成
#15
-1
sorted=($(echo ${array[@]} | tr " " "\n" | sort))
排序=($(echo ${阵列[@]}| tr " "\n" |排序))
In the spirit of bash / linux, I would pipe the best command-line tool for each step. sort
does the main job but needs input separated by newline instead of space, so the very simple pipeline above simply does:
在bash / linux的精神中,我将为每一步安装最好的命令行工具。排序是主要的工作,但是需要用换行而不是空间来进行输入,所以上面的简单管道就是:
Echo array content --> replace space by newline --> sort
Echo数组内容——>用换行符代替空格——>排序。
$()
is to echo the result
$()是响应结果。
($())
is to put the "echoed result" in an array
($())是将“响应结果”放入数组中。
Note: as @sorontar mentioned in a comment to a different question:
注:@sorontar在对另一个问题的评论中提到:
The sorted=($(...)) part is using the "split and glob" operator. You should turn glob off: set -f or set -o noglob or shopt -op noglob or an element of the array like * will be expanded to a list of files.
排序=($(…))部分使用“split和glob”操作符。您应该将glob关闭:设置-f或set -o noglob或shopt -op noglob或像*这样的数组元素将被扩展到一个文件列表。
#1
129
You don't really need all that much code:
你不需要这么多代码:
IFS=$'\n' sorted=($(sort <<<"${array[*]}"))
unset IFS
Supports whitespace in elements (as long as it's not a newline), and works in Bash 3.x.
支持元素中的空白(只要它不是换行符),并在Bash 3.x中工作。
e.g.:
例如:
$ array=("a c" b f "3 5")
$ IFS=$'\n' sorted=($(sort <<<"${array[*]}"))
$ printf "[%s]\n" "${sorted[@]}"
[3 5]
[a c]
[b]
[f]
Note: @sorontar has pointed out that care is required if elements contain wildcards such as *
or ?
:
注意:@sorontar指出,如果元素包含诸如*或?
The sorted=($(...)) part is using the "split and glob" operator. You should turn glob off:
set -f
orset -o noglob
orshopt -op noglob
or an element of the array like*
will be expanded to a list of files.排序=($(…))部分使用“split和glob”操作符。您应该将glob关闭:设置-f或set -o noglob或shopt -op noglob或像*这样的数组元素将被扩展到一个文件列表。
What's happening:
The result is a culmination six things that happen in this order:
其结果是在这个顺序中发生了六件事:
IFS=$'\n'
- IFS = $ ' \ n '
"${array[*]}"
- " $ {阵列[*]} "
<<<
- < < <
sort
- 排序
sorted=($(...))
- 排序=((…)美元)
unset IFS
- 设置IFS
First, the IFS=$'\n'
This is an important part of our operation that affects the outcome of 2 and 5 in the following way:
这是我们手术的一个重要部分,它影响着2和5的结果:
Given:
考虑到:
-
"${array[*]}"
expands to every element delimited by the first character ofIFS
- “${array[*]}”扩展到每个元素,由IFS的第一个字符分隔。
-
sorted=()
creates elements by splitting on every character ofIFS
- 排序=()通过分解IFS的每个字符来创建元素。
IFS=$'\n'
sets things up so that elements are expanded using a new line as the delimiter, and then later created in a way that each line becomes an element. (i.e. Splitting on a new line.)
IFS=$'\n'设置这些元素,使元素以新行作为分隔符展开,然后以每一行成为元素的方式创建。(也就是说,在新线路上进行拆分。)
Delimiting by a new line is important because that's how sort
operates (sorting per line). Splitting by only a new line is not-as-important, but is needed preserve elements that contain spaces or tabs.
通过一条新行进行限制是很重要的,因为这是排序操作的方式(每行排序)。仅由一条新行进行拆分并不重要,但需要保存包含空格或制表符的元素。
The default value of IFS
is a space, a tab, followed by a new line, and would be unfit for our operation.
IFS的默认值是一个空格、一个制表符、后跟一个新行,并且不适合我们的操作。
Next, the sort <<<"${array[*]}"
part
<<<
, called here strings, takes the expansion of "${array[*]}"
, as explained above, and feeds it into the standard input of sort
.
<<,称为字符串,以“${array[*]}”的扩展为例,并将其输入到sort的标准输入中。
With our example, sort
is fed this following string:
在我们的示例中,sort被输入以下字符串:
a c
b
f
3 5
Since sort
sorts, it produces:
由于排序,它产生:
3 5
a c
b
f
Next, the sorted=($(...))
part
The $(...)
part, called command substitution, causes its content (sort <<<"${array[*]}
) to run as a normal command, while taking the resulting standard output as the literal that goes where ever $(...)
was.
$(…)部分,称为命令替换,将其内容(sort << ${数组[*]})作为一个普通命令运行,同时将生成的标准输出作为文本,以$(…)的形式输出。
In our example, this produces something similar to simply writing:
在我们的例子中,这产生了类似于简单的写:
sorted=(3 5
a c
b
f
)
sorted
then becomes an array that's created by splitting this literal on every new line.
排序然后变成一个数组,它是通过在每一个新行上拆分这个文本来创建的。
Finally, the unset IFS
This resets the value of IFS
to the default value, and is just good practice.
它将IFS的值重置为默认值,这是很好的实践。
It's to ensure we don't cause trouble with anything that relies on IFS
later in our script. (Otherwise we'd need to remember that we've switched things around--something that might be impractical for complex scripts.)
它的目的是确保我们不会在脚本中出现依赖于IFS的任何东西带来麻烦。(否则,我们需要记住我们已经切换了一些东西——对于复杂的脚本来说,这可能是不切实际的。)
#2
31
Original response:
最初的反应:
array=(a c b "f f" 3 5)
readarray -t sorted < <(for a in "${array[@]}"; do echo "$a"; done | sort)
output:
输出:
$ for a in "${sorted[@]}"; do echo "$a"; done
3
5
a
b
c
f f
Note this version copes with values that contains special characters or whitespace (except newlines)
注意,这个版本包含了包含特殊字符或空格的值(换行除外)
Note readarray is supported in bash 4+.
注意,在bash 4+中支持readarray。
Edit Based on the suggestion by @Dimitre I had updated it to:
根据@Dimitre的建议进行编辑,我将其更新为:
readarray -t sorted < <(printf '%s\0' "${array[@]}" | sort -z | xargs -0n1)
which has the benefit of even understanding sorting elements with newline characters embedded correctly. Unfortunately, as correctly signaled by @ruakh this didn't mean the the result of readarray
would be correct, because readarray
has no option to use NUL
instead of regular newlines as line-separators.
它的好处是,甚至可以理解正确地嵌入新行字符的排序元素。不幸的是,正如@ruakh的正确信号,这并不意味着readarray的结果是正确的,因为readarray没有选择使用NUL而不是常规的换行符作为行分隔符。
#3
28
Here's a pure Bash quicksort implementation:
这是一个纯粹的Bash快速排序实现:
#!/bin/bash
# quicksorts positional arguments
# return is in array qsort_ret
qsort() {
local pivot i smaller=() larger=()
qsort_ret=()
(($#==0)) && return 0
pivot=$1
shift
for i; do
if [[ $i < $pivot ]]; then
smaller+=( "$i" )
else
larger+=( "$i" )
fi
done
qsort "${smaller[@]}"
smaller=( "${qsort_ret[@]}" )
qsort "${larger[@]}"
larger=( "${qsort_ret[@]}" )
qsort_ret=( "${smaller[@]}" "$pivot" "${larger[@]}" )
}
Use as, e.g.,
使用,例如,
$ array=(a c b f 3 5)
$ qsort "${array[@]}"
$ declare -p qsort_ret
declare -a qsort_ret='([0]="3" [1]="5" [2]="a" [3]="b" [4]="c" [5]="f")'
This implementation is recursive… so here's an iterative quicksort:
这个实现是递归的,这是一个迭代的快速排序:
#!/bin/bash
# quicksorts positional arguments
# return is in array qsort_ret
# Note: iterative, NOT recursive! :)
qsort() {
(($#==0)) && return 0
local stack=( 0 $(($#-1)) ) beg end i pivot smaller larger
qsort_ret=("$@")
while ((${#stack[@]})); do
beg=${stack[0]}
end=${stack[1]}
stack=( "${stack[@]:2}" )
smaller=() larger=()
pivot=${qsort_ret[beg]}
for ((i=beg+1;i<=end;++i)); do
if [[ "${qsort_ret[i]}" < "$pivot" ]]; then
smaller+=( "${qsort_ret[i]}" )
else
larger+=( "${qsort_ret[i]}" )
fi
done
qsort_ret=( "${qsort_ret[@]:0:beg}" "${smaller[@]}" "$pivot" "${larger[@]}" "${qsort_ret[@]:end+1}" )
if ((${#smaller[@]}>=2)); then stack+=( "$beg" "$((beg+${#smaller[@]}-1))" ); fi
if ((${#larger[@]}>=2)); then stack+=( "$((end-${#larger[@]}+1))" "$end" ); fi
done
}
In both cases, you can change the order you use: I used string comparisons, but you can use arithmetic comparisons, compare wrt file modification time, etc. just use the appropriate test; you can even make it more generic and have it use a first argument that is the test function use, e.g.,
在这两种情况下,您可以更改您使用的顺序:我使用了字符串比较,但是您可以使用算术比较,比较wrt文件修改时间等等,只要使用适当的测试;你甚至可以使它更通用,并让它使用第一个参数,即测试函数使用,例如,
#!/bin/bash
# quicksorts positional arguments
# return is in array qsort_ret
# Note: iterative, NOT recursive! :)
# First argument is a function name that takes two arguments and compares them
qsort() {
(($#<=1)) && return 0
local compare_fun=$1
shift
local stack=( 0 $(($#-1)) ) beg end i pivot smaller larger
qsort_ret=("$@")
while ((${#stack[@]})); do
beg=${stack[0]}
end=${stack[1]}
stack=( "${stack[@]:2}" )
smaller=() larger=()
pivot=${qsort_ret[beg]}
for ((i=beg+1;i<=end;++i)); do
if "$compare_fun" "${qsort_ret[i]}" "$pivot"; then
smaller+=( "${qsort_ret[i]}" )
else
larger+=( "${qsort_ret[i]}" )
fi
done
qsort_ret=( "${qsort_ret[@]:0:beg}" "${smaller[@]}" "$pivot" "${larger[@]}" "${qsort_ret[@]:end+1}" )
if ((${#smaller[@]}>=2)); then stack+=( "$beg" "$((beg+${#smaller[@]}-1))" ); fi
if ((${#larger[@]}>=2)); then stack+=( "$((end-${#larger[@]}+1))" "$end" ); fi
done
}
Then you can have this comparison function:
然后你可以有这个比较函数:
compare_mtime() { [[ $1 -nt $2 ]]; }
and use:
和使用:
$ qsort compare_mtime *
$ declare -p qsort_ret
to have the files in current folder sorted by modification time (newest first).
将当前文件夹中的文件按修改时间排序(最新的)。
NOTE. These functions are pure Bash! no external utilities, and no subshells! they are safe wrt any funny symbols you may have (spaces, newline characters, glob characters, etc.).
请注意。这些函数是纯粹的Bash!没有外部实用程序,也没有子shell !它们是安全的wrt,任何你可能拥有的有趣的符号(空格,换行符,glob字符,等等)。
#4
25
If you don't need to handle special shell characters in the array elements:
如果您不需要处理数组元素中的特殊shell字符:
array=(a c b f 3 5)
sorted=($(printf '%s\n' "${array[@]}"|sort))
With bash you'll need an external sorting program anyway.
在bash中,您将需要一个外部排序程序。
With zsh no external programs are needed and special shell characters are easily handled:
zsh不需要外部程序,特殊的shell字符很容易处理:
% array=('a a' c b f 3 5); printf '%s\n' "${(o)array[@]}"
3
5
a a
b
c
f
ksh has set -s
to sort ASCIIbetically.
ksh设置了-s来排序。
#5
8
In the 3-hour train trip from Munich to Frankfurt (which I had trouble to reach because Oktoberfest starts tomorrow) I was thinking about my first post. Employing a global array is a much better idea for a general sort function. The following function handles arbitary strings (newlines, blanks etc.):
在从慕尼黑到法兰克福的3个小时的火车旅行中(我遇到了麻烦,因为Oktoberfest明天就要开始了),我在想我的第一个职位。对于一般排序函数,使用全局数组是一个更好的主意。下面的函数处理任意的字符串(换行、空格等):
declare BSORT=()
function bubble_sort()
{ #
# @param [ARGUMENTS]...
#
# Sort all positional arguments and store them in global array BSORT.
# Without arguments sort this array. Return the number of iterations made.
#
# Bubble sorting lets the heaviest element sink to the bottom.
#
(($# > 0)) && BSORT=("$@")
local j=0 ubound=$((${#BSORT[*]} - 1))
while ((ubound > 0))
do
local i=0
while ((i < ubound))
do
if [ "${BSORT[$i]}" \> "${BSORT[$((i + 1))]}" ]
then
local t="${BSORT[$i]}"
BSORT[$i]="${BSORT[$((i + 1))]}"
BSORT[$((i + 1))]="$t"
fi
((++i))
done
((++j))
((--ubound))
done
echo $j
}
bubble_sort a c b 'z y' 3 5
echo ${BSORT[@]}
This prints:
这个打印:
3 5 a b c z y
The same output is created from
创建相同的输出。
BSORT=(a c b 'z y' 3 5)
bubble_sort
echo ${BSORT[@]}
Note that probably Bash internally uses smart-pointers, so the swap-operation could be cheap (although I doubt it). However, bubble_sort
demonstrates that more advanced functions like merge_sort
are also in the reach of the shell language.
注意,可能Bash内部使用了智能指针,所以swap操作可能很便宜(尽管我对此表示怀疑)。但是,bubble_sort表明,像merge_sort这样更高级的函数也在shell语言的影响范围内。
#6
6
tl;dr:
tl;博士:
Sort array a_in
and store the result in a_out
(elements must not have embedded newlines[1] ):
排序数组a_in并将结果存储在a_out中(元素必须没有嵌入的新行[1]):
Bash v4+:
Bash v4 +:
readarray -t a_out < <(printf '%s\n' "${a_in[@]}" | sort)
Bash v3:
Bash v3:
IFS=$'\n' read -d '' -r -a a_out < <(printf '%s\n' "${a_in[@]}" | sort)
Advantages over antak's solution:
优势antak的解决办法:
-
You needn't worry about accidental globbing (accidental interpretation of the array elements as filename patterns), so no extra command is needed to disable globbing (
set -f
, andset +f
to restore it later).您不必担心意外的globbing(对数组元素的意外解释作为文件名模式),因此不需要额外的命令来禁用globbing(设置-f,并设置+f以恢复它)。
-
You needn't worry about resetting
IFS
withunset IFS
.[2]您不必担心使用未设置的IFS重新设置IFS。[2]
Optional reading: explanation and sample code
The above combines Bash code with external utility sort
for a solution that works with arbitrary single-line elements and either lexical or numerical sorting (optionally by field):
上面的方法将Bash代码与外部实用程序排序,用于处理任意单行元素和词法或数值排序(可选地):
-
Performance: For around 20 elements or more, this will be faster than a pure Bash solution - significantly and increasingly so once you get beyond around 100 elements.
(The exact thresholds will depend on your specific input, machine, and platform.)性能:对于大约20个元素或更多,这将比一个纯Bash解决方案更快——当您超过100个元素时,这将是非常重要的。(确切的阈值将取决于您的特定输入、机器和平台。)
- The reason it is fast is that it avoids Bash loops.
- 它快速的原因是它避免了Bash循环。
-
printf '%s\n' "${a_in[@]}" | sort
performs the sorting (lexically, by default - seesort
's POSIX spec):printf '%s\n' ${a_in[@]}“|排序执行排序(在词汇上,默认情况下,参见sort的POSIX规范):
-
"${a_in[@]}"
safely expands to the elements of arraya_in
as individual arguments, whatever they contain (including whitespace).“${a_in[@]}”安全地扩展到数组a_in的元素作为单独的参数,无论它们包含什么(包括空格)。
-
printf '%s\n'
then prints each argument - i.e., each array element - on its own line, as-is.printf '%s\n'然后打印每个参数,即。,每个数组元素——按其本身的行,按原样。
-
-
Note the use of a process substitution (
<(...)
) to provide the sorted output as input toread
/readarray
(via redirection to stdin,<
), becauseread
/readarray
must run in the current shell (must not run in a subshell) in order for output variablea_out
to be visible to the current shell (for the variable to remain defined in the remainder of the script).注意使用进程替换(<(…))提供排序输出作为输入来读/ readarray(通过重定向stdin,<),因为读/ readarray必须运行在当前shell(不得在shell中运行)为了使输出变量a_out可见当前shell(中定义的变量保持脚本的其余部分)。
-
Reading
sort
's output into an array variable:读取sort的输出到数组变量中:
-
Bash v4+:
readarray -t a_out
reads the individual lines output bysort
into the elements of array variablea_out
, without including the trailing\n
in each element (-t
).Bash v4+: readarray -t a_out通过sort输入数组变量a_out的元素来读取单独的行输出,而不包括每个元素(-t)中的末尾\n。
-
Bash v3:
readarray
doesn't exist, soread
must be used:IFS=$'\n' read -d '' -r -a a_out
tellsread
to read into array (-a
) variablea_out
, reading the entire input, across lines (-d ''
), but splitting it into array elements by newlines (IFS=$'\n'
.$'\n'
, which produces a literal newline (LF), is a so-called ANSI C-quoted string).
(-r
, an option that should virtually always be used withread
, disables unexpected handling of\
characters.)Bash v3: readarray不存在,因此必须使用read -d =$'\n' read -d " -r -a -r -a -a,让read进入数组(-a)变量a_out,读取整个输入,跨行(-d),但将其按换行(IFS=$'\n')分割成数组元素。$'\n',它产生一个字面的换行(LF),是一个所谓的ANSI c引用字符串)。(-r,这个选项实际上应该经常与read一起使用,禁用了意外处理\字符。)
-
Annotated sample code:
带注释的示例代码:
#!/usr/bin/env bash
# Define input array `a_in`:
# Note the element with embedded whitespace ('a c')and the element that looks like
# a glob ('*'), chosen to demonstrate that elements with line-internal whitespace
# and glob-like contents are correctly preserved.
a_in=( 'a c' b f 5 '*' 10 )
# Sort and store output in array `a_out`
# Saving back into `a_in` is also an option.
IFS=$'\n' read -d '' -r -a a_out < <(printf '%s\n' "${a_in[@]}" | sort)
# Bash 4.x: use the simpler `readarray -t`:
# readarray -t a_out < <(printf '%s\n' "${a_in[@]}" | sort)
# Print sorted output array, line by line:
printf '%s\n' "${a_out[@]}"
Due to use of sort
without options, this yields lexical sorting (digits sort before letters, and digit sequences are treated lexically, not as numbers):
由于没有选择的排序,这就产生了词法排序(在字母之前,数字排序,而数字序列在词汇上被处理,而不是数字):
*
10
5
a c
b
f
If you wanted numerical sorting by the 1st field, you'd use sort -k1,1n
instead of just sort
, which yields (non-numbers sort before numbers, and numbers sort correctly):
如果你想要第一个字段的数字排序,你可以使用sort -k1,1n而不是仅仅排序,它会产生(非数字排序之前的数字,数字排序正确):
*
a c
b
f
5
10
[1] To handle elements with embedded newlines, use the following variant (Bash v4+, with GNU sort
):readarray -d '' -t a_out < <(printf '%s\0' "${a_in[@]}" | sort -z)
.
Michał Górny's helpful answer has a Bash v3 solution.
[1]使用嵌入的换行符来处理元素,使用下面的变体(Bash v4+,使用GNU sort): readarray -d ' -t a_out < <(printf '%s\0' > ${a_in[@]})。MichałGorny有用的答案有一个Bash v3的解决方案。
[2] While IFS
is set in the Bash v3 variant, the change is scoped to the command.
By contrast, what follows IFS=$'\n'
in antak's answer is an assignment rather than a command, in which case the IFS
change is global.
[2]虽然IFS是在Bash v3变体中设置的,但更改的范围是命令。相比之下,在antak的回答中,IFS=$“\n”是一个任务,而不是命令,在这种情况下,IFS改变是全局的。
#7
5
Another solution that uses external sort
and copes with any special characters (except for NULs :)). Should work with bash-3.2 and GNU or BSD sort
(sadly, POSIX doesn't include -z
).
另一种使用外部排序和复制任何特殊字符的解决方案(除了NULs:))。应该使用bash-3.2和GNU或BSD排序(遗憾的是,POSIX不包括-z)。
local e new_array=()
while IFS= read -r -d '' e; do
new_array+=( "${e}" )
done < <(printf "%s\0" "${array[@]}" | LC_ALL=C sort -z)
First look at the input redirection at the end. We're using printf
built-in to write out the array elements, zero-terminated. The quoting makes sure array elements are passed as-is, and specifics of shell printf
cause it to reuse the last part of format string for each remaining parameter. That is, it's equivalent to something like:
首先看一下最后的输入重定向。我们使用内置的printf来写数组元素,零终止。引用确保数组元素是按原样传递的,而shell printf的细节使它可以重用每个剩余参数的格式字符串的最后一部分。也就是说,它等价于
for e in "${array[@]}"; do
printf "%s\0" "${e}"
done
The null-terminated element list is then passed to sort
. The -z
option causes it to read null-terminated elements, sort them and output null-terminated as well. If you needed to get only the unique elements, you can pass -u
since it is more portable than uniq -z
. The LC_ALL=C
ensures stable sort order independently of locale — sometimes useful for scripts. If you want the sort
to respect locale, remove that.
然后将以null结尾的元素列表排序。-z选项使它读取以null结尾的元素,并对它们进行排序,并终止输出。如果您只需要获得独特的元素,那么您可以通过-u,因为它比uniq -z更便携。LC_ALL=C确保独立于语言环境的稳定排序顺序——有时对脚本有用。如果你想要尊重locale,就删除它。
The <()
construct obtains the descriptor to read from the spawned pipeline, and <
redirects the standard input of the while
loop to it. If you need to access the standard input inside the pipe, you may use another descriptor — exercise for the reader :).
<()构造获得从生成的管道中读取的描述符,并且 <重定向while循环的标准输入。如果您需要访问管道内的标准输入,您可以使用另一个描述符—为读者提供的练习:)。< p>
Now, back to the beginning. The read
built-in reads output from the redirected stdin. Setting empty IFS
disables word splitting which is unnecessary here — as a result, read
reads the whole 'line' of input to the single provided variable. -r
option disables escape processing that is undesired here as well. Finally, -d ''
sets the line delimiter to NUL — that is, tells read
to read zero-terminated strings.
现在,回到开始。read内置的读输出来自重定向的stdin。在这里设置空的IFS禁用单词分割,这是不必要的——因此,read将输入的整个“行”读入单个提供的变量。-r选项禁用了在这里不需要的脱逃处理。最后,-d“将行分隔符设置为NUL—即,告诉read读取零终止字符串。
As a result, the loop is executed once for every successive zero-terminated array element, with the value being stored in e
. The example just puts the items in another array but you may prefer to process them directly :).
因此,对于每一个连续的零终止数组元素,循环被执行一次,其中的值被存储在e中。这个例子只是将这些项放在另一个数组中,但是您可能更喜欢直接处理它们:)。
Of course, that's just one of the many ways of achieving the same goal. As I see it, it is simpler than implementing complete sorting algorithm in bash and in some cases it will be faster. It handles all special characters including newlines and should work on most of the common systems. Most importantly, it may teach you something new and awesome about bash :).
当然,这只是实现同样目标的众多方法之一。正如我所看到的,它比在bash中实现完整的排序算法更简单,在某些情况下它会更快。它处理所有的特殊字符,包括换行,并且应该在大多数公共系统上工作。最重要的是,它可以教会你一些新的和很棒的关于bash的东西:)。
#8
4
min sort:
最小值:
#!/bin/bash
array=(.....)
index_of_element1=0
while (( ${index_of_element1} < ${#array[@]} )); do
element_1="${array[${index_of_element1}]}"
index_of_element2=$((index_of_element1 + 1))
index_of_min=${index_of_element1}
min_element="${element_1}"
for element_2 in "${array[@]:$((index_of_element1 + 1))}"; do
min_element="`printf "%s\n%s" "${min_element}" "${element_2}" | sort | head -n+1`"
if [[ "${min_element}" == "${element_2}" ]]; then
index_of_min=${index_of_element2}
fi
let index_of_element2++
done
array[${index_of_element1}]="${min_element}"
array[${index_of_min}]="${element_1}"
let index_of_element1++
done
#9
3
try this:
试试这个:
echo ${array[@]} | awk 'BEGIN{RS=" ";} {print $1}' | sort
Output will be:
输出将:
3 5 a b c f
Problem solved.
问题解决了。
#10
2
array=(a c b f 3 5)
new_array=($(echo "${array[@]}" | sed 's/ /\n/g' | sort))
echo ${new_array[@]}
echo contents of new_array will be:
new_array的echo内容为:
3 5 a b c f
#11
1
If you can compute a unique integer for each element in the array, like this:
如果您可以为数组中的每个元素计算一个惟一的整数,如下所示:
tab='0123456789abcdefghijklmnopqrstuvwxyz'
# build the reversed ordinal map
for ((i = 0; i < ${#tab}; i++)); do
declare -g ord_${tab:i:1}=$i
done
function sexy_int() {
local sum=0
local i ch ref
for ((i = 0; i < ${#1}; i++)); do
ch="${1:i:1}"
ref="ord_$ch"
(( sum += ${!ref} ))
done
return $sum
}
sexy_int hello
echo "hello -> $?"
sexy_int world
echo "world -> $?"
then, you can use these integers as array indexes, because Bash always use sparse array, so no need to worry about unused indexes:
然后,您可以使用这些整数作为数组索引,因为Bash总是使用稀疏数组,因此无需担心未使用的索引:
array=(a c b f 3 5)
for el in "${array[@]}"; do
sexy_int "$el"
sorted[$?]="$el"
done
echo "${sorted[@]}"
- Pros. Fast.
- 优点。快。
- Cons. Duplicated elements are merged, and it can be impossible to map contents to 32-bit unique integers.
- 复制的元素被合并,并且不可能将内容映射到32位的唯一整数。
#12
0
I am not convinced that you'll need an external sorting program in Bash.
我不相信您在Bash中需要一个外部排序程序。
Here is my implementation for the simple bubble-sort algorithm.
下面是简单的气泡排序算法的实现。
function bubble_sort()
{ #
# Sorts all positional arguments and echoes them back.
#
# Bubble sorting lets the heaviest (longest) element sink to the bottom.
#
local array=($@) max=$(($# - 1))
while ((max > 0))
do
local i=0
while ((i < max))
do
if [ ${array[$i]} \> ${array[$((i + 1))]} ]
then
local t=${array[$i]}
array[$i]=${array[$((i + 1))]}
array[$((i + 1))]=$t
fi
((i += 1))
done
((max -= 1))
done
echo ${array[@]}
}
array=(a c b f 3 5)
echo " input: ${array[@]}"
echo "output: $(bubble_sort ${array[@]})"
This shall print:
这将打印:
input: a c b f 3 5
output: 3 5 a b c f
#13
0
a=(e b 'c d')
shuf -e "${a[@]}" | sort >/tmp/f
mapfile -t g </tmp/f
#14
0
There is a workaround for the usual problem of spaces and newlines:
对于常见的空格和换行问题,有一个变通方法:
Use a character that is not in the original array (like $'\1'
or $'\4'
or similar).
使用一个不在原始数组中的字符(比如$'\1'或$'\4'或类似的)。
This function gets the job done:
这个函数完成了任务:
# Sort an Array may have spaces or newlines with a workaround (wa=$'\4')
sortarray(){ local wa=$'\4' IFS=''
if [[ $* =~ [$wa] ]]; then
echo "$0: error: array contains the workaround char" >&2
exit 1
fi
set -f; local IFS=$'\n' x nl=$'\n'
set -- $(printf '%s\n' "${@//$nl/$wa}" | sort -n)
for x
do sorted+=("${x//$wa/$nl}")
done
}
This will sort the array:
这将对数组进行排序:
$ array=( a b 'c d' $'e\nf' $'g\1h')
$ sortarray "${array[@]}"
$ printf '<%s>\n' "${sorted[@]}"
<a>
<b>
<c d>
<e
f>
<gh>
This will complain that the source array contains the workaround character:
这将会抱怨源数组包含了工作区:
$ array=( a b 'c d' $'e\nf' $'g\4h')
$ sortarray "${array[@]}"
./script: error: array contains the workaround char
description
- We set two local variables
wa
(workaround char) and a null IFS - 我们设置了两个局部变量wa(工作区char)和一个null IFS。
- Then (with ifs null) we test that the whole array
$*
. - 然后(使用ifs null)测试整个数组$*。
- Does not contain any woraround char
[[ $* =~ [$wa] ]]
. - 不包含任何woraround char [[$* = [$wa]]]。
- If it does, raise a message and signal an error:
exit 1
- 如果是,请发出一条消息并发出一个错误:退出1。
- Avoid filename expansions:
set -f
- 避免文件名扩展:设置-f。
- Set a new value of IFS (
IFS=$'\n'
) a loop variablex
and a newline var (nl=$'\n'
). - 设置一个新的IFS值(IFS=$'\n')一个循环变量x和一个newline var (nl=$'\n')。
- We print all values of the arguments received (the input array
$@
). - 我们打印接收到的参数的所有值(输入数组$@)。
- but we replace any new line by the workaround char
"${@//$nl/$wa}"
. - 但是我们用“${@//$nl/$wa}”来替代任何新行。
- send those values to be sorted
sort -n
. - 将这些值排序为-n。
- and place back all the sorted values in the positional arguments
set --
. - 并将所有已排序的值放在位置参数集合中—。
- Then we assign each argument one by one (to preserve newlines).
- 然后我们逐个地分配每个参数(以保存换行)。
- in a loop
for x
- 对x进行循环。
- to a new array:
sorted+=(…)
- 对一个新数组:排序+=(…)
- inside quotes to preserve any existing newline.
- 内部引号保存任何现有的换行。
- restoring the workaround to a newline
"${x//$wa/$nl}"
. - 将工作恢复到新行“${x//$wa/$nl}”。
- done
- 完成
#15
-1
sorted=($(echo ${array[@]} | tr " " "\n" | sort))
排序=($(echo ${阵列[@]}| tr " "\n" |排序))
In the spirit of bash / linux, I would pipe the best command-line tool for each step. sort
does the main job but needs input separated by newline instead of space, so the very simple pipeline above simply does:
在bash / linux的精神中,我将为每一步安装最好的命令行工具。排序是主要的工作,但是需要用换行而不是空间来进行输入,所以上面的简单管道就是:
Echo array content --> replace space by newline --> sort
Echo数组内容——>用换行符代替空格——>排序。
$()
is to echo the result
$()是响应结果。
($())
is to put the "echoed result" in an array
($())是将“响应结果”放入数组中。
Note: as @sorontar mentioned in a comment to a different question:
注:@sorontar在对另一个问题的评论中提到:
The sorted=($(...)) part is using the "split and glob" operator. You should turn glob off: set -f or set -o noglob or shopt -op noglob or an element of the array like * will be expanded to a list of files.
排序=($(…))部分使用“split和glob”操作符。您应该将glob关闭:设置-f或set -o noglob或shopt -op noglob或像*这样的数组元素将被扩展到一个文件列表。