I am looking for a way to split a string in bash over a delimiter string, and place the parts in an array.
我正在寻找一种方法,可以将bash中的字符串拆分为分隔符字符串,并将这些部分放在一个数组中。
Simple case:
简单的例子:
#!/bin/bash
b="aaaaa/bbbbb/ddd/ffffff"
echo "simple string: $b"
IFS='/' b_split=($b)
echo ;
echo "split"
for i in ${b_split[@]}
do
echo "------ new part ------"
echo "$i"
done
Gives output
给出了输出
simple string: aaaaa/bbbbb/ddd/ffffff
split
------ new part ------
aaaaa
------ new part ------
bbbbb
------ new part ------
ddd
------ new part ------
ffffff
More complex case:
更复杂的例子:
#!/bin/bash
c=$(echo "AA=A"; echo "B=BB"; echo "======="; echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";)
echo "more complex string"
echo "$c";
echo ;
echo "split";
IFS='=======' c_split=($c) ;# <---- LINE TO BE CHANGED
for i in ${c_split[@]}
do
echo "------ new part ------"
echo "$i"
done
Gives output:
给输出:
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split
------ new part ------
AA
------ new part ------
A
B
------ new part ------
BB
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
C
------ new part ------
------ new part ------
CC
DD
------ new part ------
D
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
------ new part ------
EEE
FF
I would like the second output to be like
我想要第二个输出。
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ new part ------
EEE
FF
I.e. to split the string on a sequence of characters, instead of one. How can I do this?
也就是说,将字符串拆分为一个字符序列,而不是一个字符。我该怎么做呢?
I am looking for an answer that would only modify this line in the second script:
我正在寻找一个只会在第二个脚本中修改这一行的答案:
IFS='=======' c_split=($c) ;# <---- LINE TO BE CHANGED
5 个解决方案
#1
17
IFS
disambiguation
IFS
mean Input Field Separators, as list of characters that could be used as separators
.
IFS是指输入字段分隔符,作为可作为分隔符使用的字符的列表。
By default, this is set to
\t\n
, meaning that any number (greater than zero) of space, tabulation and/or newline could be one separator
.
默认情况下,这个设置为\t\n,这意味着任何数量(大于零)的空间、制表和/或换行符可以是一个分隔符。
So the string:
字符串:
" blah foo=bar
baz "
Leading and trailing separators would be ignored and this string will contain only 3 parts: blah
, foo=bar
and baz
.
前导和后置分隔符将被忽略,而这个字符串只包含3个部分:blah, foo=bar和baz。
Splitting a string using IFS
is possible if you know a valid field separator not used in your string.
如果您知道在您的字符串中没有使用有效的字段分隔符,那么使用IFS分割字符串是可能的。
OIFS="$IFS"
IFS='§'
c=$'AA=A\nB=BB\n=======\nC==CC\nDD=D\n=======\nEEE\nFF'
c_split=(${c//=======/§})
IFS="$OIFS"
printf -- "------ new part ------\n%s\n" "${c_split[@]}"
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ new part ------
EEE
FF
But this work only while string do not contain §
.
但这只有当字符串不包含§工作。
You could use another character, like IFS=$'\026';c_split=(${c//=======/$'\026'})
but anyway this may involve furter bugs.
您可以使用另一个字符,如IFS=$'\026';c_split=(${c//==== /$'\026'}),但无论如何,这可能涉及到furter bug。
You could browse character maps for finding one who's not in your string:
你可以浏览人物地图找到一个不在你的字符串里的人:
myIfs=""
for i in {1..255};do
printf -v char "$(printf "\\\%03o" $i)"
[ "$c" == "${c#*$char}" ] && myIfs="$char" && break
done
if ! [ "$myIFS" ] ;then
echo no split char found, could not do the job, sorry.
exit 1
fi
but I find this solution a little overkill.
但我觉得这个解决方案有点过头了。
Splitting on spaces (or without modifying IFS)
Under bash, we could use this bashism:
在bash中,我们可以使用这个bashism:
b="aaaaa/bbbbb/ddd/ffffff"
b_split=(${b//// })
In fact, this syntaxe ${varname//
will initiate a translation (delimited by /
) replacing all occurences of /
by a space , before assigning it to an array
b_split
.
实际上,这个syntaxe ${varname//将启动一个转换(通过/)替换所有/在空间中发生的所有事件,然后将其分配给一个数组b_split。
Of course, this still use IFS
and split array on spaces.
当然,这仍然在空间上使用IFS和split数组。
This is not the best way, but could work with specific cases.
这不是最好的方法,但可以处理特定的情况。
You could even drop unwanted spaces before splitting:
你甚至可以在分裂之前删除不需要的空间:
b='12 34 / 1 3 5 7 / ab'
b1=${b// }
b_split=(${b1//// })
printf "<%s>, " "${b_split[@]}" ;echo
<12>, <34>, <1>, <3>, <5>, <7>, <ab>,
or exchange thems...
或交换主题…
b1=${b// /§}
b_split=(${b1//// })
printf "<%s>, " "${b_split[@]//§/ }" ;echo
<12 34 >, < 1 3 5 7 >, < ab>,
Splitting line on strings
:
So you have to not use IFS
for your meaning, but bash do have nice features:
因此,您必须不使用IFS来表示您的意思,但是bash具有很好的特性:
#!/bin/bash
c=$'AA=A\nB=BB\n=======\nC==CC\nDD=D\n=======\nEEE\nFF'
echo "more complex string"
echo "$c";
echo ;
echo "split";
mySep='======='
while [ "$c" != "${c#*$mySep}" ];do
echo "------ new part ------"
echo "${c%%$mySep*}"
c="${c#*$mySep}"
done
echo "------ last part ------"
echo "$c"
Let see:
我们看到:
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ last part ------
EEE
FF
Nota: Leading and trailing newlines are not deleted. If this is needed, you could:
标题和结尾的换行没有被删除。如果需要,你可以:
mySep=$'\n=======\n'
instead of simply =======
.
而不是简单地= = = = = = =。
Or you could rewrite split loop for keeping explicitely this out:
或者你可以重新写一个分式循环来明确地说明这一点:
mySep=$'======='
while [ "$c" != "${c#*$mySep}" ];do
echo "------ new part ------"
part="${c%%$mySep*}"
part="${part##$'\n'}"
echo "${part%%$'\n'}"
c="${c#*$mySep}"
done
echo "------ last part ------"
c=${c##$'\n'}
echo "${c%%$'\n'}"
Any case, this match what SO question asked for (: and his sample :)
任何情况下,这都符合问题的要求(和他的样本:)
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ last part ------
EEE
FF
Finaly creating an array
#!/bin/bash
c=$'AA=A\nB=BB\n=======\nC==CC\nDD=D\n=======\nEEE\nFF'
echo "more complex string"
echo "$c";
echo ;
echo "split";
mySep=$'======='
export -a c_split
while [ "$c" != "${c#*$mySep}" ];do
part="${c%%$mySep*}"
part="${part##$'\n'}"
c_split+=("${part%%$'\n'}")
c="${c#*$mySep}"
done
c=${c##$'\n'}
c_split+=("${c%%$'\n'}")
for i in "${c_split[@]}"
do
echo "------ new part ------"
echo "$i"
done
Do this finely:
这样做细:
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ new part ------
EEE
FF
Some explanations:
-
export -a var
to definevar
as an array and share them in childs - 导出- var将var定义为数组,并将其共享给childs。
-
${variablename%string*}
,${variablename%%string*}
result in the left part of variablename, upto but without string. One%
mean last occurence of string and%%
for all occurences. Full variablename is returned is string not found. - ${variablename%string*}, ${variablename%string*}结果在variablename的左侧,upto但是没有字符串。一个%的意思是最后发生的字符串和%%的所有发生。完整的variablename返回是没有找到的字符串。
-
${variablename#*string}
, do same in reverse way: return last part of variablename from but without string. One#
mean first occurence and two##
man all occurences. - ${variablename#*string},以相反的方式做同样的事情:返回变量的最后一部分,但是没有字符串。一个#的意思是第一次出现,两个##人都发生了。
Nota in replacement, character *
is a joker mean any number of any character.
替换的Nota,字符*是一个小丑,意思是任意数量的任何字符。
The command echo "${c%%$'\n'}"
would echo variable c but without any number of newline at end of string.
命令echo“${c% $'\n'}”将返回变量c,但在字符串末尾没有任何换行符。
So if variable contain Hello WorldZorGluBHello youZorGluBI'm happy
,
如果变量包含Hello WorldZorGluBHello youZorGluBI很高兴,
variable="Hello WorldZorGluBHello youZorGluBI'm happy"
$ echo ${variable#*ZorGluB}
Hello youZorGlubI'm happy
$ echo ${variable##*ZorGluB}
I'm happy
$ echo ${variable%ZorGluB*}
Hello WorldZorGluBHello you
$ echo ${variable%%ZorGluB*}
Hello World
$ echo ${variable%%ZorGluB}
Hello WorldZorGluBHello youZorGluBI'm happy
$ echo ${variable%happy}
Hello WorldZorGluBHello youZorGluBI'm
$ echo ${variable##* }
happy
All this is explained in the manpage:
所有这些都在手册中解释:
$ man -Len -Pless\ +/##word bash
$ man -Len -Pless\ +/%%word bash
$ man -Len -Pless\ +/^\\\ *export\\\ .*word bash
Step by step, the splitting loop:
The separator:
分隔符:
mySep=$'======='
Declaring c_split
as an array (and could be shared with childs)
将c_split声明为数组(并可以与childs共享)
export -a c_split
While variable c do contain at least one occurence of mySep
而变量c至少包含一个mySep的出现。
while [ "$c" != "${c#*$mySep}" ];do
Trunc c from first mySep
to end of string and assign to part
.
从第一个mySep到字符串的结束,并分配到部分。
part="${c%%$mySep*}"
Remove leading newlines
删除前导换行
part="${part##$'\n'}"
Remove trailing newlines and add result as a new array element to c_split
.
删除尾随的新行,并将结果作为一个新的数组元素添加到c_split。
c_split+=("${part%%$'\n'}")
Reassing c whith the rest of string when left upto mySep
is removed
当离开到mySep时,将剩下的字符串重新赋值。
c="${c#*$mySep}"
Done ;-)
完成;-)
done
Remove leading newlines
删除前导换行
c=${c##$'\n'}
Remove trailing newlines and add result as a new array element to c_split
.
删除尾随的新行,并将结果作为一个新的数组元素添加到c_split。
c_split+=("${c%%$'\n'}")
Into a function:
ssplit() {
local string="$1" array=${2:-ssplited_array} delim="${3:- }" pos=0
while [ "$string" != "${string#*$delim}" ];do
printf -v $array[pos++] "%s" "${string%%$delim*}"
string="${string#*$delim}"
done
printf -v $array[pos] "%s" "$string"
}
Usage:
用法:
ssplit "<quoted string>" [array name] [delimiter string]
where array name is $splitted_array
by default and delimiter is one single space.
在默认情况下,数组名是$splitted_array,分隔符是一个单独的空间。
You could use:
您可以使用:
c=$'AA=A\nB=BB\n=======\nC==CC\nDD=D\n=======\nEEE\nFF'
ssplit "$c" c_split $'\n=======\n'
printf -- "--- part ----\n%s\n" "${c_split[@]}"
--- part ----
AA=A
B=BB
--- part ----
C==CC
DD=D
--- part ----
EEE
FF
#2
3
do it with awk:
用awk:
awk -vRS='\n=*\n' '{print "----- new part -----";print}' <<< $c
output:
输出:
kent$ awk -vRS='\n=*\n' '{print "----- new part -----";print}' <<< $c
----- new part -----
AA=A
B=BB
----- new part -----
C==CC
DD=D
----- new part -----
EEE
FF
#3
1
Following script tested in bash:
在bash中测试的脚本:
kent@7pLaptop:/tmp/test$ bash --version
GNU bash, version 4.2.42(2)-release (i686-pc-linux-gnu)
the script: (named t.sh
)
脚本:(名为t.sh)
#!/bin/bash
c=$(echo "AA=A"; echo "B=BB"; echo "======="; echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";)
echo "more complex string"
echo "$c"
echo "split now"
c_split=($(echo "$c"|awk -vRS="\n=*\n" '{gsub(/\n/,"\\n");printf $0" "}'))
for i in ${c_split[@]}
do
echo "---- new part ----"
echo -e "$i"
done
output:
输出:
kent@7pLaptop:/tmp/test$ ./t.sh
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split now
---- new part ----
AA=A
B=BB
---- new part ----
C==CC
DD=D
---- new part ----
EEE
FF
note the echo statement in that for loop, if you remove the option -e
you will see:
注意在循环语句中,如果删除选项-e,你会看到:
---- new part ----
AA=A\nB=BB
---- new part ----
C==CC\nDD=D
---- new part ----
EEE\nFF\n
take -e
or not depends on your requirement.
以-e或不取决于你的要求。
#4
1
Here's an approach that doesn't fumble when the data contains literal backslash sequences, spaces and other:
当数据包含文本反斜杠序列、空格和其他时,这是一种不笨拙的方法:
c=$(echo "AA=A"; echo "B=BB"; echo "======="; echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";)
echo "more complex string"
echo "$c";
echo ;
echo "split";
c_split=()
while IFS= read -r -d '' part
do
c_split+=( "$part" )
done < <(printf "%s" "$c" | sed -e 's/=======/\x00/g')
c_split+=( "$part" )
for i in "${c_split[@]}"
do
echo "------ new part ------"
echo "$i"
done
Note that the string is actually split on "=======" as requested, so the line feeds become part of the data (causing extra blank lines when "echo" adds its own).
注意,字符串在“=======”的情况下实际上是分开的,因此行提要成为数据的一部分(当“echo”添加自己的时候,会造成额外的空行)。
#5
1
Added some in the example text because of this comment:
在示例文本中添加了一些注释:
This breaks if you replace AA=A with AA =A or with AA=\nA – that other guy
如果你用AA=A或AA=\nA来替换AA=A,那么这个就会失效。
EDIT: I added a suggestion that isn't sensitive for some delimiter in the text. However this isn't using a "one line split" that OP was asking for, but this is how I should have done it if I would do it in bash, and want the result in an array.
编辑:我添加了一个对文本中某些分隔符不敏感的建议。然而,这并不是使用OP所要求的“一行分割”,而是我应该如何在bash中执行它,并希望得到一个数组中的结果。
script.sh (NEW):
脚本。sh(新):
#!/bin/bash
text=$(
echo "AA=A"; echo "AA =A"; echo "AA=\nA"; echo "B=BB"; echo "=======";
echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";
)
echo "more complex string"
echo "$text"
echo "split now"
c_split[0]=""
current=""
del=""
ind=0
# newline
newl=$'\n'
# Save IFS (not necessary when run as sub shell)
saveIFS="$IFS"
IFS="$newl"
for row in $text; do
if [[ $row =~ ^=+$ ]]; then
c_split[$ind]="$current"
((ind++))
current=""
# Avoid preceding newline
del=""
continue
fi
current+="$del$row"
del="$newl"
done
# Restore IFS
IFS="$saveIFS"
# If there is a last poor part of the text
if [[ -n $current ]]; then
c_split[$ind]="$current"
fi
# The result is an array
for i in "${c_split[@]}"
do
echo "---- new part ----"
echo "$i"
done
script.sh (OLD, with "one line split"):
(I stool the idea with awk from @Kent and adjusted it a bit)
脚本。sh(旧的,带有“一行分割”):(我用awk从@Kent中提取想法并对其进行调整)
#!/bin/bash
c=$(
echo "AA=A"; echo "AA =A"; echo "AA=\nA"; echo "B=BB"; echo "=======";
echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";
)
echo "more complex string"
echo "$c"
echo "split now"
# Now, this will be almost absolute secure,
# perhaps except a direct hit by lightning.
del=""
for ch in $'\1' $'\2' $'\3' $'\4' $'\5' $'\6' $'\7'; do
if [ -z "`echo "$c" | grep "$ch"`" ]; then
del="$ch"
break
fi
done
if [ -z "$del" ]; then
echo "Sorry, all this testing but no delmiter to use..."
exit 1
fi
IFS="$del" c_split=($(echo "$c" | awk -vRS="\n=+\n" -vORS="$del" '1'))
for i in ${c_split[@]}
do
echo "---- new part ----"
echo "$i"
done
Output:
输出:
[244an]$ bash --version
GNU bash, version 4.2.24(1)-release (x86_64-pc-linux-gnu)
[244an]$ ./script.sh
more complex string
AA=A
AA =A
AA=\nA
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split now
---- new part ----
AA=A
AA =A
AA=\nA
B=BB
---- new part ----
C==CC
DD=D
---- new part ----
EEE
FF
I'm not using -e
for echo
, to get AA=\\nA
to not do a newline
我不使用-e来表示echo,让AA=\\nA不做换行。
#1
17
IFS
disambiguation
IFS
mean Input Field Separators, as list of characters that could be used as separators
.
IFS是指输入字段分隔符,作为可作为分隔符使用的字符的列表。
By default, this is set to
\t\n
, meaning that any number (greater than zero) of space, tabulation and/or newline could be one separator
.
默认情况下,这个设置为\t\n,这意味着任何数量(大于零)的空间、制表和/或换行符可以是一个分隔符。
So the string:
字符串:
" blah foo=bar
baz "
Leading and trailing separators would be ignored and this string will contain only 3 parts: blah
, foo=bar
and baz
.
前导和后置分隔符将被忽略,而这个字符串只包含3个部分:blah, foo=bar和baz。
Splitting a string using IFS
is possible if you know a valid field separator not used in your string.
如果您知道在您的字符串中没有使用有效的字段分隔符,那么使用IFS分割字符串是可能的。
OIFS="$IFS"
IFS='§'
c=$'AA=A\nB=BB\n=======\nC==CC\nDD=D\n=======\nEEE\nFF'
c_split=(${c//=======/§})
IFS="$OIFS"
printf -- "------ new part ------\n%s\n" "${c_split[@]}"
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ new part ------
EEE
FF
But this work only while string do not contain §
.
但这只有当字符串不包含§工作。
You could use another character, like IFS=$'\026';c_split=(${c//=======/$'\026'})
but anyway this may involve furter bugs.
您可以使用另一个字符,如IFS=$'\026';c_split=(${c//==== /$'\026'}),但无论如何,这可能涉及到furter bug。
You could browse character maps for finding one who's not in your string:
你可以浏览人物地图找到一个不在你的字符串里的人:
myIfs=""
for i in {1..255};do
printf -v char "$(printf "\\\%03o" $i)"
[ "$c" == "${c#*$char}" ] && myIfs="$char" && break
done
if ! [ "$myIFS" ] ;then
echo no split char found, could not do the job, sorry.
exit 1
fi
but I find this solution a little overkill.
但我觉得这个解决方案有点过头了。
Splitting on spaces (or without modifying IFS)
Under bash, we could use this bashism:
在bash中,我们可以使用这个bashism:
b="aaaaa/bbbbb/ddd/ffffff"
b_split=(${b//// })
In fact, this syntaxe ${varname//
will initiate a translation (delimited by /
) replacing all occurences of /
by a space , before assigning it to an array
b_split
.
实际上,这个syntaxe ${varname//将启动一个转换(通过/)替换所有/在空间中发生的所有事件,然后将其分配给一个数组b_split。
Of course, this still use IFS
and split array on spaces.
当然,这仍然在空间上使用IFS和split数组。
This is not the best way, but could work with specific cases.
这不是最好的方法,但可以处理特定的情况。
You could even drop unwanted spaces before splitting:
你甚至可以在分裂之前删除不需要的空间:
b='12 34 / 1 3 5 7 / ab'
b1=${b// }
b_split=(${b1//// })
printf "<%s>, " "${b_split[@]}" ;echo
<12>, <34>, <1>, <3>, <5>, <7>, <ab>,
or exchange thems...
或交换主题…
b1=${b// /§}
b_split=(${b1//// })
printf "<%s>, " "${b_split[@]//§/ }" ;echo
<12 34 >, < 1 3 5 7 >, < ab>,
Splitting line on strings
:
So you have to not use IFS
for your meaning, but bash do have nice features:
因此,您必须不使用IFS来表示您的意思,但是bash具有很好的特性:
#!/bin/bash
c=$'AA=A\nB=BB\n=======\nC==CC\nDD=D\n=======\nEEE\nFF'
echo "more complex string"
echo "$c";
echo ;
echo "split";
mySep='======='
while [ "$c" != "${c#*$mySep}" ];do
echo "------ new part ------"
echo "${c%%$mySep*}"
c="${c#*$mySep}"
done
echo "------ last part ------"
echo "$c"
Let see:
我们看到:
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ last part ------
EEE
FF
Nota: Leading and trailing newlines are not deleted. If this is needed, you could:
标题和结尾的换行没有被删除。如果需要,你可以:
mySep=$'\n=======\n'
instead of simply =======
.
而不是简单地= = = = = = =。
Or you could rewrite split loop for keeping explicitely this out:
或者你可以重新写一个分式循环来明确地说明这一点:
mySep=$'======='
while [ "$c" != "${c#*$mySep}" ];do
echo "------ new part ------"
part="${c%%$mySep*}"
part="${part##$'\n'}"
echo "${part%%$'\n'}"
c="${c#*$mySep}"
done
echo "------ last part ------"
c=${c##$'\n'}
echo "${c%%$'\n'}"
Any case, this match what SO question asked for (: and his sample :)
任何情况下,这都符合问题的要求(和他的样本:)
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ last part ------
EEE
FF
Finaly creating an array
#!/bin/bash
c=$'AA=A\nB=BB\n=======\nC==CC\nDD=D\n=======\nEEE\nFF'
echo "more complex string"
echo "$c";
echo ;
echo "split";
mySep=$'======='
export -a c_split
while [ "$c" != "${c#*$mySep}" ];do
part="${c%%$mySep*}"
part="${part##$'\n'}"
c_split+=("${part%%$'\n'}")
c="${c#*$mySep}"
done
c=${c##$'\n'}
c_split+=("${c%%$'\n'}")
for i in "${c_split[@]}"
do
echo "------ new part ------"
echo "$i"
done
Do this finely:
这样做细:
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split
------ new part ------
AA=A
B=BB
------ new part ------
C==CC
DD=D
------ new part ------
EEE
FF
Some explanations:
-
export -a var
to definevar
as an array and share them in childs - 导出- var将var定义为数组,并将其共享给childs。
-
${variablename%string*}
,${variablename%%string*}
result in the left part of variablename, upto but without string. One%
mean last occurence of string and%%
for all occurences. Full variablename is returned is string not found. - ${variablename%string*}, ${variablename%string*}结果在variablename的左侧,upto但是没有字符串。一个%的意思是最后发生的字符串和%%的所有发生。完整的variablename返回是没有找到的字符串。
-
${variablename#*string}
, do same in reverse way: return last part of variablename from but without string. One#
mean first occurence and two##
man all occurences. - ${variablename#*string},以相反的方式做同样的事情:返回变量的最后一部分,但是没有字符串。一个#的意思是第一次出现,两个##人都发生了。
Nota in replacement, character *
is a joker mean any number of any character.
替换的Nota,字符*是一个小丑,意思是任意数量的任何字符。
The command echo "${c%%$'\n'}"
would echo variable c but without any number of newline at end of string.
命令echo“${c% $'\n'}”将返回变量c,但在字符串末尾没有任何换行符。
So if variable contain Hello WorldZorGluBHello youZorGluBI'm happy
,
如果变量包含Hello WorldZorGluBHello youZorGluBI很高兴,
variable="Hello WorldZorGluBHello youZorGluBI'm happy"
$ echo ${variable#*ZorGluB}
Hello youZorGlubI'm happy
$ echo ${variable##*ZorGluB}
I'm happy
$ echo ${variable%ZorGluB*}
Hello WorldZorGluBHello you
$ echo ${variable%%ZorGluB*}
Hello World
$ echo ${variable%%ZorGluB}
Hello WorldZorGluBHello youZorGluBI'm happy
$ echo ${variable%happy}
Hello WorldZorGluBHello youZorGluBI'm
$ echo ${variable##* }
happy
All this is explained in the manpage:
所有这些都在手册中解释:
$ man -Len -Pless\ +/##word bash
$ man -Len -Pless\ +/%%word bash
$ man -Len -Pless\ +/^\\\ *export\\\ .*word bash
Step by step, the splitting loop:
The separator:
分隔符:
mySep=$'======='
Declaring c_split
as an array (and could be shared with childs)
将c_split声明为数组(并可以与childs共享)
export -a c_split
While variable c do contain at least one occurence of mySep
而变量c至少包含一个mySep的出现。
while [ "$c" != "${c#*$mySep}" ];do
Trunc c from first mySep
to end of string and assign to part
.
从第一个mySep到字符串的结束,并分配到部分。
part="${c%%$mySep*}"
Remove leading newlines
删除前导换行
part="${part##$'\n'}"
Remove trailing newlines and add result as a new array element to c_split
.
删除尾随的新行,并将结果作为一个新的数组元素添加到c_split。
c_split+=("${part%%$'\n'}")
Reassing c whith the rest of string when left upto mySep
is removed
当离开到mySep时,将剩下的字符串重新赋值。
c="${c#*$mySep}"
Done ;-)
完成;-)
done
Remove leading newlines
删除前导换行
c=${c##$'\n'}
Remove trailing newlines and add result as a new array element to c_split
.
删除尾随的新行,并将结果作为一个新的数组元素添加到c_split。
c_split+=("${c%%$'\n'}")
Into a function:
ssplit() {
local string="$1" array=${2:-ssplited_array} delim="${3:- }" pos=0
while [ "$string" != "${string#*$delim}" ];do
printf -v $array[pos++] "%s" "${string%%$delim*}"
string="${string#*$delim}"
done
printf -v $array[pos] "%s" "$string"
}
Usage:
用法:
ssplit "<quoted string>" [array name] [delimiter string]
where array name is $splitted_array
by default and delimiter is one single space.
在默认情况下,数组名是$splitted_array,分隔符是一个单独的空间。
You could use:
您可以使用:
c=$'AA=A\nB=BB\n=======\nC==CC\nDD=D\n=======\nEEE\nFF'
ssplit "$c" c_split $'\n=======\n'
printf -- "--- part ----\n%s\n" "${c_split[@]}"
--- part ----
AA=A
B=BB
--- part ----
C==CC
DD=D
--- part ----
EEE
FF
#2
3
do it with awk:
用awk:
awk -vRS='\n=*\n' '{print "----- new part -----";print}' <<< $c
output:
输出:
kent$ awk -vRS='\n=*\n' '{print "----- new part -----";print}' <<< $c
----- new part -----
AA=A
B=BB
----- new part -----
C==CC
DD=D
----- new part -----
EEE
FF
#3
1
Following script tested in bash:
在bash中测试的脚本:
kent@7pLaptop:/tmp/test$ bash --version
GNU bash, version 4.2.42(2)-release (i686-pc-linux-gnu)
the script: (named t.sh
)
脚本:(名为t.sh)
#!/bin/bash
c=$(echo "AA=A"; echo "B=BB"; echo "======="; echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";)
echo "more complex string"
echo "$c"
echo "split now"
c_split=($(echo "$c"|awk -vRS="\n=*\n" '{gsub(/\n/,"\\n");printf $0" "}'))
for i in ${c_split[@]}
do
echo "---- new part ----"
echo -e "$i"
done
output:
输出:
kent@7pLaptop:/tmp/test$ ./t.sh
more complex string
AA=A
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split now
---- new part ----
AA=A
B=BB
---- new part ----
C==CC
DD=D
---- new part ----
EEE
FF
note the echo statement in that for loop, if you remove the option -e
you will see:
注意在循环语句中,如果删除选项-e,你会看到:
---- new part ----
AA=A\nB=BB
---- new part ----
C==CC\nDD=D
---- new part ----
EEE\nFF\n
take -e
or not depends on your requirement.
以-e或不取决于你的要求。
#4
1
Here's an approach that doesn't fumble when the data contains literal backslash sequences, spaces and other:
当数据包含文本反斜杠序列、空格和其他时,这是一种不笨拙的方法:
c=$(echo "AA=A"; echo "B=BB"; echo "======="; echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";)
echo "more complex string"
echo "$c";
echo ;
echo "split";
c_split=()
while IFS= read -r -d '' part
do
c_split+=( "$part" )
done < <(printf "%s" "$c" | sed -e 's/=======/\x00/g')
c_split+=( "$part" )
for i in "${c_split[@]}"
do
echo "------ new part ------"
echo "$i"
done
Note that the string is actually split on "=======" as requested, so the line feeds become part of the data (causing extra blank lines when "echo" adds its own).
注意,字符串在“=======”的情况下实际上是分开的,因此行提要成为数据的一部分(当“echo”添加自己的时候,会造成额外的空行)。
#5
1
Added some in the example text because of this comment:
在示例文本中添加了一些注释:
This breaks if you replace AA=A with AA =A or with AA=\nA – that other guy
如果你用AA=A或AA=\nA来替换AA=A,那么这个就会失效。
EDIT: I added a suggestion that isn't sensitive for some delimiter in the text. However this isn't using a "one line split" that OP was asking for, but this is how I should have done it if I would do it in bash, and want the result in an array.
编辑:我添加了一个对文本中某些分隔符不敏感的建议。然而,这并不是使用OP所要求的“一行分割”,而是我应该如何在bash中执行它,并希望得到一个数组中的结果。
script.sh (NEW):
脚本。sh(新):
#!/bin/bash
text=$(
echo "AA=A"; echo "AA =A"; echo "AA=\nA"; echo "B=BB"; echo "=======";
echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";
)
echo "more complex string"
echo "$text"
echo "split now"
c_split[0]=""
current=""
del=""
ind=0
# newline
newl=$'\n'
# Save IFS (not necessary when run as sub shell)
saveIFS="$IFS"
IFS="$newl"
for row in $text; do
if [[ $row =~ ^=+$ ]]; then
c_split[$ind]="$current"
((ind++))
current=""
# Avoid preceding newline
del=""
continue
fi
current+="$del$row"
del="$newl"
done
# Restore IFS
IFS="$saveIFS"
# If there is a last poor part of the text
if [[ -n $current ]]; then
c_split[$ind]="$current"
fi
# The result is an array
for i in "${c_split[@]}"
do
echo "---- new part ----"
echo "$i"
done
script.sh (OLD, with "one line split"):
(I stool the idea with awk from @Kent and adjusted it a bit)
脚本。sh(旧的,带有“一行分割”):(我用awk从@Kent中提取想法并对其进行调整)
#!/bin/bash
c=$(
echo "AA=A"; echo "AA =A"; echo "AA=\nA"; echo "B=BB"; echo "=======";
echo "C==CC"; echo "DD=D"; echo "======="; echo "EEE"; echo "FF";
)
echo "more complex string"
echo "$c"
echo "split now"
# Now, this will be almost absolute secure,
# perhaps except a direct hit by lightning.
del=""
for ch in $'\1' $'\2' $'\3' $'\4' $'\5' $'\6' $'\7'; do
if [ -z "`echo "$c" | grep "$ch"`" ]; then
del="$ch"
break
fi
done
if [ -z "$del" ]; then
echo "Sorry, all this testing but no delmiter to use..."
exit 1
fi
IFS="$del" c_split=($(echo "$c" | awk -vRS="\n=+\n" -vORS="$del" '1'))
for i in ${c_split[@]}
do
echo "---- new part ----"
echo "$i"
done
Output:
输出:
[244an]$ bash --version
GNU bash, version 4.2.24(1)-release (x86_64-pc-linux-gnu)
[244an]$ ./script.sh
more complex string
AA=A
AA =A
AA=\nA
B=BB
=======
C==CC
DD=D
=======
EEE
FF
split now
---- new part ----
AA=A
AA =A
AA=\nA
B=BB
---- new part ----
C==CC
DD=D
---- new part ----
EEE
FF
I'm not using -e
for echo
, to get AA=\\nA
to not do a newline
我不使用-e来表示echo,让AA=\\nA不做换行。