在bash脚本中，如何清理用户输入？

I'm looking for the best way to take a simple input:

我正在寻找一个简单输入的最佳方法：

echo -n "Enter a string here: "
read -e STRING

and clean it up by removing non-alphanumeric characters, lower(case), and replacing spaces with underscores.

并通过删除非字母数字字符，lower（case）和用下划线替换空格来清理它。

Does order matter? Is tr the best / only way to go about this?

订单有关系吗？这是最好/唯一的方法吗？

5 个解决方案

#1

As dj_segfault points out, the shell can do most of this for you. Looks like you'll have to fall back on something external for lower-casing the string, though. For this you have many options, like the perl one-liners above, etc., but I think tr is probably the simplest.

正如dj_segfault指出的那样，shell可以为你完成大部分工作。看起来你不得不退回一些外部的东西来降低弦线的外壳。为此你有很多选择，比如上面的perl单行等，但我认为tr可能是最简单的。

# first, strip underscores
CLEAN=${STRING//_/}
# next, replace spaces with underscores
CLEAN=${CLEAN// /_}
# now, clean out anything that's not alphanumeric or an underscore
CLEAN=${CLEAN//[^a-zA-Z0-9_]/}
# finally, lowercase with TR
CLEAN=`echo -n $CLEAN | tr A-Z a-z`

The order here is somewhat important. We want to get rid of underscores, plus replace spaces with underscores, so we have to be sure to strip underscores first. By waiting to pass things to tr until the end, we know we have only alphanumeric and underscores, and we can be sure we have no spaces, so we don't have to worry about special characters being interpreted by the shell.

这里的顺序有点重要。我们想要去掉下划线，再加上用下划线替换空格，所以我们必须先确保剥离下划线。通过等待将事物传递到tr直到结束，我们知道我们只有字母数字和下划线，并且我们可以确定我们没有空格，因此我们不必担心shell会解释特殊字符。

#2

Bash can do this all on it's own, thank you very much. If you look at the section of the man page on Parameter Expansion, you'll see that that bash has built-in substitutions, substring, trim, rtrim, etc.

Bash可以自己完成所有这些，非常感谢你。如果你查看参数扩展的手册页的部分，你会看到bash有内置替换，substring，trim，rtrim等。

To eliminate all non-alphanumeric characters, do

要消除所有非字母数字字符，请执行此操作

CLEANSTRING=${STRING//[^a-zA-Z0-9]/}

That's Occam's razor. No need to launch another process.

那是奥卡姆的剃刀。无需启动其他流程。

#3

Quick and dirty:

快而脏：

STRING=`echo 'dit /ZOU/ een test123' | perl -pe's/ //g;tr/[A-Z]/[a-z]/;s/[^a-zA-Z0-9]//g'`

STRING =`echo'dit / ZOU / een test123'| perl -pe's / // g; tr / [A-Z] / [a-z] /; s / [^ a-zA-Z0-9] // g'

#4

You could run it through perl.

你可以通过perl运行它。

export CLEANSTRING=$(perl -e 'print join( q//, map { s/\\s+/_/g; lc } split /[^\\s\\w]+/, \$ENV{STRING} )')

I'm using ksh-style subshell here, I'm not totally sure that it works in bash.

我在这里使用ksh风格的子shell，我不完全确定它在bash中有效。

That's the nice thing about shell, is that you can use perl, awk, sed, grep....

这是关于shell的好处，是你可以使用perl，awk，sed，grep ....

#5

After a bit of looking around it seems tr is indeed the simplest way:

经过一番环顾之后，似乎tr确实是最简单的方法：

export CLEANSTRING="`echo -n "${STRING}" | tr -cd '[:alnum:] [:space:]' | tr '[:space:]' '-'  | tr '[:upper:]' '[:lower:]'`"

Occam's razor, I suppose.

我猜想奥卡姆剃刀。

#1

# first, strip underscores
CLEAN=${STRING//_/}
# next, replace spaces with underscores
CLEAN=${CLEAN// /_}
# now, clean out anything that's not alphanumeric or an underscore
CLEAN=${CLEAN//[^a-zA-Z0-9_]/}
# finally, lowercase with TR
CLEAN=`echo -n $CLEAN | tr A-Z a-z`

#2