I have some smart script, that check name of server and get domain name. For example, i have name of server: example.ru01. I need to get: example.ru My scipt:
我有一些智能脚本,检查服务器名称和获取域名。例如,我有服务器的名称:example.ru01。我需要得到:example.ru我的scipt:
#!/bin/bash
hostname=example.com01
echo $hostname
reg0="\(\(\w*\.[a-z]*\)\|\(\w*\.[a-z]*\.[a-z]*\)\)"
domain=`expr match $hostname $reg0`
echo $domain
It is ok. in output i have:
没关系。在输出我有:
example.com01
example.com
But, in my infrastructure, i have some domains with hyphens. For example: test-test.com01. But it doesn't working in my script. How to resolve this problem ? Please help. I made some changes in my regular expression, like this:
但是,在我的基础设施中,我有一些带连字符的域名。例如:test-test.com01。但它在我的脚本中不起作用。如何解决这个问题?请帮忙。我在正则表达式中做了一些更改,如下所示:
\(\(\w*\.[a-z_-]*\)\|\(\w*\.[a-z_-]*\.[a-z_-]*\)\)
But it doesn't work. Where i have error ? Please help. Thanks for your attention.
但它不起作用。哪里有错误?请帮忙。感谢您的关注。
2 个解决方案
#1
1
Yes, test-test.com01
will not match.
是的,test-test.com01不匹配。
However, www.test-test.com01
will:
但是,www.test-test.com01将:
$ hostname="www.test-test.com01"
$ reg0="\(\(\w*\.[a-z_-]*\)\|\(\w*\.[a-z_-]*\.[a-z_-]*\)\)"
$ expr match $hostname $reg0
www.test-test.com
The problem is that you are requiring an optional w
(zero or more times) and a dot \.
.
问题是你需要一个可选的w(零次或多次)和一个点\ ..
Well, in fact, what you wrote is "a word" \w
, in this case you should remove the backslash if what you want to match is the "www".
嗯,事实上,你写的是“一个字”\ w,在这种情况下,你应该删除反斜杠,如果你想匹配的是“www”。
Also, underscores are incorrect in a domain name. This is the correct regex that you should use:
此外,域名中的下划线不正确。这是你应该使用的正确的正则表达式:
reg0="\(\(w\{1,3\}\.\)\?[a-z-]\+\(\.[a-z-]*\)\?\)"
In this one, the www.
is matched optionally and then one or (optionally) two names with a dot in between.
在这一个,www。可选地匹配,然后匹配一个或(可选地)两个名称,其间带有点。
However, domain names could include numbers: www.1and1.com
但是,域名可以包含数字:www.1and1.com
And, in fact, Watch out !! now they could contain any valid UTF-8 string:
事实上,小心!!现在它们可以包含任何有效的UTF-8字符串:
From section 3.3 of RFC 6531:
从RFC 6531的3.3节:
The definition of is extended to permit both the RFC 5321 definition and a UTF-8 string in a DNS label that conforms with IDNA definitions [RFC5890].
扩展的定义允许RFC 5321定义和DNS标签中的UTF-8字符串符合IDNA定义[RFC5890]。
And section 2.3.2.1 of RFC 5890
RFC 5890的2.3.2.1节
A "U-label" is an IDNA-valid string of Unicode characters, in Normalization Form C (NFC) and including at least one non-ASCII character, expressed in a standard Unicode Encoding Form (such as UTF-8).
“U-label”是IDNA有效的Unicode字符串,采用标准化表格C(NFC)并包含至少一个非ASCII字符,以标准Unicode编码格式(如UTF-8)表示。
#2
0
You are on the right track, the little problem that you had is that you added -
to the part of Regex that is responsible for matching the last part of the domain such as .com
, .net
or .ru
. Instead, you should add -
to the first part of regex. This should work:
你正走在正确的轨道上,你所遇到的一个小问题就是你所添加的部分 - 负责匹配域的最后一部分,如.com,.net或.ru。相反,你应该添加 - 正则表达式的第一部分。这应该工作:
req0="\(\(\[a-z0-9_-]*\.[a-z]*\)\|\([a-z0-9_-]*\.[a-z0-9_-]*\.[a-z]*\)\)"
This regex [a-z0-9_]
can be shortened using this token \w
, and it works without any problem. However, this token \w
does not seem to function inside []
in bash, therefore, I used [a-z0-9_]
in order to add -
.
使用此令牌\ w可以缩短此正则表达式[a-z0-9_],并且它可以正常工作。但是,这个标记\ w似乎在bash中的[]内部不起作用,因此,我使用[a-z0-9_]来添加 - 。
#1
1
Yes, test-test.com01
will not match.
是的,test-test.com01不匹配。
However, www.test-test.com01
will:
但是,www.test-test.com01将:
$ hostname="www.test-test.com01"
$ reg0="\(\(\w*\.[a-z_-]*\)\|\(\w*\.[a-z_-]*\.[a-z_-]*\)\)"
$ expr match $hostname $reg0
www.test-test.com
The problem is that you are requiring an optional w
(zero or more times) and a dot \.
.
问题是你需要一个可选的w(零次或多次)和一个点\ ..
Well, in fact, what you wrote is "a word" \w
, in this case you should remove the backslash if what you want to match is the "www".
嗯,事实上,你写的是“一个字”\ w,在这种情况下,你应该删除反斜杠,如果你想匹配的是“www”。
Also, underscores are incorrect in a domain name. This is the correct regex that you should use:
此外,域名中的下划线不正确。这是你应该使用的正确的正则表达式:
reg0="\(\(w\{1,3\}\.\)\?[a-z-]\+\(\.[a-z-]*\)\?\)"
In this one, the www.
is matched optionally and then one or (optionally) two names with a dot in between.
在这一个,www。可选地匹配,然后匹配一个或(可选地)两个名称,其间带有点。
However, domain names could include numbers: www.1and1.com
但是,域名可以包含数字:www.1and1.com
And, in fact, Watch out !! now they could contain any valid UTF-8 string:
事实上,小心!!现在它们可以包含任何有效的UTF-8字符串:
From section 3.3 of RFC 6531:
从RFC 6531的3.3节:
The definition of is extended to permit both the RFC 5321 definition and a UTF-8 string in a DNS label that conforms with IDNA definitions [RFC5890].
扩展的定义允许RFC 5321定义和DNS标签中的UTF-8字符串符合IDNA定义[RFC5890]。
And section 2.3.2.1 of RFC 5890
RFC 5890的2.3.2.1节
A "U-label" is an IDNA-valid string of Unicode characters, in Normalization Form C (NFC) and including at least one non-ASCII character, expressed in a standard Unicode Encoding Form (such as UTF-8).
“U-label”是IDNA有效的Unicode字符串,采用标准化表格C(NFC)并包含至少一个非ASCII字符,以标准Unicode编码格式(如UTF-8)表示。
#2
0
You are on the right track, the little problem that you had is that you added -
to the part of Regex that is responsible for matching the last part of the domain such as .com
, .net
or .ru
. Instead, you should add -
to the first part of regex. This should work:
你正走在正确的轨道上,你所遇到的一个小问题就是你所添加的部分 - 负责匹配域的最后一部分,如.com,.net或.ru。相反,你应该添加 - 正则表达式的第一部分。这应该工作:
req0="\(\(\[a-z0-9_-]*\.[a-z]*\)\|\([a-z0-9_-]*\.[a-z0-9_-]*\.[a-z]*\)\)"
This regex [a-z0-9_]
can be shortened using this token \w
, and it works without any problem. However, this token \w
does not seem to function inside []
in bash, therefore, I used [a-z0-9_]
in order to add -
.
使用此令牌\ w可以缩短此正则表达式[a-z0-9_],并且它可以正常工作。但是,这个标记\ w似乎在bash中的[]内部不起作用,因此,我使用[a-z0-9_]来添加 - 。