Basically, I need to get those rows which contain domain and subdomain name from a URL or the whole website name excluding www
.
基本上,我需要从URL或除www以外的整个网站名称获取包含域和子域名的行。
My DB table looks like this:
我的数据库表看起来像这样:
+----------+------------------------+
| id | website |
+----------+------------------------+
| 1 | https://www.google.com |
+----------+------------------------+
| 2 | http://www.google.co.in|
+----------+------------------------+
| 3 | www.google.com |
+----------+------------------------+
| 4 | www.google.co.in |
+----------+------------------------+
| 5 | google.com |
+----------+------------------------+
| 6 | google.co.in |
+----------+------------------------+
| 7 | http://google.co.in |
+----------+------------------------+
Expected output:
预期产量:
google.com
google.co.in
google.com
google.co.in
google.com
google.co.in
google.co.in
My Postgres Query looks like this:
我的Postgres查询看起来像这样:
select id, substring(website from '.*://([^/]*)') as website_domain from contacts
But above query give blank websites. So, how I can get the desired output?
但上面的查询给出了空白网站。那么,我如何获得所需的输出?
2 个解决方案
#1
2
You may use
你可以用
SELECT REGEXP_REPLACE(website, '^(https?://)?(www\.)?', '') from tbl;
See the regex demo.
请参阅正则表达式演示。
Details
细节
-
^
- start of string - ^ - 字符串的开头
-
(https?://)?
- 1 or 0 occurrences ofhttp://
orhttps://
- (HTTPS://)? - 出现1或0次http://或https://
-
(www\.)?
- 1 or 0 occurrences ofwww.
- (万维网\。)? - 1或0次出现的www。
See the PostgreSQL demo:
查看PostgreSQL演示:
CREATE TABLE tb1
(website character varying)
;
INSERT INTO tb1
(website)
VALUES
('https://www.google.com'),
('http://www.google.co.in'),
('www.google.com'),
('www.google.co.in'),
('google.com'),
('google.co.in'),
('http://google.co.in')
;
SELECT REGEXP_REPLACE(website, '^(https?://)?(www\.)?', '') from tb1;
Result:
结果:
#2
4
you must use the "non capturing" match ?: to cope with the non "http://" websites
你必须使用“非捕获”匹配?:以应对非“http://”网站
like
喜欢
select
id,
substring(website from '(?:.*://)?(?:www\.)?([^/]*)')
as website_domain
from contacts
http://sqlfiddle.com/#!17/197fb/14
http://sqlfiddle.com/#!17/197fb/14
https://www.postgresql.org/docs/9.3/static/functions-matching.html#POSIX-ATOMS-TABLE
https://www.postgresql.org/docs/9.3/static/functions-matching.html#POSIX-ATOMS-TABLE
#1
2
You may use
你可以用
SELECT REGEXP_REPLACE(website, '^(https?://)?(www\.)?', '') from tbl;
See the regex demo.
请参阅正则表达式演示。
Details
细节
-
^
- start of string - ^ - 字符串的开头
-
(https?://)?
- 1 or 0 occurrences ofhttp://
orhttps://
- (HTTPS://)? - 出现1或0次http://或https://
-
(www\.)?
- 1 or 0 occurrences ofwww.
- (万维网\。)? - 1或0次出现的www。
See the PostgreSQL demo:
查看PostgreSQL演示:
CREATE TABLE tb1
(website character varying)
;
INSERT INTO tb1
(website)
VALUES
('https://www.google.com'),
('http://www.google.co.in'),
('www.google.com'),
('www.google.co.in'),
('google.com'),
('google.co.in'),
('http://google.co.in')
;
SELECT REGEXP_REPLACE(website, '^(https?://)?(www\.)?', '') from tb1;
Result:
结果:
#2
4
you must use the "non capturing" match ?: to cope with the non "http://" websites
你必须使用“非捕获”匹配?:以应对非“http://”网站
like
喜欢
select
id,
substring(website from '(?:.*://)?(?:www\.)?([^/]*)')
as website_domain
from contacts
http://sqlfiddle.com/#!17/197fb/14
http://sqlfiddle.com/#!17/197fb/14
https://www.postgresql.org/docs/9.3/static/functions-matching.html#POSIX-ATOMS-TABLE
https://www.postgresql.org/docs/9.3/static/functions-matching.html#POSIX-ATOMS-TABLE