urlencode只有内置函数

时间:2023-01-15 21:45:47

Without using plpgsql, I'm trying to urlencode a given text within a pgsql SELECT statement.

在不使用plpgsql的情况下,我正在尝试在pgsql SELECT语句中对给定文本进行urlencode。

The problem with this approach:

这种方法的问题:

select regexp_replace('héllo there','([^A-Za-z0-9])','%' || encode(E'\\1','hex'),'g')

...is that the encode function is not passed the regexp parameter, unless there's another way to call functions from within the replacement expression that actually works. So I'm wondering if there's a replacement expression that, by itself, can encode matches into hex values.

...是编码函数没有传递regexp参数,除非有另一种方法从实际工作的替换表达式中调用函数。所以我想知道是否有一个替换表达式,它本身可以将匹配编码为十六进制值。

There may be other combinations of functions. I thought there would be a clever regex (and that may still be the answer) out there, but I'm having trouble finding it.

可能存在其他功能组合。我以为会有一个聪明的正则表达式(可能仍然是答案),但我找不到它。

4 个解决方案

#1


6  

select regexp_replace(encode('héllo there','hex'),'(..)',E'%\\1','g');

This doesn't leave the alphanumeric characters human-readable, though.

但是,这并不会使字母数字字符成为人类可读的字符。

#2


1  

Here's a function I wrote that handles encoding using built in functions while preserving the readability of the URL.

这是我编写的一个函数,它使用内置函数处理编码,同时保留URL的可读性。

Regex matches to capture pairs of (optional) safe characters and (at most one) non-safe character. Nested selects allow those pairs to be encoded and re-combined returning a fully encoded string.

正则表达式匹配以捕获(可选)安全字符对和(最多一个)非安全字符对。嵌套选择允许对这些对进行编码和重新组合,返回完全编码的字符串。

I've run through a test suite with all sorts of permutations (leading/trailing/only/repeated encoded characters and thus far it seems to encode correctly.

我经历了一个具有各种排列的测试套件(前导/尾随/仅/重复编码的字符,到目前为止它似乎正确编码。

The safe special characters are _ ~ . - and /. My inclusion of "/" on that list is probably non-standard, but fits the use case I have where the input text may be a path and I want that to remain.

安全的特殊字符是_~。 - 和/。我在该列表中包含“/”可能是非标准的,但适合我在输入文本可能是路径的用例,我希望保留它。

CREATE OR REPLACE FUNCTION oseberg.encode_uri(input text)
  RETURNS text
  LANGUAGE plpgsql
  IMMUTABLE STRICT
AS $function$
DECLARE
  parsed text;
  safePattern text;
BEGIN
  safePattern = 'a-zA-Z0-9_~/\-\.';
  IF input ~ ('[^' || safePattern || ']') THEN
    SELECT STRING_AGG(fragment, '')
    INTO parsed
    FROM (
      SELECT prefix || encoded AS fragment
      FROM (
        SELECT COALESCE(match[1], '') AS prefix,
               COALESCE('%' || encode(match[2]::bytea, 'hex'), '') AS encoded
        FROM (
          SELECT regexp_matches(
            input,
            '([' || safePattern || ']*)([^' || safePattern || '])?',
            'g') AS match
        ) matches
      ) parsed
    ) fragments;
    RETURN parsed;
  ELSE
    RETURN input;
  END IF;
END;
$function$

#3


0  

Here is pretty short version, and it's even "pure SQL" function, not plpgsql. Multibyte chars (including 3- and 4-bytes emoji) are supported.

这是非常短的版本,它甚至是“纯SQL”函数,而不是plpgsql。支持多字节字符(包括3字节和4字节表情符号)。

create or replace function urlencode(in_str text, OUT _result text) returns text as $$
  select
    string_agg(
      case
        when ol>1 or ch !~ '[0-9a-za-z:/@._?#-]+' 
          then regexp_replace(upper(substring(ch::bytea::text, 3)), '(..)', E'%\\1', 'g')
        else ch
      end,
      ''
    )
  from (
    select ch, octet_length(ch) as ol
    from regexp_split_to_table($1, '') as ch
  ) as s;
$$ language sql immutable strict;

#4


-3  

You can use CLR and import the namespace or use the function shown in this link , this creates a T-SQL function that does the encoding.

您可以使用CLR并导入命名空间或使用此链接中显示的函数,这将创建一个执行编码的T-SQL函数。

http://www.sqljunkies.com/WebLog/peter_debetta/archive/2007/03/09/28987.aspx

#1


6  

select regexp_replace(encode('héllo there','hex'),'(..)',E'%\\1','g');

This doesn't leave the alphanumeric characters human-readable, though.

但是,这并不会使字母数字字符成为人类可读的字符。

#2


1  

Here's a function I wrote that handles encoding using built in functions while preserving the readability of the URL.

这是我编写的一个函数,它使用内置函数处理编码,同时保留URL的可读性。

Regex matches to capture pairs of (optional) safe characters and (at most one) non-safe character. Nested selects allow those pairs to be encoded and re-combined returning a fully encoded string.

正则表达式匹配以捕获(可选)安全字符对和(最多一个)非安全字符对。嵌套选择允许对这些对进行编码和重新组合,返回完全编码的字符串。

I've run through a test suite with all sorts of permutations (leading/trailing/only/repeated encoded characters and thus far it seems to encode correctly.

我经历了一个具有各种排列的测试套件(前导/尾随/仅/重复编码的字符,到目前为止它似乎正确编码。

The safe special characters are _ ~ . - and /. My inclusion of "/" on that list is probably non-standard, but fits the use case I have where the input text may be a path and I want that to remain.

安全的特殊字符是_~。 - 和/。我在该列表中包含“/”可能是非标准的,但适合我在输入文本可能是路径的用例,我希望保留它。

CREATE OR REPLACE FUNCTION oseberg.encode_uri(input text)
  RETURNS text
  LANGUAGE plpgsql
  IMMUTABLE STRICT
AS $function$
DECLARE
  parsed text;
  safePattern text;
BEGIN
  safePattern = 'a-zA-Z0-9_~/\-\.';
  IF input ~ ('[^' || safePattern || ']') THEN
    SELECT STRING_AGG(fragment, '')
    INTO parsed
    FROM (
      SELECT prefix || encoded AS fragment
      FROM (
        SELECT COALESCE(match[1], '') AS prefix,
               COALESCE('%' || encode(match[2]::bytea, 'hex'), '') AS encoded
        FROM (
          SELECT regexp_matches(
            input,
            '([' || safePattern || ']*)([^' || safePattern || '])?',
            'g') AS match
        ) matches
      ) parsed
    ) fragments;
    RETURN parsed;
  ELSE
    RETURN input;
  END IF;
END;
$function$

#3


0  

Here is pretty short version, and it's even "pure SQL" function, not plpgsql. Multibyte chars (including 3- and 4-bytes emoji) are supported.

这是非常短的版本,它甚至是“纯SQL”函数,而不是plpgsql。支持多字节字符(包括3字节和4字节表情符号)。

create or replace function urlencode(in_str text, OUT _result text) returns text as $$
  select
    string_agg(
      case
        when ol>1 or ch !~ '[0-9a-za-z:/@._?#-]+' 
          then regexp_replace(upper(substring(ch::bytea::text, 3)), '(..)', E'%\\1', 'g')
        else ch
      end,
      ''
    )
  from (
    select ch, octet_length(ch) as ol
    from regexp_split_to_table($1, '') as ch
  ) as s;
$$ language sql immutable strict;

#4


-3  

You can use CLR and import the namespace or use the function shown in this link , this creates a T-SQL function that does the encoding.

您可以使用CLR并导入命名空间或使用此链接中显示的函数,这将创建一个执行编码的T-SQL函数。

http://www.sqljunkies.com/WebLog/peter_debetta/archive/2007/03/09/28987.aspx