用正则表达式查找`字符并替换它

时间:2023-01-20 19:26:11

I have certain text within a command \grk{} that looks like this:

命令\ grk {}中的某些文本看起来像这样:

\grk{s`u e@i `o qrist`os <o u<i`ws to~u jeo`u `ao~u z~wntos} 

I need to find all instances where there is a white space followed by ` and replace it with white space followed by the word XLFY

我需要找到所有存在空格的实例,后跟`并用空格替换后跟XLFY一词

The result from the above should be:

以上结果应该是:

\grk{s`u e@i XLFYo qrist`os <o u<i`ws to~u jeo`u XLFYao~u z~wntos} 

and all other instances of white space followed by ` outside \grk{} should be ignored.

并且应忽略所有其他空格后跟`outside \ grk {}的实例。

I got this far:

我到目前为止:

(?<=grk\{)(.*?)(?=\})

This finds and selects all the text within \grk{}

这将查找并选择\ grk {}中的所有文本

Any idea how I can just select the white space followed by the ` that is inside and replace it?

任何想法我怎么可以只选择白色空间,然后是里面的`并替换它?

2 个解决方案

#1


1  

If you have file with many \grk{} sections (and others), probably the fastest way to achieve the goal is what @Jan suggested. @noob regex is fine for single \grk{}.

如果您的文件包含许多\ grk {}部分(以及其他部分),那么实现目标的最快方法可能是@Jan建议的。 @noob正则表达式适用于单个\ grk {}。

The problem with (?<=grk\{)(.*?)(?=\}) is that you can't get fixed length lookbehind in most regex engines, so you can't ommit any text before " `". Take a look at this post.

(?<= grk \ {)(。*?)(?= \})的问题在于,在大多数正则表达式引擎中都无法获得固定长度的lookbehind,因此您不能在“`”之前省略任何文本。看看这篇文章。

You can also use bash script:

您还可以使用bash脚本:

#!/bin/bash
file=$1
newFile=$file"_replaced"
val=`cat $file`
regex="\\\grk\{(.*?)\}"

cp $file $newFile

grep -oP $regex $file | while read -r line; do
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    sed -i "s/$line/$replacement/g" $newFile
done

cat $newFile

which takes file as an argument and create file_replaced meeting your conditions.

它将文件作为参数并创建满足您条件的file_replaced。

EDIT: Run script for each file in directory:

编辑:为目录中的每个文件运行脚本:

for file in *; do ./replace.sh $file; done;

用于*中的文件; do ./replace.sh $ file;完成;

before that change the script, to it override existing file:

在更改脚本之前,它覆盖现有文件:

#!/bin/bash
file=$1
val=`cat $file`
regex="\\\grk\{(.*?)\}"

grep -oP $regex $file | while read -r line; do
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    sed -i "s/$line/$replacement/g" $file
done

But if you don't use any VCS, please make a backup of your files!

但如果您不使用任何VCS,请备份您的文件!

EDIT2: debug

#!/bin/bash
file=$1
val=`cat $file`
echo '--- file ---'
echo $val
regex="\\\grk\{(.*?)\}"
echo 'regex: '$regex
grep -oP $regex $file | while read -r line; do
    echo 'LINE:        '$line
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    echo 'REPLACEMENT: '$replacement
    sed -i "s/$line/$replacement/g" $file
done
echo '--- file after ---'
cat $file

#2


1  

You could pretty easily do it with the help of a programming language (some PHP code to show the concept, could be achieved with other languages as well), here's a code which takes the file content into account as well:

您可以在编程语言的帮助下轻松地完成它(一些PHP代码可以显示概念,也可以用其他语言实现),这里是一个代码,它也考虑了文件内容:

<?php
foreach(glob(".*txt") as $filename) {
    // load the file content 
    $content = file_get_contents($filename);
    $regex = '#\\\grk{[^}]+}#';

    $newContent = preg_replace_callback(
        $regex, 
        function($matches) {
            $regex = '#\h{1}`#';
            return preg_replace($regex, ' XLFY', $matches[0]);
        },
        $content);

    // write it back to the original file
    file_put_contents($filename, $newContent);
}
?>

The idea is to grab the text between grk and the curly braces in the first step, then to replace every occurence of a whitespace followed by "`".

我们的想法是在第一步中获取grk和花括号之间的文本,然后替换每个出现的空格,后跟“`”。

#1


1  

If you have file with many \grk{} sections (and others), probably the fastest way to achieve the goal is what @Jan suggested. @noob regex is fine for single \grk{}.

如果您的文件包含许多\ grk {}部分(以及其他部分),那么实现目标的最快方法可能是@Jan建议的。 @noob正则表达式适用于单个\ grk {}。

The problem with (?<=grk\{)(.*?)(?=\}) is that you can't get fixed length lookbehind in most regex engines, so you can't ommit any text before " `". Take a look at this post.

(?<= grk \ {)(。*?)(?= \})的问题在于,在大多数正则表达式引擎中都无法获得固定长度的lookbehind,因此您不能在“`”之前省略任何文本。看看这篇文章。

You can also use bash script:

您还可以使用bash脚本:

#!/bin/bash
file=$1
newFile=$file"_replaced"
val=`cat $file`
regex="\\\grk\{(.*?)\}"

cp $file $newFile

grep -oP $regex $file | while read -r line; do
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    sed -i "s/$line/$replacement/g" $newFile
done

cat $newFile

which takes file as an argument and create file_replaced meeting your conditions.

它将文件作为参数并创建满足您条件的file_replaced。

EDIT: Run script for each file in directory:

编辑:为目录中的每个文件运行脚本:

for file in *; do ./replace.sh $file; done;

用于*中的文件; do ./replace.sh $ file;完成;

before that change the script, to it override existing file:

在更改脚本之前,它覆盖现有文件:

#!/bin/bash
file=$1
val=`cat $file`
regex="\\\grk\{(.*?)\}"

grep -oP $regex $file | while read -r line; do
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    sed -i "s/$line/$replacement/g" $file
done

But if you don't use any VCS, please make a backup of your files!

但如果您不使用任何VCS,请备份您的文件!

EDIT2: debug

#!/bin/bash
file=$1
val=`cat $file`
echo '--- file ---'
echo $val
regex="\\\grk\{(.*?)\}"
echo 'regex: '$regex
grep -oP $regex $file | while read -r line; do
    echo 'LINE:        '$line
    replacement=`echo $line | sed -r 's/(\s)\`/\1XLFY/g'`
    echo 'REPLACEMENT: '$replacement
    sed -i "s/$line/$replacement/g" $file
done
echo '--- file after ---'
cat $file

#2


1  

You could pretty easily do it with the help of a programming language (some PHP code to show the concept, could be achieved with other languages as well), here's a code which takes the file content into account as well:

您可以在编程语言的帮助下轻松地完成它(一些PHP代码可以显示概念,也可以用其他语言实现),这里是一个代码,它也考虑了文件内容:

<?php
foreach(glob(".*txt") as $filename) {
    // load the file content 
    $content = file_get_contents($filename);
    $regex = '#\\\grk{[^}]+}#';

    $newContent = preg_replace_callback(
        $regex, 
        function($matches) {
            $regex = '#\h{1}`#';
            return preg_replace($regex, ' XLFY', $matches[0]);
        },
        $content);

    // write it back to the original file
    file_put_contents($filename, $newContent);
}
?>

The idea is to grab the text between grk and the curly braces in the first step, then to replace every occurence of a whitespace followed by "`".

我们的想法是在第一步中获取grk和花括号之间的文本,然后替换每个出现的空格,后跟“`”。