Notepad ++正则表达式替换混合字符

时间:2021-08-12 23:36:42

Please, help me to write a regular expression for this kind of text in Notepad

请帮我在记事本中为这种文本写一个正则表达式

I have a text with mixed russian and german sentences and words, and I need to insert before a sentence in russian tag <"RUSSIAN"> and before a sentence in german tag <"GERMAN">. Like this:

我有一个混合俄语和德语句子和单词的文本,我需要在俄语标签<“RUSSIAN”>中的一个句子之前和德语标签<“GERMAN”>中的句子之前插入。喜欢这个:

INPUT:

Текст на русском, раз два три, german text - русский текст: german text - some other german text русский текст = еще русский текст. Длинный текст на русском. A long text on german

OUTPUT:

<"RUSSIAN">Текст на русском, раз два три, <"GERMAN">german text - <"RUSSIAN">русский текст: <"GERMAN">german text - some other german text <"RUSSIAN">русский текст = еще русский текст. Длинный текст на русском. <"GERMAN">A long text on german

I guess it could be done somehow by searching

我猜它可以通过搜索以某种方式完成

cyrillics characters like "А,а,Б,б,В,в,Г,г,Д,д,Е,е,Ё,ё,Ж,ж,З,з,И,и,Й,й,К,к,Л,л,М,м,Н,н,О,о,П,п,Р,р,С,с,Т,т,У,у,Ф,ф,Х,х,Ц,ц,Ч,ч,Ш,ш,Щ,щ,Ъ,ъ,Ы,ы,Ь,ь,Э,э,Ю,ю,Я,я"

西里尔字母,如“А,а,Б,б,В,в,Г,г,Д,д,Е,е,Ё,ё,Ж,ж,З,з,И,и,Й,й,К, к,л,л,М,м,Н,н,О,о,П,п,р,р,с,с,Т,т,У,у,Ф,ф,х,х,Ц,ц, ч,ч,Ш,ш,щ,щ,ъ,ъ,ы,ы,ь,ь,э,э,Ю,ю,я,я”

and german characters like "A,a,B,b,C,c,D,d,E,e,F,f,G,g,H,h,I,i,J,j,K,k,L,l,M,m,N,n,O,o,P,p,Q,q,R,r,S,s,T,t,U,u,V,v,W,w,X,x,Y,y,Z,z,A,a,O,o,U,u,?"

和德语字符如“A,a,B,b,C,c,D,d,E,e,F,f,G,g,H,h,I,i,J,j,K,k,L ,L,M,M,N,N,O,O,p,p,Q,Q,R,R,S,S,T,T,U,U,V,V,W,W,X,X ,Y,Y,Z,Z,A,A,O,O,U,U,?”

1 个解决方案

#1


3  

Punctuation & numbers make this a bit iffy but you can match any Cyrillic character & capture until a latin character;

标点符号和数字使这有点不确定,但你可以匹配任何西里尔字符和捕获,直到拉丁字符;

Find: ([А-я].+?)([a-z])
Replace with: <ru>\1</ru>\2

查找:([А-я]。+?)([a-z])替换为: \ 1 \ 2

Then the other language is between </ru> and <ru>.

然后另一种语言介于 和 之间。

#1


3  

Punctuation & numbers make this a bit iffy but you can match any Cyrillic character & capture until a latin character;

标点符号和数字使这有点不确定,但你可以匹配任何西里尔字符和捕获,直到拉丁字符;

Find: ([А-я].+?)([a-z])
Replace with: <ru>\1</ru>\2

查找:([А-я]。+?)([a-z])替换为: \ 1 \ 2

Then the other language is between </ru> and <ru>.

然后另一种语言介于 和 之间。