In a project we have text files looking like this:
在一个项目中,我们有如下文本文件:
mv A, R3
mv R2, B
mv R1, R3
mv B, R4
add A, R1
add B, R1
add R1, R2
add R3, R3
add R21, X
add R12, Y
mv X, R2
I need to replace the strings according to the following, but I am looking for a more general solution.
我需要根据以下内容替换字符串,但我正在寻找更通用的解决方案。
R1 => R2
R2 => R3
R3 => R1
R12 => R21
R21 => R12
I know I could do it in Perl, the replace() function in the following code, but the real application is written in Java, so the solution needs to be in Java as well.
我知道我可以在Perl中执行它,在以下代码中使用replace()函数,但真正的应用程序是用Java编写的,因此解决方案也需要在Java中。
#!/usr/bin/perl
use strict;
use warnings;
use File::Slurp qw(read_file write_file);
my %map = (
R1 => 'R2',
R2 => 'R3',
R3 => 'R1',
R12 => 'R21',
R21 => 'R12',
);
replace(\%map, \@ARGV);
sub replace {
my ($map, $files) = @_;
# Create R12|R21|R1|R2|R3
# making sure R12 is before R1
my $regex = join "|",
sort { length($b) <=> length($a) }
keys %$map;
my $ts = time;
foreach my $file (@$files) {
my $data = read_file($file);
$data =~ s/\b($regex)\b/$map{$1}/g;
rename $file, "$file.$ts"; # backup with current timestamp
write_file( $file, $data);
}
}
Your help for the Java implementation would be appreciated.
您对Java实现的帮助将不胜感激。
5 个解决方案
#1
5
I've actually had to use this sort of algorithm several times in the past two weeks. So here it is the world's second-most verbose language...
在过去的两周里,我实际上不得不多次使用这种算法。所以这是世界上第二大的冗长语言......
import java.util.HashMap;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
/*
R1 => R2
R2 => R3
R3 => R1
R12 => R21
R21 => R12
*/
String inputString
= "mv A, R3\n"
+ "mv R2, B\n"
+ "mv R1, R3\n"
+ "mv B, R4\n"
+ "add A, R1\n"
+ "add B, R1\n"
+ "add R1, R2\n"
+ "add R3, R3\n"
+ "add R21, X\n"
+ "add R12, Y\n"
+ "mv X, R2"
;
System.out.println( "inputString = \"" + inputString + "\"" );
HashMap h = new HashMap();
h.put( "R1", "R2" );
h.put( "R2", "R3" );
h.put( "R3", "R1" );
h.put( "R12", "R21" );
h.put( "R21", "R12" );
Pattern p = Pattern.compile( "\\b(R(?:12?|21?|3))\\b");
Matcher m = p.matcher( inputString );
StringBuffer sbuff = new StringBuffer();
int lastEnd = 0;
while ( m.find()) {
int mstart = m.start();
if ( lastEnd < mstart ) {
sbuff.append( inputString.substring( lastEnd, mstart ));
}
String key = m.group( 1 );
String value = (String)h.get( key );
sbuff.append( value );
lastEnd = m.end();
}
if ( lastEnd < inputString.length() ) {
sbuff.append( inputString.substring( lastEnd ));
}
System.out.println( "sbuff = \"" + sbuff + "\"" );
This can be Java-ified by these classes:
这可以是Java-ified这些类:
import java.util.Comparator;
import java.util.Iterator;
import java.util.Map;
import java.util.TreeSet;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
interface StringReplacer {
public CharSequence getReplacement( Matcher matcher );
}
class Replacementifier {
static Comparator keyComparator = new Comparator() {
public int compare( Object o1, Object o2 ) {
String s1 = (String)o1;
String s2 = (String)o2;
int diff = s1.length() - s2.length();
return diff != 0 ? diff : s1.compareTo( s2 );
}
};
Map replaceMap = null;
public Replacementifier( Map aMap ) {
if ( aMap != null ) {
setReplacements( aMap );
}
}
public setReplacements( Map aMap ) {
replaceMap = aMap;
}
private static String createKeyExpression( Map m ) {
Set set = new TreeSet( keyComparator );
set.addAll( m.keySet());
Iterator sit = set.iterator();
StringBuffer sb = new StringBuffer( "(" + sit.next());
while ( sit.hasNext()) {
sb.append( "|" ).append( sit.next());
}
sb.append( ")" );
return sb.toString();
}
public String replace( Pattern pattern, CharSequence input, StringReplacer replaceFilter ) {
StringBuffer output = new StringBuffer();
Matcher matcher = pattern.matcher( inputString );
int lastEnd = 0;
while ( matcher.find()) {
int mstart = matcher.start();
if ( lastEnd < mstart ) {
output.append( inputString.substring( lastEnd, mstart ));
}
CharSequence cs = replaceFilter.getReplacement( matcher );
if ( cs != null ) {
output.append( cs );
}
lastEnd = matcher.end();
}
if ( lastEnd < inputString.length() ) {
sbuff.append( inputString.substring( lastEnd ));
}
}
public String replace( Map rMap, CharSequence input ) {
// pre-condition
if ( rMap == null && replaceMap == null ) return input;
Map repMap = rMap != null ? rMap : replaceMap;
Pattern pattern
= Pattern.compile( createKeyExpression( repMap ))
;
StringReplacer replacer = new StringReplacer() {
public CharSequence getReplacement( Matcher matcher ) {
String key = matcher.group( 1 );
return (String)repMap.get( key );
}
};
return replace( pattern, input, replacer );
}
}
#2
2
The perl solution has an advantage of replacing all strings in one shot, sort of "transactionally". If you don't have the same option in Java (and I can't think of a way make it happen), you need to be careful of replacing R1=>R2, then R2=>R3. In that case, both R1 and R2 end up being replaced with R3.
perl解决方案的优势在于可以一次性替换所有字符串,即“事务性”。如果你在Java中没有相同的选项(我想不出让它成为现实的方法),你需要小心替换R1 => R2,然后R2 => R3。在这种情况下,R1和R2最终都被R3取代。
#3
0
Here's a less verbose way to do this in one pass, using Matcher's lower-level API: appendReplacement()
and appendTail()
.
这是使用Matcher的低级API:appendReplacement()和appendTail()在一次传递中执行此操作的简单方法。
import java.util.*;
import java.util.regex.*;
public class Test
{
public static void main(String[] args) throws Exception
{
String inputString
= "mv A, R3\n"
+ "mv R2, B\n"
+ "mv R1, R3\n"
+ "mv B, R4\n"
+ "add A, R1\n"
+ "add B, R1\n"
+ "add R1, R2\n"
+ "add R3, R3\n"
+ "add R21, X\n"
+ "add R12, Y\n"
+ "mv X, R2"
;
System.out.println(inputString);
System.out.println();
System.out.println(doReplace(inputString));
}
public static String doReplace(String str)
{
Map<String, String> map = new HashMap<String, String>()
{{
put("R1", "R2");
put("R2", "R3");
put("R3", "R1");
put("R12", "R21");
put("R21", "R12");
}};
Pattern p = Pattern.compile("\\bR\\d\\d?\\b");
Matcher m = p.matcher(str);
StringBuffer sb = new StringBuffer();
while (m.find())
{
String repl = map.get(m.group());
if (repl != null)
{
m.appendReplacement(sb, "");
sb.append(repl);
}
}
m.appendTail(sb);
return sb.toString();
}
}
Note that appendReplacement()
processes the replacement string to replace $n sequences with text from capture groups, which we don't want in this case. To avoid that, I pass it an empty string, then use StringBuffer's append()
method instead.
请注意,appendReplacement()处理替换字符串以使用捕获组中的文本替换$ n序列,在这种情况下我们不需要这些文本。为了避免这种情况,我将它传递给一个空字符串,然后使用StringBuffer的append()方法。
Elliott Hughes has published a pre-packaged implementation of this technique here. (He tends to throw in references to other utility classes he's written, so you may want to delete the tests in his main()
method before you compile it.)
Elliott Hughes在这里发布了这种技术的预打包实现。 (他倾向于引用他编写的其他实用程序类,因此您可能希望在编译之前删除main()方法中的测试。)
#4
0
My suggestion would be replacing the strings while reading from the file itself You can use RandomAccessFile. While reading from the file character by character, You can actually check for the pattern and then do the replacement then and there itself. And then you can write all the content at once into the file. I think this will save you more time.
我的建议是在从文件本身读取时替换字符串您可以使用RandomAccessFile。在逐字符地从文件中读取时,您实际上可以检查模式,然后自己进行替换。然后,您可以将所有内容一次性写入文件。我想这会为你节省更多时间。
#5
-2
You can use a HashMap:
您可以使用HashMap:
Map<String, String> map = new HashMap<String, String>();
map.put("R1", "R2");
map.put("R2", "R3");
for(String key: map.keySet()) {
str.replaceAll(key, map.get(key));
}
replaceAll also handles regular expressions.
replaceAll也处理正则表达式。
EDIT: The above solution, as many have pointed out, doesn't work because it doesn't handle cyclic replacements. So this is my second approach:
编辑:正如许多人所指出的,上述解决方案不起作用,因为它不处理循环替换。所以这是我的第二种方法:
public class Replacement {
private String newS;
private String old;
public Replacement(String old, String newS) {
this.newS = newS;
this.old = old;
}
public String getOld() {
return old;
}
public String getNew() {
return newS;
}
}
SortedMap<Integer, Replacement> map = new TreeMap<Integer, Replacement>();
map.put(new Integer(1), new Replacement("R2", "R3"));
map.put(new Integer(2), new Replacement("R1", "R2"));
for(Integer key: map.keySet()) {
str.replaceAll(map.get(key).getOld(), map.get(key).getNew());
}
This works provided that you order the replacements properly and that you guard yourself against cyclic replacements. Some replacements are impossible:
如果您正确地订购了替换件并且保护自己免受循环替换,则此工作正常。有些替换是不可能的:
R1 -> R2
R2 -> R3
R3 -> R1
You must use some 'temp' variables for these:
你必须使用一些'temp'变量:
R1 -> R@1
R2 -> R@3
R3 -> R1
R@(\d{1}) -> R\1
You could write a library that it would do all these for you.
您可以编写一个库,它会为您完成所有这些操作。
#1
5
I've actually had to use this sort of algorithm several times in the past two weeks. So here it is the world's second-most verbose language...
在过去的两周里,我实际上不得不多次使用这种算法。所以这是世界上第二大的冗长语言......
import java.util.HashMap;
import java.util.regex.Pattern;
import java.util.regex.Matcher;
/*
R1 => R2
R2 => R3
R3 => R1
R12 => R21
R21 => R12
*/
String inputString
= "mv A, R3\n"
+ "mv R2, B\n"
+ "mv R1, R3\n"
+ "mv B, R4\n"
+ "add A, R1\n"
+ "add B, R1\n"
+ "add R1, R2\n"
+ "add R3, R3\n"
+ "add R21, X\n"
+ "add R12, Y\n"
+ "mv X, R2"
;
System.out.println( "inputString = \"" + inputString + "\"" );
HashMap h = new HashMap();
h.put( "R1", "R2" );
h.put( "R2", "R3" );
h.put( "R3", "R1" );
h.put( "R12", "R21" );
h.put( "R21", "R12" );
Pattern p = Pattern.compile( "\\b(R(?:12?|21?|3))\\b");
Matcher m = p.matcher( inputString );
StringBuffer sbuff = new StringBuffer();
int lastEnd = 0;
while ( m.find()) {
int mstart = m.start();
if ( lastEnd < mstart ) {
sbuff.append( inputString.substring( lastEnd, mstart ));
}
String key = m.group( 1 );
String value = (String)h.get( key );
sbuff.append( value );
lastEnd = m.end();
}
if ( lastEnd < inputString.length() ) {
sbuff.append( inputString.substring( lastEnd ));
}
System.out.println( "sbuff = \"" + sbuff + "\"" );
This can be Java-ified by these classes:
这可以是Java-ified这些类:
import java.util.Comparator;
import java.util.Iterator;
import java.util.Map;
import java.util.TreeSet;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
interface StringReplacer {
public CharSequence getReplacement( Matcher matcher );
}
class Replacementifier {
static Comparator keyComparator = new Comparator() {
public int compare( Object o1, Object o2 ) {
String s1 = (String)o1;
String s2 = (String)o2;
int diff = s1.length() - s2.length();
return diff != 0 ? diff : s1.compareTo( s2 );
}
};
Map replaceMap = null;
public Replacementifier( Map aMap ) {
if ( aMap != null ) {
setReplacements( aMap );
}
}
public setReplacements( Map aMap ) {
replaceMap = aMap;
}
private static String createKeyExpression( Map m ) {
Set set = new TreeSet( keyComparator );
set.addAll( m.keySet());
Iterator sit = set.iterator();
StringBuffer sb = new StringBuffer( "(" + sit.next());
while ( sit.hasNext()) {
sb.append( "|" ).append( sit.next());
}
sb.append( ")" );
return sb.toString();
}
public String replace( Pattern pattern, CharSequence input, StringReplacer replaceFilter ) {
StringBuffer output = new StringBuffer();
Matcher matcher = pattern.matcher( inputString );
int lastEnd = 0;
while ( matcher.find()) {
int mstart = matcher.start();
if ( lastEnd < mstart ) {
output.append( inputString.substring( lastEnd, mstart ));
}
CharSequence cs = replaceFilter.getReplacement( matcher );
if ( cs != null ) {
output.append( cs );
}
lastEnd = matcher.end();
}
if ( lastEnd < inputString.length() ) {
sbuff.append( inputString.substring( lastEnd ));
}
}
public String replace( Map rMap, CharSequence input ) {
// pre-condition
if ( rMap == null && replaceMap == null ) return input;
Map repMap = rMap != null ? rMap : replaceMap;
Pattern pattern
= Pattern.compile( createKeyExpression( repMap ))
;
StringReplacer replacer = new StringReplacer() {
public CharSequence getReplacement( Matcher matcher ) {
String key = matcher.group( 1 );
return (String)repMap.get( key );
}
};
return replace( pattern, input, replacer );
}
}
#2
2
The perl solution has an advantage of replacing all strings in one shot, sort of "transactionally". If you don't have the same option in Java (and I can't think of a way make it happen), you need to be careful of replacing R1=>R2, then R2=>R3. In that case, both R1 and R2 end up being replaced with R3.
perl解决方案的优势在于可以一次性替换所有字符串,即“事务性”。如果你在Java中没有相同的选项(我想不出让它成为现实的方法),你需要小心替换R1 => R2,然后R2 => R3。在这种情况下,R1和R2最终都被R3取代。
#3
0
Here's a less verbose way to do this in one pass, using Matcher's lower-level API: appendReplacement()
and appendTail()
.
这是使用Matcher的低级API:appendReplacement()和appendTail()在一次传递中执行此操作的简单方法。
import java.util.*;
import java.util.regex.*;
public class Test
{
public static void main(String[] args) throws Exception
{
String inputString
= "mv A, R3\n"
+ "mv R2, B\n"
+ "mv R1, R3\n"
+ "mv B, R4\n"
+ "add A, R1\n"
+ "add B, R1\n"
+ "add R1, R2\n"
+ "add R3, R3\n"
+ "add R21, X\n"
+ "add R12, Y\n"
+ "mv X, R2"
;
System.out.println(inputString);
System.out.println();
System.out.println(doReplace(inputString));
}
public static String doReplace(String str)
{
Map<String, String> map = new HashMap<String, String>()
{{
put("R1", "R2");
put("R2", "R3");
put("R3", "R1");
put("R12", "R21");
put("R21", "R12");
}};
Pattern p = Pattern.compile("\\bR\\d\\d?\\b");
Matcher m = p.matcher(str);
StringBuffer sb = new StringBuffer();
while (m.find())
{
String repl = map.get(m.group());
if (repl != null)
{
m.appendReplacement(sb, "");
sb.append(repl);
}
}
m.appendTail(sb);
return sb.toString();
}
}
Note that appendReplacement()
processes the replacement string to replace $n sequences with text from capture groups, which we don't want in this case. To avoid that, I pass it an empty string, then use StringBuffer's append()
method instead.
请注意,appendReplacement()处理替换字符串以使用捕获组中的文本替换$ n序列,在这种情况下我们不需要这些文本。为了避免这种情况,我将它传递给一个空字符串,然后使用StringBuffer的append()方法。
Elliott Hughes has published a pre-packaged implementation of this technique here. (He tends to throw in references to other utility classes he's written, so you may want to delete the tests in his main()
method before you compile it.)
Elliott Hughes在这里发布了这种技术的预打包实现。 (他倾向于引用他编写的其他实用程序类,因此您可能希望在编译之前删除main()方法中的测试。)
#4
0
My suggestion would be replacing the strings while reading from the file itself You can use RandomAccessFile. While reading from the file character by character, You can actually check for the pattern and then do the replacement then and there itself. And then you can write all the content at once into the file. I think this will save you more time.
我的建议是在从文件本身读取时替换字符串您可以使用RandomAccessFile。在逐字符地从文件中读取时,您实际上可以检查模式,然后自己进行替换。然后,您可以将所有内容一次性写入文件。我想这会为你节省更多时间。
#5
-2
You can use a HashMap:
您可以使用HashMap:
Map<String, String> map = new HashMap<String, String>();
map.put("R1", "R2");
map.put("R2", "R3");
for(String key: map.keySet()) {
str.replaceAll(key, map.get(key));
}
replaceAll also handles regular expressions.
replaceAll也处理正则表达式。
EDIT: The above solution, as many have pointed out, doesn't work because it doesn't handle cyclic replacements. So this is my second approach:
编辑:正如许多人所指出的,上述解决方案不起作用,因为它不处理循环替换。所以这是我的第二种方法:
public class Replacement {
private String newS;
private String old;
public Replacement(String old, String newS) {
this.newS = newS;
this.old = old;
}
public String getOld() {
return old;
}
public String getNew() {
return newS;
}
}
SortedMap<Integer, Replacement> map = new TreeMap<Integer, Replacement>();
map.put(new Integer(1), new Replacement("R2", "R3"));
map.put(new Integer(2), new Replacement("R1", "R2"));
for(Integer key: map.keySet()) {
str.replaceAll(map.get(key).getOld(), map.get(key).getNew());
}
This works provided that you order the replacements properly and that you guard yourself against cyclic replacements. Some replacements are impossible:
如果您正确地订购了替换件并且保护自己免受循环替换,则此工作正常。有些替换是不可能的:
R1 -> R2
R2 -> R3
R3 -> R1
You must use some 'temp' variables for these:
你必须使用一些'temp'变量:
R1 -> R@1
R2 -> R@3
R3 -> R1
R@(\d{1}) -> R\1
You could write a library that it would do all these for you.
您可以编写一个库,它会为您完成所有这些操作。