在Android / Java中读取和写入文件时跳过部分

时间:2021-07-26 21:54:04

I'm trying to learn Java/Android and right now I'm doing some experiments with the replaceAll function. But I've found that with large text files the process gets sluggish so I was wondering if there is a way to skip the "useless" parts of a file to have a better performance. (Note: Just skip them, not delete them)

我正在尝试学习Java / Android,现在我正在用replaceAll函数做一些实验。但我发现,对于大文本文件,这个过程变得缓慢,所以我想知道是否有办法跳过文件的“无用”部分以获得更好的性能。 (注意:只需跳过它们,不要删除它们)

Note: I am not trying to "count lines" or "println" or "system.out", I'm just replacing strings and saving the changes in the same file.

注意:我不是要“计数行”或“println”或“system.out”,我只是替换字符串并将更改保存在同一个文件中。

Example

AAAA

CCCC- 9234802394819102948102948104981209381'238901'2309'129831'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20391'20931'20853029573098341'290831'20893'12894093274019799919208310293810293810293810293810298'120931¿2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD

CCCC-9234802394819102948102948104981209381'238901'2309'129831'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20391'20931'20853029573098341'290831'20893'12894093274019799919208310293810293810293810293810298'120931¿ 2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD

AAAA

CCCC- 9234802394819102948102948104981209381'238901'2309'129831'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20391'20931'20853029573098341'290831'20893'12894093274019799919208310293810293810293810293810298'120931¿2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD

CCCC-9234802394819102948102948104981209381'238901'2309'129831'2381'2381'23081'23081'284091824098304982390482304981'20841'948023984129048'1489039842039481'204891'29031'923481290381'20391'294872385710239841'20391'20931'20853029573098341'290831'20893'12894093274019799919208310293810293810293810293810298'120931¿ 2093¿12039¿120931¿203912¿0391¿203912¿039¿12093¿12093¿12093¿12093¿12093¿1209312¿0390¿... DDDD

and so on....like a zillion times

等等......好像无数次

I want to replace all "AAAA" with "BBBB", but there are large portions of data between the strings I am replacing. Also, this portions always begin with "CCCC" and end with "DDDD".

我想用“BBBB”替换所有“AAAA”,但是我要替换的字符串之间有大量数据。此外,这部分始终以“CCCC”开头,以“DDDD”结尾。

Here's the code I am using to replace the string.

这是我用来替换字符串的代码。

File file = new File("my_file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = "", oldtext = "";
while((line = reader.readLine()) != null) {
   oldtext += line + "\r\n";
}
reader.close();

// Replacing "AAAA" strings
String newtext= oldtext.replaceAll("AAAA", "BBBB");

FileWriter writer = new FileWriter("my_file.txt");
writer.write(newtext);
writer.close();

I think reading all lines is inefficient, especially when you won't be modifying these parts (and they represent the 90% of the file).

我认为阅读所有行都是低效的,特别是当你不修改这些部分时(它们代表文件的90%)。

Does anyone know a solution???

有谁知道解决方案???

1 个解决方案

#1


0  

You are wasting a lot of time on this line --

你在这条线上浪费了很多时间 -

oldtext += line + "\r\n";

In Java, String is immutable, which means you can't modify them. Therefore, when you do the concatenation, Java is actually making a complete copy of oldtext. So, for every line in your file, you are recopying every line that came before in your new String. Take a look at StringBuilder for a a way to build a String avoiding these copies.

在Java中,String是不可变的,这意味着您无法修改它们。因此,当您进行连接时,Java实际上是制作oldtext的完整副本。因此,对于文件中的每一行,您都要重新复制新String中的每一行。看一下StringBuilder,建立一个避免这些副本的String。

However, in your case, you do not need the whole file in memory, because you can process line by line. By moving your replaceAll and write into your loop, you can operate on each line as you read it. This will keep the memory footprint of the routine down, because you are only keeping a single line in memory.

但是,在您的情况下,您不需要内存中的整个文件,因为您可以逐行处理。通过移动replaceAll并写入循环,您可以在阅读时对每一行进行操作。这将使例程的内存占用减少,因为您只在内存中保留一行。

Note that since the FileWriter is opened before you read the input file, you need to have a different name for the output file. If you want to keep the same name, you can do a renameTo on the File after you close it.

请注意,由于在读取输入文件之前打开了FileWriter,因此输出文件需要具有不同的名称。如果要保留相同的名称,可以在关闭文件后对文件进行重命名。

File file = new File("my_file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
FileWriter writer = new FileWriter("my_out_file.txt");
String line = "";
while((line = reader.readLine()) != null) {
    // Replacing "AAAA" strings
    String newtext= line.replaceAll("AAAA", "BBBB");    
    writer.write(newtext);
}
reader.close();
writer.close();

#1


0  

You are wasting a lot of time on this line --

你在这条线上浪费了很多时间 -

oldtext += line + "\r\n";

In Java, String is immutable, which means you can't modify them. Therefore, when you do the concatenation, Java is actually making a complete copy of oldtext. So, for every line in your file, you are recopying every line that came before in your new String. Take a look at StringBuilder for a a way to build a String avoiding these copies.

在Java中,String是不可变的,这意味着您无法修改它们。因此,当您进行连接时,Java实际上是制作oldtext的完整副本。因此,对于文件中的每一行,您都要重新复制新String中的每一行。看一下StringBuilder,建立一个避免这些副本的String。

However, in your case, you do not need the whole file in memory, because you can process line by line. By moving your replaceAll and write into your loop, you can operate on each line as you read it. This will keep the memory footprint of the routine down, because you are only keeping a single line in memory.

但是,在您的情况下,您不需要内存中的整个文件,因为您可以逐行处理。通过移动replaceAll并写入循环,您可以在阅读时对每一行进行操作。这将使例程的内存占用减少,因为您只在内存中保留一行。

Note that since the FileWriter is opened before you read the input file, you need to have a different name for the output file. If you want to keep the same name, you can do a renameTo on the File after you close it.

请注意,由于在读取输入文件之前打开了FileWriter,因此输出文件需要具有不同的名称。如果要保留相同的名称,可以在关闭文件后对文件进行重命名。

File file = new File("my_file.txt");
BufferedReader reader = new BufferedReader(new FileReader(file));
FileWriter writer = new FileWriter("my_out_file.txt");
String line = "";
while((line = reader.readLine()) != null) {
    // Replacing "AAAA" strings
    String newtext= line.replaceAll("AAAA", "BBBB");    
    writer.write(newtext);
}
reader.close();
writer.close();