I made a method that takes a File
and a String
. It replaces the file with a new file with that string as its contents.
我创建了一个采用File和String的方法。它将文件替换为带有该字符串作为其内容的新文件。
This is what I made:
这就是我所做的:
public static void Save(File file, String textToSave) {
file.delete();
try {
BufferedWriter out = new BufferedWriter(new FileWriter(file));
out.write(textToSave);
out.close();
} catch (IOException e) {
}
}
However it is painfully slow. It sometimes takes over a minute.
然而它很慢。它有时需要一分钟。
How can I write large files with tens of thousands to maybe up to a million characters in them?
如何编写成千上万的大文件,其中可能包含多达一百万个字符?
6 个解决方案
#1
18
Make sure you allocate a large enough buffer:
确保分配足够大的缓冲区:
BufferedWriter out = new BufferedWriter(new FileWriter(file), 32768);
What sort of OS are you running on? That can make a big difference too. However, taking a minute to write out a file of less-than-enormous size sounds like a system problem. On Linux or other *ix systems, you can use things like strace
to see if the JVM is making lots of unnecessary system calls. (A very long time ago, Java I/O was pretty dumb and would make insane numbers of low-level write()
system calls if you weren't careful, but when I say "a long time ago" I mean 1998 or so.)
你在运行什么样的操作系统?这也可以产生很大的不同。然而,花一分钟写出一个不太大的文件听起来像一个系统问题。在Linux或其他* ix系统上,您可以使用strace之类的东西来查看JVM是否进行了大量不必要的系统调用。 (很久以前,Java I / O非常愚蠢,如果你不小心的话,会发出疯狂的低级别write()系统调用,但是当我说“很久以前”我的意思是1998年左右。)
edit — note that the situation of a Java program writing a simple file in a simple way, and yet being really slow, is an inherently odd one. Can you tell if the CPU is heavily loaded while the file is being written? It shouldn't be; there should be almost no CPU load from such a thing.
编辑 - 请注意,Java程序以简单的方式编写一个简单文件,但速度非常慢的情况本身就是一个奇怪的问题。在写入文件时,能否判断CPU是否负载过重?它不应该;从这样的事情应该几乎没有CPU负载。
#2
13
A simple test for you
一个简单的测试
char[] chars = new char[100*1024*1024];
Arrays.fill(chars, 'A');
String text = new String(chars);
long start = System.nanoTime();
BufferedWriter bw = new BufferedWriter(new FileWriter("/tmp/a.txt"));
bw.write(text);
bw.close();
long time = System.nanoTime() - start;
System.out.println("Wrote " + chars.length*1000L/time+" MB/s.");
Prints
Wrote 135 MB/s.
#3
4
You could look into Java's NIO capabilities. It may support what you want to do.
您可以了解Java的NIO功能。它可能支持你想做的事情。
Java NIO FileChannel versus FileOutputstream performance / usefulness
Java NIO FileChannel与FileOutputstream的性能/实用性
#4
3
Try using memory mapped files:
尝试使用内存映射文件:
FileChannel rwChannel = new RandomAccessFile("textfile.txt", "rw").getChannel();
ByteBuffer wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, 0, textToSave.length());
wrBuf.put(textToSave.getBytes());
rwChannel.close();
#5
0
Hi I have created two approaches to create big files, run program on windows 7, 64-bit, 8 GB RAM machine, JDK 8 and below are results.
In both the cases, file of 180 MB created that contains number in each line from 1 to 20 million (2 crore in Indian system).
嗨我创建了两种方法来创建大文件,在Windows 7,64位,8 GB RAM机器上运行程序,JDK 8及以下是结果。在这两种情况下,创建的180 MB文件包含每行中1到2千万的数字(印度系统中为2千万卢比)。
Java program memory grows gradually till 600 MB
Java程序内存逐渐增长到600 MB
First output
Approach = approach-1 (Using FileWriter)
Completed file writing in milli seconds = 4521 milli seconds.
Second output
Approach = approach-2 (Using FileChannel and ByteBuffer)
Completed file writing in milli seconds = 3590 milli seconds.
One observation - I am calculating position (pos variable) in approach#2, if I comment it out then only last string will be visible due to overwritten at position, but time reduced to nearly 2000 milli seconds.
一个观察 - 我在方法#2中计算位置(pos变量),如果我将其评论出来,那么由于在位置被覆盖,只有最后一个字符串可见,但时间减少到接近2000毫秒。
Attaching code.
import java.io.FileWriter;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.concurrent.TimeUnit;
public class TestLargeFile {
public static void main(String[] args) {
writeBigFile();
}
private static void writeBigFile() {
System.out.println("--------writeBigFile-----------");
long nanoTime = System.nanoTime();
String fn = "big-file.txt";
boolean approach1 = false;
System.out.println("Approach = " + (approach1 ? "approach-1" : "approach-2"));
int numLines = 20_000_000;
try {
if (approach1) {
//Approach 1 -- for 2 crore lines takes 4.5 seconds with 180 mb file size
approach1(fn, numLines);
} else {
//Approach 2 -- for 2 crore lines takes nearly 2 to 2.5 seconds with 180 mb file size
approach2(fn, numLines);
}
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Completed file writing in milli seconds = " + TimeUnit.MILLISECONDS.convert((System.nanoTime() - nanoTime), TimeUnit.NANOSECONDS));
}
private static void approach2(String fn, int numLines) throws IOException {
StringBuilder sb = new StringBuilder();
FileChannel rwChannel = new RandomAccessFile(fn, "rw").getChannel();
ByteBuffer wrBuf;
int pos = 0;
for (int i = 1; i <= numLines; i++) {
sb.append(i).append(System.lineSeparator());
if (i % 100000 == 0) {
wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, pos, sb.length());
pos += sb.length();
wrBuf.put(sb.toString().getBytes());
sb = new StringBuilder();
}
}
if (sb.length() > 0) {
wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, pos, sb.length());
wrBuf.put(sb.toString().getBytes());
}
rwChannel.close();
}
private static void approach1(String fn, int numLines) throws IOException {
StringBuilder sb = new StringBuilder();
for (int i = 1; i <= numLines; i++) {
sb.append(i).append(System.lineSeparator());
}
FileWriter fileWriter = new FileWriter(fn);
fileWriter.write(sb.toString());
fileWriter.flush();
fileWriter.close();
}
}
#6
-3
In Java, the BufferWriter is very slow: Use the native methods directly, and call them as little as possible (give them as much data per call as you can).
在Java中,BufferWriter非常慢:直接使用本机方法,并尽可能少地调用它们(尽可能为每个调用提供尽可能多的数据)。
try{
FileOutputStream file=new FileOutputStream(file);
file.write(content);
file.close();
}catch(Throwable e){
D.error(e);
}//try
Also, deleting the file can take a while (maybe it is being copied to the recycle bin first). Just overwrite the file, like in the above code.
此外,删除文件可能需要一段时间(可能首先将其复制到回收站)。只需覆盖文件,就像上面的代码一样。
#1
18
Make sure you allocate a large enough buffer:
确保分配足够大的缓冲区:
BufferedWriter out = new BufferedWriter(new FileWriter(file), 32768);
What sort of OS are you running on? That can make a big difference too. However, taking a minute to write out a file of less-than-enormous size sounds like a system problem. On Linux or other *ix systems, you can use things like strace
to see if the JVM is making lots of unnecessary system calls. (A very long time ago, Java I/O was pretty dumb and would make insane numbers of low-level write()
system calls if you weren't careful, but when I say "a long time ago" I mean 1998 or so.)
你在运行什么样的操作系统?这也可以产生很大的不同。然而,花一分钟写出一个不太大的文件听起来像一个系统问题。在Linux或其他* ix系统上,您可以使用strace之类的东西来查看JVM是否进行了大量不必要的系统调用。 (很久以前,Java I / O非常愚蠢,如果你不小心的话,会发出疯狂的低级别write()系统调用,但是当我说“很久以前”我的意思是1998年左右。)
edit — note that the situation of a Java program writing a simple file in a simple way, and yet being really slow, is an inherently odd one. Can you tell if the CPU is heavily loaded while the file is being written? It shouldn't be; there should be almost no CPU load from such a thing.
编辑 - 请注意,Java程序以简单的方式编写一个简单文件,但速度非常慢的情况本身就是一个奇怪的问题。在写入文件时,能否判断CPU是否负载过重?它不应该;从这样的事情应该几乎没有CPU负载。
#2
13
A simple test for you
一个简单的测试
char[] chars = new char[100*1024*1024];
Arrays.fill(chars, 'A');
String text = new String(chars);
long start = System.nanoTime();
BufferedWriter bw = new BufferedWriter(new FileWriter("/tmp/a.txt"));
bw.write(text);
bw.close();
long time = System.nanoTime() - start;
System.out.println("Wrote " + chars.length*1000L/time+" MB/s.");
Prints
Wrote 135 MB/s.
#3
4
You could look into Java's NIO capabilities. It may support what you want to do.
您可以了解Java的NIO功能。它可能支持你想做的事情。
Java NIO FileChannel versus FileOutputstream performance / usefulness
Java NIO FileChannel与FileOutputstream的性能/实用性
#4
3
Try using memory mapped files:
尝试使用内存映射文件:
FileChannel rwChannel = new RandomAccessFile("textfile.txt", "rw").getChannel();
ByteBuffer wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, 0, textToSave.length());
wrBuf.put(textToSave.getBytes());
rwChannel.close();
#5
0
Hi I have created two approaches to create big files, run program on windows 7, 64-bit, 8 GB RAM machine, JDK 8 and below are results.
In both the cases, file of 180 MB created that contains number in each line from 1 to 20 million (2 crore in Indian system).
嗨我创建了两种方法来创建大文件,在Windows 7,64位,8 GB RAM机器上运行程序,JDK 8及以下是结果。在这两种情况下,创建的180 MB文件包含每行中1到2千万的数字(印度系统中为2千万卢比)。
Java program memory grows gradually till 600 MB
Java程序内存逐渐增长到600 MB
First output
Approach = approach-1 (Using FileWriter)
Completed file writing in milli seconds = 4521 milli seconds.
Second output
Approach = approach-2 (Using FileChannel and ByteBuffer)
Completed file writing in milli seconds = 3590 milli seconds.
One observation - I am calculating position (pos variable) in approach#2, if I comment it out then only last string will be visible due to overwritten at position, but time reduced to nearly 2000 milli seconds.
一个观察 - 我在方法#2中计算位置(pos变量),如果我将其评论出来,那么由于在位置被覆盖,只有最后一个字符串可见,但时间减少到接近2000毫秒。
Attaching code.
import java.io.FileWriter;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.concurrent.TimeUnit;
public class TestLargeFile {
public static void main(String[] args) {
writeBigFile();
}
private static void writeBigFile() {
System.out.println("--------writeBigFile-----------");
long nanoTime = System.nanoTime();
String fn = "big-file.txt";
boolean approach1 = false;
System.out.println("Approach = " + (approach1 ? "approach-1" : "approach-2"));
int numLines = 20_000_000;
try {
if (approach1) {
//Approach 1 -- for 2 crore lines takes 4.5 seconds with 180 mb file size
approach1(fn, numLines);
} else {
//Approach 2 -- for 2 crore lines takes nearly 2 to 2.5 seconds with 180 mb file size
approach2(fn, numLines);
}
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Completed file writing in milli seconds = " + TimeUnit.MILLISECONDS.convert((System.nanoTime() - nanoTime), TimeUnit.NANOSECONDS));
}
private static void approach2(String fn, int numLines) throws IOException {
StringBuilder sb = new StringBuilder();
FileChannel rwChannel = new RandomAccessFile(fn, "rw").getChannel();
ByteBuffer wrBuf;
int pos = 0;
for (int i = 1; i <= numLines; i++) {
sb.append(i).append(System.lineSeparator());
if (i % 100000 == 0) {
wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, pos, sb.length());
pos += sb.length();
wrBuf.put(sb.toString().getBytes());
sb = new StringBuilder();
}
}
if (sb.length() > 0) {
wrBuf = rwChannel.map(FileChannel.MapMode.READ_WRITE, pos, sb.length());
wrBuf.put(sb.toString().getBytes());
}
rwChannel.close();
}
private static void approach1(String fn, int numLines) throws IOException {
StringBuilder sb = new StringBuilder();
for (int i = 1; i <= numLines; i++) {
sb.append(i).append(System.lineSeparator());
}
FileWriter fileWriter = new FileWriter(fn);
fileWriter.write(sb.toString());
fileWriter.flush();
fileWriter.close();
}
}
#6
-3
In Java, the BufferWriter is very slow: Use the native methods directly, and call them as little as possible (give them as much data per call as you can).
在Java中,BufferWriter非常慢:直接使用本机方法,并尽可能少地调用它们(尽可能为每个调用提供尽可能多的数据)。
try{
FileOutputStream file=new FileOutputStream(file);
file.write(content);
file.close();
}catch(Throwable e){
D.error(e);
}//try
Also, deleting the file can take a while (maybe it is being copied to the recycle bin first). Just overwrite the file, like in the above code.
此外,删除文件可能需要一段时间(可能首先将其复制到回收站)。只需覆盖文件,就像上面的代码一样。