I am working on a Java application where I need to send an array of 500,000 integers from one Android phone to another Android phone over a socket connection as quickly as possible. The main bottleneck seems to be converting the integers so the socket can take them, whether I use ObjectOutputStreams, ByteBuffers, or a low level mask-and-shift conversion. What is the fastest way to send an int[] over a socket from one Java app to another?
我正在开发一个Java应用程序,我需要尽快通过套接字连接从一部Android手机向另一部Android手机发送500,000个整数数组。主要的瓶颈似乎是转换整数,因此套接字可以使用它们,无论我使用ObjectOutputStreams,ByteBuffers还是低级掩码和移位转换。通过套接字将int []从一个Java应用程序发送到另一个Java应用程序的最快方法是什么?
Here is the code for everything I've tried so far, with benchmarks on the LG Optimus V I'm testing on (600 MHz ARM processor, Android 2.2).
这是我迄今为止尝试过的所有内容的代码,LG Optimus V上的基准测试我正在测试(600 MHz ARM处理器,Android 2.2)。
Low level mask-and-shift: 0.2 seconds
低级掩码和移位:0.2秒
public static byte[] intToByte(int[] input)
{
byte[] output = new byte[input.length*4];
for(int i = 0; i < input.length; i++) {
output[i*4] = (byte)(input[i] & 0xFF);
output[i*4 + 1] = (byte)((input[i] & 0xFF00) >>> 8);
output[i*4 + 2] = (byte)((input[i] & 0xFF0000) >>> 16);
output[i*4 + 3] = (byte)((input[i] & 0xFF000000) >>> 24);
}
return output;
}
Using ByteBuffer and IntBuffer: 0.75 seconds
使用ByteBuffer和IntBuffer:0.75秒
public static byte[] intToByte(int[] input)
{
ByteBuffer byteBuffer = ByteBuffer.allocate(input.length * 4);
IntBuffer intBuffer = byteBuffer.asIntBuffer();
intBuffer.put(input);
byte[] array = byteBuffer.array();
return array;
}
ObjectOutputStream: 3.1 seconds (I tried variations of this using DataOutPutStream and writeInt() instead of writeObject(), but it didn't make much of a difference)
ObjectOutputStream:3.1秒(我尝试使用DataOutPutStream和writeInt()而不是writeObject()的变体,但它没有太大区别)
public static void sendSerialDataTCP(String address, int[] array) throws IOException
{
Socket senderSocket = new Socket(address, 4446);
OutputStream os = senderSocket.getOutputStream();
BufferedOutputStream bos = new BufferedOutputStream (os);
ObjectOutputStream oos = new ObjectOutputStream(bos);
oos.writeObject(array);
oos.flush();
bos.flush();
os.flush();
oos.close();
os.close();
bos.close();
senderSocket.close();
}
Lastly, the code I used to send byte[]: takes an addition 0.2 seconds over the intToByte() functions
最后,我用来发送byte []的代码比intToByte()函数多0.2秒
public static void sendDataTCP(String address, byte[] data) throws IOException
{
Socket senderSocket = new Socket(address, 4446);
OutputStream os = senderSocket.getOutputStream();
os.write(data, 0, data.length);
os.flush();
senderSocket.close();
}
I'm writing the code on both sides of the socket so I can try any kind of endianness, compression, serialization, etc. There's got to be a way to do this conversion more efficiently in Java. Please help!
我在套接字的两边编写代码,所以我可以尝试任何类型的字节顺序,压缩,序列化等。必须有一种方法在Java中更有效地进行这种转换。请帮忙!
4 个解决方案
#1
5
As I noted in a comment, I think you're banging against the limits of your processor. As this might be helpful to others, I'll break it down. Here's your loop to convert integers to bytes:
正如我在评论中指出的那样,我认为你正在抨击处理器的极限。由于这可能对其他人有帮助,我会将其分解。这是将整数转换为字节的循环:
for(int i = 0; i < input.length; i++) {
output[i*4] = (byte)(input[i] & 0xFF);
output[i*4 + 1] = (byte)((input[i] & 0xFF00) >>> 8);
output[i*4 + 2] = (byte)((input[i] & 0xFF0000) >>> 16);
output[i*4 + 3] = (byte)((input[i] & 0xFF000000) >>> 24);
}
This loop executes 500,000 times. You 600Mhz processor can process roughly 600,000,000 operations per second. So each iteration of the loop will consume roughly 1/1200 of a second for every operation.
该循环执行500,000次。 600Mhz处理器每秒可处理大约600,000,000次操作。因此,对于每个操作,循环的每次迭代将消耗大约1/1200秒。
Again, using very rough numbers (I don't know the ARM instruction set, so there may be more or less per action), here's an operation count:
再次,使用非常粗略的数字(我不知道ARM指令集,因此每个操作可能有更多或更少),这是一个操作计数:
- Test/branch: 5 (retrieve counter, retrieve array length, compare, branch, increment counter)
- 测试/分支:5(检索计数器,检索数组长度,比较,分支,增量计数器)
- Mask and shift: 10 x 4 (retrieve counter, retrieve input array base, add, retrieve mask, and, shift, multiply counter, add offset, add to output base, store)
- 掩码和移位:10 x 4(检索计数器,检索输入数组基础,添加,检索掩码,移位,乘法计数器,添加偏移量,添加到输出基数,存储)
OK, so in rough numbers, this loop takes at best 55/1200 of a second, or 0.04 seconds. However, you're not dealing with best case scenario. For one thing, with an array this large you're not going to benefit from a processor cache, so you'll introduce wait states into every array store and load.
好的,所以在粗略的数字中,这个循环最多需要55/1200秒,或0.04秒。但是,您没有处理最佳案例场景。首先,对于这么大的数组,您不会从处理器缓存中受益,因此您将在每个数组存储和加载中引入等待状态。
Plus, the basic operations that I described may or may not translate directly into machine code. If not (and I suspect not), the loop will cost more than I've described.
另外,我描述的基本操作可能会或可能不会直接转换为机器代码。如果不是(我怀疑没有),那么循环的成本将超过我所描述的。
Finally, if you're really unlucky, the JVM hasn't JIT-ed your code, so for some portion (or all) of the loop it's interpreting bytecode rather than executing native instructions. I don't know enough about Dalvik to comment on that.
最后,如果你真的不走运,JVM还没有JIT编写你的代码,所以对于循环的某些部分(或全部)它解释字节码而不是执行本机指令。我不太了解Dalvik对此发表评论。
#2
1
Java was IMO never intended to be able efficiently reinterpret a memory region from int[]
to byte[]
like you could do in C. It doesn't even have such a memory address model.
Java是IMO从未打算能够有效地将内存区域从int []重新解释为byte [],就像你在C中所做的那样。它甚至没有这样的内存地址模型。
You either need to go native to send the data or you can try to find some micro optimizations. But I doubt you will gain a lot.
您需要本地发送数据,或者您可以尝试查找一些微优化。但我怀疑你会获得很多。
E.g. this could be slightly faster than your version (if it works at all)
例如。这可能比你的版本略快(如果它可以工作)
public static byte[] intToByte(int[] input)
{
byte[] output = new byte[input.length*4];
for(int i = 0; i < input.length; i++) {
int position = i << 2;
output[position | 0] = (byte)((input[i] >> 0) & 0xFF);
output[position | 1] = (byte)((input[i] >> 8) & 0xFF);
output[position | 2] = (byte)((input[i] >> 16) & 0xFF);
output[position | 3] = (byte)((input[i] >> 24) & 0xFF);
}
return output;
}
#3
1
I would do it like this:
我会这样做:
Socket senderSocket = new Socket(address, 4446);
OutputStream os = senderSocket.getOutputStream();
BufferedOutputStream bos = new BufferedOutputStream(os);
DataOutputStream dos = new DataOutputStream(bos);
dos.writeInt(array.length);
for(int i : array) dos.writeInt(i);
dos.close();
On the other side, read it like:
另一方面,请阅读:
Socket recieverSocket = ...;
InputStream is = recieverSocket.getInputStream();
BufferedInputStream bis = new BufferedInputStream(is);
DataInputStream dis = new DataInputStream(bis);
int length = dis.readInt();
int[] array = new int[length];
for(int i = 0; i < length; i++) array[i] = dis.readInt();
dis.close();
#4
1
If you're not adverse to using a library, you might want to check out Protocol Buffers from Google. It's built for much more complex object serialization, but I'd bet that they worked hard to figure out how to quickly serialize an array of integers in Java.
如果您不喜欢使用库,则可能需要查看Google的Protocol Buffers。它是为更复杂的对象序列化而构建的,但我敢打赌,他们努力想弄清楚如何在Java中快速序列化整数数组。
EDIT: I looked in the Protobuf source code, and it uses something similar to your low-level mask and shift.
编辑:我查看了Protobuf源代码,它使用类似于你的低级掩码和移位的东西。
#1
5
As I noted in a comment, I think you're banging against the limits of your processor. As this might be helpful to others, I'll break it down. Here's your loop to convert integers to bytes:
正如我在评论中指出的那样,我认为你正在抨击处理器的极限。由于这可能对其他人有帮助,我会将其分解。这是将整数转换为字节的循环:
for(int i = 0; i < input.length; i++) {
output[i*4] = (byte)(input[i] & 0xFF);
output[i*4 + 1] = (byte)((input[i] & 0xFF00) >>> 8);
output[i*4 + 2] = (byte)((input[i] & 0xFF0000) >>> 16);
output[i*4 + 3] = (byte)((input[i] & 0xFF000000) >>> 24);
}
This loop executes 500,000 times. You 600Mhz processor can process roughly 600,000,000 operations per second. So each iteration of the loop will consume roughly 1/1200 of a second for every operation.
该循环执行500,000次。 600Mhz处理器每秒可处理大约600,000,000次操作。因此,对于每个操作,循环的每次迭代将消耗大约1/1200秒。
Again, using very rough numbers (I don't know the ARM instruction set, so there may be more or less per action), here's an operation count:
再次,使用非常粗略的数字(我不知道ARM指令集,因此每个操作可能有更多或更少),这是一个操作计数:
- Test/branch: 5 (retrieve counter, retrieve array length, compare, branch, increment counter)
- 测试/分支:5(检索计数器,检索数组长度,比较,分支,增量计数器)
- Mask and shift: 10 x 4 (retrieve counter, retrieve input array base, add, retrieve mask, and, shift, multiply counter, add offset, add to output base, store)
- 掩码和移位:10 x 4(检索计数器,检索输入数组基础,添加,检索掩码,移位,乘法计数器,添加偏移量,添加到输出基数,存储)
OK, so in rough numbers, this loop takes at best 55/1200 of a second, or 0.04 seconds. However, you're not dealing with best case scenario. For one thing, with an array this large you're not going to benefit from a processor cache, so you'll introduce wait states into every array store and load.
好的,所以在粗略的数字中,这个循环最多需要55/1200秒,或0.04秒。但是,您没有处理最佳案例场景。首先,对于这么大的数组,您不会从处理器缓存中受益,因此您将在每个数组存储和加载中引入等待状态。
Plus, the basic operations that I described may or may not translate directly into machine code. If not (and I suspect not), the loop will cost more than I've described.
另外,我描述的基本操作可能会或可能不会直接转换为机器代码。如果不是(我怀疑没有),那么循环的成本将超过我所描述的。
Finally, if you're really unlucky, the JVM hasn't JIT-ed your code, so for some portion (or all) of the loop it's interpreting bytecode rather than executing native instructions. I don't know enough about Dalvik to comment on that.
最后,如果你真的不走运,JVM还没有JIT编写你的代码,所以对于循环的某些部分(或全部)它解释字节码而不是执行本机指令。我不太了解Dalvik对此发表评论。
#2
1
Java was IMO never intended to be able efficiently reinterpret a memory region from int[]
to byte[]
like you could do in C. It doesn't even have such a memory address model.
Java是IMO从未打算能够有效地将内存区域从int []重新解释为byte [],就像你在C中所做的那样。它甚至没有这样的内存地址模型。
You either need to go native to send the data or you can try to find some micro optimizations. But I doubt you will gain a lot.
您需要本地发送数据,或者您可以尝试查找一些微优化。但我怀疑你会获得很多。
E.g. this could be slightly faster than your version (if it works at all)
例如。这可能比你的版本略快(如果它可以工作)
public static byte[] intToByte(int[] input)
{
byte[] output = new byte[input.length*4];
for(int i = 0; i < input.length; i++) {
int position = i << 2;
output[position | 0] = (byte)((input[i] >> 0) & 0xFF);
output[position | 1] = (byte)((input[i] >> 8) & 0xFF);
output[position | 2] = (byte)((input[i] >> 16) & 0xFF);
output[position | 3] = (byte)((input[i] >> 24) & 0xFF);
}
return output;
}
#3
1
I would do it like this:
我会这样做:
Socket senderSocket = new Socket(address, 4446);
OutputStream os = senderSocket.getOutputStream();
BufferedOutputStream bos = new BufferedOutputStream(os);
DataOutputStream dos = new DataOutputStream(bos);
dos.writeInt(array.length);
for(int i : array) dos.writeInt(i);
dos.close();
On the other side, read it like:
另一方面,请阅读:
Socket recieverSocket = ...;
InputStream is = recieverSocket.getInputStream();
BufferedInputStream bis = new BufferedInputStream(is);
DataInputStream dis = new DataInputStream(bis);
int length = dis.readInt();
int[] array = new int[length];
for(int i = 0; i < length; i++) array[i] = dis.readInt();
dis.close();
#4
1
If you're not adverse to using a library, you might want to check out Protocol Buffers from Google. It's built for much more complex object serialization, but I'd bet that they worked hard to figure out how to quickly serialize an array of integers in Java.
如果您不喜欢使用库,则可能需要查看Google的Protocol Buffers。它是为更复杂的对象序列化而构建的,但我敢打赌,他们努力想弄清楚如何在Java中快速序列化整数数组。
EDIT: I looked in the Protobuf source code, and it uses something similar to your low-level mask and shift.
编辑:我查看了Protobuf源代码,它使用类似于你的低级掩码和移位的东西。