字节流和字符流

一：输入流，

字节流（二进制字节）

1. InputStream 类，抽象类代表一输入流

2. 具体的流类：

AudioInputStream, ByteArrayInputStream, FileInputStream, FilterInputStream, ObjectInputStream, PipedInputStream, SequenceInputStream, StringBufferInputStream

InputStream 类的子类代表不同来源的流。例如： ByteArrayInputStream流的来源是：字节数组。FileInputStream流的来源是文件。

3. 包装的功能流类

BufferedInputStream, CheckedInputStream, CipherInputStream, DataInputStream, DeflaterInputStream, DigestInputStream, InflaterInputStream, LineNumberInputStream, ProgressMonitorInputStream, PushbackInputStream

这些类都是 java.io.FilterInputStream 类的子类，他们的作用是提供一些读取流的特殊功能，例如按行来读取流 LineNumberInputStream 流。使用缓冲区的流泪：BufferedInputStream。所以这些类的构造方法都是 InputStream 类的子类。也就是2中的具体的流类。A FilterInputStream contains some other input stream, which it uses as its basic source of data, possibly transforming the data along the way or providing additional functionality.

字符流

1. java.io.Reader 所有字符流的抽象基类。

2. 具体的字符流类：BufferedReader, CharArrayReader, FilterReader, InputStreamReader, PipedReader, StringReader

每一种流类代表一种字符流的来源。

3. 同样地 FilterReader 类，代表了字符流的功能包装流。

类似的，有输出流：

//////////

从本质上看，字符流其实就是字节流，字符流也是一些二进制数据，只不过这些二进制数据可以通过，指定字符集（charset），将其映射到可以显示的字符上，所以我们称之为字符流。所以 jdk 提供了一种将字节流转换成字符流的方法，就是 InputStreamReader 类，通过给这个类提供一个InputStream类和字符集名称，可以将该字节流转换成字符流。之后就可以使用字符流的功能来，操纵原来的字节流。

字符集：又是一个新话题：

java.nio包中的类：

这个包中的类新的IO架构，其思路和之前的基于流的I/O完全不同。基于流的I/O其基本假设是：具备I/O操作的实体（例如：文件，socket(网络)，内存区域（字符串，字节数组）），其输入和输出都是按照其数据在其中的顺序，来读写。而且这种流式操作，读取字节数据是常常是不可逆的，按照其读取顺序，直到文件末尾。所以这种流失的读写是一种阻塞式的I/O模型，并且不是线程安全的：例如：FileInputStream类：

private native void open(String name) throws FileNotFoundException;

public native int read() throws IOException;

private native int readBytes(byte b[], int off, int len) throws IOException;

这几个核心的方法都是 native 的。所以对于流失I/O来说，每次的数据读写，都要直接对目标文件（或内存区域）进行读写。

而对于 java.nio.channels 中的 FileChannel类，其每次操作的都是ByteBuffer：例如：读的方法：

abstract int read(ByteBuffer dst)

long read(ByteBuffer[] dsts)

abstract long read(ByteBuffer[] dsts, int offset, int length)

abstract int read(ByteBuffer dst, long position)

写的方法：

abstract int write(ByteBuffer src)

long write(ByteBuffer[] srcs)

abstract long write(ByteBuffer[] srcs, int offset, int length)

abstract int write(ByteBuffer src, long position)

那么什么是缓冲区呢，其实就是一块内存区域，这块区域中定义了一些管理这块内存区域的方法：limit , position, mark等等。同时他也是数据的容器，用来暂时存放数据。其实nio中的数据操作更多的就是在这个data buffer中，Channel中提供的 read 和 write方法用来生成和消费Data Buffer的，有了read生成出来的Buffer，我们就可以在这个Buffer进行读写操作了。所以对于Buffer类的具体子类，都提供了类似与：get, put方法，用来操纵Buffer中的data.对Buffer中的data进行一系列的业务处理之后，我们就可以使用Channel的write方法一次性的将这个Buffer写到目标中去。所以这种操作机制大大降低了，实际的I/O次数。提高了I/O效率。

4M的文件：计算 CRC32

Input Stream: fe935a6f 13342 milliseconds

Buffered Input Stream: fe935a6f 111 milliseconds

Random Access File: fe935a6f 19526 milliseconds

Mapped File: fe935a6f 80 milliseconds

8M的文件：

Input Stream: 8b40d037 26994 milliseconds

Buffered Input Stream: 8b40d037 145 milliseconds

Random Access File: 8b40d037 38734 milliseconds

Mapped File: 8b40d037 125 milliseconds

13M的文件：

Input Stream: 13693a09 16884 milliseconds

Buffered Input Stream: 13693a09 217 milliseconds

Random Access File: 13693a09 32616 milliseconds

Mapped File: 13693a09 186 milliseconds

26M的文件：

Input Stream: ee040d90 33103 milliseconds

Buffered Input Stream: ee040d90 447 milliseconds

Random Access File: ee040d90 62611 milliseconds

Mapped File: ee040d90 360 milliseconds

500M的文件：

Buffered Input Stream: 8e60e18e 19057 milliseconds

Mapped File: 8e60e18e 17359 milliseconds

由此可见使用 Buffered Input Stream 和 Mapped File在性能是很好的。

2. java.nio.Buffer类的层次结构：

Buffer类定义了缓冲区的状态和状态操作字段：

状态字段：capacity, limt, position, mark.

状态读写方法：

capacity()

position()

position(int)

limit()

limit(int)

mark()

reset()

clear()

flip()

rewind()

remaining()

hasRemaining()

都是对状态位进行操作，其实也就是缓冲区所拥有的具体操作：

Buffer类的具体子类：例如：ByteBuffer类，这个类也是一个抽象类：

他定义一些系列关于字节(Byte)读写的方法：put, putXXX, get, getXXX,这些个基本操作也都是抽象的，因为这个此时的缓冲区，还没有指明缓冲区分配的介质，ByteBuffer的子类，java.nio.HeapByteBuffer 和 java.nio.DirectByteBuffer 最终代表缓冲区存储介质的抽象，HeapByteBuffer在JVM堆上分配缓冲区内存，DirectByteBuffer在系统底层的内存。这三层抽象各有不同：Buffer抽象了缓冲区具有的基本操作，ByteBuffer抽象了缓冲区的字节读写操作，而 HeapByteBuffer 则抽象了在JVM堆上分配的字节缓冲区。其他类型的缓冲区实现类似。

3. Channel

java.io.FileInputStream类的getChannel，第三个参数表明，这个channel是不可写的。第二个参数表明这个通道是可读的。

public FileChannel java.io.FileInputStream.getChannel() {

synchronized (this) {

if (channel == null) {

channel = FileChannelImpl.open(fd, true, false, this);

fd.incrementAndGetUseCount();

}

return channel;

}

java.io.FileOutputStream.getChannel的getChannel方法：表明这个channel是不可读（不能调用read系列方法。），只可以写的。

public FileChannel java.io.FileOutputStream.getChannel() {

synchronized (this) {

if (channel == null) {

channel = FileChannelImpl.open(fd, false, true, append, this);

fd.incrementAndGetUseCount();

}

return channel;

}

在文档中也表明：这个方法是在java1.4加入的。也就是在 java.nio加入的时候，添加的这两个方法。

java.io.RandomAccessFile类的 getChannel ，这个流是可读的，同时是否可以写取决与rw参数，也就是创建RandomAccessFile类的实例是的可读可写标志位。

public final FileChannel getChannel() {

synchronized (this) {

if (channel == null) {

channel = FileChannelImpl.open(fd, true, rw, this);

fd.incrementAndGetUseCount();

}

return channel;

}

秒客网

字节流和字符流

相关文章