HBase学习笔记之HFile格式

时间:2023-03-08 16:04:42

主要看Roger的文档,这里作为文档的补充

HFile的格式-HFile的基本结构

HBase学习笔记之HFile格式

  • Trailer通过指针找到Meta index、Data index、File info。
  • Meta index保存每一个元数据在HFile中的位置、大小、元数据的key值。
  • Data index保存每一个数据块在HFile中的位置、大小、块第一个cell的key值。
  • File Info保存HFile相关信息。
  • Meta块保存的是HFile的元数据,比如布隆过滤器。
  • Data块保存的为具体的数据,每个数据块有个Magic头,存储偏移量和首Key。

hfile中索引块的大小默认值是128K,当索引的信息超过128K后,就会新分配一个索引块。hbase对于hfile的访问都是通过索引块来实现的,通过索引来定位所要查的数据到底在哪个数据块里面。

hfile中的索引块可以分成三中,根索引块,枝索引块,叶索引块。根索引块是一定会有的,但是如果hfile中的数据块比较少的话,枝索引块和叶索引块就可能不存在。

当单个的索引块中没有办法存储全部的数据块的信息时,索引块就会分裂,会产生叶索引块和根索引块,根索引块是对叶索引块的索引,如果数据块继续增加就会产生枝索引块,整个索引结果的层次也会加深。

查看一个HFile的内容:

hbase org.apache.hadoop.hbase.io.hfile.HFile -f /hbase/v_dm_user_app_d_201406/c64c6f7f7caf7f6d5a3f6bbc209dd2cb/c/0ebf8900a669462c8a9d9d908622928f -v -m -p
14/07/10 10:23:33 INFO util.ChecksumType: Checksum can use java.util.zip.CRC32
Scanning -> /hbase/v_dm_user_app_d_201406/c64c6f7f7caf7f6d5a3f6bbc209dd2cb/c/0ebf8900a669462c8a9d9d908622928f
14/07/10 10:23:33 INFO hfile.CacheConfig: Allocating LruBlockCache with maximum size 246.9m
14/07/10 10:23:33 ERROR metrics.SchemaMetrics: Inconsistent configuration. Previous configuration for using table name in metrics: true, new configuration: false
K: 13519304818|20140608-1/c:Q/1402368323157/Put/vlen=32/ts=0 V: 20140608\x0913519304818\x09\xE5\x85\xB6\xE4\xBB\x96\x090.38
K: 13519304818|20140608-2/c:Q/1402368323157/Put/vlen=37/ts=0 V: 20140608\x0913519304818\x09\xE5\x85\xB6\xE4\xBB\x96-TCP\x0913.87
K: 13519304818|20140608-32/c:Q/1402368323157/Put/vlen=36/ts=0 V: 20140608\x0913519304818\x09vivo\xE5\xAE\x98\xE7\xBD\x91\x0912.0
K: 13519306776|20140608-23/c:Q/1402368323157/Put/vlen=37/ts=0 V: 20140608\x0913519306776\x09HTTP\xE4\xB8\x8A\xE7\xBD\x91\x09205.0
K: 13519306776|20140608-24/c:Q/1402368323157/Put/vlen=36/ts=0 V: 20140608\x0913519306776\x09360\xE7\xBD\x91\xE7\xAB\x99\x09165.0
K: 13519306776|20140608-25/c:Q/1402368323157/Put/vlen=31/ts=0 V: 20140608\x0913519306776\x09\xE6\xB7\x98\xE5\xAE\x9D\x098.0
K: 13519306776|20140608-29/c:Q/1402368323157/Put/vlen=44/ts=0 V: 20140608\x0913519306776\x09\xE5\x8F\x8B\xE7\x9B\x9F\xE6\x9C\x8D\xE5\x8A\xA1\xE5\xB9\xB3\xE5\x8F\xB0\x0915.0
K: 13519306776|20140608-3/c:Q/1402368323157/Put/vlen=33/ts=0 V: 20140608\x0913519306776\x09\xE5\x85\xB6\xE4\xBB\x96\x0925.89
K: 13519306776|20140608-33/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519306776\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\xA4\xB4\xE5\x83\x8F\x0963.0
K: 13519306776|20140608-34/c:Q/1402368323157/Put/vlen=35/ts=0 V: 20140608\x0913519306776\x09\xE8\x85\xBE\xE8\xAE\xAF\xE7\xBD\x91\x0990.0
K: 13519306776|20140608-39/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519306776\x09\xE5\xA2\xA8\xE8\xBF\xB9\xE5\xA4\xA9\xE6\xB0\x94\x0911.0
K: 13519306776|20140608-4/c:Q/1402368323157/Put/vlen=39/ts=0 V: 20140608\x0913519306776\x09\xE5\x85\xB6\xE4\xBB\x96-TCP\x091442.83
K: 13519306776|20140608-52/c:Q/1402368323157/Put/vlen=39/ts=0 V: 20140608\x0913519306776\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\x9B\xBE\xE7\x89\x87\x09266.0
K: 13519609801|20140608-5/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519609801\x09\xE5\x85\xB6\xE4\xBB\x96-TCP\x09338.97
K: 13519700318|20140608-26/c:Q/1402368323157/Put/vlen=39/ts=0 V: 20140608\x0913519700318\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\xA4\xB4\xE5\x83\x8F\x09521.0
K: 13519700318|20140608-31/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519700318\x09\xE8\x85\xBE\xE8\xAE\xAF\xE5\x9B\xBE\xE7\x89\x87\x0931.0
K: 13519700318|20140608-43/c:Q/1402368323157/Put/vlen=37/ts=0 V: 20140608\x0913519700318\x09\xE8\x85\xBE\xE8\xAE\xAF\xE7\xBD\x91\x091391.0
K: 13519700318|20140608-47/c:Q/1402368323157/Put/vlen=32/ts=0 V: 20140608\x0913519700318\x09\xE6\x90\x9C\xE7\x8B\x90\x0976.0
K: 13519700318|20140608-48/c:Q/1402368323157/Put/vlen=47/ts=0 V: 20140608\x0913519700318\x09\xE4\xB8\xAD\xE5\x9B\xBD\xE8\x81\x94\xE9\x80\x9A\xE6\xB2\x83\xE5\x95\x86\xE5\xBA\x97\x0921.0
K: 13519700318|20140608-57/c:Q/1402368323157/Put/vlen=32/ts=0 V: 20140608\x0913519700318\x09\xE8\xBF\x85\xE9\x9B\xB7\x0965.0
K: 13519700318|20140608-58/c:Q/1402368323157/Put/vlen=37/ts=0 V: 20140608\x0913519700318\x09HTTP\xE4\xB8\x8A\xE7\xBD\x91\x09730.0
K: 13519700318|20140608-59/c:Q/1402368323157/Put/vlen=31/ts=0 V: 20140608\x0913519700318\x09\xE7\x99\xBE\xE5\xBA\xA6\x098.0
K: 13519700318|20140608-6/c:Q/1402368323157/Put/vlen=33/ts=0 V: 20140608\x0913519700318\x09\xE5\x85\xB6\xE4\xBB\x96\x0931.04
K: 13519700318|20140608-62/c:Q/1402368323157/Put/vlen=31/ts=0 V: 20140608\x0913519700318\x0910086\x0925.0
K: 13519700318|20140608-7/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519700318\x09\xE5\x85\xB6\xE4\xBB\x96-TCP\x09685.59
K: 13519700318|20140608-8/c:Q/1402368323157/Put/vlen=33/ts=0 V: 20140608\x0913519700318\x09\xE5\xBD\xA9\xE4\xBF\xA1\x0930.77
K: 13519700638|20140608-10/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519700638\x09\xE5\x85\xB6\xE4\xBB\x96-TCP\x092820.1
K: 13519700638|20140608-11/c:Q/1402368323157/Put/vlen=34/ts=0 V: 20140608\x0913519700638\x09\xE5\xBD\xA9\xE4\xBF\xA1\x09485.94
K: 13519700638|20140608-30/c:Q/1402368323157/Put/vlen=36/ts=0 V: 20140608\x0913519700638\x09\xE8\x85\xBE\xE8\xAE\xAF\xE7\xBD\x91\x09351.0
K: 13519700638|20140608-35/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519700638\x09\xE6\x96\xB0\xE6\xB5\xAA\xE5\xBE\xAE\xE5\x8D\x9A\x0920.0
K: 13519700638|20140608-36/c:Q/1402368323157/Put/vlen=31/ts=0 V: 20140608\x0913519700638\x09\xE6\xB7\x98\xE5\xAE\x9D\x098.0
K: 13519700638|20140608-37/c:Q/1402368323157/Put/vlen=31/ts=0 V: 20140608\x0913519700638\x09\xE7\x82\xB9\xE5\xBF\x83\x093.0
K: 13519700638|20140608-38/c:Q/1402368323157/Put/vlen=35/ts=0 V: 20140608\x0913519700638\x09360\xE7\xBD\x91\xE7\xAB\x99\x0988.0
K: 13519700638|20140608-45/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519700638\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\xA4\xB4\xE5\x83\x8F\x0919.0
K: 13519700638|20140608-49/c:Q/1402368323157/Put/vlen=40/ts=0 V: 20140608\x0913519700638\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\x9B\xBE\xE7\x89\x87\x091251.0
K: 13519700638|20140608-53/c:Q/1402368323157/Put/vlen=32/ts=0 V: 20140608\x0913519700638\x09\xE6\x96\xB0\xE6\xB5\xAA\x0913.0
K: 13519700638|20140608-54/c:Q/1402368323157/Put/vlen=37/ts=0 V: 20140608\x0913519700638\x09HTTP\xE4\xB8\x8A\xE7\xBD\x91\x09122.0
K: 13519700638|20140608-55/c:Q/1402368323157/Put/vlen=33/ts=0 V: 20140608\x0913519700638\x09\xE7\x99\xBE\xE5\xBA\xA6\x09697.0
K: 13519700638|20140608-56/c:Q/1402368323157/Put/vlen=44/ts=0 V: 20140608\x0913519700638\x09\xE5\xAE\x89\xE6\x99\xBA\xE5\xB8\x82\xE5\x9C\xBA\xE4\xB8\x8B\xE8\xBD\xBD\x0926.0
K: 13519700638|20140608-9/c:Q/1402368323157/Put/vlen=33/ts=0 V: 20140608\x0913519700638\x09\xE5\x85\xB6\xE4\xBB\x96\x0975.75
K: 13519701290|20140608-12/c:Q/1402368323157/Put/vlen=33/ts=0 V: 20140608\x0913519701290\x09\xE5\x85\xB6\xE4\xBB\x96\x0933.79
K: 13519701290|20140608-13/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519701290\x09\xE5\x85\xB6\xE4\xBB\x96-TCP\x09703.16
K: 13519701290|20140608-14/c:Q/1402368323157/Put/vlen=32/ts=0 V: 20140608\x0913519701290\x09\xE5\xBD\xA9\xE4\xBF\xA1\x092.59
K: 13519701290|20140608-18/c:Q/1402368323157/Put/vlen=39/ts=0 V: 20140608\x0913519701290\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\x9B\xBE\xE7\x89\x87\x09941.0
K: 13519701290|20140608-19/c:Q/1402368323157/Put/vlen=47/ts=0 V: 20140608\x0913519701290\x09\xE7\x99\xBE\xE5\xBA\xA6\xE5\xBC\x80\xE6\x94\xBE\xE4\xBA\x91\xE5\xB9\xB3\xE5\x8F\xB0\x0929.0
K: 13519701290|20140608-21/c:Q/1402368323157/Put/vlen=37/ts=0 V: 20140608\x0913519701290\x09\xE9\xAB\x98\xE5\xBE\xB7\xE5\x9C\xB0\xE5\x9B\xBE\x098.0
K: 13519701290|20140608-27/c:Q/1402368323157/Put/vlen=37/ts=0 V: 20140608\x0913519701290\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\xA4\xB4\xE5\x83\x8F\x094.0
K: 13519701290|20140608-44/c:Q/1402368323157/Put/vlen=36/ts=0 V: 20140608\x0913519701290\x09\xE8\x85\xBE\xE8\xAE\xAF\xE7\xBD\x91\x09484.0
K: 13519701290|20140608-50/c:Q/1402368323157/Put/vlen=31/ts=0 V: 20140608\x0913519701290\x09\xE6\x90\x9C\xE6\x90\x9C\x092.0
K: 13519701290|20140608-60/c:Q/1402368323157/Put/vlen=36/ts=0 V: 20140608\x0913519701290\x09HTTP\xE4\xB8\x8A\xE7\xBD\x91\x0981.0
K: 13519701290|20140608-61/c:Q/1402368323157/Put/vlen=33/ts=0 V: 20140608\x0913519701290\x09\xE7\x99\xBE\xE5\xBA\xA6\x09255.0
K: 13519702500|20140608-15/c:Q/1402368323157/Put/vlen=33/ts=0 V: 20140608\x0913519702500\x09\xE5\x85\xB6\xE4\xBB\x96\x0917.38
K: 13519702500|20140608-16/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519702500\x09\xE5\x85\xB6\xE4\xBB\x96-TCP\x09174.14
K: 13519702500|20140608-17/c:Q/1402368323157/Put/vlen=34/ts=0 V: 20140608\x0913519702500\x09\xE5\xBD\xA9\xE4\xBF\xA1\x09443.75
K: 13519702500|20140608-20/c:Q/1402368323157/Put/vlen=36/ts=0 V: 20140608\x0913519702500\x09\xE8\x85\xBE\xE8\xAE\xAF\xE7\xBD\x91\x09457.0
K: 13519702500|20140608-22/c:Q/1402368323157/Put/vlen=34/ts=0 V: 20140608\x0913519702500\x09360\xE7\xBD\x91\xE7\xAB\x99\x099.0
K: 13519702500|20140608-28/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519702500\x09\xE8\x85\xBE\xE8\xAE\xAF\xE5\x9B\xBE\xE7\x89\x87\x0951.0
K: 13519702500|20140608-40/c:Q/1402368323157/Put/vlen=38/ts=0 V: 20140608\x0913519702500\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\xA4\xB4\xE5\x83\x8F\x0974.0
K: 13519702500|20140608-41/c:Q/1402368323157/Put/vlen=40/ts=0 V: 20140608\x0913519702500\x09\xE5\xBE\xAE\xE4\xBF\xA1\xE5\x9B\xBE\xE7\x89\x87\x091134.0
K: 13519702500|20140608-46/c:Q/1402368323157/Put/vlen=37/ts=0 V: 20140608\x0913519702500\x09HTTP\xE4\xB8\x8A\xE7\xBD\x91\x09407.0
K: 13519702500|20140608-51/c:Q/1402368323157/Put/vlen=31/ts=0 V: 20140608\x0913519702500\x09\xE6\x97\xBA\xE6\x97\xBA\x091.0
K: 13519702564|20140608-42/c:Q/1402368323157/Put/vlen=36/ts=0 V: 20140608\x0913519702564\x09HTTP\xE4\xB8\x8A\xE7\xBD\x91\x0916.0
Block index size as per heapsize: 424
reader=/hbase/v_dm_user_app_d_201406/c64c6f7f7caf7f6d5a3f6bbc209dd2cb/c/0ebf8900a669462c8a9d9d908622928f,
compression=none,
cacheConf=CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false] [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false] [cacheEvictOnClose=false] [cacheCompressed=false],
firstKey=13519304818|20140608-1/c:Q/1402368323157/Put,
lastKey=13519702564|20140608-42/c:Q/1402368323157/Put,
avgKeyLen=36,
avgValueLen=36,
entries=62,
length=5841
Trailer:
fileinfoOffset=5199,
loadOnOpenDataOffset=5102,
dataIndexCount=1,
metaIndexCount=0,
totalUncomressedBytes=5768,
entryCount=62,
compressionCodec=NONE,
uncompressedDataIndexSize=49,
numDataIndexLevels=1,
firstDataBlockOffset=0,
lastDataBlockOffset=0,
comparatorClassName=org.apache.hadoop.hbase.KeyValue$KeyComparator,
majorVersion=2,
minorVersion=0
Fileinfo:
BULKLOAD_SOURCE_TASK = attempt_201406091028_0019_r_000000_0
BULKLOAD_TIMESTAMP = \x00\x00\x01F\x83\xAA\xD9E
DELETE_FAMILY_COUNT = \x00\x00\x00\x00\x00\x00\x00\x00
EARLIEST_PUT_TS = \x00\x00\x01F\x83\xAAnU
EXCLUDE_FROM_MINOR_COMPACTION = \x00
KEY_VALUE_VERSION = \x00\x00\x00\x01
MAJOR_COMPACTION_KEY = \xFF
MAX_MEMSTORE_TS_KEY = \x00\x00\x00\x00\x00\x00\x00\x00
TIMERANGE = 1402368323157....1402368323157
hfile.AVG_KEY_LEN = 36
hfile.AVG_VALUE_LEN = 36
hfile.LASTKEY = \x00\x1713519702564|20140608-42\x01cQ\x00\x00\x01F\x83\xAAnU\x04
Mid-key: \x00\x1613519304818|20140608-1\x01cQ\x00\x00\x01F\x83\xAAnU\x04
Bloom filter:
Not present
Delete Family Bloom filter:
Not present
Scanned kv count -> 62