Datax将本地文件导入Hbase数据库!!!酷酷酷

时间:2022-09-22 08:23:58

Hbase Writer的json文件链接:

https://github.com/alibaba/DataX/blob/master/hbase11xwriter/doc/hbase11xwriter.md

配置json文件

{
  "job": {
    "setting": {
      "speed": {
        "channel": 5
      }
    },
    "content": [
      {
        "reader": {
          "name": "txtfilereader",
          "parameter": {
            "path": "/home/baoshan/ml-1m/movies.dat",
            "charset": "UTF-8",
            "column": [
              {
                "index": 0,
                "type": "String"
              },
              {
                "index": 1,
                "type": "string"
              },
              {
                "index": 2,
                "type": "string"
              }
            ],
            "fieldDelimiter": ""
          }
        },
        "writer": {
          "name": "hbase11xwriter",
          "parameter": {
            "hbaseConfig": {
              "hbase.rootdir": "xxxxxx",
              "hbase.cluster.distributed": "true",
              "hbase.zookeeper.quorum": "xxxxxx"
            },
            "table": "writer",
            "mode": "normal",
            "rowkeyColumn": [
                {
                  "index":0,
                  "type":"string"
                }
            ],
            "column": [
              {
                "index":1,
                "name": "cf1:q1",
                "type": "string"
              },
              {
                "index":2,
                "name": "cf1:q2",
                "type": "string"
              }
            ],
            "versionColumn":{
              "index": -1,
              "value":"123456789"
            },
            "encoding": "utf-8"
          }
        }
      }
    ]
  }
}

可以同时导入多个本地文件,采用默认匹配符号*,例如有example开头的多个文件,则写为example*,则会并发导入多个文件

执行

python bin/datax.py local2hbase.json

 

亲测,批量导入hbase数据,完美