下面是如何在三个节点(192.168.250.140, 192.168.250.146, 192.168.250.147)上部署 ClickHouse 集群,并使用 ClickHouse Keeper 作为分布式协调服务的详细步骤。
Step 1: 安装 ClickHouse
首先,在每个节点上安装 ClickHouse。以下命令适用于 Debian 系统。
wget -O clickhouse-client_24.4.3.25_amd64.deb /ClickHouse/ClickHouse/releases/download/v24.4.3.25-stable/clickhouse-client_24.4.3.25_amd64.deb
wget -O clickhouse-common-static_24.4.3.25_amd64.deb /ClickHouse/ClickHouse/releases/download/v24.4.3.25-stable/clickhouse-common-static_24.4.3.25_amd64.deb
wget -O clickhouse-server_24.4.3.25_amd64.deb /ClickHouse/ClickHouse/releases/download/v24.4.3.25-stable/clickhouse-server_24.4.3.25_amd64.deb
sudo dpkg -i clickhouse-common-static_24.4.3.25_amd64.deb
sudo dpkg -i clickhouse-client_24.4.3.25_amd64.deb
sudo dpkg -i clickhouse-server_24.4.3.25_amd64.deb
dpkg -l | grep clickhouse --是否安装成功
Step 2: 配置 ClickHouse Keeper
在每个节点上配置 ClickHouse Keeper。编辑 /etc/clickhouse-server/ 文件,添加 ClickHouse Keeper 的配置。
在每个节点上,添加以下配置:
<clickhouse>
<keeper_server>
<tcp_port>9181</tcp_port>
<server_id>1</server_id>
<log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<session_timeout_ms>30000</session_timeout_ms>
<raft_logs_level>trace</raft_logs_level>
</coordination_settings>
<raft_configuration>
<server>
<id>1</id>
<hostname>192.168.252.140</hostname>
<port>9234</port>
</server>
<server>
<id>2</id>
<hostname>192.168.252.146</hostname>
<port>9234</port>
</server>
<server>
<id>3</id>
<hostname>192.168.252.147</hostname>
<port>9234</port>
</server>
</raft_configuration>
</keeper_server>
</clickhouse>
注意:在每个节点上,server_id 应该是唯一的。例如:
● 192.168.250.140 上的 server_id 为 1
● 192.168.250.146 上的 server_id 为 2
● 192.168.250.147 上的 server_id 为 3
Step 3: 配置分片和副本
在每个节点上,编辑 /etc/clickhouse-server// 文件,配置分片和副本。
<clickhouse>
<remote_servers>
<log_cluster> <!-- 使用了自定义的集群名称 -->
<shard>
<replica>
<host>192.168.250.140</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.250.146</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>192.168.250.147</host>
<port>9000</port>
</replica>
<replica>
<host>192.168.250.140</host>
<port>9000</port>
</replica>
</shard>
</log_cluster>
</remote_servers>
</clickhouse>
Step 4: 配置宏
在对应的节点上加入宏
//在192.168.250.140加入这个宏
<macros>
<shard>01</shard>
<replica>192.168.250.140</replica>
</macros>
//在192.168.250.146加入这个宏
<macros>
<shard>02</shard>
<replica>192.168.250.146</replica>
</macros>
//在192.168.250.147加入这个宏
<macros>
<shard>03</shard>
<replica>192.168.250.147</replica>
</macros>
Step 5: 启动 ClickHouse 服务
我使用的所有配置都在/etc/clickhouse-server/里面
所以注释掉keeper_config.xml里面的 <keeper_server>标签及 <keeper_server>内的内容
Step 6: 启动 ClickHouse 服务
在每个节点上启动 ClickHouse 服务:
sudo systemctl start clickhouse-server
并确保服务已启动:
sudo systemctl status clickhouse-server
如果启动不能完成初始化如下,不妨删除旧数据
2024.06.26 21:03:29.159411 [ 94688 ] {} <Information> KeeperDispatcher: Server still not initialized, will not apply configuration until initialization finished
2024.06.26 21:03:34.159564 [ 94688 ] {} <Information> KeeperDispatcher: Server still not initialized, will not apply configuration until initialization finished
2024.06.26 21:03:39.159754 [ 94688 ] {} <Information> KeeperDispatcher: Server still not initialized, will not apply configuration until initialization finished
2024.06.26 21:03:44.159931 [ 94688 ] {} <Information> KeeperDispatcher: Server still not initialized, will not apply configuration until initialization finished
2024.06.26 21:03:49.160159 [ 94688 ] {} <Information> KeeperDispatcher: Server still not initialized, will not apply configuration until initialization finished
sudo rm -rf /var/lib/clickhouse/coordination/log/*
sudo rm -rf /var/lib/clickhouse/coordination/snapshots/*
Step 6: 创建分布式表
使用 ClickHouse 客户端连接到任意一个节点,创建分布式表。
clickhouse-client -h 192.168.250.140
-- 创建数据库
CREATE DATABASE IF NOT EXISTS test;
-- 创建本地表
CREATE TABLE test.local_table ON CLUSTER log_cluster (
id UInt32,
name String
) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/test/local_table', '{replica}')
ORDER BY id;
-- 创建分布式表
CREATE TABLE test.distributed_table ON CLUSTER your_cluster_name AS test.local_table
ENGINE = Distributed(log_cluster, test, local_table, rand());
配置总结
<listen_host>0.0.0.0</listen_host>
<log_cluster>
<shard>
<replica>
<host>192.168.250.140</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>192.168.250.146</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>192.168.250.147</host>
<port>9000</port>
</replica>
</shard>
</log_cluster>
<keeper_server>
<tcp_port>9181</tcp_port>
<server_id>1</server_id>
<log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
<coordination_settings>
<operation_timeout_ms>10000</operation_timeout_ms>
<session_timeout_ms>30000</session_timeout_ms>
<raft_logs_level>trace</raft_logs_level>
</coordination_settings>
<raft_configuration>
<server>
<id>1</id>
<hostname>192.168.252.140</hostname>
<port>9234</port>
</server>
<server>
<id>2</id>
<hostname>192.168.252.146</hostname>
<port>9234</port>
</server>
<server>
<id>3</id>
<hostname>192.168.252.147</hostname>
<port>9234</port>
</server>
</raft_configuration>
</keeper_server>
<macros>
<shard>shardNum</shard>
<replica>192.168.250.140</replica>
</macros>