文章目录
- 物化视图
- 概述
- 与普通视图的区别
- 优缺点
- 基本语法
- 案例实操
- 数据表准备
- 创建物化视图方式一
- 创建物化视图
- 向源表插入数据
- 向物化视图插入数据
- 向源表插入同分区数据
- 向源表插入不满足条件数据
- 再次向物化视图导入历史数据
- 查看物化视图存储
- 删除源表数据
- 创建物化视图方式二
- 创建物化视图
- 源表新增数据
- 再次向源表插入数据
- 创建物化视图方式三
- 创建实体目标表
- 创建物化视图
- 向源表导入数据
- 查询物化视图和实体表
- 导入数据到物化视图
- 查看物化视图存储
- 创建物化视图方式四
- AggregatingMergeTree引擎创建物化视图
- 向源表导入数据
- 查询物化视图
- 总结
- 参考
物化视图
/docs/en/sql-reference/statements/create/view/#materialized
物化视图是查询结果的持久化,可以理解为一张时刻在与计算的表,创建过程中用了一个特殊引擎。但是对 更新 与 删除 操作支持并不好,更像是个插入触发器。
概述
与普通视图的区别
普通视图仅仅保存查询语句,查询的时候还是从原表读取数据。物化视图是把查询的结果根据相应的引擎存入到了磁盘或内存中,对数据重新进行了组织,可以理解为完全的一张新表。
优缺点
优点:查询速度快
缺点:写入过程中消耗较多机器资源,比如带宽占满,存储增加等。本质是一个流式数据的使用场景,是累加式的技术,所以要用历史数据做去重、去核的分析操作不太好用。
基本语法
创建一个隐藏的目标表来保存视图数据,表名默认是.inner.物化视图名。如果加了TO表名,将保存到显式的表。
create [materialized] view [if not exists] [db.]table_name [to [db.]name] [engine = engine] [populate] as select ...
限制条件:
- 必须指定物化视图的engine用于数据存储(要么是物化视图,要么是指定的显式表)
- to [db.]table的时候,不得使用populate
- 查询语句可以包含子句:distinct, group by, order by, limit …
案例实操
数据表准备
-- 建表
CREATE TABLE test.tb_mtview_counter (
create_time DateTime DEFAULT now(),
device UInt32,
value Float32
)
ENGINE=MergeTree
PARTITION BY toYYYYMM(create_time)
ORDER BY (device, create_time);
-- 插入数据
INSERT INTO test.tb_mtview_counter
SELECT
toDateTime('2015-01-01 00:00:00') + toInt64(number/10) AS create_time,
(number % 10) + 1 AS device,
(device * 3) + (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;
-- 查询源表
SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
sum(value) as sum,
max(value) as max,
min(value) as min,
avg(value) as avg
from test.tb_mtview_counter
GROUP BY device, day
ORDER BY day, device;
数据表查询结果如下:
┌────────day─┬─device─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 2015-01-01 │ 1 │ 10000 │ 106363.50000333786 │ 18.096 │ 3.021 │ 10.636350000333787 │
│ 2015-01-01 │ 2 │ 10000 │ 136010.90005207062 │ 21.1011 │ 6.1701 │ 13.601090005207062 │
│ 2015-01-01 │ 3 │ 10000 │ 166312.0000295639 │ 24.1942 │ 9.0022 │ 16.63120000295639 │
│ 2015-01-01 │ 4 │ 10000 │ 195977.00000858307 │ 27.1783 │ 12.0003 │ 19.597700000858307 │
│ 2015-01-01 │ 5 │ 10000 │ 226182.19993972778 │ 30.1824 │ 15.0284 │ 22.61821999397278 │
│ 2015-01-01 │ 6 │ 10000 │ 256159.19996261597 │ 33.0795 │ 18.0295 │ 25.615919996261596 │
│ 2015-01-01 │ 7 │ 10000 │ 286113.2000026703 │ 36.1976 │ 21.1016 │ 28.61132000026703 │
│ 2015-01-01 │ 8 │ 10000 │ 315845.9000492096 │ 39.0887 │ 24.1237 │ 31.58459000492096 │
│ 2015-01-01 │ 9 │ 10000 │ 345891.30003738403 │ 42.0428 │ 27.0048 │ 34.589130003738404 │
│ 2015-01-01 │ 10 │ 10000 │ 376212.0999965668 │ 45.0789 │ 30.0009 │ 37.62120999965668 │
└────────────┴────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
创建物化视图方式一
在本方式中,先用我们常见的聚合函数,来看看这种方式到底会出现什么结果。
创建物化视图
-- 创建物化视图 -- 没有to table时必须有engine
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv
ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(day) ORDER BY (device, day)
AS SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
sum(value) as sum,
max(value) as max,
min(value) as min,
avg(value) as avg
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2016-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
-- 查询物化视图
SELECT
device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;
查询结果,物化视图中此时没有数据,为何?这是因为未使用 populate 关键字,该关键字并不推荐使用。
思考:为什么建表要加条件 WHERE create_time >= toDate(‘2016-01-01 00:00:00’) ?
在后续的测试中发现其实不加也没有问题,猜测加过滤条件是为了确保建表时数据无法被导入,从而确保数据不会出错。
向源表插入数据
-- 源表插入数据
INSERT INTO test.tb_mtview_counter
SELECT
toDateTime('2017-01-01 00:00:00') + toInt64(number/10) AS create_time,
(number % 10) + 1 AS device,
(device * 3) + (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;
-- 查询物化视图
SELECT
device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;
结果如下:
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2017-01-01 │ 10000 │ 106250.90001893044 │ 18.153 │ 3.122 │ 10.625090001893044 │
│ 2 │ 2017-01-01 │ 10000 │ 135986.09993600845 │ 21.1851 │ 6.1051 │ 13.598609993600846 │
│ 3 │ 2017-01-01 │ 10000 │ 165927.19995212555 │ 24.1862 │ 9.2032 │ 16.592719995212555 │
│ 4 │ 2017-01-01 │ 10000 │ 195965.7000246048 │ 27.1743 │ 12.0193 │ 19.59657000246048 │
│ 5 │ 2017-01-01 │ 10000 │ 226203.90004825592 │ 30.1224 │ 15.0054 │ 22.62039000482559 │
│ 6 │ 2017-01-01 │ 10000 │ 256157.8000125885 │ 33.1915 │ 18.0025 │ 25.61578000125885 │
│ 7 │ 2017-01-01 │ 10000 │ 285870.0000991821 │ 36.1576 │ 21.0036 │ 28.587000009918214 │
│ 8 │ 2017-01-01 │ 10000 │ 315986.59994506836 │ 39.1477 │ 24.0267 │ 31.598659994506836 │
│ 9 │ 2017-01-01 │ 10000 │ 346104.4998226166 │ 42.1978 │ 27.0028 │ 34.610449982261656 │
│ 10 │ 2017-01-01 │ 10000 │ 376108.5000667572 │ 45.1809 │ 30.0059 │ 37.61085000667572 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
向物化视图插入数据
-- 向物化视图插入数据
insert into test.tb_mtview_counter_daily_mv
SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
sum(value) as sum,
max(value) as max,
min(value) as min,
avg(value) as avg
from test.tb_mtview_counter where create_time < toDate('2016-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
-- 查询物化视图
SELECT
device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;
结果如下:
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2017-01-01 │ 10000 │ 106250.90001893044 │ 18.153 │ 3.122 │ 10.625090001893044 │
│ 2 │ 2017-01-01 │ 10000 │ 135986.09993600845 │ 21.1851 │ 6.1051 │ 13.598609993600846 │
│ 3 │ 2017-01-01 │ 10000 │ 165927.19995212555 │ 24.1862 │ 9.2032 │ 16.592719995212555 │
│ 4 │ 2017-01-01 │ 10000 │ 195965.7000246048 │ 27.1743 │ 12.0193 │ 19.59657000246048 │
│ 5 │ 2017-01-01 │ 10000 │ 226203.90004825592 │ 30.1224 │ 15.0054 │ 22.62039000482559 │
│ 6 │ 2017-01-01 │ 10000 │ 256157.8000125885 │ 33.1915 │ 18.0025 │ 25.61578000125885 │
│ 7 │ 2017-01-01 │ 10000 │ 285870.0000991821 │ 36.1576 │ 21.0036 │ 28.587000009918214 │
│ 8 │ 2017-01-01 │ 10000 │ 315986.59994506836 │ 39.1477 │ 24.0267 │ 31.598659994506836 │
│ 9 │ 2017-01-01 │ 10000 │ 346104.4998226166 │ 42.1978 │ 27.0028 │ 34.610449982261656 │
│ 10 │ 2017-01-01 │ 10000 │ 376108.5000667572 │ 45.1809 │ 30.0059 │ 37.61085000667572 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2015-01-01 │ 10000 │ 105885.80000400543 │ 18.166 │ 3.021 │ 10.588580000400544 │
│ 2 │ 2015-01-01 │ 10000 │ 136100.19999217987 │ 21.0361 │ 6.0391 │ 13.610019999217988 │
│ 3 │ 2015-01-01 │ 10000 │ 166083.29994297028 │ 24.0942 │ 9.0432 │ 16.608329994297026 │
│ 4 │ 2015-01-01 │ 10000 │ 196134.90005016327 │ 27.1783 │ 12.0363 │ 19.613490005016327 │
│ 5 │ 2015-01-01 │ 10000 │ 225946.59996700287 │ 30.1994 │ 15.1584 │ 22.594659996700287 │
│ 6 │ 2015-01-01 │ 10000 │ 256283.39991378784 │ 33.1905 │ 18.0135 │ 25.628339991378784 │
│ 7 │ 2015-01-01 │ 10000 │ 286232.70005607605 │ 35.9916 │ 21.0066 │ 28.623270005607605 │
│ 8 │ 2015-01-01 │ 10000 │ 315966.5000553131 │ 39.1537 │ 24.0577 │ 31.596650005531313 │
│ 9 │ 2015-01-01 │ 10000 │ 345799.6998100281 │ 42.1388 │ 27.0538 │ 34.579969981002805 │
│ 10 │ 2015-01-01 │ 10000 │ 375820.0998439789 │ 45.1509 │ 30.0779 │ 37.582009984397885 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
向源表插入同分区数据
插入一个同分区(对物化视图而言),但日期不同的数据。对比查询源表结果,与物化视图的结果有什么区别
-- 源表插入数据
INSERT INTO test.tb_mtview_counter
SELECT
toDateTime('2017-01-02 00:00:00') + toInt64(number/10) AS create_time,
(number % 10) + 1 AS device,
(device * 3) + (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;
-- 查询物化视图
SELECT
device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;
结果如下:
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2015-01-01 │ 10000 │ 105885.80000400543 │ 18.166 │ 3.021 │ 10.588580000400544 │
│ 2 │ 2015-01-01 │ 10000 │ 136100.19999217987 │ 21.0361 │ 6.0391 │ 13.610019999217988 │
│ 3 │ 2015-01-01 │ 10000 │ 166083.29994297028 │ 24.0942 │ 9.0432 │ 16.608329994297026 │
│ 4 │ 2015-01-01 │ 10000 │ 196134.90005016327 │ 27.1783 │ 12.0363 │ 19.613490005016327 │
│ 5 │ 2015-01-01 │ 10000 │ 225946.59996700287 │ 30.1994 │ 15.1584 │ 22.594659996700287 │
│ 6 │ 2015-01-01 │ 10000 │ 256283.39991378784 │ 33.1905 │ 18.0135 │ 25.628339991378784 │
│ 7 │ 2015-01-01 │ 10000 │ 286232.70005607605 │ 35.9916 │ 21.0066 │ 28.623270005607605 │
│ 8 │ 2015-01-01 │ 10000 │ 315966.5000553131 │ 39.1537 │ 24.0577 │ 31.596650005531313 │
│ 9 │ 2015-01-01 │ 10000 │ 345799.6998100281 │ 42.1388 │ 27.0538 │ 34.579969981002805 │
│ 10 │ 2015-01-01 │ 10000 │ 375820.0998439789 │ 45.1509 │ 30.0779 │ 37.582009984397885 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2017-01-01 │ 10000 │ 106250.90001893044 │ 18.153 │ 3.122 │ 10.625090001893044 │
│ 2 │ 2017-01-01 │ 10000 │ 135986.09993600845 │ 21.1851 │ 6.1051 │ 13.598609993600846 │
│ 3 │ 2017-01-01 │ 10000 │ 165927.19995212555 │ 24.1862 │ 9.2032 │ 16.592719995212555 │
│ 4 │ 2017-01-01 │ 10000 │ 195965.7000246048 │ 27.1743 │ 12.0193 │ 19.59657000246048 │
│ 5 │ 2017-01-01 │ 10000 │ 226203.90004825592 │ 30.1224 │ 15.0054 │ 22.62039000482559 │
│ 6 │ 2017-01-01 │ 10000 │ 256157.8000125885 │ 33.1915 │ 18.0025 │ 25.61578000125885 │
│ 7 │ 2017-01-01 │ 10000 │ 285870.0000991821 │ 36.1576 │ 21.0036 │ 28.587000009918214 │
│ 8 │ 2017-01-01 │ 10000 │ 315986.59994506836 │ 39.1477 │ 24.0267 │ 31.598659994506836 │
│ 9 │ 2017-01-01 │ 10000 │ 346104.4998226166 │ 42.1978 │ 27.0028 │ 34.610449982261656 │
│ 10 │ 2017-01-01 │ 10000 │ 376108.5000667572 │ 45.1809 │ 30.0059 │ 37.61085000667572 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2017-01-01 │ 10000 │ 106321.70000195503 │ 18.196 │ 3.007 │ 10.632170000195503 │
│ 2 │ 2017-01-01 │ 10000 │ 135868.9999809265 │ 21.1501 │ 6.0041 │ 13.586899998092651 │
│ 3 │ 2017-01-01 │ 10000 │ 165784.19991016388 │ 24.1572 │ 9.0552 │ 16.57841999101639 │
│ 4 │ 2017-01-01 │ 10000 │ 196065.50004196167 │ 27.1883 │ 12.0763 │ 19.606550004196166 │
│ 5 │ 2017-01-01 │ 10000 │ 225919.39997386932 │ 30.1374 │ 15.0284 │ 22.591939997386934 │
│ 6 │ 2017-01-01 │ 10000 │ 256105.40001869202 │ 33.1865 │ 18.0855 │ 25.6105400018692 │
│ 7 │ 2017-01-01 │ 10000 │ 286283.80012512207 │ 36.1986 │ 21.0996 │ 28.628380012512206 │
│ 8 │ 2017-01-01 │ 10000 │ 316006.7000656128 │ 39.1217 │ 24.0167 │ 31.60067000656128 │
│ 9 │ 2017-01-01 │ 10000 │ 345796.399974823 │ 42.1778 │ 27.0268 │ 34.5796399974823 │
│ 10 │ 2017-01-01 │ 10000 │ 376074.8998966217 │ 45.1629 │ 30.0649 │ 37.60748998966217 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
数据暂时未合并,可以执行 optimize 来手动进行合并(就算不手动执行,未来某个时刻也会自动合并)。
optimize table test.tb_mtview_counter_daily_mv final;
合并后查询结果:
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2015-01-01 │ 10000 │ 105885.80000400543 │ 18.166 │ 3.021 │ 10.588580000400544 │
│ 2 │ 2015-01-01 │ 10000 │ 136100.19999217987 │ 21.0361 │ 6.0391 │ 13.610019999217988 │
│ 3 │ 2015-01-01 │ 10000 │ 166083.29994297028 │ 24.0942 │ 9.0432 │ 16.608329994297026 │
│ 4 │ 2015-01-01 │ 10000 │ 196134.90005016327 │ 27.1783 │ 12.0363 │ 19.613490005016327 │
│ 5 │ 2015-01-01 │ 10000 │ 225946.59996700287 │ 30.1994 │ 15.1584 │ 22.594659996700287 │
│ 6 │ 2015-01-01 │ 10000 │ 256283.39991378784 │ 33.1905 │ 18.0135 │ 25.628339991378784 │
│ 7 │ 2015-01-01 │ 10000 │ 286232.70005607605 │ 35.9916 │ 21.0066 │ 28.623270005607605 │
│ 8 │ 2015-01-01 │ 10000 │ 315966.5000553131 │ 39.1537 │ 24.0577 │ 31.596650005531313 │
│ 9 │ 2015-01-01 │ 10000 │ 345799.6998100281 │ 42.1388 │ 27.0538 │ 34.579969981002805 │
│ 10 │ 2015-01-01 │ 10000 │ 375820.0998439789 │ 45.1509 │ 30.0779 │ 37.582009984397885 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬───────max─┬───────min─┬────────────────avg─┐
│ 1 │ 2017-01-01 │ 20000 │ 212572.60002088547 │ 36.349 │ 6.1289997 │ 21.257260002088547 │
│ 2 │ 2017-01-01 │ 20000 │ 271855.09991693497 │ 42.3352 │ 12.1092 │ 27.185509991693497 │
│ 3 │ 2017-01-01 │ 20000 │ 331711.3998622894 │ 48.3434 │ 18.2584 │ 33.171139986228944 │
│ 4 │ 2017-01-01 │ 20000 │ 392031.20006656647 │ 54.3626 │ 24.0956 │ 39.20312000665665 │
│ 5 │ 2017-01-01 │ 20000 │ 452123.30002212524 │ 60.2598 │ 30.0338 │ 45.212330002212525 │
│ 6 │ 2017-01-01 │ 20000 │ 512263.2000312805 │ 66.378006 │ 36.088 │ 51.22632000312805 │
│ 7 │ 2017-01-01 │ 20000 │ 572153.8002243042 │ 72.3562 │ 42.1032 │ 57.21538002243042 │
│ 8 │ 2017-01-01 │ 20000 │ 631993.3000106812 │ 78.2694 │ 48.0434 │ 63.19933000106812 │
│ 9 │ 2017-01-01 │ 20000 │ 691900.8997974396 │ 84.375595 │ 54.029602 │ 69.19008997974396 │
│ 10 │ 2017-01-01 │ 20000 │ 752183.3999633789 │ 90.343796 │ 60.0708 │ 75.21833999633789 │
└────────┴────────────┴───────┴────────────────────┴───────────┴───────────┴────────────────────┘
从结果可以看到,分区数据执行了合并,而且合并是依据order by字段进行的,除了device和day以外的字段全部自动求和了,这个就是SummingMergeTree的特性。
向源表插入不满足条件数据
再次向源表插入数据,注意插入时间,插入后观察物化视图结果。
-- 源表插入数据
INSERT INTO test.tb_mtview_counter
SELECT
toDateTime('2015-01-02 00:00:00') + toInt64(number/10) AS create_time,
(number % 10) + 1 AS device,
(device * 3) + (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;
-- 查询物化视图
SELECT
device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;
发现该部分数据并未插入到物化视图中,这是为什么呢?
原来,这是因为构建物化视图时,我们为了防止历史数据无法正确导入,设定了一个条件:WHERE create_time >= toDate(‘2016-01-01 00:00:00’)。因此在向源表插入数据时,如果数据不满足物化视图的过滤条件,那么数据将无法自动进入物化视图。
再次向物化视图导入历史数据
再次向物化视图插入数据,对比前后数据有什么不一样
-- 向物化视图插入数据
insert into test.tb_mtview_counter_daily_mv
SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
sum(value) as sum,
max(value) as max,
min(value) as min,
avg(value) as avg
from test.tb_mtview_counter where create_time < toDate('2016-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
-- 查询物化视图
SELECT
device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;
结果显示数据重复插入,2015-01-01的数据又重新导入了一份,以一个新的分区存在(当前数据未合并):
┌─device─┬────────day─┬─count─┬────────────────sum─┬───────max─┬───────min─┬────────────────avg─┐
│ 1 │ 2017-01-01 │ 20000 │ 212572.60002088547 │ 36.349 │ 6.1289997 │ 21.257260002088547 │
│ 2 │ 2017-01-01 │ 20000 │ 271855.09991693497 │ 42.3352 │ 12.1092 │ 27.185509991693497 │
│ 3 │ 2017-01-01 │ 20000 │ 331711.3998622894 │ 48.3434 │ 18.2584 │ 33.171139986228944 │
│ 4 │ 2017-01-01 │ 20000 │ 392031.20006656647 │ 54.3626 │ 24.0956 │ 39.20312000665665 │
│ 5 │ 2017-01-01 │ 20000 │ 452123.30002212524 │ 60.2598 │ 30.0338 │ 45.212330002212525 │
│ 6 │ 2017-01-01 │ 20000 │ 512263.2000312805 │ 66.378006 │ 36.088 │ 51.22632000312805 │
│ 7 │ 2017-01-01 │ 20000 │ 572153.8002243042 │ 72.3562 │ 42.1032 │ 57.21538002243042 │
│ 8 │ 2017-01-01 │ 20000 │ 631993.3000106812 │ 78.2694 │ 48.0434 │ 63.19933000106812 │
│ 9 │ 2017-01-01 │ 20000 │ 691900.8997974396 │ 84.375595 │ 54.029602 │ 69.19008997974396 │
│ 10 │ 2017-01-01 │ 20000 │ 752183.3999633789 │ 90.343796 │ 60.0708 │ 75.21833999633789 │
└────────┴────────────┴───────┴────────────────────┴───────────┴───────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2015-01-01 │ 10000 │ 105885.80000400543 │ 18.166 │ 3.021 │ 10.588580000400544 │
│ 2 │ 2015-01-01 │ 10000 │ 136100.19999217987 │ 21.0361 │ 6.0391 │ 13.610019999217988 │
│ 3 │ 2015-01-01 │ 10000 │ 166083.29994297028 │ 24.0942 │ 9.0432 │ 16.608329994297026 │
│ 4 │ 2015-01-01 │ 10000 │ 196134.90005016327 │ 27.1783 │ 12.0363 │ 19.613490005016327 │
│ 5 │ 2015-01-01 │ 10000 │ 225946.59996700287 │ 30.1994 │ 15.1584 │ 22.594659996700287 │
│ 6 │ 2015-01-01 │ 10000 │ 256283.39991378784 │ 33.1905 │ 18.0135 │ 25.628339991378784 │
│ 7 │ 2015-01-01 │ 10000 │ 286232.70005607605 │ 35.9916 │ 21.0066 │ 28.623270005607605 │
│ 8 │ 2015-01-01 │ 10000 │ 315966.5000553131 │ 39.1537 │ 24.0577 │ 31.596650005531313 │
│ 9 │ 2015-01-01 │ 10000 │ 345799.6998100281 │ 42.1388 │ 27.0538 │ 34.579969981002805 │
│ 10 │ 2015-01-01 │ 10000 │ 375820.0998439789 │ 45.1509 │ 30.0779 │ 37.582009984397885 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬───────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2015-01-01 │ 20000 │ 212017.2000157833 │ 18.166 │ 3.021 │ 10.600860000789165 │
│ 2 │ 2015-01-01 │ 20000 │ 272037.8999552727 │ 21.1301 │ 6.0391 │ 13.601894997763633 │
│ 3 │ 2015-01-01 │ 20000 │ 331975.7998428345 │ 24.1942 │ 9.0432 │ 16.598789992141725 │
│ 4 │ 2015-01-01 │ 20000 │ 392069.6000614166 │ 27.1783 │ 12.0363 │ 19.603480003070832 │
│ 5 │ 2015-01-01 │ 20000 │ 452100.799785614 │ 30.1994 │ 15.0254 │ 22.6050399892807 │
│ 6 │ 2015-01-01 │ 20000 │ 512518.7999534607 │ 33.1905 │ 18.0135 │ 25.625939997673036 │
│ 7 │ 2015-01-01 │ 20000 │ 572203.9001655579 │ 36.1746 │ 21.0066 │ 28.61019500827789 │
│ 8 │ 2015-01-01 │ 20000 │ 631927.8000221252 │ 39.1537 │ 24.0577 │ 31.59639000110626 │
│ 9 │ 2015-01-01 │ 20000 │ 691799.4997425079 │ 42.1388 │ 27.0538 │ 34.589974987125395 │
│ 10 │ 2015-01-01 │ 20000 │ 751651.5999317169 │ 45.1999 │ 30.0089 │ 37.58257999658585 │
└────────┴────────────┴───────┴───────────────────┴─────────┴─────────┴────────────────────┘
执行合并之后再查看:
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2015-01-01 │ 30000 │ 317903.00001978874 │ 36.332 │ 6.042 │ 21.18944000118971 │
│ 2 │ 2015-01-01 │ 30000 │ 408138.09994745255 │ 42.1662 │ 12.0782 │ 27.21191499698162 │
│ 3 │ 2015-01-01 │ 30000 │ 498059.09978580475 │ 48.2884 │ 18.0864 │ 33.207119986438755 │
│ 4 │ 2015-01-01 │ 30000 │ 588204.5001115799 │ 54.3566 │ 24.0726 │ 39.21697000808716 │
│ 5 │ 2015-01-01 │ 30000 │ 678047.3997526169 │ 60.3988 │ 30.1838 │ 45.199699985980985 │
│ 6 │ 2015-01-01 │ 30000 │ 768802.1998672485 │ 66.381 │ 36.027 │ 51.254279989051824 │
│ 7 │ 2015-01-01 │ 30000 │ 858436.6002216339 │ 72.1662 │ 42.0132 │ 57.23346501388549 │
│ 8 │ 2015-01-01 │ 30000 │ 947894.3000774384 │ 78.3074 │ 48.1154 │ 63.193040006637574 │
│ 9 │ 2015-01-01 │ 30000 │ 1037599.199552536 │ 84.2776 │ 54.1076 │ 69.16994496812819 │
│ 10 │ 2015-01-01 │ 30000 │ 1127471.6997756958 │ 90.3508 │ 60.0868 │ 75.16458998098373 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬───────max─┬───────min─┬────────────────avg─┐
│ 1 │ 2017-01-01 │ 20000 │ 212572.60002088547 │ 36.349 │ 6.1289997 │ 21.257260002088547 │
│ 2 │ 2017-01-01 │ 20000 │ 271855.09991693497 │ 42.3352 │ 12.1092 │ 27.185509991693497 │
│ 3 │ 2017-01-01 │ 20000 │ 331711.3998622894 │ 48.3434 │ 18.2584 │ 33.171139986228944 │
│ 4 │ 2017-01-01 │ 20000 │ 392031.20006656647 │ 54.3626 │ 24.0956 │ 39.20312000665665 │
│ 5 │ 2017-01-01 │ 20000 │ 452123.30002212524 │ 60.2598 │ 30.0338 │ 45.212330002212525 │
│ 6 │ 2017-01-01 │ 20000 │ 512263.2000312805 │ 66.378006 │ 36.088 │ 51.22632000312805 │
│ 7 │ 2017-01-01 │ 20000 │ 572153.8002243042 │ 72.3562 │ 42.1032 │ 57.21538002243042 │
│ 8 │ 2017-01-01 │ 20000 │ 631993.3000106812 │ 78.2694 │ 48.0434 │ 63.19933000106812 │
│ 9 │ 2017-01-01 │ 20000 │ 691900.8997974396 │ 84.375595 │ 54.029602 │ 69.19008997974396 │
│ 10 │ 2017-01-01 │ 20000 │ 752183.3999633789 │ 90.343796 │ 60.0708 │ 75.21833999633789 │
└────────┴────────────┴───────┴────────────────────┴───────────┴───────────┴────────────────────┘
查看物化视图存储
物化视图实际上也是占用了disk上的datapart,我们可以看一下私有表和mv表对应的datapart情况:
-- 查看当前私有表和mv表名称
show tables from test;
-- 查看系统表
SELECT
database,
table,
partition,
name,
rows,
bytes_on_disk,
modification_time,
min_date,
max_date,
engine
FROM system.parts
WHERE table in ('.inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e','tb_mtview_counter_daily_mv');
从结果可以知道,实际磁盘上存储数据的是隐藏的目标表,表名通常以.inner
开头。
┌─database─┬─table──────────────────────────────────────────┬─partition─┬─name─────────┬─rows─┬─bytes_on_disk─┬───modification_time─┬───min_date─┬───max_date─┬─engine───────────┐
│ test │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201501 │ 201501_2_2_1 │ 10 │ 754 │ 2021-11-22 09:59:28 │ 2015-01-01 │ 2015-01-01 │ SummingMergeTree │
│ test │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201501 │ 201501_2_4_2 │ 10 │ 754 │ 2021-11-22 11:17:42 │ 2015-01-01 │ 2015-01-01 │ SummingMergeTree │
│ test │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201501 │ 201501_4_4_0 │ 10 │ 754 │ 2021-11-22 11:13:50 │ 2015-01-01 │ 2015-01-01 │ SummingMergeTree │
│ test │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201701 │ 201701_1_3_1 │ 10 │ 754 │ 2021-11-22 09:59:28 │ 2017-01-01 │ 2017-01-01 │ SummingMergeTree │
│ test │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201701 │ 201701_1_3_2 │ 10 │ 754 │ 2021-11-22 11:17:42 │ 2017-01-01 │ 2017-01-01 │ SummingMergeTree │
└──────────┴────────────────────────────────────────────────┴───────────┴──────────────┴──────┴───────────────┴─────────────────────┴────────────┴────────────┴──────────────────┘
删除源表数据
-- 清空源表
truncate table test.tb_mtview_counter;
-- 查询源表
SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
sum(value) as sum,
max(value) as max,
min(value) as min,
avg(value) as avg
from test.tb_mtview_counter
GROUP BY device, day
ORDER BY day, device;
-- 查询物化视图
SELECT
device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;
-- 物化视图删除
truncate table test.tb_mtview_counter_daily_mv;
drop table test.tb_mtview_counter_daily_mv;
drop view test.tb_mtview_counter_daily_mv;
上面的操作我们发现,删除源表数据,物化视图的数据不会受到影响。这说明物化视图不支持同步删除数据,在实际应用中,应避免在经常需要删除或更新的场景中使用物化视图。
创建物化视图方式二
在前一种物化视图中,我们看到,SummingMergeTree()对常规的除了求和以外的聚合函数支持并不好,在本例中,我们采用 聚合函数名称加-State
后缀 的函数形式来获取聚合值,通过例子来了解具体是怎么一回事。
官网介绍:以-State
后缀的函数总是返回AggregateFunction
类型的数据的中间状态。对于SELECT
而言,AggregateFunction
类型总是以特定的二进制形式展现在所有的输出格式中。
参考:/docs/zh/sql-reference/data-types/aggregatefunction/
创建物化视图
-- 没有to table时必须有engine
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv2
ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(day) ORDER BY (device, day)
AS SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
sum(value) as sum,
maxState(value) AS max_value_state,
minState(value) AS min_value_state,
avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2016-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
源表新增数据
INSERT INTO test.tb_mtview_counter
SELECT
toDateTime('2016-01-01 00:00:00') + toInt64(number/10) AS create_time,
(number % 10) + 1 AS device,
(device * 3) + (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;
-- 查询源表
SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
sum(value) as sum,
max(value) as max,
min(value) as min,
avg(value) as avg
from test.tb_mtview_counter
GROUP BY device, day
ORDER BY day, device;
-- 查询物化视图
SELECT
day, device,
sum(count) AS count,
sum(sum) as sum,
maxMerge(max_value_state) AS max,
minMerge(min_value_state) AS min,
avgMerge(avg_value_state) AS avg
FROM test.tb_mtview_counter_daily_mv2
GROUP BY device, day
ORDER BY day, device;
-- 显示当前数据库所有表
show tables from test;
二者结果均为:
┌────────day─┬─device─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 2016-01-01 │ 1 │ 10000 │ 105755.79999136925 │ 18.11 │ 3.004 │ 10.575579999136925 │
│ 2016-01-01 │ 2 │ 10000 │ 136052.39994812012 │ 21.1971 │ 6.0411 │ 13.605239994812012 │
│ 2016-01-01 │ 3 │ 10000 │ 165888.80001831055 │ 24.1952 │ 9.0012 │ 16.588880001831054 │
│ 2016-01-01 │ 4 │ 10000 │ 196001.40008163452 │ 27.1563 │ 12.0223 │ 19.600140008163454 │
│ 2016-01-01 │ 5 │ 10000 │ 225971.79990959167 │ 30.0924 │ 15.0074 │ 22.597179990959166 │
│ 2016-01-01 │ 6 │ 10000 │ 256052.80004692078 │ 33.1685 │ 18.0755 │ 25.605280004692077 │
│ 2016-01-01 │ 7 │ 10000 │ 286267.600069046 │ 36.1666 │ 21.0096 │ 28.6267600069046 │
│ 2016-01-01 │ 8 │ 10000 │ 315901.2999572754 │ 39.1117 │ 24.0547 │ 31.59012999572754 │
│ 2016-01-01 │ 9 │ 10000 │ 345774.49993896484 │ 42.1678 │ 27.1348 │ 34.57744999389649 │
│ 2016-01-01 │ 10 │ 10000 │ 375901.99990463257 │ 45.1459 │ 30.0009 │ 37.590199990463255 │
└────────────┴────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
再次向源表插入数据
插入同分区不同日期的数据
-- 源表插入数据
INSERT INTO test.tb_mtview_counter
SELECT
toDateTime('2016-01-02 00:00:00') + toInt64(number/10) AS create_time,
(number % 10) + 1 AS device,
(device * 3) + (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 10000;
先看物化视图原始数据:
select * from test.tb_mtview_counter_daily_mv2 where device=1 ;
┌────────day─┬─device─┬─count─┬────────────────sum─┬─max_value_state─┬─min_value_state─┬─avg_value_state─┐
│ 2016-01-01 │ 1 │ 10000 │ 105755.79999136925 │ H▒A │ ▒A@@ │ ▒▒̼▒▒@▒N │
└────────────┴────────┴───────┴────────────────────┴─────────────────┴─────────────────┴─────────────────┘
┌────────day─┬─device─┬─count─┬────────────────sum─┬─max_value_state─┬─min_value_state─┬─avg_value_state─┐
│ 2016-01-01 │ 1 │ 1000 │ 6048.3999972343445 │ ▒&A │ ▒@@ │ 8ff▒▒@▒ │
└────────────┴────────┴───────┴────────────────────┴─────────────────┴─────────────────┴─────────────────┘
可以看出来数据并未按照分区进行合并,可以执行 :
-- 合并数据
optimize table test.tb_mtview_counter_daily_mv2 final;
再次查询查看结果,数据已经合并:
select * from test.tb_mtview_counter_daily_mv2 where device=1 ;
┌────────day─┬─device─┬─count─┬────────────────sum─┬─max_value_state─┬─min_value_state─┬─avg_value_state─┐
│ 2016-01-01 │ 1 │ 11000 │ 111804.19998860359 │ H▒A │ ▒A@@ │ @'3▒K▒@▒U │
└────────────┴────────┴───────┴────────────────────┴─────────────────┴─────────────────┴─────────────────┘
再次查询物化视图聚合结果:
-- 查询物化视图
SELECT
day, device,
sum(count) AS count,
sum(sum) as sum,
maxMerge(max_value_state) AS max,
minMerge(min_value_state) AS min,
avgMerge(avg_value_state) AS avg
FROM test.tb_mtview_counter_daily_mv2
GROUP BY device, day
ORDER BY day, device;
结果正确:
┌────────day─┬─device─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 2016-01-01 │ 1 │ 11000 │ 111804.19998860359 │ 18.11 │ 3.004 │ 10.164018180782145 │
│ 2016-01-01 │ 2 │ 11000 │ 145148.39994430542 │ 21.1971 │ 6.0411 │ 13.195309085845947 │
│ 2016-01-01 │ 3 │ 11000 │ 178021.30000972748 │ 24.1952 │ 9.0002 │ 16.18375454633886 │
│ 2016-01-01 │ 4 │ 11000 │ 211105.6000881195 │ 27.1563 │ 12.0223 │ 19.191418189829047 │
│ 2016-01-01 │ 5 │ 11000 │ 244015.59992313385 │ 30.0924 │ 15.0074 │ 22.183236356648532 │
│ 2016-01-01 │ 6 │ 11000 │ 277090.6000518799 │ 33.1685 │ 18.0155 │ 25.1900545501709 │
│ 2016-01-01 │ 7 │ 11000 │ 310359.90009117126 │ 36.1666 │ 21.0096 │ 28.21453637192466 │
│ 2016-01-01 │ 8 │ 11000 │ 342972.3999404907 │ 39.1117 │ 24.0547 │ 31.179309085499156 │
│ 2016-01-01 │ 9 │ 11000 │ 375844.5999317169 │ 42.1678 │ 27.0128 │ 34.16769090288336 │
│ 2016-01-01 │ 10 │ 11000 │ 409034.9999103546 │ 45.1459 │ 30.0009 │ 37.18499999185042 │
└────────────┴────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
创建物化视图方式三
在本例中,我们采用使用to 方法,将物化视图的实体表指定为特定名称的表。
创建实体目标表
在这之前,我们先查看一下物化视图的建表语句:
-- 查看建表语句
show create table tb_mtview_counter_daily_mv2;
其中物化视图建表语句:
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv2
(
`day` Date,
`device` UInt32,
`count` UInt64,
`sum` Float64,
`max_value_state` AggregateFunction(max, Float32),
`min_value_state` AggregateFunction(min, Float32),
`avg_value_state` AggregateFunction(avg, Float32)
)
ENGINE = SummingMergeTree
PARTITION BY toYYYYMM(day)
ORDER BY (device, day)
SETTINGS index_granularity = 8192 AS
SELECT
toStartOfMonth(create_time) AS day,
device,
count(*) AS count,
sum(value) AS sum,
maxState(value) AS max_value_state,
minState(value) AS min_value_state,
avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2016-01-01 00:00:00')
GROUP BY
device,
day
ORDER BY
device ASC,
day ASC
如果我们查看该物化视图对应的实体表,会发现建表语句其实就是物化视图前半部分的语句:
show create table `.inner_id.b72f9396-16ff-4b1b-8234-8ba8a1a8f620`;
我们直接使用该建表语句,将表名修改为tb_mtview_counter_daily,如下所示:
CREATE TABLE test.tb_mtview_counter_daily (
day Date,
device UInt32,
count UInt64,
sum Float64,
max_value_state AggregateFunction(max, Float32),
min_value_state AggregateFunction(min, Float32),
avg_value_state AggregateFunction(avg, Float32)
)
ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(day)
ORDER BY (device, day);
AggregateFunction(name, types_of_arguments…):聚合函数的中间状态,可以通过聚合函数名称加-State
后缀的形式得到它。访问该类型的最终状态数据时,以相同的聚合函数名加-Merge
后缀的形式来得到最终状态数据。
参考:/docs/zh/sql-reference/data-types/aggregatefunction/
创建物化视图
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv3
TO test.tb_mtview_counter_daily
AS SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
sum(value) as sum,
maxState(value) AS max_value_state,
minState(value) AS min_value_state,
avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2019-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
向源表导入数据
INSERT INTO test.tb_mtview_counter
SELECT
toDateTime('2020-01-01 00:00:00') + toInt64(number/10) AS create_time,
(number % 10) + 1 AS device,
(device * 3) + (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 10000;
查询物化视图和实体表
-- 查询物化视图实体表
SELECT
device, day,
sum(count) AS count,
sum(sum) as sum,
maxMerge(max_value_state) AS max,
minMerge(min_value_state) AS min,
avgMerge(avg_value_state) AS avg
FROM test.tb_mtview_counter_daily
GROUP BY device, day
ORDER BY day, device;
-- 查询物化视图
SELECT
device, day,
sum(count) AS count,
sum(sum) as sum,
maxMerge(max_value_state) AS max,
minMerge(min_value_state) AS min,
avgMerge(avg_value_state) AS avg
FROM test.tb_mtview_counter_daily_mv3
GROUP BY device, day
ORDER BY day, device;
结果二者查询的结果一致:
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2020-01-01 │ 1000 │ 6121.699999332428 │ 9.165 │ 3.111 │ 6.121699999332428 │
│ 2 │ 2020-01-01 │ 1000 │ 9138.199996948242 │ 12.1741 │ 6.0041 │ 9.138199996948241 │
│ 3 │ 2020-01-01 │ 1000 │ 12132.700001716614 │ 15.1862 │ 9.0142 │ 12.132700001716614 │
│ 4 │ 2020-01-01 │ 1000 │ 15055.100002288818 │ 18.1763 │ 12.0163 │ 15.055100002288818 │
│ 5 │ 2020-01-01 │ 1000 │ 18032.100012779236 │ 21.1454 │ 15.0524 │ 18.032100012779235 │
│ 6 │ 2020-01-01 │ 1000 │ 21153.600038528442 │ 24.1725 │ 18.1265 │ 21.15360003852844 │
│ 7 │ 2020-01-01 │ 1000 │ 24141.300004959106 │ 27.1236 │ 21.1196 │ 24.141300004959106 │
│ 8 │ 2020-01-01 │ 1000 │ 27071.199975967407 │ 30.1157 │ 24.0987 │ 27.071199975967406 │
│ 9 │ 2020-01-01 │ 1000 │ 30123.699993133545 │ 33.1418 │ 27.0528 │ 30.123699993133545 │
│ 10 │ 2020-01-01 │ 1000 │ 33096.899978637695 │ 36.0949 │ 30.0509 │ 33.096899978637694 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
导入数据到物化视图
导入到物化视图实体表:
INSERT INTO test.tb_mtview_counter_daily
SELECT
toStartOfMonth(create_time) as day,
device,
count(*) AS count,
sum(value) as sum,
maxState(value) AS max_value_state,
minState(value) AS min_value_state,
avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time < toDateTime('2017-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
此时查询物化视图和实体表,都能得到结果:
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2016-01-01 │ 11000 │ 111804.19998860359 │ 18.11 │ 3.004 │ 10.164018180782145 │
│ 2 │ 2016-01-01 │ 11000 │ 145148.39994430542 │ 21.1971 │ 6.0411 │ 13.195309085845947 │
│ 3 │ 2016-01-01 │ 11000 │ 178021.30000972748 │ 24.1952 │ 9.0002 │ 16.18375454633886 │
│ 4 │ 2016-01-01 │ 11000 │ 211105.6000881195 │ 27.1563 │ 12.0223 │ 19.191418189829047 │
│ 5 │ 2016-01-01 │ 11000 │ 244015.59992313385 │ 30.0924 │ 15.0074 │ 22.183236356648532 │
│ 6 │ 2016-01-01 │ 11000 │ 277090.6000518799 │ 33.1685 │ 18.0155 │ 25.1900545501709 │
│ 7 │ 2016-01-01 │ 11000 │ 310359.90009117126 │ 36.1666 │ 21.0096 │ 28.21453637192466 │
│ 8 │ 2016-01-01 │ 11000 │ 342972.3999404907 │ 39.1117 │ 24.0547 │ 31.179309085499156 │
│ 9 │ 2016-01-01 │ 11000 │ 375844.5999317169 │ 42.1678 │ 27.0128 │ 34.16769090288336 │
│ 10 │ 2016-01-01 │ 11000 │ 409034.9999103546 │ 45.1459 │ 30.0009 │ 37.18499999185042 │
│ 1 │ 2020-01-01 │ 1000 │ 6121.699999332428 │ 9.165 │ 3.111 │ 6.121699999332428 │
│ 2 │ 2020-01-01 │ 1000 │ 9138.199996948242 │ 12.1741 │ 6.0041 │ 9.138199996948241 │
│ 3 │ 2020-01-01 │ 1000 │ 12132.700001716614 │ 15.1862 │ 9.0142 │ 12.132700001716614 │
│ 4 │ 2020-01-01 │ 1000 │ 15055.100002288818 │ 18.1763 │ 12.0163 │ 15.055100002288818 │
│ 5 │ 2020-01-01 │ 1000 │ 18032.100012779236 │ 21.1454 │ 15.0524 │ 18.032100012779235 │
│ 6 │ 2020-01-01 │ 1000 │ 21153.600038528442 │ 24.1725 │ 18.1265 │ 21.15360003852844 │
│ 7 │ 2020-01-01 │ 1000 │ 24141.300004959106 │ 27.1236 │ 21.1196 │ 24.141300004959106 │
│ 8 │ 2020-01-01 │ 1000 │ 27071.199975967407 │ 30.1157 │ 24.0987 │ 27.071199975967406 │
│ 9 │ 2020-01-01 │ 1000 │ 30123.699993133545 │ 33.1418 │ 27.0528 │ 30.123699993133545 │
│ 10 │ 2020-01-01 │ 1000 │ 33096.899978637695 │ 36.0949 │ 30.0509 │ 33.096899978637694 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
导入数据到物化视图:
INSERT INTO test.tb_mtview_counter_daily_mv3
SELECT
toStartOfMonth(create_time) as day,
device,
count(*) AS count,
sum(value) as sum,
maxState(value) AS max_value_state,
minState(value) AS min_value_state,
avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time < toDateTime('2020-01-01 00:00:00') and create_time >= toDateTime('2017-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
再次查询物化视图和实体表,结果如下:
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2016-01-01 │ 11000 │ 111804.19998860359 │ 18.11 │ 3.004 │ 10.164018180782145 │
│ 2 │ 2016-01-01 │ 11000 │ 145148.39994430542 │ 21.1971 │ 6.0411 │ 13.195309085845947 │
│ 3 │ 2016-01-01 │ 11000 │ 178021.30000972748 │ 24.1952 │ 9.0002 │ 16.18375454633886 │
│ 4 │ 2016-01-01 │ 11000 │ 211105.6000881195 │ 27.1563 │ 12.0223 │ 19.191418189829047 │
│ 5 │ 2016-01-01 │ 11000 │ 244015.59992313385 │ 30.0924 │ 15.0074 │ 22.183236356648532 │
│ 6 │ 2016-01-01 │ 11000 │ 277090.6000518799 │ 33.1685 │ 18.0155 │ 25.1900545501709 │
│ 7 │ 2016-01-01 │ 11000 │ 310359.90009117126 │ 36.1666 │ 21.0096 │ 28.21453637192466 │
│ 8 │ 2016-01-01 │ 11000 │ 342972.3999404907 │ 39.1117 │ 24.0547 │ 31.179309085499156 │
│ 9 │ 2016-01-01 │ 11000 │ 375844.5999317169 │ 42.1678 │ 27.0128 │ 34.16769090288336 │
│ 10 │ 2016-01-01 │ 11000 │ 409034.9999103546 │ 45.1459 │ 30.0009 │ 37.18499999185042 │
│ 1 │ 2019-01-01 │ 1000 │ 6154.999999046326 │ 9.1 │ 3.01 │ 6.154999999046326 │
│ 2 │ 2019-01-01 │ 1000 │ 9118.399992465973 │ 12.1301 │ 6.0011 │ 9.118399992465973 │
│ 3 │ 2019-01-01 │ 1000 │ 12096.499998092651 │ 15.1812 │ 9.1182 │ 12.096499998092652 │
│ 4 │ 2019-01-01 │ 1000 │ 15040.29999256134 │ 18.1233 │ 12.0773 │ 15.04029999256134 │
│ 5 │ 2019-01-01 │ 1000 │ 18125.899991989136 │ 21.1994 │ 15.0644 │ 18.125899991989137 │
│ 6 │ 2019-01-01 │ 1000 │ 21185.09998703003 │ 24.1755 │ 18.0895 │ 21.18509998703003 │
│ 7 │ 2019-01-01 │ 1000 │ 24034.600004196167 │ 27.1696 │ 21.0026 │ 24.034600004196168 │
│ 8 │ 2019-01-01 │ 1000 │ 27155.999990463257 │ 30.1957 │ 24.0117 │ 27.155999990463258 │
│ 9 │ 2019-01-01 │ 1000 │ 30086.500009536743 │ 33.1848 │ 27.0038 │ 30.086500009536742 │
│ 10 │ 2019-01-01 │ 1000 │ 33094.30003166199 │ 36.1629 │ 30.0829 │ 33.09430003166199 │
│ 1 │ 2020-01-01 │ 1000 │ 6121.699999332428 │ 9.165 │ 3.111 │ 6.121699999332428 │
│ 2 │ 2020-01-01 │ 1000 │ 9138.199996948242 │ 12.1741 │ 6.0041 │ 9.138199996948241 │
│ 3 │ 2020-01-01 │ 1000 │ 12132.700001716614 │ 15.1862 │ 9.0142 │ 12.132700001716614 │
│ 4 │ 2020-01-01 │ 1000 │ 15055.100002288818 │ 18.1763 │ 12.0163 │ 15.055100002288818 │
│ 5 │ 2020-01-01 │ 1000 │ 18032.100012779236 │ 21.1454 │ 15.0524 │ 18.032100012779235 │
│ 6 │ 2020-01-01 │ 1000 │ 21153.600038528442 │ 24.1725 │ 18.1265 │ 21.15360003852844 │
│ 7 │ 2020-01-01 │ 1000 │ 24141.300004959106 │ 27.1236 │ 21.1196 │ 24.141300004959106 │
│ 8 │ 2020-01-01 │ 1000 │ 27071.199975967407 │ 30.1157 │ 24.0987 │ 27.071199975967406 │
│ 9 │ 2020-01-01 │ 1000 │ 30123.699993133545 │ 33.1418 │ 27.0528 │ 30.123699993133545 │
│ 10 │ 2020-01-01 │ 1000 │ 33096.899978637695 │ 36.0949 │ 30.0509 │ 33.096899978637694 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
我们会发现,不论是往物化视图还是物化视图实体表插入数据,都可以使数据进入到物化视图实体表中,查询物化视图都可以得到新增的数据。
查看物化视图存储
查看系统表信息:
SELECT
database,
table,
partition,
name,
rows,
bytes_on_disk,
modification_time,
min_date,
max_date,
engine
FROM system.parts
WHERE table in ('tb_mtview_counter_daily_mv3','tb_mtview_counter_daily');
结果如下:
┌─database─┬─table───────────────────┬─partition─┬─name─────────┬─rows─┬─bytes_on_disk─┬───modification_time─┬───min_date─┬───max_date─┬─engine───────────┐
│ test │ tb_mtview_counter_daily │ 201601 │ 201601_2_2_0 │ 10 │ 786 │ 2021-11-23 02:47:05 │ 2016-01-01 │ 2016-01-01 │ SummingMergeTree │
│ test │ tb_mtview_counter_daily │ 201901 │ 201901_3_3_0 │ 10 │ 776 │ 2021-11-23 02:53:26 │ 2019-01-01 │ 2019-01-01 │ SummingMergeTree │
│ test │ tb_mtview_counter_daily │ 202001 │ 202001_1_1_0 │ 10 │ 778 │ 2021-11-23 02:44:33 │ 2020-01-01 │ 2020-01-01 │ SummingMergeTree │
└──────────┴─────────────────────────┴───────────┴──────────────┴──────┴───────────────┴─────────────────────┴────────────┴────────────┴──────────────────┘
创建物化视图方式四
AggregatingMergeTree引擎创建物化视图
参考:/docs/en/engines/table-engines/mergetree-family/aggregatingmergetree/#example-of-an-aggregated-materialized-view
使用AggregatingMergeTree创建物化视图时,规则其实和前面类似,只不过这里所有的聚合函数都需要加上State
后缀,而在使用时需要加上Merge
后缀。
-- 没有to table时必须有engine
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv4
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(day) ORDER BY (device, day)
AS SELECT
toStartOfMonth(create_time) as day,
device,
countState(value) as count,
uniqState(value) as count_uniq,
sumState(value) as sum,
maxState(value) AS max_value_state,
minState(value) AS min_value_state,
avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2021-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
向源表导入数据
INSERT INTO test.tb_mtview_counter
SELECT
toDateTime('2021-01-01 00:00:00') + toInt64(number/10) AS create_time,
(number % 10) + 1 AS device,
(device * 3) + (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 10000;
查询物化视图
-- 查询物化视图
SELECT
device, day,
countMerge(count) as count,
uniqMerge(count_uniq) as count_uniq,
sumMerge(sum) as sum,
maxMerge(max_value_state) as max,
minMerge(min_value_state) as min,
avgMerge(avg_value_state) as avg
FROM test.tb_mtview_counter_daily_mv4
GROUP BY device, day
ORDER BY day, device;
-- 查询源表
SELECT
toStartOfMonth(create_time) as day,
device,
count(*) as count,
uniq(value) as count_uniq,
sum(value) as sum,
max(value) as max,
min(value) as min,
avg(value) as avg
from test.tb_mtview_counter
WHERE create_time >= toDate('2021-01-01 00:00:00')
GROUP BY device, day
ORDER BY day, device;
结果如下:
┌─device─┬────────day─┬─count─┬─count_uniq─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 1 │ 2021-01-01 │ 1000 │ 927 │ 6119.699991703033 │ 9.194 │ 3.028 │ 6.119699991703033 │
│ 2 │ 2021-01-01 │ 1000 │ 908 │ 9103.499992847443 │ 12.1681 │ 6.0101 │ 9.103499992847443 │
│ 3 │ 2021-01-01 │ 1000 │ 932 │ 12109.299996376038 │ 15.1942 │ 9.0042 │ 12.109299996376038 │
│ 4 │ 2021-01-01 │ 1000 │ 919 │ 15160.899991989136 │ 18.1823 │ 12.0713 │ 15.160899991989135 │
│ 5 │ 2021-01-01 │ 1000 │ 918 │ 18105.59998035431 │ 21.1324 │ 15.1554 │ 18.10559998035431 │
│ 6 │ 2021-01-01 │ 1000 │ 920 │ 21167.400003433228 │ 24.1555 │ 18.0185 │ 21.167400003433226 │
│ 7 │ 2021-01-01 │ 1000 │ 907 │ 24069.20000076294 │ 27.0776 │ 21.0096 │ 24.06920000076294 │
│ 8 │ 2021-01-01 │ 1000 │ 925 │ 27040.30001449585 │ 30.0767 │ 24.0157 │ 27.04030001449585 │
│ 9 │ 2021-01-01 │ 1000 │ 928 │ 30125.100023269653 │ 33.1758 │ 27.0588 │ 30.125100023269653 │
│ 10 │ 2021-01-01 │ 1000 │ 926 │ 33178.39996910095 │ 36.1689 │ 30.0159 │ 33.17839996910095 │
└────────┴────────────┴───────┴────────────┴────────────────────┴─────────┴─────────┴────────────────────┘
假设我们想知道如何创建这种物化视图的显示表,使用 show create table test.tb_mtview_counter_daily_mv4
即可得到:
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv4
(
`day` Date,
`device` UInt32,
`count` AggregateFunction(count, Float32),
`count_uniq` AggregateFunction(uniq, Float32),
`sum` AggregateFunction(sum, Float32),
`max_value_state` AggregateFunction(max, Float32),
`min_value_state` AggregateFunction(min, Float32),
`avg_value_state` AggregateFunction(avg, Float32)
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(day)
ORDER BY (device, day)
SETTINGS index_granularity = 8192
总结
- 源表不可有nullable列,空值尽量用默认字符串或者数值填充,因为在order by中无法指定nullable列。
- 在创建 MV 表时不要使用 POPULATE 关键字,而是在 MV 表建好之后将数据手动导入 MV 表,clickhouse 官方并不推荐使用populated,因为在创建物化视图的过程中同时写入的数据不能被插入物化视图。
- 创建物化视图时,建议指定条件过滤掉历史数据,然后手动将历史数据导入物化视图。如果不指定的话,容易导致无法采用统一条件将历史数据导入物化视图,从而出现数据重复导入的情况。建表中的过滤条件可以显示提醒用户导入历史数据的条件。
- 创建物化视图时,尽量采用MergeTree家族中的聚合引擎,如SummingMergeTree, AggregatingMergeTree。创建物化视图时,聚合统计列,采用State函数,如:maxState, minState, avgState。查询物化视图时,聚合列采用Merge函数获取值,如:maxMerge, minMerge, avgMerge。
- 在使用 MV 的聚合引擎时,也需要按照聚合查询来写sql,因为聚合时机不可控
- 物化视图不支持同步删除,若源表的数据不存在(删除了)则物化视图的数据仍然保留
- 在创建 MV 表时,一定要使用 TO 关键字为 MV 表指定存储位置,否则不支持 嵌套视图(多个物化视图继续聚合一个新的视图)
- 在创建 MV 表时如果用到了多表联查,不能为连接表指定别名,如果多个连接表中存在同名字段,在连接表的查询语句中使用 AS 将字段名区分开
- 在创建 MV 表时如果用到了多表联查,只有当第一个查询的表有数据插入时,这个 MV 才会被触发
参考
/p/390560886
/p/362809994