clickhouse--物化视图

时间:2025-01-24 18:58:01

文章目录

  • 物化视图
    • 概述
      • 与普通视图的区别
      • 优缺点
      • 基本语法
    • 案例实操
      • 数据表准备
      • 创建物化视图方式一
        • 创建物化视图
        • 向源表插入数据
        • 向物化视图插入数据
        • 向源表插入同分区数据
        • 向源表插入不满足条件数据
        • 再次向物化视图导入历史数据
        • 查看物化视图存储
        • 删除源表数据
      • 创建物化视图方式二
        • 创建物化视图
        • 源表新增数据
        • 再次向源表插入数据
      • 创建物化视图方式三
        • 创建实体目标表
        • 创建物化视图
        • 向源表导入数据
        • 查询物化视图和实体表
        • 导入数据到物化视图
        • 查看物化视图存储
      • 创建物化视图方式四
        • AggregatingMergeTree引擎创建物化视图
        • 向源表导入数据
        • 查询物化视图
    • 总结
    • 参考


物化视图

/docs/en/sql-reference/statements/create/view/#materialized

物化视图是查询结果的持久化,可以理解为一张时刻在与计算的表,创建过程中用了一个特殊引擎。但是对 更新删除 操作支持并不好,更像是个插入触发器。

概述

与普通视图的区别

普通视图仅仅保存查询语句,查询的时候还是从原表读取数据。物化视图是把查询的结果根据相应的引擎存入到了磁盘或内存中,对数据重新进行了组织,可以理解为完全的一张新表。

优缺点

优点:查询速度快

缺点:写入过程中消耗较多机器资源,比如带宽占满,存储增加等。本质是一个流式数据的使用场景,是累加式的技术,所以要用历史数据做去重、去核的分析操作不太好用。

基本语法

创建一个隐藏的目标表来保存视图数据,表名默认是.inner.物化视图名。如果加了TO表名,将保存到显式的表。

create [materialized] view [if not exists] [db.]table_name [to [db.]name] [engine = engine] [populate] as select ...

限制条件:

  1. 必须指定物化视图的engine用于数据存储(要么是物化视图,要么是指定的显式表)
  2. to [db.]table的时候,不得使用populate
  3. 查询语句可以包含子句:distinct, group by, order by, limit …

案例实操

数据表准备

-- 建表
CREATE TABLE test.tb_mtview_counter (
      create_time DateTime DEFAULT now(),
      device UInt32,
      value Float32
) 
ENGINE=MergeTree
PARTITION BY toYYYYMM(create_time)
ORDER BY (device, create_time);

-- 插入数据
INSERT INTO test.tb_mtview_counter
SELECT
      toDateTime('2015-01-01 00:00:00') + toInt64(number/10) AS create_time,
      (number % 10) + 1 AS device,
      (device * 3) +  (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;

-- 查询源表
SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    sum(value) as sum,
    max(value) as max,
    min(value) as min,
    avg(value) as avg
from test.tb_mtview_counter 
GROUP BY device, day
ORDER BY day, device;

数据表查询结果如下:

┌────────day─┬─device─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 2015-01-01 │      1 │ 10000 │ 106363.50000333786 │  18.096 │   3.021 │ 10.636350000333787 │
│ 2015-01-01 │      2 │ 10000 │ 136010.90005207062 │ 21.1011 │  6.1701 │ 13.601090005207062 │
│ 2015-01-01 │      3 │ 10000 │  166312.0000295639 │ 24.1942 │  9.0022 │  16.63120000295639 │
│ 2015-01-01 │      4 │ 10000 │ 195977.00000858307 │ 27.1783 │ 12.0003 │ 19.597700000858307 │
│ 2015-01-01 │      5 │ 10000 │ 226182.19993972778 │ 30.1824 │ 15.0284 │  22.61821999397278 │
│ 2015-01-01 │      6 │ 10000 │ 256159.19996261597 │ 33.0795 │ 18.0295 │ 25.615919996261596 │
│ 2015-01-01 │      7 │ 10000 │  286113.2000026703 │ 36.1976 │ 21.1016 │  28.61132000026703 │
│ 2015-01-01 │      8 │ 10000 │  315845.9000492096 │ 39.0887 │ 24.1237 │  31.58459000492096 │
│ 2015-01-01 │      9 │ 10000 │ 345891.30003738403 │ 42.0428 │ 27.0048 │ 34.589130003738404 │
│ 2015-01-01 │     10 │ 10000 │  376212.0999965668 │ 45.0789 │ 30.0009 │  37.62120999965668 │
└────────────┴────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘

创建物化视图方式一

在本方式中,先用我们常见的聚合函数,来看看这种方式到底会出现什么结果。

创建物化视图
-- 创建物化视图 -- 没有to table时必须有engine
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv
ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(day) ORDER BY (device, day)
AS SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    sum(value) as sum,
    max(value) as max,
    min(value) as min,
    avg(value) as avg
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2016-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;

-- 查询物化视图
SELECT
  device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;

查询结果,物化视图中此时没有数据,为何?这是因为未使用 populate 关键字,该关键字并不推荐使用。

思考:为什么建表要加条件 WHERE create_time >= toDate(‘2016-01-01 00:00:00’) ?

在后续的测试中发现其实不加也没有问题,猜测加过滤条件是为了确保建表时数据无法被导入,从而确保数据不会出错。

向源表插入数据
-- 源表插入数据
INSERT INTO test.tb_mtview_counter
SELECT
      toDateTime('2017-01-01 00:00:00') + toInt64(number/10) AS create_time,
      (number % 10) + 1 AS device,
      (device * 3) +  (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;

-- 查询物化视图
SELECT
  device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;

结果如下:

┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2017-01-01 │ 10000 │ 106250.90001893044 │  18.153 │   3.122 │ 10.625090001893044 │
│      2 │ 2017-01-01 │ 10000 │ 135986.09993600845 │ 21.1851 │  6.1051 │ 13.598609993600846 │
│      3 │ 2017-01-01 │ 10000 │ 165927.19995212555 │ 24.1862 │  9.2032 │ 16.592719995212555 │
│      4 │ 2017-01-01 │ 10000 │  195965.7000246048 │ 27.1743 │ 12.0193 │  19.59657000246048 │
│      5 │ 2017-01-01 │ 10000 │ 226203.90004825592 │ 30.1224 │ 15.0054 │  22.62039000482559 │
│      6 │ 2017-01-01 │ 10000 │  256157.8000125885 │ 33.1915 │ 18.0025 │  25.61578000125885 │
│      7 │ 2017-01-01 │ 10000 │  285870.0000991821 │ 36.1576 │ 21.0036 │ 28.587000009918214 │
│      8 │ 2017-01-01 │ 10000 │ 315986.59994506836 │ 39.1477 │ 24.0267 │ 31.598659994506836 │
│      9 │ 2017-01-01 │ 10000 │  346104.4998226166 │ 42.1978 │ 27.0028 │ 34.610449982261656 │
│     10 │ 2017-01-01 │ 10000 │  376108.5000667572 │ 45.1809 │ 30.0059 │  37.61085000667572 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
向物化视图插入数据
-- 向物化视图插入数据
insert into test.tb_mtview_counter_daily_mv
SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    sum(value) as sum,
    max(value) as max,
    min(value) as min,
    avg(value) as avg
from test.tb_mtview_counter where create_time < toDate('2016-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;

-- 查询物化视图
SELECT
  device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;

结果如下:

┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2017-01-01 │ 10000 │ 106250.90001893044 │  18.153 │   3.122 │ 10.625090001893044 │
│      2 │ 2017-01-01 │ 10000 │ 135986.09993600845 │ 21.1851 │  6.1051 │ 13.598609993600846 │
│      3 │ 2017-01-01 │ 10000 │ 165927.19995212555 │ 24.1862 │  9.2032 │ 16.592719995212555 │
│      4 │ 2017-01-01 │ 10000 │  195965.7000246048 │ 27.1743 │ 12.0193 │  19.59657000246048 │
│      5 │ 2017-01-01 │ 10000 │ 226203.90004825592 │ 30.1224 │ 15.0054 │  22.62039000482559 │
│      6 │ 2017-01-01 │ 10000 │  256157.8000125885 │ 33.1915 │ 18.0025 │  25.61578000125885 │
│      7 │ 2017-01-01 │ 10000 │  285870.0000991821 │ 36.1576 │ 21.0036 │ 28.587000009918214 │
│      8 │ 2017-01-01 │ 10000 │ 315986.59994506836 │ 39.1477 │ 24.0267 │ 31.598659994506836 │
│      9 │ 2017-01-01 │ 10000 │  346104.4998226166 │ 42.1978 │ 27.0028 │ 34.610449982261656 │
│     10 │ 2017-01-01 │ 10000 │  376108.5000667572 │ 45.1809 │ 30.0059 │  37.61085000667572 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2015-01-01 │ 10000 │ 105885.80000400543 │  18.166 │   3.021 │ 10.588580000400544 │
│      2 │ 2015-01-01 │ 10000 │ 136100.19999217987 │ 21.0361 │  6.0391 │ 13.610019999217988 │
│      3 │ 2015-01-01 │ 10000 │ 166083.29994297028 │ 24.0942 │  9.0432 │ 16.608329994297026 │
│      4 │ 2015-01-01 │ 10000 │ 196134.90005016327 │ 27.1783 │ 12.0363 │ 19.613490005016327 │
│      5 │ 2015-01-01 │ 10000 │ 225946.59996700287 │ 30.1994 │ 15.1584 │ 22.594659996700287 │
│      6 │ 2015-01-01 │ 10000 │ 256283.39991378784 │ 33.1905 │ 18.0135 │ 25.628339991378784 │
│      7 │ 2015-01-01 │ 10000 │ 286232.70005607605 │ 35.9916 │ 21.0066 │ 28.623270005607605 │
│      8 │ 2015-01-01 │ 10000 │  315966.5000553131 │ 39.1537 │ 24.0577 │ 31.596650005531313 │
│      9 │ 2015-01-01 │ 10000 │  345799.6998100281 │ 42.1388 │ 27.0538 │ 34.579969981002805 │
│     10 │ 2015-01-01 │ 10000 │  375820.0998439789 │ 45.1509 │ 30.0779 │ 37.582009984397885 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
向源表插入同分区数据

插入一个同分区(对物化视图而言),但日期不同的数据。对比查询源表结果,与物化视图的结果有什么区别

-- 源表插入数据
INSERT INTO test.tb_mtview_counter
SELECT
      toDateTime('2017-01-02 00:00:00') + toInt64(number/10) AS create_time,
      (number % 10) + 1 AS device,
      (device * 3) +  (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;

-- 查询物化视图
SELECT
  device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;

结果如下:

┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2015-01-01 │ 10000 │ 105885.80000400543 │  18.166 │   3.021 │ 10.588580000400544 │
│      2 │ 2015-01-01 │ 10000 │ 136100.19999217987 │ 21.0361 │  6.0391 │ 13.610019999217988 │
│      3 │ 2015-01-01 │ 10000 │ 166083.29994297028 │ 24.0942 │  9.0432 │ 16.608329994297026 │
│      4 │ 2015-01-01 │ 10000 │ 196134.90005016327 │ 27.1783 │ 12.0363 │ 19.613490005016327 │
│      5 │ 2015-01-01 │ 10000 │ 225946.59996700287 │ 30.1994 │ 15.1584 │ 22.594659996700287 │
│      6 │ 2015-01-01 │ 10000 │ 256283.39991378784 │ 33.1905 │ 18.0135 │ 25.628339991378784 │
│      7 │ 2015-01-01 │ 10000 │ 286232.70005607605 │ 35.9916 │ 21.0066 │ 28.623270005607605 │
│      8 │ 2015-01-01 │ 10000 │  315966.5000553131 │ 39.1537 │ 24.0577 │ 31.596650005531313 │
│      9 │ 2015-01-01 │ 10000 │  345799.6998100281 │ 42.1388 │ 27.0538 │ 34.579969981002805 │
│     10 │ 2015-01-01 │ 10000 │  375820.0998439789 │ 45.1509 │ 30.0779 │ 37.582009984397885 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2017-01-01 │ 10000 │ 106250.90001893044 │  18.153 │   3.122 │ 10.625090001893044 │
│      2 │ 2017-01-01 │ 10000 │ 135986.09993600845 │ 21.1851 │  6.1051 │ 13.598609993600846 │
│      3 │ 2017-01-01 │ 10000 │ 165927.19995212555 │ 24.1862 │  9.2032 │ 16.592719995212555 │
│      4 │ 2017-01-01 │ 10000 │  195965.7000246048 │ 27.1743 │ 12.0193 │  19.59657000246048 │
│      5 │ 2017-01-01 │ 10000 │ 226203.90004825592 │ 30.1224 │ 15.0054 │  22.62039000482559 │
│      6 │ 2017-01-01 │ 10000 │  256157.8000125885 │ 33.1915 │ 18.0025 │  25.61578000125885 │
│      7 │ 2017-01-01 │ 10000 │  285870.0000991821 │ 36.1576 │ 21.0036 │ 28.587000009918214 │
│      8 │ 2017-01-01 │ 10000 │ 315986.59994506836 │ 39.1477 │ 24.0267 │ 31.598659994506836 │
│      9 │ 2017-01-01 │ 10000 │  346104.4998226166 │ 42.1978 │ 27.0028 │ 34.610449982261656 │
│     10 │ 2017-01-01 │ 10000 │  376108.5000667572 │ 45.1809 │ 30.0059 │  37.61085000667572 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2017-01-01 │ 10000 │ 106321.70000195503 │  18.196 │   3.007 │ 10.632170000195503 │
│      2 │ 2017-01-01 │ 10000 │  135868.9999809265 │ 21.1501 │  6.0041 │ 13.586899998092651 │
│      3 │ 2017-01-01 │ 10000 │ 165784.19991016388 │ 24.1572 │  9.0552 │  16.57841999101639 │
│      4 │ 2017-01-01 │ 10000 │ 196065.50004196167 │ 27.1883 │ 12.0763 │ 19.606550004196166 │
│      5 │ 2017-01-01 │ 10000 │ 225919.39997386932 │ 30.1374 │ 15.0284 │ 22.591939997386934 │
│      6 │ 2017-01-01 │ 10000 │ 256105.40001869202 │ 33.1865 │ 18.0855 │   25.6105400018692 │
│      7 │ 2017-01-01 │ 10000 │ 286283.80012512207 │ 36.1986 │ 21.0996 │ 28.628380012512206 │
│      8 │ 2017-01-01 │ 10000 │  316006.7000656128 │ 39.1217 │ 24.0167 │  31.60067000656128 │
│      9 │ 2017-01-01 │ 10000 │   345796.399974823 │ 42.1778 │ 27.0268 │   34.5796399974823 │
│     10 │ 2017-01-01 │ 10000 │  376074.8998966217 │ 45.1629 │ 30.0649 │  37.60748998966217 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘

数据暂时未合并,可以执行 optimize 来手动进行合并(就算不手动执行,未来某个时刻也会自动合并)。

optimize table test.tb_mtview_counter_daily_mv final;

合并后查询结果:

┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2015-01-01 │ 10000 │ 105885.80000400543 │  18.166 │   3.021 │ 10.588580000400544 │
│      2 │ 2015-01-01 │ 10000 │ 136100.19999217987 │ 21.0361 │  6.0391 │ 13.610019999217988 │
│      3 │ 2015-01-01 │ 10000 │ 166083.29994297028 │ 24.0942 │  9.0432 │ 16.608329994297026 │
│      4 │ 2015-01-01 │ 10000 │ 196134.90005016327 │ 27.1783 │ 12.0363 │ 19.613490005016327 │
│      5 │ 2015-01-01 │ 10000 │ 225946.59996700287 │ 30.1994 │ 15.1584 │ 22.594659996700287 │
│      6 │ 2015-01-01 │ 10000 │ 256283.39991378784 │ 33.1905 │ 18.0135 │ 25.628339991378784 │
│      7 │ 2015-01-01 │ 10000 │ 286232.70005607605 │ 35.9916 │ 21.0066 │ 28.623270005607605 │
│      8 │ 2015-01-01 │ 10000 │  315966.5000553131 │ 39.1537 │ 24.0577 │ 31.596650005531313 │
│      9 │ 2015-01-01 │ 10000 │  345799.6998100281 │ 42.1388 │ 27.0538 │ 34.579969981002805 │
│     10 │ 2015-01-01 │ 10000 │  375820.0998439789 │ 45.1509 │ 30.0779 │ 37.582009984397885 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬───────max─┬───────min─┬────────────────avg─┐
│      1 │ 2017-01-01 │ 20000 │ 212572.60002088547 │    36.349 │ 6.1289997 │ 21.257260002088547 │
│      2 │ 2017-01-01 │ 20000 │ 271855.09991693497 │   42.3352 │   12.1092 │ 27.185509991693497 │
│      3 │ 2017-01-01 │ 20000 │  331711.3998622894 │   48.3434 │   18.2584 │ 33.171139986228944 │
│      4 │ 2017-01-01 │ 20000 │ 392031.20006656647 │   54.3626 │   24.0956 │  39.20312000665665 │
│      5 │ 2017-01-01 │ 20000 │ 452123.30002212524 │   60.2598 │   30.0338 │ 45.212330002212525 │
│      6 │ 2017-01-01 │ 20000 │  512263.2000312805 │ 66.378006 │    36.088 │  51.22632000312805 │
│      7 │ 2017-01-01 │ 20000 │  572153.8002243042 │   72.3562 │   42.1032 │  57.21538002243042 │
│      8 │ 2017-01-01 │ 20000 │  631993.3000106812 │   78.2694 │   48.0434 │  63.19933000106812 │
│      9 │ 2017-01-01 │ 20000 │  691900.8997974396 │ 84.375595 │ 54.029602 │  69.19008997974396 │
│     10 │ 2017-01-01 │ 20000 │  752183.3999633789 │ 90.343796 │   60.0708 │  75.21833999633789 │
└────────┴────────────┴───────┴────────────────────┴───────────┴───────────┴────────────────────┘

从结果可以看到,分区数据执行了合并,而且合并是依据order by字段进行的,除了device和day以外的字段全部自动求和了,这个就是SummingMergeTree的特性。

向源表插入不满足条件数据

再次向源表插入数据,注意插入时间,插入后观察物化视图结果。

-- 源表插入数据
INSERT INTO test.tb_mtview_counter
SELECT
      toDateTime('2015-01-02 00:00:00') + toInt64(number/10) AS create_time,
      (number % 10) + 1 AS device,
      (device * 3) +  (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;

-- 查询物化视图
SELECT
  device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;

发现该部分数据并未插入到物化视图中,这是为什么呢?

原来,这是因为构建物化视图时,我们为了防止历史数据无法正确导入,设定了一个条件:WHERE create_time >= toDate(‘2016-01-01 00:00:00’)。因此在向源表插入数据时,如果数据不满足物化视图的过滤条件,那么数据将无法自动进入物化视图。

再次向物化视图导入历史数据

再次向物化视图插入数据,对比前后数据有什么不一样

-- 向物化视图插入数据
insert into test.tb_mtview_counter_daily_mv
SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    sum(value) as sum,
    max(value) as max,
    min(value) as min,
    avg(value) as avg
from test.tb_mtview_counter where create_time < toDate('2016-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;

-- 查询物化视图
SELECT
  device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;

结果显示数据重复插入,2015-01-01的数据又重新导入了一份,以一个新的分区存在(当前数据未合并):

┌─device─┬────────day─┬─count─┬────────────────sum─┬───────max─┬───────min─┬────────────────avg─┐
│      1 │ 2017-01-01 │ 20000 │ 212572.60002088547 │    36.349 │ 6.1289997 │ 21.257260002088547 │
│      2 │ 2017-01-01 │ 20000 │ 271855.09991693497 │   42.3352 │   12.1092 │ 27.185509991693497 │
│      3 │ 2017-01-01 │ 20000 │  331711.3998622894 │   48.3434 │   18.2584 │ 33.171139986228944 │
│      4 │ 2017-01-01 │ 20000 │ 392031.20006656647 │   54.3626 │   24.0956 │  39.20312000665665 │
│      5 │ 2017-01-01 │ 20000 │ 452123.30002212524 │   60.2598 │   30.0338 │ 45.212330002212525 │
│      6 │ 2017-01-01 │ 20000 │  512263.2000312805 │ 66.378006 │    36.088 │  51.22632000312805 │
│      7 │ 2017-01-01 │ 20000 │  572153.8002243042 │   72.3562 │   42.1032 │  57.21538002243042 │
│      8 │ 2017-01-01 │ 20000 │  631993.3000106812 │   78.2694 │   48.0434 │  63.19933000106812 │
│      9 │ 2017-01-01 │ 20000 │  691900.8997974396 │ 84.375595 │ 54.029602 │  69.19008997974396 │
│     10 │ 2017-01-01 │ 20000 │  752183.3999633789 │ 90.343796 │   60.0708 │  75.21833999633789 │
└────────┴────────────┴───────┴────────────────────┴───────────┴───────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2015-01-01 │ 10000 │ 105885.80000400543 │  18.166 │   3.021 │ 10.588580000400544 │
│      2 │ 2015-01-01 │ 10000 │ 136100.19999217987 │ 21.0361 │  6.0391 │ 13.610019999217988 │
│      3 │ 2015-01-01 │ 10000 │ 166083.29994297028 │ 24.0942 │  9.0432 │ 16.608329994297026 │
│      4 │ 2015-01-01 │ 10000 │ 196134.90005016327 │ 27.1783 │ 12.0363 │ 19.613490005016327 │
│      5 │ 2015-01-01 │ 10000 │ 225946.59996700287 │ 30.1994 │ 15.1584 │ 22.594659996700287 │
│      6 │ 2015-01-01 │ 10000 │ 256283.39991378784 │ 33.1905 │ 18.0135 │ 25.628339991378784 │
│      7 │ 2015-01-01 │ 10000 │ 286232.70005607605 │ 35.9916 │ 21.0066 │ 28.623270005607605 │
│      8 │ 2015-01-01 │ 10000 │  315966.5000553131 │ 39.1537 │ 24.0577 │ 31.596650005531313 │
│      9 │ 2015-01-01 │ 10000 │  345799.6998100281 │ 42.1388 │ 27.0538 │ 34.579969981002805 │
│     10 │ 2015-01-01 │ 10000 │  375820.0998439789 │ 45.1509 │ 30.0779 │ 37.582009984397885 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬───────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2015-01-01 │ 20000 │ 212017.2000157833 │  18.166 │   3.021 │ 10.600860000789165 │
│      2 │ 2015-01-01 │ 20000 │ 272037.8999552727 │ 21.1301 │  6.0391 │ 13.601894997763633 │
│      3 │ 2015-01-01 │ 20000 │ 331975.7998428345 │ 24.1942 │  9.0432 │ 16.598789992141725 │
│      4 │ 2015-01-01 │ 20000 │ 392069.6000614166 │ 27.1783 │ 12.0363 │ 19.603480003070832 │
│      5 │ 2015-01-01 │ 20000 │  452100.799785614 │ 30.1994 │ 15.0254 │   22.6050399892807 │
│      6 │ 2015-01-01 │ 20000 │ 512518.7999534607 │ 33.1905 │ 18.0135 │ 25.625939997673036 │
│      7 │ 2015-01-01 │ 20000 │ 572203.9001655579 │ 36.1746 │ 21.0066 │  28.61019500827789 │
│      8 │ 2015-01-01 │ 20000 │ 631927.8000221252 │ 39.1537 │ 24.0577 │  31.59639000110626 │
│      9 │ 2015-01-01 │ 20000 │ 691799.4997425079 │ 42.1388 │ 27.0538 │ 34.589974987125395 │
│     10 │ 2015-01-01 │ 20000 │ 751651.5999317169 │ 45.1999 │ 30.0089 │  37.58257999658585 │
└────────┴────────────┴───────┴───────────────────┴─────────┴─────────┴────────────────────┘

执行合并之后再查看:

┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2015-01-01 │ 30000 │ 317903.00001978874 │  36.332 │   6.042 │  21.18944000118971 │
│      2 │ 2015-01-01 │ 30000 │ 408138.09994745255 │ 42.1662 │ 12.0782 │  27.21191499698162 │
│      3 │ 2015-01-01 │ 30000 │ 498059.09978580475 │ 48.2884 │ 18.0864 │ 33.207119986438755 │
│      4 │ 2015-01-01 │ 30000 │  588204.5001115799 │ 54.3566 │ 24.0726 │  39.21697000808716 │
│      5 │ 2015-01-01 │ 30000 │  678047.3997526169 │ 60.3988 │ 30.1838 │ 45.199699985980985 │
│      6 │ 2015-01-01 │ 30000 │  768802.1998672485 │  66.381 │  36.027 │ 51.254279989051824 │
│      7 │ 2015-01-01 │ 30000 │  858436.6002216339 │ 72.1662 │ 42.0132 │  57.23346501388549 │
│      8 │ 2015-01-01 │ 30000 │  947894.3000774384 │ 78.3074 │ 48.1154 │ 63.193040006637574 │
│      9 │ 2015-01-01 │ 30000 │  1037599.199552536 │ 84.2776 │ 54.1076 │  69.16994496812819 │
│     10 │ 2015-01-01 │ 30000 │ 1127471.6997756958 │ 90.3508 │ 60.0868 │  75.16458998098373 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
┌─device─┬────────day─┬─count─┬────────────────sum─┬───────max─┬───────min─┬────────────────avg─┐
│      1 │ 2017-01-01 │ 20000 │ 212572.60002088547 │    36.349 │ 6.1289997 │ 21.257260002088547 │
│      2 │ 2017-01-01 │ 20000 │ 271855.09991693497 │   42.3352 │   12.1092 │ 27.185509991693497 │
│      3 │ 2017-01-01 │ 20000 │  331711.3998622894 │   48.3434 │   18.2584 │ 33.171139986228944 │
│      4 │ 2017-01-01 │ 20000 │ 392031.20006656647 │   54.3626 │   24.0956 │  39.20312000665665 │
│      5 │ 2017-01-01 │ 20000 │ 452123.30002212524 │   60.2598 │   30.0338 │ 45.212330002212525 │
│      6 │ 2017-01-01 │ 20000 │  512263.2000312805 │ 66.378006 │    36.088 │  51.22632000312805 │
│      7 │ 2017-01-01 │ 20000 │  572153.8002243042 │   72.3562 │   42.1032 │  57.21538002243042 │
│      8 │ 2017-01-01 │ 20000 │  631993.3000106812 │   78.2694 │   48.0434 │  63.19933000106812 │
│      9 │ 2017-01-01 │ 20000 │  691900.8997974396 │ 84.375595 │ 54.029602 │  69.19008997974396 │
│     10 │ 2017-01-01 │ 20000 │  752183.3999633789 │ 90.343796 │   60.0708 │  75.21833999633789 │
└────────┴────────────┴───────┴────────────────────┴───────────┴───────────┴────────────────────┘
查看物化视图存储

物化视图实际上也是占用了disk上的datapart,我们可以看一下私有表和mv表对应的datapart情况:

-- 查看当前私有表和mv表名称
show tables from test;

-- 查看系统表
SELECT
	database,
	table,
	partition,
    name,
    rows,
    bytes_on_disk,
    modification_time,
    min_date,
    max_date,
    engine
FROM system.parts
WHERE table in ('.inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e','tb_mtview_counter_daily_mv');

从结果可以知道,实际磁盘上存储数据的是隐藏的目标表,表名通常以.inner开头。

┌─database─┬─table──────────────────────────────────────────┬─partition─┬─name─────────┬─rows─┬─bytes_on_disk─┬───modification_time─┬───min_date─┬───max_date─┬─engine───────────┐
│ test     │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201501    │ 201501_2_2_1 │   10 │           754 │ 2021-11-22 09:59:28 │ 2015-01-01 │ 2015-01-01 │ SummingMergeTree │
│ test     │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201501    │ 201501_2_4_2 │   10 │           754 │ 2021-11-22 11:17:42 │ 2015-01-01 │ 2015-01-01 │ SummingMergeTree │
│ test     │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201501    │ 201501_4_4_0 │   10 │           754 │ 2021-11-22 11:13:50 │ 2015-01-01 │ 2015-01-01 │ SummingMergeTree │
│ test     │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201701    │ 201701_1_3_1 │   10 │           754 │ 2021-11-22 09:59:28 │ 2017-01-01 │ 2017-01-01 │ SummingMergeTree │
│ test     │ .inner_id.e402237d-2c7e-4706-91f7-bc558ff8392e │ 201701    │ 201701_1_3_2 │   10 │           754 │ 2021-11-22 11:17:42 │ 2017-01-01 │ 2017-01-01 │ SummingMergeTree │
└──────────┴────────────────────────────────────────────────┴───────────┴──────────────┴──────┴───────────────┴─────────────────────┴────────────┴────────────┴──────────────────┘
删除源表数据
-- 清空源表
truncate table test.tb_mtview_counter;

-- 查询源表
SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    sum(value) as sum,
    max(value) as max,
    min(value) as min,
    avg(value) as avg
from test.tb_mtview_counter 
GROUP BY device, day
ORDER BY day, device;

-- 查询物化视图
SELECT
  device, day, count, sum, max, min, avg
FROM test.tb_mtview_counter_daily_mv;

-- 物化视图删除
truncate table test.tb_mtview_counter_daily_mv;
drop table test.tb_mtview_counter_daily_mv;
drop view test.tb_mtview_counter_daily_mv;

上面的操作我们发现,删除源表数据,物化视图的数据不会受到影响。这说明物化视图不支持同步删除数据,在实际应用中,应避免在经常需要删除或更新的场景中使用物化视图。

创建物化视图方式二

在前一种物化视图中,我们看到,SummingMergeTree()对常规的除了求和以外的聚合函数支持并不好,在本例中,我们采用 聚合函数名称加-State后缀 的函数形式来获取聚合值,通过例子来了解具体是怎么一回事。

官网介绍:以-State后缀的函数总是返回AggregateFunction类型的数据的中间状态。对于SELECT而言,AggregateFunction类型总是以特定的二进制形式展现在所有的输出格式中。

参考:/docs/zh/sql-reference/data-types/aggregatefunction/

创建物化视图
-- 没有to table时必须有engine
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv2
ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(day) ORDER BY (device, day)
AS SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    sum(value) as sum,
    maxState(value) AS max_value_state,
    minState(value) AS min_value_state,
    avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2016-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
源表新增数据
INSERT INTO test.tb_mtview_counter
SELECT
      toDateTime('2016-01-01 00:00:00') + toInt64(number/10) AS create_time,
      (number % 10) + 1 AS device,
      (device * 3) +  (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 100000;

-- 查询源表
SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    sum(value) as sum,
    max(value) as max,
    min(value) as min,
    avg(value) as avg
from test.tb_mtview_counter 
GROUP BY device, day
ORDER BY day, device;

-- 查询物化视图
SELECT
  day, device, 
  sum(count) AS count,
  sum(sum) as sum, 
  maxMerge(max_value_state) AS max,
  minMerge(min_value_state) AS min,
  avgMerge(avg_value_state) AS avg
FROM test.tb_mtview_counter_daily_mv2
GROUP BY device, day
ORDER BY day, device;

-- 显示当前数据库所有表
show tables from test;

二者结果均为:

┌────────day─┬─device─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 2016-01-01 │      1 │ 10000 │ 105755.79999136925 │   18.11 │   3.004 │ 10.575579999136925 │
│ 2016-01-01 │      2 │ 10000 │ 136052.39994812012 │ 21.1971 │  6.0411 │ 13.605239994812012 │
│ 2016-01-01 │      3 │ 10000 │ 165888.80001831055 │ 24.1952 │  9.0012 │ 16.588880001831054 │
│ 2016-01-01 │      4 │ 10000 │ 196001.40008163452 │ 27.1563 │ 12.0223 │ 19.600140008163454 │
│ 2016-01-01 │      5 │ 10000 │ 225971.79990959167 │ 30.0924 │ 15.0074 │ 22.597179990959166 │
│ 2016-01-01 │      6 │ 10000 │ 256052.80004692078 │ 33.1685 │ 18.0755 │ 25.605280004692077 │
│ 2016-01-01 │      7 │ 10000 │   286267.600069046 │ 36.1666 │ 21.0096 │   28.6267600069046 │
│ 2016-01-01 │      8 │ 10000 │  315901.2999572754 │ 39.1117 │ 24.0547 │  31.59012999572754 │
│ 2016-01-01 │      9 │ 10000 │ 345774.49993896484 │ 42.1678 │ 27.1348 │  34.57744999389649 │
│ 2016-01-01 │     10 │ 10000 │ 375901.99990463257 │ 45.1459 │ 30.0009 │ 37.590199990463255 │
└────────────┴────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
再次向源表插入数据

插入同分区不同日期的数据

-- 源表插入数据
INSERT INTO test.tb_mtview_counter
SELECT
      toDateTime('2016-01-02 00:00:00') + toInt64(number/10) AS create_time,
      (number % 10) + 1 AS device,
      (device * 3) +  (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 10000;

先看物化视图原始数据:

select * from test.tb_mtview_counter_daily_mv2 where device=1 ;
┌────────day─┬─device─┬─count─┬────────────────sum─┬─max_value_state─┬─min_value_state─┬─avg_value_state─┐
│ 2016-01-01 │      1 │ 10000 │ 105755.79999136925 │ H▒A             │ ▒A@@             │ ▒▒̼▒▒@▒N              │
└────────────┴────────┴───────┴────────────────────┴─────────────────┴─────────────────┴─────────────────┘
┌────────day─┬─device─┬─count─┬────────────────sum─┬─max_value_state─┬─min_value_state─┬─avg_value_state─┐
│ 2016-01-01 │      1 │  1000 │ 6048.3999972343445 │ ▒&A              │ ▒@@              │ 8ff▒▒@▒            │
└────────────┴────────┴───────┴────────────────────┴─────────────────┴─────────────────┴─────────────────┘

可以看出来数据并未按照分区进行合并,可以执行 :

-- 合并数据
optimize table test.tb_mtview_counter_daily_mv2 final;

再次查询查看结果,数据已经合并:

select * from test.tb_mtview_counter_daily_mv2 where device=1 ;
┌────────day─┬─device─┬─count─┬────────────────sum─┬─max_value_state─┬─min_value_state─┬─avg_value_state─┐
│ 2016-01-01 │      1 │ 11000 │ 111804.19998860359 │ H▒A             │ ▒A@@             │ @'3▒K▒@▒U          │
└────────────┴────────┴───────┴────────────────────┴─────────────────┴─────────────────┴─────────────────┘

再次查询物化视图聚合结果:

-- 查询物化视图
SELECT
  day, device, 
  sum(count) AS count,
  sum(sum) as sum, 
  maxMerge(max_value_state) AS max,
  minMerge(min_value_state) AS min,
  avgMerge(avg_value_state) AS avg
FROM test.tb_mtview_counter_daily_mv2
GROUP BY device, day
ORDER BY day, device;

结果正确:

┌────────day─┬─device─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│ 2016-01-01 │      1 │ 11000 │ 111804.19998860359 │   18.11 │   3.004 │ 10.164018180782145 │
│ 2016-01-01 │      2 │ 11000 │ 145148.39994430542 │ 21.1971 │  6.0411 │ 13.195309085845947 │
│ 2016-01-01 │      3 │ 11000 │ 178021.30000972748 │ 24.1952 │  9.0002 │  16.18375454633886 │
│ 2016-01-01 │      4 │ 11000 │  211105.6000881195 │ 27.1563 │ 12.0223 │ 19.191418189829047 │
│ 2016-01-01 │      5 │ 11000 │ 244015.59992313385 │ 30.0924 │ 15.0074 │ 22.183236356648532 │
│ 2016-01-01 │      6 │ 11000 │  277090.6000518799 │ 33.1685 │ 18.0155 │   25.1900545501709 │
│ 2016-01-01 │      7 │ 11000 │ 310359.90009117126 │ 36.1666 │ 21.0096 │  28.21453637192466 │
│ 2016-01-01 │      8 │ 11000 │  342972.3999404907 │ 39.1117 │ 24.0547 │ 31.179309085499156 │
│ 2016-01-01 │      9 │ 11000 │  375844.5999317169 │ 42.1678 │ 27.0128 │  34.16769090288336 │
│ 2016-01-01 │     10 │ 11000 │  409034.9999103546 │ 45.1459 │ 30.0009 │  37.18499999185042 │
└────────────┴────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘

创建物化视图方式三

在本例中,我们采用使用to 方法,将物化视图的实体表指定为特定名称的表。

创建实体目标表

在这之前,我们先查看一下物化视图的建表语句:

-- 查看建表语句
show create table tb_mtview_counter_daily_mv2;

其中物化视图建表语句:

CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv2
(
    `day` Date,
    `device` UInt32,
    `count` UInt64,
    `sum` Float64,
    `max_value_state` AggregateFunction(max, Float32),
    `min_value_state` AggregateFunction(min, Float32),
    `avg_value_state` AggregateFunction(avg, Float32)
)
ENGINE = SummingMergeTree
PARTITION BY toYYYYMM(day)
ORDER BY (device, day)
SETTINGS index_granularity = 8192 AS
SELECT
    toStartOfMonth(create_time) AS day,
    device,
    count(*) AS count,
    sum(value) AS sum,
    maxState(value) AS max_value_state,
    minState(value) AS min_value_state,
    avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2016-01-01 00:00:00')
GROUP BY
    device,
    day
ORDER BY
    device ASC,
    day ASC

如果我们查看该物化视图对应的实体表,会发现建表语句其实就是物化视图前半部分的语句:

show create table `.inner_id.b72f9396-16ff-4b1b-8234-8ba8a1a8f620`;

我们直接使用该建表语句,将表名修改为tb_mtview_counter_daily,如下所示:

CREATE TABLE test.tb_mtview_counter_daily (
	day Date,
	device UInt32,
	count UInt64,
    sum Float64,
    max_value_state AggregateFunction(max, Float32),
    min_value_state AggregateFunction(min, Float32),
    avg_value_state AggregateFunction(avg, Float32)
)
ENGINE = SummingMergeTree()
PARTITION BY toYYYYMM(day)
ORDER BY (device, day);

AggregateFunction(name, types_of_arguments…):聚合函数的中间状态,可以通过聚合函数名称加-State后缀的形式得到它。访问该类型的最终状态数据时,以相同的聚合函数名加-Merge后缀的形式来得到最终状态数据。

参考:/docs/zh/sql-reference/data-types/aggregatefunction/

创建物化视图
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv3
TO test.tb_mtview_counter_daily
AS SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    sum(value) as sum,
    maxState(value) AS max_value_state,
    minState(value) AS min_value_state,
    avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2019-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
向源表导入数据
INSERT INTO test.tb_mtview_counter
SELECT
      toDateTime('2020-01-01 00:00:00') + toInt64(number/10) AS create_time,
      (number % 10) + 1 AS device,
      (device * 3) +  (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 10000;
查询物化视图和实体表
-- 查询物化视图实体表
SELECT
    device, day, 
    sum(count) AS count,
    sum(sum) as sum, 
    maxMerge(max_value_state) AS max,
    minMerge(min_value_state) AS min,
    avgMerge(avg_value_state) AS avg
FROM test.tb_mtview_counter_daily
GROUP BY device, day
ORDER BY day, device;

-- 查询物化视图
SELECT
    device, day, 
    sum(count) AS count,
    sum(sum) as sum, 
    maxMerge(max_value_state) AS max,
    minMerge(min_value_state) AS min,
    avgMerge(avg_value_state) AS avg
FROM test.tb_mtview_counter_daily_mv3
GROUP BY device, day
ORDER BY day, device;

结果二者查询的结果一致:

┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2020-01-01 │  1000 │  6121.699999332428 │   9.165 │   3.111 │  6.121699999332428 │
│      2 │ 2020-01-01 │  1000 │  9138.199996948242 │ 12.1741 │  6.0041 │  9.138199996948241 │
│      3 │ 2020-01-01 │  1000 │ 12132.700001716614 │ 15.1862 │  9.0142 │ 12.132700001716614 │
│      4 │ 2020-01-01 │  1000 │ 15055.100002288818 │ 18.1763 │ 12.0163 │ 15.055100002288818 │
│      5 │ 2020-01-01 │  1000 │ 18032.100012779236 │ 21.1454 │ 15.0524 │ 18.032100012779235 │
│      6 │ 2020-01-01 │  1000 │ 21153.600038528442 │ 24.1725 │ 18.1265 │  21.15360003852844 │
│      7 │ 2020-01-01 │  1000 │ 24141.300004959106 │ 27.1236 │ 21.1196 │ 24.141300004959106 │
│      8 │ 2020-01-01 │  1000 │ 27071.199975967407 │ 30.1157 │ 24.0987 │ 27.071199975967406 │
│      9 │ 2020-01-01 │  1000 │ 30123.699993133545 │ 33.1418 │ 27.0528 │ 30.123699993133545 │
│     10 │ 2020-01-01 │  1000 │ 33096.899978637695 │ 36.0949 │ 30.0509 │ 33.096899978637694 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘
导入数据到物化视图

导入到物化视图实体表:

INSERT INTO test.tb_mtview_counter_daily
SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) AS count,
    sum(value) as sum,
    maxState(value) AS max_value_state,
    minState(value) AS min_value_state,
    avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time < toDateTime('2017-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;

此时查询物化视图和实体表,都能得到结果:

┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2016-01-01 │ 11000 │ 111804.19998860359 │   18.11 │   3.004 │ 10.164018180782145 │
│      2 │ 2016-01-01 │ 11000 │ 145148.39994430542 │ 21.1971 │  6.0411 │ 13.195309085845947 │
│      3 │ 2016-01-01 │ 11000 │ 178021.30000972748 │ 24.1952 │  9.0002 │  16.18375454633886 │
│      4 │ 2016-01-01 │ 11000 │  211105.6000881195 │ 27.1563 │ 12.0223 │ 19.191418189829047 │
│      5 │ 2016-01-01 │ 11000 │ 244015.59992313385 │ 30.0924 │ 15.0074 │ 22.183236356648532 │
│      6 │ 2016-01-01 │ 11000 │  277090.6000518799 │ 33.1685 │ 18.0155 │   25.1900545501709 │
│      7 │ 2016-01-01 │ 11000 │ 310359.90009117126 │ 36.1666 │ 21.0096 │  28.21453637192466 │
│      8 │ 2016-01-01 │ 11000 │  342972.3999404907 │ 39.1117 │ 24.0547 │ 31.179309085499156 │
│      9 │ 2016-01-01 │ 11000 │  375844.5999317169 │ 42.1678 │ 27.0128 │  34.16769090288336 │
│     10 │ 2016-01-01 │ 11000 │  409034.9999103546 │ 45.1459 │ 30.0009 │  37.18499999185042 │
│      1 │ 2020-01-01 │  1000 │  6121.699999332428 │   9.165 │   3.111 │  6.121699999332428 │
│      2 │ 2020-01-01 │  1000 │  9138.199996948242 │ 12.1741 │  6.0041 │  9.138199996948241 │
│      3 │ 2020-01-01 │  1000 │ 12132.700001716614 │ 15.1862 │  9.0142 │ 12.132700001716614 │
│      4 │ 2020-01-01 │  1000 │ 15055.100002288818 │ 18.1763 │ 12.0163 │ 15.055100002288818 │
│      5 │ 2020-01-01 │  1000 │ 18032.100012779236 │ 21.1454 │ 15.0524 │ 18.032100012779235 │
│      6 │ 2020-01-01 │  1000 │ 21153.600038528442 │ 24.1725 │ 18.1265 │  21.15360003852844 │
│      7 │ 2020-01-01 │  1000 │ 24141.300004959106 │ 27.1236 │ 21.1196 │ 24.141300004959106 │
│      8 │ 2020-01-01 │  1000 │ 27071.199975967407 │ 30.1157 │ 24.0987 │ 27.071199975967406 │
│      9 │ 2020-01-01 │  1000 │ 30123.699993133545 │ 33.1418 │ 27.0528 │ 30.123699993133545 │
│     10 │ 2020-01-01 │  1000 │ 33096.899978637695 │ 36.0949 │ 30.0509 │ 33.096899978637694 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘

导入数据到物化视图:

INSERT INTO test.tb_mtview_counter_daily_mv3
SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) AS count,
    sum(value) as sum,
    maxState(value) AS max_value_state,
    minState(value) AS min_value_state,
    avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time < toDateTime('2020-01-01 00:00:00') and create_time >= toDateTime('2017-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;

再次查询物化视图和实体表,结果如下:

┌─device─┬────────day─┬─count─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2016-01-01 │ 11000 │ 111804.19998860359 │   18.11 │   3.004 │ 10.164018180782145 │
│      2 │ 2016-01-01 │ 11000 │ 145148.39994430542 │ 21.1971 │  6.0411 │ 13.195309085845947 │
│      3 │ 2016-01-01 │ 11000 │ 178021.30000972748 │ 24.1952 │  9.0002 │  16.18375454633886 │
│      4 │ 2016-01-01 │ 11000 │  211105.6000881195 │ 27.1563 │ 12.0223 │ 19.191418189829047 │
│      5 │ 2016-01-01 │ 11000 │ 244015.59992313385 │ 30.0924 │ 15.0074 │ 22.183236356648532 │
│      6 │ 2016-01-01 │ 11000 │  277090.6000518799 │ 33.1685 │ 18.0155 │   25.1900545501709 │
│      7 │ 2016-01-01 │ 11000 │ 310359.90009117126 │ 36.1666 │ 21.0096 │  28.21453637192466 │
│      8 │ 2016-01-01 │ 11000 │  342972.3999404907 │ 39.1117 │ 24.0547 │ 31.179309085499156 │
│      9 │ 2016-01-01 │ 11000 │  375844.5999317169 │ 42.1678 │ 27.0128 │  34.16769090288336 │
│     10 │ 2016-01-01 │ 11000 │  409034.9999103546 │ 45.1459 │ 30.0009 │  37.18499999185042 │
│      1 │ 2019-01-01 │  1000 │  6154.999999046326 │     9.1 │    3.01 │  6.154999999046326 │
│      2 │ 2019-01-01 │  1000 │  9118.399992465973 │ 12.1301 │  6.0011 │  9.118399992465973 │
│      3 │ 2019-01-01 │  1000 │ 12096.499998092651 │ 15.1812 │  9.1182 │ 12.096499998092652 │
│      4 │ 2019-01-01 │  1000 │  15040.29999256134 │ 18.1233 │ 12.0773 │  15.04029999256134 │
│      5 │ 2019-01-01 │  1000 │ 18125.899991989136 │ 21.1994 │ 15.0644 │ 18.125899991989137 │
│      6 │ 2019-01-01 │  1000 │  21185.09998703003 │ 24.1755 │ 18.0895 │  21.18509998703003 │
│      7 │ 2019-01-01 │  1000 │ 24034.600004196167 │ 27.1696 │ 21.0026 │ 24.034600004196168 │
│      8 │ 2019-01-01 │  1000 │ 27155.999990463257 │ 30.1957 │ 24.0117 │ 27.155999990463258 │
│      9 │ 2019-01-01 │  1000 │ 30086.500009536743 │ 33.1848 │ 27.0038 │ 30.086500009536742 │
│     10 │ 2019-01-01 │  1000 │  33094.30003166199 │ 36.1629 │ 30.0829 │  33.09430003166199 │
│      1 │ 2020-01-01 │  1000 │  6121.699999332428 │   9.165 │   3.111 │  6.121699999332428 │
│      2 │ 2020-01-01 │  1000 │  9138.199996948242 │ 12.1741 │  6.0041 │  9.138199996948241 │
│      3 │ 2020-01-01 │  1000 │ 12132.700001716614 │ 15.1862 │  9.0142 │ 12.132700001716614 │
│      4 │ 2020-01-01 │  1000 │ 15055.100002288818 │ 18.1763 │ 12.0163 │ 15.055100002288818 │
│      5 │ 2020-01-01 │  1000 │ 18032.100012779236 │ 21.1454 │ 15.0524 │ 18.032100012779235 │
│      6 │ 2020-01-01 │  1000 │ 21153.600038528442 │ 24.1725 │ 18.1265 │  21.15360003852844 │
│      7 │ 2020-01-01 │  1000 │ 24141.300004959106 │ 27.1236 │ 21.1196 │ 24.141300004959106 │
│      8 │ 2020-01-01 │  1000 │ 27071.199975967407 │ 30.1157 │ 24.0987 │ 27.071199975967406 │
│      9 │ 2020-01-01 │  1000 │ 30123.699993133545 │ 33.1418 │ 27.0528 │ 30.123699993133545 │
│     10 │ 2020-01-01 │  1000 │ 33096.899978637695 │ 36.0949 │ 30.0509 │ 33.096899978637694 │
└────────┴────────────┴───────┴────────────────────┴─────────┴─────────┴────────────────────┘

我们会发现,不论是往物化视图还是物化视图实体表插入数据,都可以使数据进入到物化视图实体表中,查询物化视图都可以得到新增的数据。

查看物化视图存储

查看系统表信息:

SELECT
	database,
	table,
	partition,
    name,
    rows,
    bytes_on_disk,
    modification_time,
    min_date,
    max_date,
    engine
FROM system.parts
WHERE table in ('tb_mtview_counter_daily_mv3','tb_mtview_counter_daily');

结果如下:

┌─database─┬─table───────────────────┬─partition─┬─name─────────┬─rows─┬─bytes_on_disk─┬───modification_time─┬───min_date─┬───max_date─┬─engine───────────┐
│ test     │ tb_mtview_counter_daily │ 201601    │ 201601_2_2_0 │   10 │           786 │ 2021-11-23 02:47:05 │ 2016-01-01 │ 2016-01-01 │ SummingMergeTree │
│ test     │ tb_mtview_counter_daily │ 201901    │ 201901_3_3_0 │   10 │           776 │ 2021-11-23 02:53:26 │ 2019-01-01 │ 2019-01-01 │ SummingMergeTree │
│ test     │ tb_mtview_counter_daily │ 202001    │ 202001_1_1_0 │   10 │           778 │ 2021-11-23 02:44:33 │ 2020-01-01 │ 2020-01-01 │ SummingMergeTree │
└──────────┴─────────────────────────┴───────────┴──────────────┴──────┴───────────────┴─────────────────────┴────────────┴────────────┴──────────────────┘

创建物化视图方式四

AggregatingMergeTree引擎创建物化视图

参考:/docs/en/engines/table-engines/mergetree-family/aggregatingmergetree/#example-of-an-aggregated-materialized-view

使用AggregatingMergeTree创建物化视图时,规则其实和前面类似,只不过这里所有的聚合函数都需要加上State后缀,而在使用时需要加上Merge后缀。

-- 没有to table时必须有engine
CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv4
ENGINE = AggregatingMergeTree()
PARTITION BY toYYYYMM(day) ORDER BY (device, day)
AS SELECT
    toStartOfMonth(create_time) as day,
    device,
    countState(value) as count,
    uniqState(value) as count_uniq,
    sumState(value) as sum,
    maxState(value) AS max_value_state,
    minState(value) AS min_value_state,
    avgState(value) AS avg_value_state
FROM test.tb_mtview_counter
WHERE create_time >= toDate('2021-01-01 00:00:00')
GROUP BY device, day
ORDER BY device, day;
向源表导入数据
INSERT INTO test.tb_mtview_counter
SELECT
      toDateTime('2021-01-01 00:00:00') + toInt64(number/10) AS create_time,
      (number % 10) + 1 AS device,
      (device * 3) +  (number/10000) + (rand() % 53) * 0.1 AS value
FROM system.numbers LIMIT 10000;
查询物化视图
-- 查询物化视图
SELECT
    device, day, 
    countMerge(count) as count,
    uniqMerge(count_uniq) as count_uniq,
    sumMerge(sum) as sum, 
    maxMerge(max_value_state) as max,
    minMerge(min_value_state) as min,
    avgMerge(avg_value_state) as avg
FROM test.tb_mtview_counter_daily_mv4
GROUP BY device, day
ORDER BY day, device;

-- 查询源表
SELECT
    toStartOfMonth(create_time) as day,
    device,
    count(*) as count,
    uniq(value) as count_uniq,
    sum(value) as sum,
    max(value) as max,
    min(value) as min,
    avg(value) as avg
from test.tb_mtview_counter 
WHERE create_time >= toDate('2021-01-01 00:00:00')
GROUP BY device, day
ORDER BY day, device;

结果如下:

┌─device─┬────────day─┬─count─┬─count_uniq─┬────────────────sum─┬─────max─┬─────min─┬────────────────avg─┐
│      1 │ 2021-01-01 │  1000 │        927 │  6119.699991703033 │   9.194 │   3.028 │  6.119699991703033 │
│      2 │ 2021-01-01 │  1000 │        908 │  9103.499992847443 │ 12.1681 │  6.0101 │  9.103499992847443 │
│      3 │ 2021-01-01 │  1000 │        932 │ 12109.299996376038 │ 15.1942 │  9.0042 │ 12.109299996376038 │
│      4 │ 2021-01-01 │  1000 │        919 │ 15160.899991989136 │ 18.1823 │ 12.0713 │ 15.160899991989135 │
│      5 │ 2021-01-01 │  1000 │        918 │  18105.59998035431 │ 21.1324 │ 15.1554 │  18.10559998035431 │
│      6 │ 2021-01-01 │  1000 │        920 │ 21167.400003433228 │ 24.1555 │ 18.0185 │ 21.167400003433226 │
│      7 │ 2021-01-01 │  1000 │        907 │  24069.20000076294 │ 27.0776 │ 21.0096 │  24.06920000076294 │
│      8 │ 2021-01-01 │  1000 │        925 │  27040.30001449585 │ 30.0767 │ 24.0157 │  27.04030001449585 │
│      9 │ 2021-01-01 │  1000 │        928 │ 30125.100023269653 │ 33.1758 │ 27.0588 │ 30.125100023269653 │
│     10 │ 2021-01-01 │  1000 │        926 │  33178.39996910095 │ 36.1689 │ 30.0159 │  33.17839996910095 │
└────────┴────────────┴───────┴────────────┴────────────────────┴─────────┴─────────┴────────────────────┘

假设我们想知道如何创建这种物化视图的显示表,使用 show create table test.tb_mtview_counter_daily_mv4即可得到:

CREATE MATERIALIZED VIEW test.tb_mtview_counter_daily_mv4
(
    `day` Date,
    `device` UInt32,
    `count` AggregateFunction(count, Float32),
    `count_uniq` AggregateFunction(uniq, Float32),
    `sum` AggregateFunction(sum, Float32),
    `max_value_state` AggregateFunction(max, Float32),
    `min_value_state` AggregateFunction(min, Float32),
    `avg_value_state` AggregateFunction(avg, Float32)
)
ENGINE = AggregatingMergeTree
PARTITION BY toYYYYMM(day)
ORDER BY (device, day)
SETTINGS index_granularity = 8192

总结

  1. 源表不可有nullable列,空值尽量用默认字符串或者数值填充,因为在order by中无法指定nullable列。
  2. 在创建 MV 表时不要使用 POPULATE 关键字,而是在 MV 表建好之后将数据手动导入 MV 表,clickhouse 官方并不推荐使用populated,因为在创建物化视图的过程中同时写入的数据不能被插入物化视图。
  3. 创建物化视图时,建议指定条件过滤掉历史数据,然后手动将历史数据导入物化视图。如果不指定的话,容易导致无法采用统一条件将历史数据导入物化视图,从而出现数据重复导入的情况。建表中的过滤条件可以显示提醒用户导入历史数据的条件。
  4. 创建物化视图时,尽量采用MergeTree家族中的聚合引擎,如SummingMergeTree, AggregatingMergeTree。创建物化视图时,聚合统计列,采用State函数,如:maxState, minState, avgState。查询物化视图时,聚合列采用Merge函数获取值,如:maxMerge, minMerge, avgMerge。
  5. 在使用 MV 的聚合引擎时,也需要按照聚合查询来写sql,因为聚合时机不可控
  6. 物化视图不支持同步删除,若源表的数据不存在(删除了)则物化视图的数据仍然保留
  7. 在创建 MV 表时,一定要使用 TO 关键字为 MV 表指定存储位置,否则不支持 嵌套视图(多个物化视图继续聚合一个新的视图)
  8. 在创建 MV 表时如果用到了多表联查,不能为连接表指定别名,如果多个连接表中存在同名字段,在连接表的查询语句中使用 AS 将字段名区分开
  9. 在创建 MV 表时如果用到了多表联查,只有当第一个查询的表有数据插入时,这个 MV 才会被触发

参考

/p/390560886

/p/362809994