Condense multiple rows to single row with counts based on unique values in sqlite

时间:2021-05-09 13:21:54

I am trying to condense a table which contains multiple rows per event to a smaller table which contains counts of key sub-events within each event. Events are defined based on unique combinations across columns.


As a specific example, say I have the following data involving customer visits to various stores on different dates with different items purchased:


cust date   store   item_type
a    1     Main St  1
a    1     Main St  2
a    1     Main St  2
a    1     Main St  2
b    1     Main St  1
b    1     Main St  2
b    1     Main St  2
c    1     Main St  1
d    2     Elm St   1
d    2     Elm St   3
e    2     Main St  1
e    2     Main St  1
a    3     Main St  1
a    3     Main St  2

I would like to restructure the data to a table that contains a single line per customer visit on a given day, with appropriate counts. I am trying to understand how to use SQLite to condense this to:


Index   cust   date   store     n_items    item1    item2    item3     item4
1         a      1   Main St        4        1        3        0        0
2         b      1   Main St        3        1        2        0        0
3         c      1   Main St        1        1        0        0        0
4         d      2   Elm St         2        1        0        1        0
5         e      2   Main St        2        2        0        0        0
6         a      3   Main St        2        1        1        0        0

I can do this in excel for this trivial example (begin with sumproduct( cutomer * date) as suggested here, followed by cumulative sum on this column to generate Index, then countif and countifs to generate desired counts).

我可以在excel中为这个简单的例子做这个(从这里建议的sumproduct(cutomer * date)开始,然后在这个列上生成Index的累积和,然后用countif和countif来生成所需的计数)。

Excel is poorly suited to doing this for thousands of rows, so I am looking for a solution using SQLite.


Sadly, my SQLite kung-fu is weak.


I think this is the closest I have found, but I am having trouble understanding exactly how to adapt it.


When I tried a more basic approach to begin by generating a unique index:


CREATE UNIQUE INDEX ui ON t(cust, date);

I get:

Error: indexed columns are not unique

I would greatly appreciate any help with where to start. Many thanks in advance!


1 个解决方案



To create one result record for each unique combination of column values, use GROUP BY.

要为每个唯一的列值组合创建一个结果记录,请使用GROUP BY。

The number of records in the group is available with COUNT.


To count specific item types, use a boolean expression like item_type=x, which returns 0 or 1, and sum this over all records in the group:

要计算特定的项类型,请使用类似于item_type = x的布尔表达式,它返回0或1,并对组中的所有记录求和:

SELECT cust,
       COUNT(*) AS n_items,
       SUM(item_type = 1) AS item1,
       SUM(item_type = 2) AS item2,
       SUM(item_type = 3) AS item3,
       SUM(item_type = 4) AS item4
GROUP BY cust,



To create one result record for each unique combination of column values, use GROUP BY.

要为每个唯一的列值组合创建一个结果记录,请使用GROUP BY。

The number of records in the group is available with COUNT.


To count specific item types, use a boolean expression like item_type=x, which returns 0 or 1, and sum this over all records in the group:

要计算特定的项类型,请使用类似于item_type = x的布尔表达式,它返回0或1,并对组中的所有记录求和:

SELECT cust,
       COUNT(*) AS n_items,
       SUM(item_type = 1) AS item1,
       SUM(item_type = 2) AS item2,
       SUM(item_type = 3) AS item3,
       SUM(item_type = 4) AS item4
GROUP BY cust,