SQL:将两个表分组为1,然后是join、union和then?

时间:2022-09-05 15:34:01

I have 5 tables:

我有5个表:

customers id - name

客户id,名称

p_orders id - id_customer - code - date

p_orders id - id_customer - code - date

p_items id - id_order - description - price

p_items id - id_order - description - price

and h_orders and h_items, that are exactly the copy of p_orders and p_items.

以及h_orders和h_items,它们都是p_orders和p_items的副本。

When the p_ tables reach a big amount of rows, i move the oldest to the h_ tables.. they due as history.

当p_表达到大量行时,我将最老的表移到h_表。他们由于历史。

So, my problem is: how to retrieve the data from both the p_ tables and h_ considering them as one unique table?

因此,我的问题是:如何从p_表中检索数据,并将它们视为唯一的表?

For example, i want to retrieve the number of orders for each customer, and the total price (of all the customer's orders), and i use that query:

例如,我想检索每个客户的订单数量和总价格(所有客户的订单),我使用这个查询:

SELECT
    customer.id,
    customer.name,
    count(DISTINCT p_orders.id) AS num_orders,
    sum(p_items.price) AS total_money
FROM
    customer
    INNER JOIN p_orders ON p_orders.id_customer = customer.id
    INNER JOIN p_items ON p_items.id_order = p_orders.id
GROUP BY
    customer.id,
    customer.name,
    p_orders.id_customer
ORDER BY
    customer.id

it works just for one 'set' of tables (p_ or h_)..but i want them both.

它只适用于一组表(p_或h_)。但我都想要。

I've tryed to use an UNION:

我倾向于使用工会:

(
    SELECT
        customer.id,
        customer.name,
        count(DISTINCT p_orders.id) AS num_orders,
        sum(p_items.price) AS total_money
    FROM
        customer
        INNER JOIN p_orders ON p_orders.id_customer = customer.id
        INNER JOIN p_items ON p_items.id_order = p_orders.id
    GROUP BY
        customer.id,
        customer.name,
        p_orders.id_customer
)
UNION
(
    SELECT
        customer.id,
        customer.name,
        count(DISTINCT h_orders.id) AS num_orders,
        sum(h_items.price) AS total_money
    FROM
        customer
        INNER JOIN h_orders ON h_orders.id_customer = customer.id
        INNER JOIN h_items ON h_items.id_order = h_orders.id
    GROUP BY
        customer.id,
        customer.name,
        h_orders.id_customer
)
ORDER BY id ASC

This one works, but if a customer have orders both in the p_ tables and in the h_ tables, i'll have 2 rows for that customer with 2 different num_orders and total_money (respectively coming from p_ tables and h_ tables)

这一项是有效的,但是如果客户在p_表和h_表中都有订单,那么我将为该客户提供2行,其中有2个不同的num_orders和total_money(分别来自p_表和h_表)

I've tryed to add a GROUP BY id outside the union:

我试图在联盟外通过id添加一个组:

(
    --SELECT 2
)
UNION
(
    --SELECT 1
)
GROUP BY id
ORDER BY id ASC

but the query fail with ERROR: syntax error at or near "GROUP" at character 948, seem like GROUP BY cannot be used in that way.

但是查询失败并出现错误:在字符948处或“GROUP”附近出现语法错误,似乎GROUP BY不能这样使用。

Any suggestion?

任何建议吗?

EDIT:

编辑:

For uriDium, yes, all the tables have the id column as primary key, and the referred fields (aka p_orders.id_customer) are foreign keys too. Here the test db structure dump (i added some indexes and foreign keys after the table creation, but i dont think that this mean something):

对于uriDium,是的,所有表都将id列作为主键,引用的字段(即p_order .id_customer)也是外键。这里是测试db结构转储(我在创建表之后添加了一些索引和外键,但我不认为这意味着什么):

CREATE TABLE customer (
    id serial NOT NULL,
    name character(50)
);
CREATE TABLE p_orders (
    id serial NOT NULL,
    id_customer integer NOT NULL,
    date date DEFAULT now(),
    code character(5)
);
CREATE TABLE p_items (
    id serial NOT NULL,
    id_order integer NOT NULL,
    descr character(250),
    price money
);
CREATE TABLE h_orders (
    id integer NOT NULL,
    id_customer integer NOT NULL,
    date date,
    code character(5)
);
CREATE TABLE h_items (
    id integer NOT NULL,
    id_order integer NOT NULL,
    descr character(250),
    price money
);
CREATE UNIQUE INDEX id_h_orders ON h_orders USING btree (id);
CREATE INDEX id_h_o_c ON h_orders USING btree (id_customer);
CREATE UNIQUE INDEX id_items_h ON h_items USING btree (id);
CREATE INDEX id_ordinr_dsve ON h_items USING btree (id_order);

ALTER TABLE ONLY customer
    ADD CONSTRAINT customer_pkey  (id);
ALTER TABLE ONLY p_orders
    ADD CONSTRAINT p_orders_pkey PRIMARY KEY (id);
ALTER TABLE ONLY p_items
    ADD CONSTRAINT p_items_pkey PRIMARY KEY (id);
ALTER TABLE ONLY stats
    ADD CONSTRAINT stats_pkey PRIMARY KEY (id);
ALTER TABLE ONLY p_orders
    ADD CONSTRAINT "$1" FOREIGN KEY (id_customer) REFERENCES customer(id) ON DELETE CASCADE;
ALTER TABLE ONLY p_items
    ADD CONSTRAINT "$1" FOREIGN KEY (id_order) REFERENCES p_orders(id) ON DELETE CASCADE;
ALTER TABLE ONLY h_orders
    ADD CONSTRAINT "$1" FOREIGN KEY (id_customer) REFERENCES customer(id) ON DELETE CASCADE;
ALTER TABLE ONLY h_items
    ADD CONSTRAINT "$1" FOREIGN KEY (id_order) REFERENCES h_orders(id) ON DELETE CASCADE;

6 个解决方案

#1


4  

You should probably create views over the two tables:

您可能应该在这两个表上创建视图:

CREATE VIEW All_Orders
AS
     SELECT
          id,
          id_customer,
          code,
          date,
          'H' AS order_type
     FROM
          h_orders
     UNION ALL
     SELECT
          id,
          id_customer,
          code,
          date,
          'P' AS order_type
     FROM
          p_orders

CREATE VIEW All_Order_Items  -- A table name of "items" is pretty bad in my opinion
AS
     SELECT
          id,
          id_order,
          description,
          price,
          'H' AS order_item_type
     FROM
          h_items
     UNION ALL
     SELECT
          id,
          id_order,
          description,
          price,
          'P' AS order_item_type
     FROM
          p_items

Now you can just join to those views. I included the types (P & H) so that you know what the "id" column now refers to. If the ids in your two tables ("h" and "p" can have duplicates then you will have to join the Orders table right in the All_Order_Items view. Otherwise you will have a lot of trouble joining between the two views. Hopefully your id columns are intelligently designed and not just auto-incrmenting or identity columns.

现在可以加入这些视图。我包含了类型(P & H),以便您知道“id”列现在指的是什么。如果两个表中的id(“h”和“p”可以有重复,那么必须在All_Order_Items视图中加入Orders表。否则,在这两个视图之间进行连接将会有很多困难。希望您的id列是智能设计的,而不仅仅是自动生成或标识列。

#2


2  

You could try this:

你可以试试这个:

SELECT tbl.ID, 
       tbl.Name, 
       sum(tbl.num_orders) num_orders, 
       sum(tbl.total_money) total_money
FROM (    
      SELECT customer.id, 
             customer.name,        
             count(DISTINCT p_orders.id) AS num_orders,        
             sum(p_items.price) AS total_money    
      FROM customer        
            INNER JOIN p_orders 
                ON p_orders.id_customer = customer.id        
            INNER JOIN p_items 
                ON p_items.id_order = p_orders.id    
      GROUP BY customer.id, customer.name, p_orders.id_customer

      UNION

      SELECT customer.id, 
             customer.name,        
             count(DISTINCT h_orders.id) AS num_orders,
             sum(h_items.price) AS total_money    
      FROM  customer        
             INNER JOIN h_orders 
                 ON h_orders.id_customer = customer.id
             INNER JOIN h_items 
                 ON h_items.id_order = h_orders.id    
      GROUP BY customer.id, customer.name, h_orders.id_customer
    ) tbl
 GROUB BY tbl.id, tbl.name
 ORDER BY tbl.id ASC

#3


1  

Create a view with the union of the two queries but without the aggregate functions. USe Union All as the same record is not in both tables and you don't need the server to waste time looking to see that.You will probaly have other times you want to access both tables in a query.

使用两个查询的联合创建视图,但不使用聚合函数。使用Union All,因为两个表中没有相同的记录,您不需要服务器浪费时间来查看这些记录。您可能需要在查询中访问这两个表。

Then write your query using the view.

然后使用视图编写查询。

view code would be something like (you may want other fields for other purposes as well:

视图代码类似于(您可能还希望其他字段用于其他目的:

Create view customerOrders
AS
SELECT      customer.id as CustomerID,  customer.name, p_orders.id as OrderID,  p_items.price  as price
FROM        customer        
INNER JOIN  p_orders ON p_orders.id_customer = customer.id        
INNER JOIN  p_items ON p_items.id_order = p_orders.id
union all
SELECT      customer.id,  customer.name,  h_orders.id as id, H_items.price           
FROM        customer        
INNER JOIN  h_orders ON h_orders.id_customer = customer.id        
INNER JOIN  h_items ON h_items.id_order = h_orders.id

then the call for your query would be something like (none of this is tested may need adjustment)

然后,对查询的调用将类似于(这些都不需要进行测试)

SELECT    CustomerID,    customer.name,    count(DISTINCT OrderID) AS num_orders,    
sum(price) AS total_money
FROM    customerOrders
GROUP BY     CustomerID,    customer.name
ORDER BY    CustomerID

#4


0  

As far as I know SQL Server should automatically eliminating duplicates. Using UNION ALL will include duplicates. I would imagine that SQL Server would use the primary keys as a means of working out what is a duplicate. Are the primary keys on these tables made up of the same datatype and is ID 1 in your p table also ID 1 in your h table?

据我所知,SQL Server应该自动消除重复。使用UNION ALL将包括副本。我认为SQL Server将使用主键作为确定什么是副本的一种手段。这些表上的主键是由相同的数据类型组成的吗? p表中的ID 1也是h表中的ID 1吗?

#5


0  

The easiest way to do what you are looking at would be to create views (say "a_orders" and "a_items"). The views would just be defined like:

最简单的方法就是创建视图(比如“a_orders”和“a_items”)。这些视图的定义如下:

SELECT * FROM p_orders
UNION
SELECT * FROM h_orders

If you delete rows from a_orders as you insert them to h_orders (so a given order would not be in both tables) it would be quite a bit more efficient to use UNION ALL instead of UNION.

如果在将行插入到h_orders中时从a_orders中删除行(因此给定的订单不会同时出现在两个表中),那么使用UNION ALL而不是UNION将会更加有效。

#6


0  

Thanks for all the replies, guys..

谢谢大家的回复。

Both the 'views way' and the 'subquery way' by Jimmie R. Houts works perfectly, maybe the views are just more convenient to use.. and them both should take the same time (or not?)

Jimmie R. Houts的“视图方式”和“子查询方式”都很完美,也许视图更方便使用。而且他们都应该花同样的时间(或者不?)

So i'll mark as best answer the first one about the views.

我把第一个关于视图的答案标记为最佳答案。

Anyway, if i can, may i ask you if the structure and indexes i used are a good or could be optimized?

不管怎样,如果可以的话,我可以问你我使用的结构和索引是好的还是可以优化的?

#1


4  

You should probably create views over the two tables:

您可能应该在这两个表上创建视图:

CREATE VIEW All_Orders
AS
     SELECT
          id,
          id_customer,
          code,
          date,
          'H' AS order_type
     FROM
          h_orders
     UNION ALL
     SELECT
          id,
          id_customer,
          code,
          date,
          'P' AS order_type
     FROM
          p_orders

CREATE VIEW All_Order_Items  -- A table name of "items" is pretty bad in my opinion
AS
     SELECT
          id,
          id_order,
          description,
          price,
          'H' AS order_item_type
     FROM
          h_items
     UNION ALL
     SELECT
          id,
          id_order,
          description,
          price,
          'P' AS order_item_type
     FROM
          p_items

Now you can just join to those views. I included the types (P & H) so that you know what the "id" column now refers to. If the ids in your two tables ("h" and "p" can have duplicates then you will have to join the Orders table right in the All_Order_Items view. Otherwise you will have a lot of trouble joining between the two views. Hopefully your id columns are intelligently designed and not just auto-incrmenting or identity columns.

现在可以加入这些视图。我包含了类型(P & H),以便您知道“id”列现在指的是什么。如果两个表中的id(“h”和“p”可以有重复,那么必须在All_Order_Items视图中加入Orders表。否则,在这两个视图之间进行连接将会有很多困难。希望您的id列是智能设计的,而不仅仅是自动生成或标识列。

#2


2  

You could try this:

你可以试试这个:

SELECT tbl.ID, 
       tbl.Name, 
       sum(tbl.num_orders) num_orders, 
       sum(tbl.total_money) total_money
FROM (    
      SELECT customer.id, 
             customer.name,        
             count(DISTINCT p_orders.id) AS num_orders,        
             sum(p_items.price) AS total_money    
      FROM customer        
            INNER JOIN p_orders 
                ON p_orders.id_customer = customer.id        
            INNER JOIN p_items 
                ON p_items.id_order = p_orders.id    
      GROUP BY customer.id, customer.name, p_orders.id_customer

      UNION

      SELECT customer.id, 
             customer.name,        
             count(DISTINCT h_orders.id) AS num_orders,
             sum(h_items.price) AS total_money    
      FROM  customer        
             INNER JOIN h_orders 
                 ON h_orders.id_customer = customer.id
             INNER JOIN h_items 
                 ON h_items.id_order = h_orders.id    
      GROUP BY customer.id, customer.name, h_orders.id_customer
    ) tbl
 GROUB BY tbl.id, tbl.name
 ORDER BY tbl.id ASC

#3


1  

Create a view with the union of the two queries but without the aggregate functions. USe Union All as the same record is not in both tables and you don't need the server to waste time looking to see that.You will probaly have other times you want to access both tables in a query.

使用两个查询的联合创建视图,但不使用聚合函数。使用Union All,因为两个表中没有相同的记录,您不需要服务器浪费时间来查看这些记录。您可能需要在查询中访问这两个表。

Then write your query using the view.

然后使用视图编写查询。

view code would be something like (you may want other fields for other purposes as well:

视图代码类似于(您可能还希望其他字段用于其他目的:

Create view customerOrders
AS
SELECT      customer.id as CustomerID,  customer.name, p_orders.id as OrderID,  p_items.price  as price
FROM        customer        
INNER JOIN  p_orders ON p_orders.id_customer = customer.id        
INNER JOIN  p_items ON p_items.id_order = p_orders.id
union all
SELECT      customer.id,  customer.name,  h_orders.id as id, H_items.price           
FROM        customer        
INNER JOIN  h_orders ON h_orders.id_customer = customer.id        
INNER JOIN  h_items ON h_items.id_order = h_orders.id

then the call for your query would be something like (none of this is tested may need adjustment)

然后,对查询的调用将类似于(这些都不需要进行测试)

SELECT    CustomerID,    customer.name,    count(DISTINCT OrderID) AS num_orders,    
sum(price) AS total_money
FROM    customerOrders
GROUP BY     CustomerID,    customer.name
ORDER BY    CustomerID

#4


0  

As far as I know SQL Server should automatically eliminating duplicates. Using UNION ALL will include duplicates. I would imagine that SQL Server would use the primary keys as a means of working out what is a duplicate. Are the primary keys on these tables made up of the same datatype and is ID 1 in your p table also ID 1 in your h table?

据我所知,SQL Server应该自动消除重复。使用UNION ALL将包括副本。我认为SQL Server将使用主键作为确定什么是副本的一种手段。这些表上的主键是由相同的数据类型组成的吗? p表中的ID 1也是h表中的ID 1吗?

#5


0  

The easiest way to do what you are looking at would be to create views (say "a_orders" and "a_items"). The views would just be defined like:

最简单的方法就是创建视图(比如“a_orders”和“a_items”)。这些视图的定义如下:

SELECT * FROM p_orders
UNION
SELECT * FROM h_orders

If you delete rows from a_orders as you insert them to h_orders (so a given order would not be in both tables) it would be quite a bit more efficient to use UNION ALL instead of UNION.

如果在将行插入到h_orders中时从a_orders中删除行(因此给定的订单不会同时出现在两个表中),那么使用UNION ALL而不是UNION将会更加有效。

#6


0  

Thanks for all the replies, guys..

谢谢大家的回复。

Both the 'views way' and the 'subquery way' by Jimmie R. Houts works perfectly, maybe the views are just more convenient to use.. and them both should take the same time (or not?)

Jimmie R. Houts的“视图方式”和“子查询方式”都很完美,也许视图更方便使用。而且他们都应该花同样的时间(或者不?)

So i'll mark as best answer the first one about the views.

我把第一个关于视图的答案标记为最佳答案。

Anyway, if i can, may i ask you if the structure and indexes i used are a good or could be optimized?

不管怎样,如果可以的话,我可以问你我使用的结构和索引是好的还是可以优化的?