MySQl计数按4列分组

Basically, this query returns me different values from counts()

基本上,这个查询从count()返回不同的值

Geographic Address(city),Office,Device type, Device unique type identifier, number case by device type
0001,1002,ORDENADOR,ORD1234,5 INCIDENCIAS
0001,1002,ORDENADOR,ORD3333,2 INCIDENCIAS
0001,1002,ORDENADOR,ORD2222,1 INCIDENCIAS
0001,1002,TECLADO,TECYYYY,2 INCIDENCIAS
0001,1002,TECLADO,TECXXXX,4 INCIDENCIAS
0001,1002,PANTALLA,PAN0000,1 INCIDENCIAS

Select 
        d.dt as 'Direccion Territorial',
        t.centro as 'Oficina',
        nombrelargo,
        if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina) as 'Oficina2',
        p.Tipo_Disp as 'Dispositivo',
        count(p.Tipo_Disp) as 'Nº de partes/Etiqueta',
        p.Etq_Amarilla as 'Etiqueta',
        ------------   count(TOTAL INC DE ESE DISPOSITIVO) ---------------------------,
        ------------   count(TOTAL INC DE ESA OFICINA) ---------------------------

from textcentro t,dtdz d,ppp p
        where 
                t.jcentro03=d.dt and
                t.organizativo='OFIC./AGEN./DELEG.' and
                t.situacion='ABIERTO' and
                t.sociedad='0900' and
                (p.Estado != "Abierto" and p.Estado!= 'Planificado') and
                (month(p.Fecha_y_hora_de_creacion) = 8 and year(Fecha_y_hora_de_creacion)=2013) and
                t.centro=if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina)

                GROUP BY d.dt,t.centro,p.Tipo_Disp,p.Etq_Amarilla

The grouping:

1 - d.dt ----> Postal code

1 - d.dt ---->邮政编码

2 - t.centro ----> Office code

2 - t.centro ----> Office代码

3 - p.Tipo_Disp ----> Device Type

3 - p.Tipo_Disp ---->设备类型

4 - d.Etq_Amarilla ----> Unique identifier for this device

4 - d.Etq_Amarilla ---->此设备的唯一标识符

The tables are :

表格是:

1- textcentro ----> Specific information of the offices

1- textcentro ---->办事处的具体信息

2- dtdz ----> auxiliary table to find the Postal Code of the office

2- dtdz ---->辅助表找到办公室的邮政编码

3- ppp ----> Table where we can find all the cases

3- ppp ---->表我们可以找到所有的案例

So now, I want to sum the total number of cases by device type, should be this:

所以现在,我想按设备类型总计案例总数,应该是这样的:

Postal Code,Office,Device type, Unique identifier for Device, total number of cases by unique identifier device, total number case by device type, total number case by office

0001,1002,ORDENADOR,ORD1234,5 INCIDENCIAS,8 INC,15
0001,1002,ORDENADOR,ORD3333,2 INCIDENCIAS,8 INC,15
0001,1002,ORDENADOR,ORD2222,1 INCIDENCIAS,8 INC,15
0001,1002,TECLADO,TECYYYY,2 INCIDENCIAS,6 INC,15
0001,1002,TECLADO,TECXXXX,4 INCIDENCIAS,6 INC,15
0001,1002,PANTALLA,PAN0000,1 INCIDENCIAS,1 INC,15

I'm trying with sums and counts functions but i dont reach it, i don't have any way to take the last two columns. I think that i can try to take this number by sub-query in the column but the performance will be down too much.

我正在尝试使用sums和count函数,但我没有达到它,我没有办法采取最后两列。我认为我可以尝试通过列中的子查询获取此数字,但性能将下降太多。

The example would be this... but even i get to finish the query and im waiting around 12-13 minutes.

这个例子就是这个......但即使我完成了查询,我也要等待大约12-13分钟。

Select 
        d.dt as 'Direccion Territorial',
        t.centro as 'Oficina',
        nombrelargo,
        if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina) as 'Oficina2',
        p.Tipo_Disp as 'Dispositivo',
        count(p.Tipo_Disp) as 'Nº de partes/Etiqueta',
        p.Etq_Amarilla as 'Etiqueta',
        (Select count(*) from People_DB pp where pp.Oficina=p.Oficina and pp.Tipo_Disp=Dispositivo and (month(pp.Fecha_y_hora_de_creacion) = 8 and year(pp.Fecha_y_hora_de_creacion)=2013) and (pp.Estado != "Abierto" and pp.Estado!= 'Planificado') )

from textcentro t,dtdz d,ppp p
        where 
                t.jcentro03=d.dt and
                t.organizativo='OFIC./AGEN./DELEG.' and
                t.situacion='ABIERTO' and
                t.sociedad='0900' and
                (p.Estado != "Abierto" and p.Estado!= 'Planificado') and
                (month(p.Fecha_y_hora_de_creacion) = 8 and year(Fecha_y_hora_de_creacion)=2013) and
                t.centro=if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina)

                GROUP BY d.dt,t.centro,p.Tipo_Disp,p.Etq_Amarilla

Sorry for my poor english, maybe this post is unintelligible

对不起,我的英语很差,也许这篇文章难以理解

1 个解决方案

#1

May I make some suggestions:

我可以提一些建议:

First, your choice of tables looks like this:

首先,您选择的表格如下所示:

 from textcentro t,dtdz d,ppp p

For the sake of clarity I suggest you employ explicit JOIN statements instead. For example

为了清楚起见,我建议您使用显式的JOIN语句。例如

 FROM textcentro AS t
 JOIN dtdx       AS d      ON t.jcentro03=d.dt
 JOIN ppp        AS p      ON  XXXXXXXXX

You may want to use LEFT JOIN in cases for example, where there might be no corresponding row in dtdx to go with a row in textcentro.

例如,您可能希望使用LEFT JOIN,其中dtdx中可能没有相应的行与textcentro中的行一起使用。

I cannot tell from your sample query what the ON constraint for the JOIN to ppp should be. I have shown that with XXXXXXXXX in my code above. I think your condition is this:

我无法从您的示例查询中了解JOIN到ppp的ON约束应该是什么。我在上面的代码中用XXXXXXXXX表明了这一点。我认为你的情况是这样的:

 t.centro=if(length(p.Oficina)=3,concat('0',p.Oficina),p.Oficina)

but that is a nasty expression to compute, and therefore very slow. It looks like your t.centro is a char column containing an integer with leading zeros, and your p.Oficina is the same but without the leading zeros. Instead of adding the leading zero to p.Oficina, try stripping it from the t.centro column.

但这是一个令人讨厌的计算表达式,因此非常慢。看起来你的t.centro是一个包含带前导零的整数的char列,而你的p.Oficina是相同的但没有前导零。不要将前导零添加到p.Oficina,而是尝试从t.centro列中删除它。

 CAST(t.centro AS INTEGER) = p.Oficina

Keep in mind that without a simple JOIN constraint you get a combinatorial explosion: m times n rows. This makes things slow and possibly wrong.

请记住,如果没有简单的JOIN约束,就会出现组合爆炸:m次n行。这使事情变得缓慢而且可能是错误的

So, your table selection becomes:

因此,您的表格选择变为:

 FROM textcentro AS t
 JOIN dtdx       AS d      ON t.jcentro03=d.dt
 JOIN ppp        AS p      ON CAST(t.centro AS INTEGER) = p.Oficina

Second, your date/time search expressions are not built for speed. Try this:

其次,您的日期/时间搜索表达式不是为了速度而构建的。试试这个:

      p.Fecha_y_hora_de_creacion >= '2013-08-01'
  AND p.Fecha_y_hora_de_creacion <  '2013-08-01' + INTERVAL 1 MONTH

If you have an index on your p.Fecha... column, this will permit a range-scan search on that column.

如果p.Fecha ...列上有索引,则允许对该列进行范围扫描搜索。

Third, this item in your SELECT list is killing performance.

第三,SELECT列表中的这个项目正在扼杀性能。

(Select count(*) 
   from People_DB pp 
  where pp.Oficina=p.Oficina 
    and pp.Tipo_Disp=Dispositivo
    and (month(pp.Fecha_y_hora_de_creacion) = 8 
    and year(pp.Fecha_y_hora_de_creacion)=2013) 
    and (pp.Estado != "Abierto" and pp.Estado!= 'Planificado') )

Refactor this to be a virtual table in your JOIN list, as follows.

将其重构为JOIN列表中的虚拟表,如下所示。

 (SELECT COUNT(*) AS NumPersonas,
         Oficina,
         Tipo_Disp
    FROM People_DB
   WHERE Fecha_y_hora_de_creacion >= '2013-08-01'
     AND Fecha_y_hora_de_creacion <  '2013-08-01' + INTERVAL 1 MONTH
     AND Estado != 'Abierto'
     AND Estado != 'Planificado 
   GROUP BY Oficina, Tipo_Disp
 ) AS pp_summary ON (    pp_summary.Oficina=p.Oficina
                     AND pp_summary.Tipo_Disp=Dispositivo)

So, this is your final list of tables.

所以,这是你最后的表格列表。

 FROM textcentro AS t
 JOIN dtdx       AS d      ON t.jcentro03=d.dt
 JOIN ppp        AS p      ON CAST(t.centro AS INTEGER) = p.Oficina
 JOIN  (
         SELECT COUNT(*) AS NumPersonas,
                Oficina,
                Tipo_Disp
           FROM People_DB
          WHERE Fecha_y_hora_de_creacion >= '2013-08-01'
            AND Fecha_y_hora_de_creacion <  '2013-08-01' + INTERVAL 1 MONTH
            AND Estado != 'Abierto'
            AND Estado != 'Planificado 
       GROUP BY Oficina, Tipo_Disp
 ) AS pp_summary ON (    pp_summary.Oficina=p.Oficina
                     AND pp_summary.Tipo_Disp=Dispositivo)

Three of these tables are "physical" tables, and the fourth is a "virtual" table, constructed as a summary of the physical table called People_DB.

其中三个表是“物理”表,第四个表是“虚拟”表,构造为名为People_DB的物理表的摘要。

You can include

你可以包括

     pp_summary.NumPersonas

in your SELECT list.

在您的SELECT列表中。

Fourth, avoid the nonstandard extensions to MySQL GROUP BY functionality, and use standard SQL. Read this for more information.

第四,避免MySQL GROUP BY功能的非标准扩展,并使用标准SQL。阅读本文以获取更多信息。

http://dev.mysql.com/doc/refman/5.0/en/group-by-extensions.html

Fifth, add appropriate indexes to your tables.

第五,为表添加适当的索引。

#1