对于多列SQL查询 - SUM(CASE WHEN x THEN 1 ELSE 0)

问题描述:

我正在查看是否有更好的方法来查询下面。我想要做的是创建一个总结报告,按日期编制统计信息。对于多列SQL查询 - SUM(CASE WHEN x THEN 1 ELSE 0)

SELECT CAST(Detail.ReceiptDate AS DATE) AS 'DATE' 
, SUM(CASE WHEN Detail.Type = 'TotalMailed' THEN 1 ELSE 0 END) AS 'TOTALMAILED' 
, SUM(CASE WHEN Detail.Type = 'TotalReturnMail' THEN 1 ELSE 0 END) AS 'TOTALUNDELINOTICESRECEIVED' 
, SUM(CASE WHEN Detail.Type = 'TraceReturnedMail' THEN 1 ELSE 0 END) AS 'TRACEUNDELNOTICESRECEIVED' 
FROM 
(
select SentDate AS 'ReceiptDate', 'TotalMailed' AS 'Type' 
from MailDataExtract 
where sentdate is not null 
UNION ALL 
select MDE.ReturnMailDate AS 'ReceiptDate', 'TotalReturnMail' AS 'Type' 
from MailDataExtract MDE 
where MDE.ReturnMailDate is not null 
UNION ALL 
select MDE.ReturnMailDate AS 'ReceiptDate', 'TraceReturnedMail' AS 'Type' 
from MailDataExtract MDE 
    inner join DTSharedData.dbo.ScanData SD ON SD.ScanDataID = MDE.ReturnScanDataID 
where MDE.ReturnMailDate is not null AND SD.ReturnMailTypeID = 1 
) AS Detail 
GROUP BY CAST(Detail.ReceiptDate AS DATE) 
ORDER BY 1 

这仅仅是该查询(其在一报告中所使用)的样品作为有一些其它列的和用于其它统计信息的逻辑是方式更加复杂。有没有更优雅的方法来获取这类信息/撰写这种报告?

+0

这是在一个进程或视图,还是其他什么东西?基本上,你可以引入变量并运行多个语句,还是只是一个大的'select'语句? –

+0

这是一个将被用于SSRS报告的过程,所以它将基本上是一个select语句,因为我需要返回一个结果集(对吧?) – MickJuice

+0

是的,你最终将有一个大的'select'结束,但由于它处于proc中,因此您可以将查询分解为更小,更简单的块,并根据变量将值分配给变量。这可以在可读性方面产生很大的不同。例如,可以将三个小型独立查询事先运行并将汇总结果指定给变量,而不是将这三个子查询进行联合或分组,而不是将这些变量分配给您的返回查询。可能更容易阅读和理解,并可能更好地表现。 –

我会更改查询在以下几个方面:

  1. 不要在子查询的聚集。这可以利用关于表格的更多信息来优化group by
  2. 合并第二个和第三个子查询。他们正在聚合在同一列。这需要使用left outer join来确保所有数据都可用。
  3. 通过使用count(<fieldname>)可以消除与is null的比较。这对第二和第三个计算值很重要。
  4. 要组合第二个和第三个查询,它需要计算mde表中的ID。这些使用mde.mdeid

以下版本如下您例如,通过使用union all

SELECT CAST(Detail.ReceiptDate AS DATE) AS "Date", 
     SUM(TOTALMAILED) as TotalMailed, 
     SUM(TOTALUNDELINOTICESRECEIVED) as TOTALUNDELINOTICESRECEIVED, 
     SUM(TRACEUNDELNOTICESRECEIVED) as TRACEUNDELNOTICESRECEIVED 
FROM ((select SentDate AS "ReceiptDate", COUNT(*) as TotalMailed, 
       NULL as TOTALUNDELINOTICESRECEIVED, NULL as TRACEUNDELNOTICESRECEIVED 
     from MailDataExtract 
     where SentDate is not null 
     group by SentDate 
    ) union all 
     (select MDE.ReturnMailDate AS ReceiptDate, 0, 
       COUNT(distinct mde.mdeid) as TOTALUNDELINOTICESRECEIVED, 
       SUM(case when sd.ReturnMailTypeId = 1 then 1 else 0 end) as TRACEUNDELNOTICESRECEIVED 
     from MailDataExtract MDE left outer join 
      DTSharedData.dbo.ScanData SD 
      ON SD.ScanDataID = MDE.ReturnScanDataID 
     group by MDE.ReturnMailDate; 
    ) 
    ) detail 
GROUP BY CAST(Detail.ReceiptDate AS DATE) 
ORDER BY 1; 

不使用full outer join类似以下内容:

SELECT coalesce(sd.ReceiptDate, mde.ReceiptDate) AS "Date", 
     sd.TotalMailed, mde.TOTALUNDELINOTICESRECEIVED, 
     mde.TRACEUNDELNOTICESRECEIVED 
FROM (select cast(SentDate as date) AS "ReceiptDate", COUNT(*) as TotalMailed 
     from MailDataExtract 
     where SentDate is not null 
     group by cast(SentDate as date) 
    ) sd full outer join 
    (select cast(MDE.ReturnMailDate as date) AS ReceiptDate, 
      COUNT(distinct mde.mdeID) as TOTALUNDELINOTICESRECEIVED, 
      SUM(case when sd.ReturnMailTypeId = 1 then 1 else 0 end) as TRACEUNDELNOTICESRECEIVED 
    from MailDataExtract MDE left outer join 
      DTSharedData.dbo.ScanData SD 
      ON SD.ScanDataID = MDE.ReturnScanDataID 
    group by cast(MDE.ReturnMailDate as date) 
    ) mde 
    on sd.ReceiptDate = mde.ReceiptDate 
ORDER BY 1; 

我认为你应该做一个子查询来做分组。在这种情况下,内部子查询返回少量行,并且不需要CASE语句。因此,我认为这将是更快:

select Detail.ReceiptDate AS 'DATE', 
     SUM(TotalMailed), 
     SUM(TotalReturnMail), 
     SUM(TraceReturnedMail) 

from 
(

select SentDate AS 'ReceiptDate', 
     count('TotalMailed') AS TotalMailed, 
     0 as TotalReturnMail, 
     0 as TraceReturnedMail 
from MailDataExtract 
where sentdate is not null 
GROUP BY SentDate 

UNION ALL 
select MDE.ReturnMailDate AS 'ReceiptDate', 
     0 AS TotalMailed, 
     count(TotalReturnMail) as TotalReturnMail, 
     0 as TraceReturnedMail 
from MailDataExtract MDE 
where MDE.ReturnMailDate is not null 
GROUP BY MDE.ReturnMailDate 

UNION ALL 

select MDE.ReturnMailDate AS 'ReceiptDate', 
     0 AS TotalMailed, 
     0 as TotalReturnMail, 
     count(TraceReturnedMail) as TraceReturnedMail 

from MailDataExtract MDE 
    inner join DTSharedData.dbo.ScanData SD 
     ON SD.ScanDataID = MDE.ReturnScanDataID 
    where MDE.ReturnMailDate is not null AND SD.ReturnMailTypeID = 1 
GROUP BY MDE.ReturnMailDate 

) as Detail 
GROUP BY Detail.ReceiptDate 
ORDER BY 1