Postgresql LARGE查询优化
问题描述:
我在Postgresql中遇到了一些问题。此查询需要很长的时间来执行(无缓冲约30秒) 我的查询是在这里:Postgresql LARGE查询优化
SELECT d.name, COUNT (*) AS cnt,
'first' AS TYPE
FROM
tableA a
INNER JOIN tableD d ON d.NAME = 'FOO'
AND a.key = d.key
WHERE
a.DATE > '2017-06-01'
AND a.DATE < '2017-07-01'
group by d.name
UNION ALL
SELECT
d.name,
COUNT (*) AS cnt,
'second' AS TYPE
FROM
tableB b
INNER JOIN tableD d ON d.NAME = 'FOO'
AND b.key = d.key
WHERE
b.DATE > '2017-06-01'
AND b.DATE < '2017-07-01'
group by d.name
UNION ALL
SELECT
d.name,
COUNT (*) AS cnt,
'Third' AS TYPE
FROM
tableC c
INNER JOIN tableD d ON d.NAME = 'FOO'
AND c.key = d.key
WHERE
c.date > '2017-06-01'
AND c.date < '2017-07-01'
group by d.name
我创建了tableC.key(B树)索引和tableC.name(哈希) 而且其他表对日期和键(B树)索引
所以我的查询可以通过索引加入,并且可以通过指标筛选
我提出有几千行,别人有几十亿或几乎百亿
在Ë xecution计划我看到执行人使用嵌套循环中的所有我的连接(预计一个在BD加盟,有一个哈希联接)
也许我找到了“背叛者”
Node Type": "Bitmap Heap Scan",
"Parent Relationship": "Inner",
"Relation Name": "tableA",
"Alias": "a",
"Startup Cost": 2469.84,
"Total Cost": 137625.61,
"Plan Rows": 53748,
"Plan Width": 37,
"Recheck Cond": "(((key)::text = (d.key)::text) AND (date > '2017-06-01 00:00:00'::timestamp without time zone) AND (date < '2017-07-01 00:00:00'::timestamp without time zone))",
"Plans": [{
"Node Type": "Bitmap Index Scan",
"Parent Relationship": "Outer",
"Index Name": "\"date + key\"",
"Startup Cost": 0.00,
"Total Cost": 2456.40,
"Plan Rows": 53748,
"Plan Width": 0,
"Index Cond": "(((key)::text = (d.key)::text) AND (date > '2017-06-01 00:00:00'::timestamp without time zone) AND (date < '2017-07-01 00:00:00'::timestamp without time zone))"
}]
提出:
CREATE TABLE "sch"."tableD" (
"id" int4 NOT NULL,
"key" varchar(36) COLLATE "default",
"name" varchar(255) COLLATE "default",
CREATE INDEX "license_key" ON "sch"."tableD" USING btree ("key");
CREATE INDEX "name" ON "sch"."tableD" USING btree ("name");
表A:
CREATE TABLE "sch"."tableA" (
"id" int4 DEFAULT nextval('"sch".table'::regclass) NOT NULL,
"key" varchar(255) COLLATE "default",
"date" timestamp(6),
CREATE INDEX "date" ON "sch"."tableA" USING btree ("date");
CREATE INDEX "date + key" ON "sch"."tableA" USING btree ("key", "date")
CREATE INDEX "keyIndex" ON "sch"."tableA" USING btree ("key");
表B和C相似甲
我不知道,为什么我在这里失去了时间。你能帮我解决我的问题,这查询不应该运行30秒 谢谢
答
提供这些B树指数(不哈希):
b: (DATE, key)
b: (key, DATE)
d: (NAME, key)
d: (key, NAME)
它看起来像一个月的时间跨度,但你排除了月初。将>
更改为>=
。
开始通过测量每个子查询需要多长时间。然后你可以缩小性能问题。 –
不确定,但在我看来,我们可以消除工会和使用窗函数得到计数有1个查询。和一个case语句来设置类型和外部连接。 – xQbert
第一子查询花费的时间最长,但最行是在表A,所以我可以想像这可能会导致查询的放缓 如果我消除我的工会执行者可以选择散列连接(或合并联接,如果我上的按键使用哈希索引),但它是更慢(100-120秒) –