求助，关于 sql 查询时间取并集的情况

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

现在注册

已注册用户请登录

这是一个创建于 474 天前的主题，其中的信息可能已经有所发展或是发生改变。

如下,这种该如何优化，添加 open_time 和 close_time 的索引，但是还是比较慢，查询时候大约 500ms.

select count(1) from switchcover where open_time <= '2024-01-07 23:59:59' and close_time>= '2024-01-07 00:00:00' ;

11 条回复 • 2024-01-09 14:17:43 +08:00

vanpeisi7

2024-01-08 13:47:18 +08:00

单纯的加上 limit 和 offset 分页查询是很快的，几十 ms ，就是 count 太耗时了。

select
*
from
switch.bz_switchcover
where
close_time>= '2023-05-01 00:00:00'
and open_time <= '2323-05-01 23:59:59'
order by open_time desc
limit 15 offset 0
;

whoami9426

2024-01-08 13:52:56 +08:00

贴下 explain

vanpeisi7

2024-01-08 14:34:47 +08:00

@whoami9426

```
Finalize Aggregate (cost=362425.35..362425.36 rows=1 width=8)
-> Gather (cost=362425.14..362425.35 rows=2 width=8)
Workers Planned: 2
-> Partial Aggregate (cost=361425.14..361425.15 rows=1 width=8)
-> Parallel Index Only Scan using bz_switchcover_open_close_time_c_idx on bz_switchcover (cost=0.43..358436.32 rows=1195526 width=0)
Index Cond: ((open_time <= '2023-05-11 23:59:59'::timestamp without time zone) AND (close_time >= '2023-05-11 00:00:00'::timestamp without time zone))
JIT:
Functions: 5
Options: Inlining false, Optimization false, Expressions true, Deforming true

```

vanpeisi7

2024-01-08 14:37:07 +08:00

详细的 explain：

Finalize Aggregate (cost=425545.45..425545.46 rows=1 width=8) (actual time=522.040..573.976 rows=1 loops=1)
-> Gather (cost=425545.23..425545.44 rows=2 width=8) (actual time=521.286..573.949 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=424545.23..424545.24 rows=1 width=8) (actual time=494.321..494.322 rows=1 loops=3)
-> Parallel Index Only Scan using bz_switchcover_open_close_time_c_idx on bz_switchcover (cost=0.43..421581.16 rows=1185629 width=0) (actual time=20.590..487.712 rows=9267 loops=3)
Index Cond: ((open_time <= '2023-07-16 23:59:59'::timestamp without time zone) AND (close_time >= '2023-07-16 00:00:00'::timestamp without time zone))
Heap Fetches: 18
Planning Time: 0.184 ms
JIT:
Functions: 11
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 1.365 ms, Inlining 0.000 ms, Optimization 1.067 ms, Emission 17.333 ms, Total 19.766 ms
Execution Time: 574.577 ms

Huelse

2024-01-08 15:30:37 +08:00

你应该 create index where close_time>= '2024-01-07 00:00:00'::timestamp;

whoami9426

2024-01-08 16:24:22 +08:00

看看这篇 [理解 PostgreSQL 的 count 函数的行为]( https://zhuanlan.zhihu.com/p/63379010)
如果表中数据量本身就很大, count 耗时长是难免的

opengps

2024-01-08 16:27:29 +08:00

如果没有时间差字段，那么最好是有这俩时间列的联合索引。
如果有时间差字段，那么开始时间和时间差字段的组合要比直接操作两个 time 类型容易很多，这里也得有联合索引辅助

MoYi123

2024-01-08 18:50:21 +08:00

这个 sql 什么意思, 感觉逻辑很奇怪, 一般来说 open_time 应该小于 close_time 吧

总之想个办法, 把其中一个时间改成 between 一个较小的时间段, 性能应该就会好不少.

现在是 2 个大 set 取交集, 肯定很慢.

siweipancc

2024-01-09 11:39:24 +08:00 via iPhone

给个建议，count(id)

siweipancc

2024-01-09 11:52:55 +08:00 via iPhone

贴个古早回答
https://stackoverflow.com/questions/2710621/count-vs-count1-vs-countpk-which-is-better

Belmode

2024-01-09 14:17:43 +08:00

select count(1) from (select id, close_time from switchcover where open_time <= '2024-01-07 23:59:59') t where t.close_time>= '2024-01-07 00:00:00' ;
要不这样试试？