TDengine 中 Interval 按查询时间划分窗口

发布于:2025-06-27 ⋅ 阅读:(15) ⋅ 点赞:(0)

在这里插入图片描述

背景

目前 INTERVAL 算子目前支持指定 interval_offset 作为划分时间窗口的起始时间点的偏移量。但发现用户使用时,可能希望窗口划分从 where 条件指定的时间戳开始,就得先自行算出并指定偏移量,这在使用上造成了不便。

行为说明

CREATE STABLE meters (ts timestamp, voltage int) TAGS (group_id int);

CREATE TABLE d1001 USING meters (group_id) TAGS (1); 

INSERT INTO d1001 VALUES
  ('2024-01-01 00:00:01', 1),
  ('2024-01-01 00:00:02', 2),
  ('2024-01-01 00:00:03', 3),
  ('2024-01-01 00:00:05', 5);

这是当前的 INTERVAL 语法:

INTERVAL(interval_val [, interval_offset])

其中,interval_offset 可以是整数类型,也可以是形如 “1s”, “2m”, “3h”, “4d”, “5w” 这些表示时间长度的字符串。在此基础上,额外支持指定 interval_offset 为 “auto”,表示从查询条件中的起始时间自动计算当前 offset。仅当查询条件包含明确可应用的起始时间限制时,auto 才能生效。以下是详细说明和示例:

  1. 当查询条件包含 _rowts >= ${ts},保证从 ${ts} 开始划分时间窗口:
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts >= '2024-01-01 00:00:01' interval (2s, auto);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:01.000 | 2024-01-01 00:00:03.000 |                  2000 |                     2 |
 2024-01-01 00:00:03.000 | 2024-01-01 00:00:05.000 |                  2000 |                     1 |
 2024-01-01 00:00:05.000 | 2024-01-01 00:00:07.000 |                  2000 |                     1 |
Query OK, 3 row(s) in set
  1. 当查询条件包含 _rowts > ${ts},保证从 ${ts} + 1 开始划分时间窗口:
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts > '2024-01-01 00:00:00' interval (2s, auto);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:00.001 | 2024-01-01 00:00:02.001 |                  2000 |                     2 |
 2024-01-01 00:00:02.001 | 2024-01-01 00:00:04.001 |                  2000 |                     1 |
 2024-01-01 00:00:04.001 | 2024-01-01 00:00:06.001 |                  2000 |                     1 |
Query OK, 3 row(s) in set
  1. 当查询条件包含多个条件 _rowts >= ${ts_i} 或者 _rowts > ${ts_j} 时,保证从 MAX{${ts_i}, …, ${ts_j} + 1, …} 开始划分时间窗口:
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts > '2024-01-01 00:00:00' and ts >= '2024-01-01 00:00:01' interval (2s, auto);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:01.000 | 2024-01-01 00:00:03.000 |                  2000 |                     2 |
 2024-01-01 00:00:03.000 | 2024-01-01 00:00:05.000 |                  2000 |                     1 |
 2024-01-01 00:00:05.000 | 2024-01-01 00:00:07.000 |                  2000 |                     1 |
Query OK, 3 row(s) in set
  1. 当查询条件不包含起始时间限制,或者起始时间限制不明确时,auto 不生效:
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts < '2024-01-01 00:00:06.000' interval (2s);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:00.000 | 2024-01-01 00:00:02.000 |                  2000 |                     1 |
 2024-01-01 00:00:02.000 | 2024-01-01 00:00:04.000 |                  2000 |                     2 |
 2024-01-01 00:00:04.000 | 2024-01-01 00:00:06.000 |                  2000 |                     1 |
Query OK, 3 row(s) in set

taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts - voltage >= '2024-01-01 00:00:00' interval (2s);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:00.000 | 2024-01-01 00:00:02.000 |                  2000 |                     1 |
 2024-01-01 00:00:02.000 | 2024-01-01 00:00:04.000 |                  2000 |                     2 |
 2024-01-01 00:00:04.000 | 2024-01-01 00:00:06.000 |                  2000 |                     1 |
Query OK, 3 row(s) in set

使用场景

当使用 INTERVAL 语句时,希望能够自动从 WHERE 指定的起始时间限制开始窗口划分的场景。

约束和限制

  1. 受限于目前解析器实现,当查询条件中时间戳列参与表达式计算时,不能求出起始时间限制,auto 不生效;但比较的值是常量表达式时可以生效:
-- `ts - 1s` 中 ts 参与表达式计算,暂时不支持计算起始时间限制,所以 auto 不生效
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts - 1s >= '2024-01-01 00:00:02' interval (2s);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:02.000 | 2024-01-01 00:00:04.000 |                  2000 |                     1 |
 2024-01-01 00:00:04.000 | 2024-01-01 00:00:06.000 |                  2000 |                     1 |
Query OK, 2 row(s) in set

-- 虽然 `ts >=` 右侧的是表达式,但它的结果是个常量,可以计算起始时间限制,所以 auto 生效
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts >= cast('2024-01-01 00:00:02' as timestamp) + 1s interval (2s, auto);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:03.000 | 2024-01-01 00:00:05.000 |                  2000 |                     1 |
 2024-01-01 00:00:05.000 | 2024-01-01 00:00:07.000 |                  2000 |                     1 |
Query OK, 2 row(s) in set
  1. 受限于目前解析器实现,当查询条件中时间筛选条件过于复杂时,不能求出起始时间限制,auto 不生效:
-- 简单的条件组合,auto 可以生效
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts = '2024-01-01 00:00:01' or ts >= cast('2024-01-01 00:00:02' as timestamp) + 1s interval (2s, auto);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:01.000 | 2024-01-01 00:00:03.000 |                  2000 |                     1 |
 2024-01-01 00:00:03.000 | 2024-01-01 00:00:05.000 |                  2000 |                     1 |
 2024-01-01 00:00:05.000 | 2024-01-01 00:00:07.000 |                  2000 |                     1 |
Query OK, 3 row(s) in set

-- 复杂的条件组合,auto 不生效
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts >= '2024-01-01 00:00:03' or ts in ('2024-01-01 00:00:01', '2024-01-01 00:00:05') interval (2s, auto);
         _wstart         |          _wend          |      _wduration       |       count(*)        |
====================================================================================================
 2024-01-01 00:00:00.000 | 2024-01-01 00:00:02.000 |                  2000 |                     1 |
 2024-01-01 00:00:02.000 | 2024-01-01 00:00:04.000 |                  2000 |                     1 |
 2024-01-01 00:00:04.000 | 2024-01-01 00:00:06.000 |                  2000 |                     1 |
Query OK, 3 row(s) in set
  1. 使用 AUTO 作为窗口偏移量时,如果窗口宽度的单位是 d (天), n (月), w (周), y (年),比如: INTERVAL(1d, AUTO), INTERVAL(3w, AUTO),此时 TSMA 优化无法生效。如果目标表上手动创建了TSMA,语句会报错退出;这种情况下,可以显式指定 Hint SKIP_TSMA 或者不使用 AUTO 作为窗口偏移量。

关于 TDengine

TDengine 是一款专为物联网、工业互联网等场景设计并优化的大数据平台,其核心模块是高性能、集群开源、云原生、极简的时序数据库。它能安全高效地将大量设备每天产生的高达 TB 甚至 PB 级的数据进行汇聚、存储、分析和分发,并提供 AI 智能体对数据进行预测与异常检测,提供实时的商业洞察。


网站公告

今日签到

点亮在社区的每一天
去签到