背景
目前 INTERVAL 算子目前支持指定 interval_offset 作为划分时间窗口的起始时间点的偏移量。但发现用户使用时,可能希望窗口划分从 where 条件指定的时间戳开始,就得先自行算出并指定偏移量,这在使用上造成了不便。
行为说明
CREATE STABLE meters (ts timestamp, voltage int) TAGS (group_id int);
CREATE TABLE d1001 USING meters (group_id) TAGS (1);
INSERT INTO d1001 VALUES
('2024-01-01 00:00:01', 1),
('2024-01-01 00:00:02', 2),
('2024-01-01 00:00:03', 3),
('2024-01-01 00:00:05', 5);
这是当前的 INTERVAL 语法:
INTERVAL(interval_val [, interval_offset])
其中,interval_offset 可以是整数类型,也可以是形如 “1s”, “2m”, “3h”, “4d”, “5w” 这些表示时间长度的字符串。在此基础上,额外支持指定 interval_offset 为 “auto”,表示从查询条件中的起始时间自动计算当前 offset。仅当查询条件包含明确可应用的起始时间限制时,auto 才能生效。以下是详细说明和示例:
- 当查询条件包含
_rowts >= ${ts}
,保证从${ts}
开始划分时间窗口:
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts >= '2024-01-01 00:00:01' interval (2s, auto);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:01.000 | 2024-01-01 00:00:03.000 | 2000 | 2 |
2024-01-01 00:00:03.000 | 2024-01-01 00:00:05.000 | 2000 | 1 |
2024-01-01 00:00:05.000 | 2024-01-01 00:00:07.000 | 2000 | 1 |
Query OK, 3 row(s) in set
- 当查询条件包含
_rowts > ${ts}
,保证从${ts} + 1
开始划分时间窗口:
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts > '2024-01-01 00:00:00' interval (2s, auto);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:00.001 | 2024-01-01 00:00:02.001 | 2000 | 2 |
2024-01-01 00:00:02.001 | 2024-01-01 00:00:04.001 | 2000 | 1 |
2024-01-01 00:00:04.001 | 2024-01-01 00:00:06.001 | 2000 | 1 |
Query OK, 3 row(s) in set
- 当查询条件包含多个条件
_rowts >= ${ts_i}
或者_rowts > ${ts_j}
时,保证从MAX{${ts_i}, …, ${ts_j} + 1, …}
开始划分时间窗口:
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts > '2024-01-01 00:00:00' and ts >= '2024-01-01 00:00:01' interval (2s, auto);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:01.000 | 2024-01-01 00:00:03.000 | 2000 | 2 |
2024-01-01 00:00:03.000 | 2024-01-01 00:00:05.000 | 2000 | 1 |
2024-01-01 00:00:05.000 | 2024-01-01 00:00:07.000 | 2000 | 1 |
Query OK, 3 row(s) in set
- 当查询条件不包含起始时间限制,或者起始时间限制不明确时,auto 不生效:
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts < '2024-01-01 00:00:06.000' interval (2s);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:00.000 | 2024-01-01 00:00:02.000 | 2000 | 1 |
2024-01-01 00:00:02.000 | 2024-01-01 00:00:04.000 | 2000 | 2 |
2024-01-01 00:00:04.000 | 2024-01-01 00:00:06.000 | 2000 | 1 |
Query OK, 3 row(s) in set
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts - voltage >= '2024-01-01 00:00:00' interval (2s);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:00.000 | 2024-01-01 00:00:02.000 | 2000 | 1 |
2024-01-01 00:00:02.000 | 2024-01-01 00:00:04.000 | 2000 | 2 |
2024-01-01 00:00:04.000 | 2024-01-01 00:00:06.000 | 2000 | 1 |
Query OK, 3 row(s) in set
使用场景
当使用 INTERVAL 语句时,希望能够自动从 WHERE 指定的起始时间限制开始窗口划分的场景。
约束和限制
- 受限于目前解析器实现,当查询条件中时间戳列参与表达式计算时,不能求出起始时间限制,auto 不生效;但比较的值是常量表达式时可以生效:
-- `ts - 1s` 中 ts 参与表达式计算,暂时不支持计算起始时间限制,所以 auto 不生效
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts - 1s >= '2024-01-01 00:00:02' interval (2s);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:02.000 | 2024-01-01 00:00:04.000 | 2000 | 1 |
2024-01-01 00:00:04.000 | 2024-01-01 00:00:06.000 | 2000 | 1 |
Query OK, 2 row(s) in set
-- 虽然 `ts >=` 右侧的是表达式,但它的结果是个常量,可以计算起始时间限制,所以 auto 生效
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts >= cast('2024-01-01 00:00:02' as timestamp) + 1s interval (2s, auto);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:03.000 | 2024-01-01 00:00:05.000 | 2000 | 1 |
2024-01-01 00:00:05.000 | 2024-01-01 00:00:07.000 | 2000 | 1 |
Query OK, 2 row(s) in set
- 受限于目前解析器实现,当查询条件中时间筛选条件过于复杂时,不能求出起始时间限制,auto 不生效:
-- 简单的条件组合,auto 可以生效
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts = '2024-01-01 00:00:01' or ts >= cast('2024-01-01 00:00:02' as timestamp) + 1s interval (2s, auto);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:01.000 | 2024-01-01 00:00:03.000 | 2000 | 1 |
2024-01-01 00:00:03.000 | 2024-01-01 00:00:05.000 | 2000 | 1 |
2024-01-01 00:00:05.000 | 2024-01-01 00:00:07.000 | 2000 | 1 |
Query OK, 3 row(s) in set
-- 复杂的条件组合,auto 不生效
taos> select _wstart, _wend, _wduration, count(*) from d1001 where ts >= '2024-01-01 00:00:03' or ts in ('2024-01-01 00:00:01', '2024-01-01 00:00:05') interval (2s, auto);
_wstart | _wend | _wduration | count(*) |
====================================================================================================
2024-01-01 00:00:00.000 | 2024-01-01 00:00:02.000 | 2000 | 1 |
2024-01-01 00:00:02.000 | 2024-01-01 00:00:04.000 | 2000 | 1 |
2024-01-01 00:00:04.000 | 2024-01-01 00:00:06.000 | 2000 | 1 |
Query OK, 3 row(s) in set
- 使用 AUTO 作为窗口偏移量时,如果窗口宽度的单位是 d (天), n (月), w (周), y (年),比如: INTERVAL(1d, AUTO), INTERVAL(3w, AUTO),此时 TSMA 优化无法生效。如果目标表上手动创建了TSMA,语句会报错退出;这种情况下,可以显式指定 Hint SKIP_TSMA 或者不使用 AUTO 作为窗口偏移量。
关于 TDengine
TDengine 是一款专为物联网、工业互联网等场景设计并优化的大数据平台,其核心模块是高性能、集群开源、云原生、极简的时序数据库。它能安全高效地将大量设备每天产生的高达 TB 甚至 PB 级的数据进行汇聚、存储、分析和分发,并提供 AI 智能体对数据进行预测与异常检测,提供实时的商业洞察。