nuScenes里的目标物体的速度是如何获取的

发布于:2024-10-10 ⋅ 阅读:(13) ⋅ 点赞:(0)

nuScenes的那些标注文件里并没有标注记录物体的速度数据,而是读取标注数据后根据sample_annotation.json里目标在前后帧里的translation数据相减除以时间差获得x、y、z方向的三个速度分量(Vx,Vy,Vz)的,一版只使用了Vx和Vy,具体实现代码在nuscenes/nuscenes.py里的box_velocity()函数里:

def box_velocity(self, sample_annotation_token: str, max_time_diff: float = 1.5) -> np.ndarray:
        """
        Estimate the velocity for an annotation.
        If possible, we compute the centered difference between the previous and next frame.
        Otherwise we use the difference between the current and previous/next frame.
        If the velocity cannot be estimated, values are set to np.nan.
        :param sample_annotation_token: Unique sample_annotation identifier.
        :param max_time_diff: Max allowed time diff between consecutive samples that are used to estimate velocities.
        :return: <np.float: 3>. Velocity in x/y/z direction in m/s.
        """

        current = self.get('sample_annotation', sample_annotation_token)
        has_prev = current['prev'] != ''
        has_next = current['next'] != ''

        # Cannot estimate velocity for a single annotation.
        if not has_prev and not has_next:
            return np.array([np.nan, np.nan, np.nan])

        if has_prev:
            first = self.get('sample_annotation', current['prev'])
        else:
            first = current

        if has_next:
            last = self.get('sample_annotation', current['next'])
        else:
            last = current

        pos_last = np.array(last['translation'])
        pos_first = np.array(first['translation'])
        pos_diff = pos_last - pos_first

        time_last = 1e-6 * self.get('sample', last['sample_token'])['timestamp']
        time_first = 1e-6 * self.get('sample', first['sample_token'])['timestamp']
        time_diff = time_last - time_first
        if has_next and has_prev:
            # If doing centered difference, allow for up to double the max_time_diff.
            max_time_diff *= 2

        if time_diff > max_time_diff:
            # If time_diff is too big, don't return an estimate.
            return np.array([np.nan, np.nan, np.nan])
        else:
            return pos_diff / time_diff

之所以nuScenes可以直接拿translation相减后除以时间差(nuScenes时间戳默认是微秒16位的,先除以1e6得到秒)来获得速度(单位m/s),是因为nuScenes标注的bbox的坐标translation都是全局坐标系下的,例如:

而我们自己的数据集一般都是在激光雷达坐标系或相机坐标系或者自车坐标系下标注的,所以需要先借助自车的定位数据把bbox的translation转换到全局坐标系去,然后求差再除以时间差获得速度,直接拿局部坐标系下的标注数据的translation值相减除以时间差显然是不对的,因为这只适合自车是静止的情况。