基于机器学习技术的日冕物质抛射（CME）自动检测与跟踪新工具 (ApJS）-EW帮帮网

A New Automatic Tool for CME Detection and Tracking with Machine-learning Techniques

Abstract

With the accumulation of coronal mass ejection (CME) observations by coronagraphs, automatic detection and tracking of CMEs has proven to be crucial. The excellent performance of the convolutional neural network in image classification, object detection, and other computer vision tasks motivates us to apply it to CME detection and tracking as well. We developed a new tool for CME Automatic detection and tracking with MachinE Learning (CAMEL) techniques. The system is a three-module pipeline. It is first a supervised image classification problem. We solve it by training a neural network LeNet with training labels obtained from an existing CME catalog. Those images containing CME structures are flagged as CME images. Next, to identify the CME region in each CME-flagged image, we use deep descriptor transforming to localize the common object in an image set. A following step is to apply the graph cut technique to finely tune the detected CME region. To track the CME in an image sequence, the binary images with detected CME pixels are converted from a cartesian to a polar coordinate. A CME event is labeled if it can move in at least two frames and reach the edge of the coronagraph field of view. For each event, a few fundamental parameters are derived. The results of four representative CMEs with various characteristics are presented and compared with those from four existing automatic and manual catalogs. We find that CAMEL can detect more complete and weaker structures and has better performance to catch a CME as early as possible.

随着日冕仪对日冕物质抛射（CME）观测数据的不断积累，CME的自动检测与跟踪技术已被证明至关重要。卷积神经网络在图像分类、目标检测等计算机视觉任务中的卓越表现，促使我们将其应用于CME的检测与跟踪。我们开发了一种基于机器学习技术的CME自动检测与跟踪新工具——CAMEL（CME Automatic detection and tracking with MachinE Learning）。

该系统由三个模块组成：

首先，通过监督学习的图像分类方法，利用现有CME目录中的标注数据训练LeNet神经网络，标记包含CME结构的图像；

其次，通过深度描述符变换定位图像集中的共同目标区域，并结合图割技术精细调整检测到的CME区域；

最后，将检测到的二值图像转换为极坐标系，通过连续帧间的运动轨迹判定CME事件（需满足至少两帧可见且到达日冕仪视场边缘的条件），并提取其关键参数。

通过对四个具有不同特征的代表性CME事件进行测试，并与四种现有自动/人工目录结果对比，发现CAMEL能够更完整地检测弱结构，且能更早捕捉CME的初始信号。

1. Introduction

Observations of coronal mass ejections (CMEs) by space missions can be dated back to 1970s. The coronagraphs aboard the Solar and Heliospheric Observatory (SOHO) have made tremendous contributions to CME observations. The Large Angle and Spectrometric Coronagraph Experiment (LASCO; Brueckner et al. 1995) can follow CMEs from 1.1 to about 30 RS. Since the launch of the Solar TErrestrial RElations Observatory (STEREO) mission, CMEs can be observed from two different perspectives with the coronagraphs COR 1 and COR 2 in the Sun Earth Connection Coronal and Heliospheric Investigation (SECCHI; Howard et al. 2008) instrument package. With the accumulation of coronagraph images, it becomes more and more important to have the capability to automatically detect and track different features, especially CMEs, and build corresponding event catalogs. On one hand, they provide much easier access to data for statistical studies of CME key parameters. On the other hand, with automatic detection, the coronagraph images with CMEs flagged can be used immediately for the purpose of near-real-time space weather predictions.

日冕物质抛射（CME）的空间探测可追溯至20世纪70年代。太阳和日球层观测站（SOHO）搭载的日冕仪对CME观测做出了巨大贡献。其中，大角度和光谱日冕仪实验（LASCO；Brueckner等，1995）可追踪从1.1至约30倍太阳半径（RS）范围内的CME。自日地关系天文台（STEREO）任务实施后，其搭载的日球层联接冠状和日球层探测成像包（SECCHI；Howard等，2008）中的COR1和COR2日冕仪，实现了双视角CME观测。随着日冕图像数据的持续积累，自动检测与跟踪CME等特征并构建事件目录的需求日益迫切。一方面，此类目录为CME关键参数的统计研究提供了便捷的数据访问方式；另一方面，结合实时检测技术，可快速利用含CME标记的日冕图像进行空间天气预警。

Different CME catalogs have been developed with the long-running coronagraph observations. They are classified as either manual or automated catalogs. The manual catalog that we mostly use is the CME catalog created for LASCO observations and maintained at the Coordinated Data Analysis Workshops (CDAW) data center (Yashiro et al. 2004). Event movies of observations by LASCO and other related instruments together with key parameters of each CME are provided. Although the CDAW catalog has been widely adopted, the CME detection and tracking are done by human perception and are obviously subjective and time consuming. Depending on the experience of different operators, we may reach different detection results and physical parameters. When the Sun approaches its activity maximum, the detection and tracking of CMEs require significant man power.

These disadvantages of manual CME catalogs prompt the development of automatic catalogs. Several methods have been devised and deployed for the LASCO and/or SECCHI coronagraph images—for instance, the Solar Eruptive Event Detection System (SEEDS; Olmedo et al. 2008), Computer-Aided CME Tracking (CACTus; Robbrecht & Berghmans 2004; Robbrecht et al. 2009), CORonal IMage Process (CORIMP; Byrne et al. 2012), and Automatic Recognition of Transient Events and Marseilles Inventory from Synoptic maps (ARTEMIS; Boursier et al. 2009). Thanks to the observations of two STEREO spacecraft, dual-viewpoint CME catalogs have also been developed, e.g., Vourlidas et al. (2017), for STEREO/COR 2 observations. For all the aforementioned catalogs, CMEs are detected automatically using different traditional segmentation methods.

基于长期日冕观测数据，目前已形成多类CME目录，可分为人工与自动两类。最常用的人工目录是CDAW数据中心（Yashiro等，2004）维护的LASCO CME目录。该目录提供了CME事件的观测影像及其关键参数，但依赖人工判读，存在主观性强、耗时等缺陷。尤其在太阳活动峰年，CME频发时，人工处理需要大量人力资源，且不同操作者经验差异可能导致参数结果不一致。

人工目录的局限性推动了自动目录的发展。针对LASCO和/或SECCHI日冕仪图像，已开发出多种自动检测方法，例如：Solar Eruptive Event Detection System（SEEDS；Olmedo等，2008）、Computer-Aided CME Tracking（CACTus；Robbrecht & Berghmans，2004；Robbrecht等，2009）、CORonal IMage Process（CORIMP；Byrne等，2012），以及Automatic Recognition of Transient Events and Marseilles Inventory from Synoptic maps（ARTEMIS；Boursier等，2009）。此外，基于STEREO双视角数据，Vourlidas等（2017）构建了STEREO/COR2的CME目录。然而，上述方法均依赖传统图像分割技术，存在一定的局限性。

Nowadays, machine-learning techniques become more and more widely used in many different research fields. It brings a cross-disciplinary research community together between computer science and solar and heliospheric physics. There have been quite a few applications of machine-learning techniques for different solar features and space weather purposes. For example, Dhuri et al. (2019) used machine learning to understand the underlying mechanisms governing flares. Huang et al. (2018) applied a deep learning method to flare forecasting. Camporeale et al. (2017) and Delouille et al. (2018) used the machine-learning technique for the classification of solar wind and the coronal hole, respectively. Very recently, Galvez et al. (2019) even compiled a curated data set for the Solar Dynamics Observatory mission in a format suitable for the booming machine-learning research. A review paper on the challenges of machine learning in space weather nowcasting and forecasting can be found in Camporeale (2019).

近年来，机器学习技术在多个领域展现出强大能力，促进了计算机科学与太阳物理、空间天气研究的交叉融合。例如，Dhuri等（2019）利用机器学习解析耀斑机制；Huang等（2018）采用深度学习进行耀斑预报；Camporeale等（2017）和Delouille等（2018）分别将其用于太阳风分类和冕洞识别。近期，Galvez等（2019）还为太阳动力学天文台（SDO）任务整理了适配机器学习研究的数据集。关于机器学习在空间天气预警中的挑战，可参考Camporeale（2019）的综述。

In the field of computer vision, machine learning has shown excellent performance in image classification, feature detection, and tracking (Krizhevsky et al. 2012; He et al. 2017; Shelhamer et al. 2017). In view of its great success and our need of fast detection and tracking for CME prediction, we developed and validated our machine-learning technique, CME Automatic detection and tracking with MachinE Learning (CAMEL), for the automatic CME detection and tracking based on the LASCO C2 data. Section 2 describes the detailed mathematical methodology including image classification, CME detection, and CME tracking. In Section 3, we compare our results with those derived from existing SEEDS, CACTus, and CORIMP catalogs for four representative CMEs with different angular widths, velocities, and brightnesses. The method is developed and tested using the observations around the solar maximum, during which the CME occurrence rate is much higher than that around the solar minimum. The large number of CMEs around the solar maximum poses a challenge in CME detection and tracking. The last section is dedicated to conclusions and discussions.

在计算机视觉领域，机器学习在图像分类、目标检测与跟踪任务中表现卓越（Krizhevsky等，2012；He等，2017；Shelhamer等，2017）。鉴于其成功经验及CME快速检测与预警的需求，我们开发并验证了基于LASCO C2数据的CME自动检测与跟踪工具CAMEL。第2节详述方法论，包括图像分类、CME检测与跟踪的数学原理；第3节通过四个具有不同角宽度、速度和亮度的CME事件，对比CAMEL与SEEDS、CACTus、CORIMP目录的结果。本方法基于太阳活动峰年数据开发与测试，此时CME发生频率远高于活动低年，密集的事件对检测与跟踪技术提出了更高挑战。最后一节为结论与讨论。

2. Methodology

Our goal is to detect and track pixel-level CME regions in a set of white-light coronagraph images by using machine-learning methods. To this end, we design a three-module algorithm pipeline. In the first module, we use a trained convolutional neural network (CNN) to classify whether a coronagraph image contains CME structures or not. The images with CME structures are flagged as CME images. On the contrary, the rest images are flagged as non-CME images. The second module is to detect pixel-level CME regions in CME-flagged images by using an unsupervised common object co-localization method, and the detected CME regions are future-refined using the graph cut method in computer vision. The final module serves to track a CME in running-difference images.

我们的目标是通过机器学习方法，在一组白光日冕仪图像中检测和追踪像素级的日冕物质抛射（CME）区域。为此，我们设计了一个三模块算法流程。在第一个模块中，我们使用训练好的卷积神经网络（CNN）对日冕仪图像进行是否包含CME结构的分类。含有CME结构的图像被标记为CME图像，其余则标记为非CME图像。第二个模块通过无监督的共性物体共定位方法在标记为CME的图像中检测像素级CME区域，并采用计算机视觉中的图割方法对检测结果进行精细化处理。最终模块负责在时序差分图像中追踪CME运动。

2.1. Preprocessing

Before we going through the pipeline, all coronagraph data are processed in the following way: the downloaded level 0.5 LASCO C2 FITS files are read with lasco_readfits.pro from the Solar Software (SSW) and are then processed to level 1 data using reduce_level.pro from the SSW. The processing consists the calibrations for the dark current, flat field, stray light, distortion, vignetting, photometry, time, and position correction. After the processing, the solar north has been rotated to the image north. For CME detection and tracking, we use the running-difference images as inputs to the three-module algorithm pipeline.

As a preprocessing step, all input LASCO C2 images with a 1024 × 1024 resolution are first down-sampled to a 512 × 512 resolution and aligned according to coordinates of solar centers. Then, all down-sampled images are passed through a noise filter to suppress some sharp noise features. In our method, we use a normalized box filter with a sliding window that has the size of 3 × 3. Normalized box filtering is a basic linear image filtering, which computes the average value of the surrounding pixels. Then, the running-difference images are computed simply by using the following:

where u_i is the running-difference image, which equals the current image, n_i, minus the previous image, n_i-1. For some of the LASCO images containing missing blocks, we create a missing-block mask according to the previous image: if the value of a pixel in the previous image is zero, then the same pixel of the running-difference image also has a zero value. The final running-difference image is multiplied by the missingblock mask.

For the first module of our algorithm pipeline, we need to train a CNN for image classification and rough localization. From the perspective of computational efficiency, our CNN takes 112 × 112 resolution images as the input. After rough localization, the down-sampled images of the CME region will be refined by the graph cut method in the original 512 × 512 running-difference images.

2.1 预处理

在算法流程处理前，所有日冕仪数据均经过以下预处理：从太阳物理软件（SSW）调用lasco_readfits.pro读取LASCO C2的0.5级FITS文件，并使用reduce_level.pro升级为1级数据。该处理包含暗电流校正、平场校正、杂散光校正、畸变校正、渐晕校正、测光校正、时间与位置校正。处理后太阳北极已旋转至图像正上方。CME检测与追踪过程中，我们以时序差分图像作为三模块算法流程的输入。

作为预处理步骤，所有1024×1024分辨率的LASCO C2图像首先下采样至512×512分辨率，并根据日心坐标进行对齐。随后通过3×3滑动窗口的归一化盒式滤波器进行噪声抑制——这种基础线性滤波通过计算邻域像素平均值实现。时序差分图像通过以下公式生成：

其中u_i表示当前图像n_i与前帧图像n_{i-1}的差分结果。针对存在数据缺失块的LASCO图像，我们基于前帧图像生成掩膜：若某像素在前帧中值为零，则在差分图像中对应像素置零，最终通过掩膜矩阵实现缺失块修正。

在算法流程的第一模块中，我们需训练用于图像分类与粗定位的CNN。出于计算效率考量，该网络以112×112分辨率图像为输入。完成粗定位后，CME区域的下采样图像将在原始512×512时序差分图像中通过图割方法进行精细化处理。

2.2. Image Classification

Detecting and locating instances of a certain class in an image can be seen as a basic problem in computer vision. Each certain class has its own special features, which can be manually or automatically extracted by a supervised machine-learning method. Recently, CNNs have shown excellent performance in image classification, object detection, and some other computer vision tasks. The multi-scale and sliding window approach can help CNNs learn more robust features of a certain class without any human effort and prior knowledge. Before detecting the CME events in each image, we need a CNN model to tell us if there are CME structures in every input LASCO C2 running-difference image first. To train such a CNN in a supervised fashion, we need to first collect images and training labels. As a preprocessing step, all input running-difference images with a 1024 × 1024 resolution are down-sampled to a 112 × 112 resolution. The training set of data are 10-month LASCO C2 images from 2011 January to October around the solar maximum whose category label is known. Both image categories flagged with or without CMEs are obtained from the CDAW catalog. The first step of CME detection can be treated as a supervised image classification problem by assigning a given white-light coronagraph image to the CME-detected category or the CME-not-detected category. As a second step, the middle-level features extracted from the well-trained CNN can be used for detecting the CME regions in Section 2.3.

The CNN architecture we used is called LeNet-5 from Lecun et al. (1998), which has two convolution layers, two down-sampling layers, and two fully connected layers. This classical architecture can be divided into two modules: a feature extractor module and a classifier module. The feature extractor module consists of convolution layers, nonlinearity activation layers and down-sampling layers and the last two fully connected layers form the classification module. A convolution layer can be seen as a local-connected network in which each hidden unit will connect to only a small contiguous region of the input and obtain different feature activation values at each location. The convolution kernel slides from left to right and from top to bottom on the input feature map of the upper layer. Each time it slides to a position, the convolution kernel is multiplied and summed with the pixel value of input feature map block at that position, and the summation result is passed through the activation function to obtain an output pixel value of the feature map of the layer. The jth feature map of layer l is obtained as follows:

where N denotes the number of feature maps of layer l − 1, k represents convolution kernels, and b is a bias term. f represents the nonlinearity activation function. We use rectified linear units (ReLUs), which can make the CNN training several times faster. Down-sampling layers could help to enlarge the receptive field and aggregate features at various locations. This layer treats each feature map separately. It computes the max value over a neighborhood in each feature map. The neighborhoods are stepped by a stride whose size is two.

在计算机视觉中，检测与定位图像中特定类别的实例可视为基础性问题。每一类对象均具有可由人工或监督式机器学习方法提取的独有特征。近年来，卷积神经网络（CNN）在图像分类、目标检测等视觉任务中展现出卓越性能。通过多尺度滑动窗口方法，CNN无需人工先验知识即可自主学习更鲁棒的类别特征。在对LASCO C2时序差分图像实施CME检测前，需先由CNN模型判断图像是否含有CME结构。为此，我们首先需收集标注数据集用于监督式训练：所有1024×1024分辨率的时序差分图像均下采样至112×112分辨率，训练集采用2011年1月至10月（太阳活动峰年期间）的10个月LASCO C2图像，其类别标签（含CME/不含CME）源自CDAW目录。第一步的CME检测可视为监督式图像分类问题，即将日冕仪图像归类为"CME存在"或"CME不存在"；第二步则利用训练完备CNN提取的中层特征进行CME区域检测（详见第2.3节）。

本工作采用LeCun等人（1998）提出的LeNet-5架构，该网络包含两个卷积层、两个下采样层及两个全连接层。其结构可分为特征提取模块与分类模块：前者由卷积层、非线性激活层和下采样层构成，后者由末端的全连接层组成。卷积层本质为局部连接网络，每个隐藏单元仅连接输入特征图的局部区域，通过在输入特征图上滑动卷积核（从左至右、自上而下）实现特征激活。具体而言，当卷积核滑动至某位置时，与对应输入特征图子块进行乘积累加运算，并通过激活函数输出当前层特征图像素值。第l层第j个特征图的生成公式为：

其中N为第 l-1 层特征图数量，k为卷积核张量，b为偏置项，f 为非线性激活函数。本网络采用修正线性单元（ReLU）加速训练。下采样层通过最大池化操作（邻域尺寸2×2，步长2）扩展感受野并聚合空间特征，对每个特征图独立计算邻域内最大值以降低特征图分辨率。

After convolution layers and down-sampling layers, the feature map of each image is down-sampled to a 25 × 25 resolution. Then, the high-level semantic knowledge can be obtained via fully connected layers and output the final CME occurrence probability. The original LeNet architecture is designed for handwritten digit number recognition, and the output layer outputs 10 units, which represent the probability of each class (0–9). We modified the output layer and output two units, which represent the probability of the CME occurrence. To obtain the probability, we use two-way softmax function to produce a distribution over the two-class labels:

where x_CME and x_non-CME are both output units from the final output layer. An image with the output probability value greater than 0.5 can be seen as a CME-detected image. Figure 1 shows the LeNet architecture we use.

As a machine-learning approach, the CNN model needs to discover weights and biases automatically with training data and labels. As a classification problem, the objective loss function can be define as follows:

where N denotes the number of training data, y_i∗ is the true label which equals 0 or 1, yi is the CNN output probability, which is less than 1 and greater than 0. We see that L is nonnegative, so the aim of the CNN training process is to minimize L as a function of the weights and biases. We trained our models using stochastic gradient descent with a batch size of 128 examples. The update rules of weights and biases can be seen in the following:

where i is the iteration index and η is the learning rate. The learning rate was initialized at 0.0001 and reduced three times prior to termination. Only a batch of training examples are used for updating the weights and biases in each iteration. The weights in each layer are initialized from a zero-mean Gaussian distribution with a standard deviation of 0.01, and the neuron biases are initialized with a constant value of zero in each convolutional layer and fully connected layer. In the test phase, continuous running-difference images are classified in a chronological order. A set of continuous CME-detected frames can be seen as an image sequence of CME evolution, which is used for CME co-localization and tracking.

在卷积层与下采样层处理后，每幅图像的特征图分辨率降至25×25。随后通过全连接层提取高层语义特征，最终输出CME发生概率。原始LeNet架构专为手写数字识别设计，其输出层包含10个单元（对应0-9数字类别概率）。本工作将输出层调整为两个单元，分别表征CME存在与否的概率。为获取类别概率分布，采用双向softmax函数进行二分类计算：

其中 x_CME and x_non-CME 为输出层两个单元的值。当输出概率值超过0.5时，判定该图像为CME图像。

作为机器学习模型，CNN需通过训练数据与标签自动学习权重与偏置参数。对于分类问题，目标损失函数定义为：

式中 N 为训练样本数量，y^∗_i 为二元真实标签（0或1），y_i 为CNN输出的概率值（0~1之间）。由于 L 恒为非负值，训练过程即通过优化算法最小化该损失函数。我们采用批量大小为128的随机梯度下降法进行训练，权重与偏置更新规则为：

其中 i 为迭代次数，η 为学习率（初始值0.0001，训练过程中三次衰减）。各层权重采用标准差0.01的高斯分布初始化，卷积层与全连接层神经元的偏置初始化为零。测试阶段中，连续时序差分图像按时间顺序分类，连续被标记为CME的帧序列可视为CME演化图像序列，用于后续共定位与追踪。

2.3. CME Detection

2.3.1. CME Region Co-localization

After the classification, the next step is to segment the CME regions in every CME-detected image. However, due to the lack of a set of images with known labeled CME regions, we need to solve the problem in an unsupervised fashion. After training the above LeNet neural network, we can extract convolutional feature maps from the last convolution layer of the CNN model. Each feature map is considered as a down-sample of the input and contains high-level semantic information. To mine the hidden information for segmenting the CME regions, we use Deep Descriptor Transforming (DDT; Wei et al. 2019), an unsupervised image co-localization method, which utilizes the Principal Component Analysis (PCA; Pearson 1901) to analyze CNN feature maps and localize category-consistent regions of each image in an image set. The extracted feature maps can be considered to have 25 × 25 cells and each cell contains one d-dimensional feature vector. PCA uses an orthogonal transformation to convert d-dimensional correlated variables into a set of linearly uncorrelated variables that called principal components by the eigendecomposition of the covariance matrix. The covariance matrix of the input data is calculated by

where K = h × w × N. N denotes the number of input feature maps with a h × w resolution and x^n_{(i, j)} represents a C dimension CNN feature vector of image n at pixel position (i, j). After the eigendecomposition, we can get the eigenvectors $\xi_{(1)}, \xi_{(2)}, \dots, \xi_{(d)}$ of the covariance matrix that correspond to the sorted eigenvalues λ_1 ≥⋯≥ λ_d ≥ 0 in a descending order. We take the first eigenvector that corresponds to the largest eigenvalue as the main projection direction. For a particular position, (i, j), of a CNN feature vector of image n, its main principal feature is calculated as follows:

In this way, we reduce its feature dimension from C to 1 and the feature value after transformation can be treated as the appearance possibility of the common object at each pixel position. All pixel locations of f(i, j) form into an indicator matrix whose dimensions are h × w:

The pipeline of image co-localization can be found in Figure 2. The image sequence of CME evolution we obtain from the trained CNN model consists a set of CME images, which are directly processed through the DDT algorithm for CME region co-localization. The final output of the image co-localization is a set of CME region mask images with the same resolution as that of the input feature maps. For convenience, we resize the output in the same resolution by the nearest interpolation.

2.3.1 CME区域共定位

分类完成后，下一步是在每幅CME图像中分割CME区域。由于缺乏标注CME区域的训练集，我们需以无监督方式解决该问题。通过已训练的LeNet网络，可从其末层卷积层提取特征图。这些特征图可视为输入图像的下采样表征，蕴含高层语义信息。为挖掘分割CME区域的隐含信息，我们采用无监督图像共定位方法——深度描述子变换（DDT；Wei等，2019），该方法基于主成分分析（PCA；Pearson，1901）解析CNN特征图，定位图像集中类别一致性区域。

提取的特征图可视为25×25的网格单元，每个单元包含d维特征向量。PCA通过正交变换将d维相关变量转换为线性无关的主成分：首先计算输入数据的协方差矩阵：

其中K=h×w×N，N为h×w分辨率输入特征图数量，x^n_(i,j) 表示第 n 幅图像在像素位置(i,j)处的C维CNN特征向量。经特征分解后，可得到按降序排列的特征值λ_1≥⋯≥λ_d≥0对应的特征向量 $\xi_{(1)}, \xi_{(2)}, \dots, \xi_{(d)}$ 。取最大特征值对应的首特征向量为主投影方向。对于第n幅图像特征图位置(i,j)，其主成分特征计算为：

通过此变换，特征维度由C降至1，变换后的特征值可视为共性物体在各像素位置的出现概率。所有f(i,j)的像素位置构成h×w维的指示矩阵：

图像共定位流程见图2。从CNN模型获取的CME演化图像序列（即一组CME图像）直接输入DDT算法进行CME区域共定位。最终输出是与输入特征图分辨率相同的CME区域掩膜图像。为方便后续处理，通过最近邻插值将输出掩膜恢复至原始特征图分辨率。

2.3.2. CME Region Refinement

The outputs of the pipeline in Figure 2 are just images with roughly detected CME regions. To obtain images with CME region finely tuned, we use the graph cut method (Boykov et al. 2001) in computer vision for segmented region smoothing. Obviously, the indicator matrix can only roughly tell the probabilities that a pixel position belongs to a CME or a non-CME class. However, class consistency may arise among neighboring pixels. To address this problem, a framework of energy minimization is naturally formulated. In this framework, one seeks the labeling l of image pixels that minimizes the energy:

where λ_s and λ_d are nonnegative constants to balance the influences of each term. E_smooth(l) measures the class consistency of l among boundary pixels according to their neighborhood intensity difference, while E_data(l) measures the disagreement between l and the predicted data, which is optimized mainly based on the probability calculated in Section 2.2. We set E_smooth(l) and E_data(l) as follows:

where pr(l_p) denotes the probability of a pixel position p assigned as a class CME and I_p denotes the intensity on position i. The graph cut optimization can be then employed to efficiently solve the energy minimization problem. The graph cut algorithm generates a related graph of the labeling problem and the minimum cut set is used to solve the minimum cut set of that graph. Then, the minimum solution of the energy function is obtained. More details on the graph cut algorithm can be found in Boykov et al. (2001). Here, we show one example of the comparison results before and after optimization in Figure 3.

2.3.2 CME区域精细化

图2流程输出的图像仅包含粗略检测的CME区域。为获得精细化CME区域，我们采用计算机视觉中的图割方法（Boykov等，2001）进行分割区域平滑。显然，指示矩阵仅能粗略反映像素属于CME或非CME类别的概率，但相邻像素间可能存在类别一致性。为此，我们构建能量最小化框架，寻求使能量函数最小的像素标签分配方案：

其中 λ_s 与 λ_d 为非负权重参数，用于平衡两项影响。E_smooth(l)衡量邻域像素的类别一致性（基于强度差异），E_data(l)度量标签l与预测数据的偏差（基于2.2节计算的概率）。具体定义为：

式中pr(l_p)表示像素 p 被标注为CME类别的概率，I_p 为像素 p 的强度值。通过图割优化算法可高效求解该能量最小化问题：首先将标签问题转化为图结构，通过求解图的最小割集获得能量函数最小值。图割算法细节可参考Boykov等（2001）的研究。图3展示了优化前后的对比案例。

2.4. CME Tracking

After CME region detection and refinement, we can only obtain CME regions in each frame independently. Furthermore, there could be more than one CME in the image sequence of CME evolution that we obtained in Sections 2.2 and 2.3. To track a CME in a series of running-difference images, we define some rules to identify a CME, which are similar to Olmedo et al. (2008). First, a CME must be seen to move outward in at least two running-difference images. Second, the maximal height of a CME must reach out of the C2 field of view (FOV). Each tracked CME that does not satisfy the above two rules is abandoned. Moreover, given a set of images with the pixel-level CME region annotated, we aim to recognize each CME in that image set and analyze its key parameters, e.g., the central position angle (CPA), angular width, median, and maximal velocities.

To better track the movement of a CME, all images with CME regions annotated are transformed to a polar coordinate system with a 360 × 360 resolution. The height range at each angle position is from 2.2 to 6.2 solar radii. As a demonstration, we use the CME event that occurred on 2012 February 4. Figure 4(a) shows the input of our tracking module, which is an image sequence of CME evolution for a given time range. All images are ordered according to the observation time. The original CME images are shown in gray and the images with detected CME pixels are indicated in pink. The top panel of Figure 4(b) presents the results of coordinate transform for the frame at 19:35 UT as an example. Actually, we apply the coordinate transformation to each image in the sequence and compute the maximal height of the CME region mask at each position angle in the given time range. All position angles with a maximal height less than half of the FOV are removed. The rest position angles are merged together according to the position connectivity. Again, using the frame at 19:35 UT as an illustration, we show the cleaned result in the bottom panel of Figure 4(b) from where we determine the start position angle (start PA) and the end position angle (end PA) of each CME. The central position angle (PA) is the average of the start PA and the end PA. And the angular width is derived as the difference of these two PAs.

The height–time diagram at each position angle between the derived start and end position angles can be retrieved. Subsequently, the corresponding velocity can be obtained. In Figure 4(c), as a representative case, we plot the CME height evolution at the position angle with the maximal velocity. To determine the start and end time of each CME, we find all time segments with an increasing CME height in the height–time diagram. Next, we check if the CME in each time segment meets the defined two criteria: existing in at least two frames and reaching beyond the LASCO C2 FOV. The time segment that does not satisfy the aforementioned criteria is discarded. For the case in Figure 4(c), the derived final time range of the tracked CME is indicated by the blue dashed line. To derive representative CME velocities, we compute the median and maximal values of the CME velocity distribution at all derived position angle. The velocity at each position angle is calculated by a linear fit to the data points in the obtained time segment. As in the CACTus catalog (Robbrecht et al. 2009), we use the median velocity as an overall velocity of the detected CME. Meanwhile, in order to compare with the velocity in the CDAW catalog, we also calculate the maximal velocity. In summary, for a tracked CME, we offer the following five fundamental parameters: the first appearance time in the LASCO C2 FOV (Tstart), CPA, angular width (AW), median, and maximal velocities (Vmed and Vmax). These fundamental parameters are used for the comparison among different detection and tracking techniques.

2.4 CME追踪

完成CME区域检测与精细化处理后，各帧中的CME区域仍为独立结果。此外，2.2与2.3节获取的CME演化图像序列中可能存在多个CME事件。为实现时序差分图像中的CME追踪，我们参考Olmedo等（2008）方法设定两条判定准则：其一，CME需在至少两帧时序差分图像中呈现向外运动特征；其二，CME最大高度须超出C2视场（FOV）。不满足上述条件的追踪结果将被剔除。对于标注像素级CME区域的图像集，我们旨在识别各CME事件并分析其核心参数，包括中心位置角（CPA）、角宽度、中值速度及最大速度。

为精确追踪CME运动轨迹，所有标注CME区域的图像均转换为360×360分辨率的极坐标系，各角度位置的高度范围为2.2至6.2太阳半径。以2012年2月4日CME事件为例：图4(a)展示追踪模块输入——特定时段内CME演化的时序图像（按观测时间排序），原始CME图像以灰度显示，检测到的CME像素以粉色标记。图4(b)上图为UT时间19:35帧的极坐标转换结果演示。实际处理中，我们对序列所有图像执行坐标转换，并计算各位置角在对应时段内CME掩膜区域的最大高度。最大高度未达视场半径的位置角将被剔除，剩余位置角依据空间连续性进行合并。以19:35 UT帧为例，图4(b)下图展示清洗后的结果，据此确定各CME的起始位置角（start PA）与终止位置角（end PA）。中心位置角（CPA）取二者的均值，角宽度（AW）则为其差值。

在起始与终止位置角区间内，可提取各位置角的高度-时间演化图并计算对应速度。图4(c)以最大速度位置角为例展示CME高度演化曲线。为确定CME事件的起止时间，我们识别高度-时间图中所有高度递增时段，并验证其是否满足既定双准则（至少存在两帧且超出C2视场）。图4(c)中蓝色虚线标示最终有效追踪时段。为获取代表性速度参数，我们对所有有效位置角的速度分布计算中值（Vmed）与最大值（Vmax）。各位置角速度通过时段内数据点的线性拟合获得，参照CACTus目录（Robbrecht等，2009）以中值速度表征CME整体速度，同时计算最大速度用于与CDAW目录对比。综上，每个追踪成功的CME事件将输出五项基础参数：LASCO C2视场内首次出现时间（Tstart）、CPA、AW、Vmed与Vmax，用于不同检测追踪技术的对比分析。

基于机器学习技术的日冕物质抛射（CME）自动检测与跟踪新工具 (ApJS）