对比机器学习揭示了跨物种共享与特异性的脑功能结构|文献速递-医学影像人工智能进展

发布于:2025-02-22 ⋅ 阅读:(15) ⋅ 点赞:(0)

Title

题目

Contrastive machine learning reveals species -shared and -specific brainfunctional architecture

对比机器学习揭示了跨物种共享与特异性的脑功能结构

01

文献速递介绍

猕猴作为人类的动物模型,广泛用于研究大脑和行为的关键方面(Goulas等,2014a)。这可能归因于这两个物种至少在某种程度上共享共同的大脑结构(Orban等,2004;Rilling等,2008),例如功能连接组(Margulies等,2009),这些连接组反映了大脑区域的协作(Sporns,2011),并被认为与大脑功能和行为的出现有关(Shen等,2017)。然而,猕猴和人类大约在2500万年前就从它们的共同祖先中分化开来(Kumar和Hedges,1998)。随着物种在系统发育树上分支,共性在许多方面发生了变化,例如皮层的不成比例扩展和轴突的重新连接(Semendeferi等,2002;Krubitzer和Kaas,2005;Van Essen和Dierker,2007a),因此将猕猴的大脑翻译成人类的大脑可能不仅仅是大脑缩放的问题。因此,定量研究跨物种的大脑结构共性和差异,包括功能连接组,可能为大脑功能和行为的保留与演化提供见解,并且对于更好地利用动物模型将知识转化为人类的科学和临床应用至关重要(Kelly等,2012)。

当前的研究通常将一个物种视为“背景”,如猕猴,并与另一物种进行对比(Deacon,1990)。这种比较范式在某些背景下有效,例如应用对比变分自编码器(CVAE)(Abid和Zou,2019)捕捉自闭症患者的大脑特征,相对于正常人群,后者作为“背景”进行对比(Aglinskas等,2022)。但对于跨物种比较,物种是独立演化并从它们的共同祖先中分化出来,发展出各自独特的特征。现有范式难以诱导出差异的来源,即这些差异是否归因于猕猴分支的演化,还是归因于人类分支的演化。从这个角度来看,更合理的做法是假设两物种共享一个“大脑”作为“背景”,以此为基础对比这两个物种,从而能够并行地分离出物种特有的特征。此外,将一个物种与“背景”进行匹配的假设理论上基于我们已经知道需要匹配的因素(Deacon,1990)。然而,不同个体之间存在内在的个体变异,甚至可能有方法学上的伪影。这些无关的变异可能会模糊特定物种的共性或特有变异的识别(Buckner和Krienen,2013)。

为了解决这些问题,我们最近开发了一种基于变分自编码器(Kingma和Welling,2013)的新算法,称为共享-特有变异自编码器(SU-VAE),用于比较猕猴和人类大脑的静息态功能MRI导出的连接组。SU-VAE以两个物种的未配对大脑功能连接组为输入,将每个物种的“特有”变异从“共享”变异中解耦开来,而后者最初是未知的,需要估计。通过这种方式,这些变异以三组不同的潜在特征表示,如图1所示。

我们在两个大型神经成像数据集上开发了该方法:人类连接组计划(HCP)数据集(Van Essen等,2012)和威斯康星大学麦迪逊分校的猕猴MRI(MUWM)数据集(UW-Madison,2018)。通过验证,解耦后的潜在特征在人类数据集上的验证表明,人类特有的特征与认知评分(如语言相关评分)有差异性关联,而与猕猴共享的特征则表现出与感觉运动评分的更强关联。接着,我们识别了物种共享的连接组及各自物种特有的连接组,这些结果得到了以往研究的支持。我们进一步将这些解耦后的连接组投射到大脑皮层,通过这种方式展示了一个反映物种之间连接组分化的梯度。为了进一步解释这些结果,我们计算了共享和特有连接组的图度量,以辨别它们在功能连接组演化中的潜在作用。结果发现,引入人类特有的连接,而非猕猴特有的连接,能增强网络效率。最后,我们在Allen Institute提供的全脑基因表达数据集上探索了可能的基因调控机制,这些机制可能与演化压力及人类特有功能连接组的形成相关(Shen等,2012)。我们识别了一组富集在“轴突引导”中的基因。我们通过在一个独立的大规模人类数据集——中国人类连接组计划(CHCP)数据集(Ge等,2023)上复制结果,验证了我们方法的稳健性和广泛性。

Abatract

摘要

A deep comparative analysis of brain functional connectome across species in primates has the potential toyield valuable insights for both scientific and clinical applications. However, the interspecies commonalityand differences are inherently entangled with each other and with other irrelevant factors. Here we developa novel contrastive machine learning method, called shared-unique variation autoencoder (SU-VAE), to allowdisentanglement of the species-shared and species-specific functional connectome variation between macaqueand human brains on large-scale resting-state fMRI datasets. The method was validated by confirming thathuman-specific features are differentially related to cognitive scores, while features shared with macaquebetter capture sensorimotor ones. The projection of disentangled connectomes to the cortex revealed a gradientthat reflected species divergence. In contrast to macaque, the introduction of human-specific connectomes tothe shared ones enhanced network efficiency. We identified genes enriched on ‘axon guidance’ that couldbe related to the human-specific connectomes.

在灵长类动物中,对大脑功能连接组的深度比较分析具有潜力为科学研究和临床应用提供宝贵的见解。然而,不同物种之间的共性和差异本质上是交织在一起的,并与其他不相关因素相互影响。在这项研究中,我们开发了一种新型的对比机器学习方法,称为共享-特有变异自动编码器(SU-VAE),用于揭示猕猴和人类大脑在大规模静息态fMRI数据集中的物种共享与物种特有的功能连接组变异。通过验证,人类特有的特征与认知评分存在差异性关联,而与猕猴共享的特征则更好地捕捉了感觉运动功能。将解耦后的连接组投射到大脑皮层后,揭示了一种反映物种分化的梯度。与猕猴相比,引入人类特有的连接组后,共享连接组的网络效率得到了提升。我们还识别出富集在“轴突引导”中的基因,可能与人类特有的连接组相关。

Method

方法

2.1. Previous work

2.1.1. Disentangled representation learning

Finding representations for the task is fundamental in machinelearning. Characterizing a factor as ‘disentangled’ when any intervention on this factor results in a specific change in the generateddata (Bengio et al., 2013). Recently, much work focused on learningdisentangled representations with VAEs (Higgins et al., 2016; Locatelloet al., 2019b; Tschannen et al., 2018), in which each latent featurelearns one semantically meaningful factor of variation, while being invariant to other factors. Disentangled representation learning has beenproposed as an approach to learning general representations even inthe absence of, or with limited, supervision (Liu et al., 2022). Althoughthere is no widely accepted definition of disentangled representationsyet, the main intention is to separate the main factors of variationthat are present in provided data distribution (Higgins et al., 2018;Locatello et al., 2019a). However, these methods mainly focus ondisentangling the ‘factors’ of objects, unable to cope with the separationof exclusive and shared content. There are also some works focusing ondisentangling the latent features between two domains, which mainlyseek to transfer a classifier or to map an image to a different distribution (Liu et al., 2018; Lin et al., 2019; Ding et al., 2020). Thesestudies demonstrate the potential of using unpaired data for domainspecific feature extraction and inter domain feature transformation,which are also some of the abilities of the model required for theproblem we are studying. Apart from natural images, disentangledrepresentation learning has a wide range of applications in medicalimaging, such as spatial decomposition network (SDNet) (Chartsiaset al., 2019), which factorizes 2D medical images into spatial anatomical factors and non-spatial modality factors. Thermos et al. (2021)proposed a generative model that learns to combine anatomical factorsfrom different input images, re-entangling them with the desired imaging modality (e.g., MRI), to create plausible new cardiac images withspecific characteristics. These methods of feature disentanglement byfocusing on specific fields and utilizing synthesis and other means alsoprovide us with inspiration that there is great potential for learninglatent embeddings by combining and reconstructing different features.However, These methods require additional supervised learning (Linet al., 2019) or additional human guidance (Ding et al., 2020; Chartsiaset al., 2019). At the same time, they need to clearly know in advancethe features that need to be learned in each type of image dataset, suchas smile and non-smile features in sketches and real images dataset (Liuet al., 2018). These methods can be applied to conditional cross-domainimage synthesis (Thermos et al., 2021) and translation. However, theyare not very suitable for our problem which needs to be completelyunsupervised and disentangle the shared and specific features of thetwo datasets without knowing any visible features of the two datasets.

2.1. 先前的工作

2.1.1. 解耦表示学习

在机器学习中,找到任务所需的表示是基础。当对某一因子进行干预时,若能够引起生成数据的特定变化,则称该因子为“解耦”的 (Bengio et al., 2013)。近年来,许多研究集中在使用变分自编码器(VAE)学习解耦表示 (Higgins et al., 2016; Locatello et al., 2019b; Tschannen et al., 2018),其中每个潜在特征学习数据变化的一个语义上有意义的因子,同时对其他因子保持不变。解耦表示学习被提出作为一种在缺乏或有限监督的情况下学习通用表示的方法 (Liu et al., 2022)。尽管目前没有广泛接受的解耦表示定义,但其主要目的是分离数据分布中存在的主要变化因子 (Higgins et al., 2018; Locatello et al., 2019a)。然而,这些方法主要专注于解耦物体的“因子”,无法处理专属内容和共享内容的分离。有些研究也聚焦于解耦两个领域之间的潜在特征,这些研究主要试图传递分类器或将图像映射到不同的分布 (Liu et al., 2018; Lin et al., 2019; Ding et al., 2020)。这些研究展示了使用未配对数据进行领域特征提取和跨领域特征转换的潜力,这也是我们所研究问题所需要模型的一些能力。除了自然图像外,解耦表示学习在医学成像中也有广泛的应用,例如空间分解网络(SDNet)(Chartsias et al., 2019),该方法将二维医学图像分解为空间解剖因子和非空间模态因子。Thermos等人 (2021) 提出了一个生成模型,能够将来自不同输入图像的解剖因子结合在一起,再与期望的成像模态(例如MRI)重新交织,从而创建具有特定特征的可信心脏图像。这些通过专注于特定领域并利用合成等手段进行特征解耦的方法为我们提供了灵感,表明通过结合和重构不同特征来学习潜在嵌入具有巨大潜力。

然而,这些方法需要额外的监督学习 (Lin et al., 2019) 或额外的人类指导 (Ding et al., 2020; Chartsias et al., 2019)。同时,它们需要预先明确知道在每种图像数据集中需要学习的特征,例如在草图和真实图像数据集中微笑和非微笑特征 (Liu et al., 2018)。这些方法可以应用于条件跨域图像合成 (Thermos et al., 2021) 和翻译任务。然而,它们不太适合我们的研究问题,因为该问题需要完全无监督的学习,并且要求在不知道两种数据集的任何显性特征的情况下解耦共享特征和特定特征。

Results

结果

3.1. Overview of experiments

To vividly demonstrate the effectiveness of the proposed model, weinitially conducted validation on a synthetic dataset (SupplementarySection 1.2). Synthetic data consists of two types of data, which aresuperimposed by a shared ‘background’ (sunflower) and their ownunique ‘foregrounds’ (digits ‘0’ and ‘1’, respectively). The experimentalresults show that the SU-VAE successfully disentangled the sharedfeatures and specific features of the two types of data (SupplementaryFig.S1 & Fig.S3).Subsequently, large-scale human and macaque brain functional connectome data was used to train SU-VAE. Since interhemisphericaldissimilarity is not of our interest, we have 970 human connectomesamples and 880 macaque ones, where 800 human brain and macaquebrain data were used as the training set, while the remaining datawere used as the testing set. As shown in Fig. 2, the unpaired data isfirst passed through step 1 to obtain the reconstruction of the speciesshared brain functional connectome. Then, the three encoders in step2 respectively learn the latent representation of the macaque-specific,human-specific, and species-shared brain functional connectome. Finally, the validation results on the test set and the out-of-domain dataset show the superior performance of SU-VAE.

3.1. 实验概述

为了生动地展示所提出模型的有效性,我们首先在一个合成数据集上进行了验证(补充部分1.2)。合成数据包含两种类型的数据,它们由一个共享的“背景”(向日葵)和各自独特的“前景”(分别为数字‘0’和‘1’)叠加而成。实验结果表明,SU-VAE成功地将两种数据的共享特征和特定特征进行了分离(补充图S1和图S3)。随后,我们使用大规模的人类和猕猴大脑功能连接组数据来训练SU-VAE。由于我们不关注大脑半球间的差异,我们使用了970个人类连接组样本和880个猕猴连接组样本,其中800个人类大脑和猕猴大脑数据被用作训练集,其余数据用作测试集。如图2所示,未经配对的数据首先通过步骤1进行处理,以获得物种共享的大脑功能连接组的重建。然后,步骤2中的三个编码器分别学习猕猴特定、大脑人类特定和物种共享的大脑功能连接组的潜在表示。最后,测试集和域外数据集的验证结果显示,SU-VAE的性能优越。

Figure

图片

Fig. 1. Data flow and the framework of SU-VAE aiming at disentangling ‘species-shared’ and ‘species-specific’ functional connectomes. On the left is the construction of functionalconnectomes of the macaque brain and the human brain on a specific atlas (Brodmann areas). On the right is a schematic diagram of the SU-VAE model

图1. SU-VAE框架的数据流,旨在解耦“物种共享”和“物种特有”的功能连接组。左侧展示了基于特定图谱(Brodmann区)构建猕猴大脑和人类大脑的功能连接组。右侧是SU-VAE模型的示意图。

图片

Fig. 2. The detailed architecture of SU-VAE. Step 1 and step 2 are marked by blue and red panels, respectively. Only class 𝑦 is shown in step 2. The ‘∼’ in step 2 means theprocesses for 𝑥 are similar to those for ?

Fig. 2. SU-VAE的详细架构。第1步和第2步分别用蓝色和红色面板标记。在第2步中,仅显示了类 𝑦。第2步中的‘∼’表示𝑥的处理过程与?的处理过程相似。

图片

Fig. 3. (a) Box plots of the RSA results. The horizontal white line represents the mean and the vertical red line represents a 95% confidence interval. The upper and lowerbounds of boxes represent the third and first quantile. Stars(**𝑝 < 0.05,𝑝 < 0.01,*𝑝 < 0.001) on the 𝑥-axis indicates the 𝑝-value of the paired t-test on the sampling results ofsimilarity analysis for each corresponding ‘shared’ and ‘specific’ representation of the indicators. (b) Average summary of RSA results for the three major categories of behavioralindicators in (a). Abbreviations: EMPE: Episodic Memory, EF: Executive FunctionInhibition, FI: Fluid Intelligence, LRD: Language/Reading Decoding, PS: Processing Speed, SO:Spatial Orientation, SA: Sustained Attention, VEM: Verbal Episodic Memory, WM: Working Memory, CFC: Cognition Fluid Composite, CECC: Cognition Early Childhood Composite,CTCS: Cognition Total Composite Score, CCC: Cognition Crystallized Composite, ER_CRT: Emotion Recognition ER40_CRT, ER_ANG: Emotion Recognition ER40ANG, ER_FEAR:ER40FEAR, ER_HAP: ER40HAP, ER_SAD: ER40SAD, ENDU: Endurance, LOCO: Locomotion, DEX: Dexterity, STR: Strength, ODOR: Olfaction, TAST: Taste

图 3. (a) RSA 结果的箱型图。水平白线表示均值,垂直红线表示95%的置信区间。箱体的上界和下界分别表示第三和第一四分位数。𝑥 轴上方的星号(𝑝 < 0.05,𝑝 < 0.01,𝑝 < 0.001)表示配对t检验的𝑝值,针对每个对应的“共享”和“特定”表示指标的相似性分析的采样结果。(b)(a)中三个主要类别的行为指标的RSA结果的平均总结。缩写:EMPE:情景记忆,EF:执行功能抑制,FI:流体智力,LRD:语言/阅读解码,PS:处理速度,SO:空间定向,SA:持续注意力,VEM:口头情景记忆,WM:工作记忆,CFC:认知流动性综合指数,CECC:早期儿童认知综合指数,CTCS:总认知综合得分,CCC:晶体化认知综合得分,ER_CRT:情绪识别ER40_CRT,ER_ANG:情绪识别ER40ANG,ER_FEAR:ER40FEAR,ER_HAP:ER40HAP,ER_SAD:ER40SAD,ENDU:耐力,LOCO:运动,DEX:灵巧,STR:力量,ODOR:嗅觉,TAST:味觉。

图片

Fig. 4. (a) Average matrix form of inputs and significant shared and specific connections as outputs. (b) Display of specific and shared connections against the surface of the brainas the background

图 4. (a) 输入的平均矩阵形式,以及显著的共享和特定连接作为输出。 (b) 将特定和共享连接显示在大脑表面上作为背景。

图片

Fig. 5. T-SNE results of latent features comparison among methods on BA data. (a) T-SNE result of SU-VAE. (b) T-SNE result of CVAE. (c) T-SNE result of DIG. (d) T-SNE result of DID

图 5. BA 数据上方法间潜在特征比较的 T-SNE 结果。 (a) SU-VAE 的 T-SNE 结果。 (b) CVAE 的 T-SNE 结果。 (c) DIG 的 T-SNE 结果。 (d) DID 的 T-SNE 结果。

图片

Fig. 6. Comparison of latent embeddings of different models with representation similarity analysis. (a) (b) (c) (d) (e) are the results on BA data that trained by SU-VAE, CVAE,DMR, DID, and DIG, respectively. Due to the fact that SU-VAE and CVAE are probability models based on VAE, we sampled the latent embeddings of each subject in these twomodels 6 times, while DID and DIG are GAN based models, DMR is a linear model, and each subject in these three models corresponds to only one latent embeddings.

图 6. 不同模型潜在嵌入的比较,采用表示相似性分析。 (a) (b) (c) (d) (e) 是基于 SU-VAE、CVAE、DMR、DID 和 DIG 训练的 BA 数据上的结果。由于 SU-VAE 和 CVAE 是基于 VAE 的概率模型,我们对这两种模型中的每个主题的潜在嵌入进行了 6 次采样,而 DID 和 DIG 是基于 GAN 的模型,DMR 是线性模型,在这三种模型中,每个主题对应一个潜在嵌入。

图片

Fig. 7. The differences in functional connectomes between species are transferred onto the cortical surface space. Individual matching between species was conducted by leveragingthe shared features of the trained SU-VAE. Functional connectome differences (human minus macaque) between species were then calculated based on these matched pairs. Principalcomponent analysis and spectral clustering was adopted to highlight the patterns of the species difference connectomes. The left panel shows the mean species difference connectomereordered by the clustering ranks (grayscale bars) on the first two PCs. In the middle panel, BAs were placed in the space of the reordered PCs. Crossing highlights the locationof origin, where species difference is close to zero. The 2D color map as the background was used to color-code the human cortical surface, shown in the right panel. H: human,M: macaque, Vis: visual, Aud: auditory, Sen/Mo: sensory/motor, VAN: ventral attention network, DAN: dorsal attention network, LS: limbic system, FPN: frontoparietal network,DMN: default mode network.

图 7. 物种之间的功能连接组差异被转移到皮层表面空间。通过利用训练后的 SU-VAE 的共享特征进行物种之间的个体配对。然后,基于这些匹配对计算物种之间的功能连接组差异(人类减去猕猴)。主成分分析(PCA)和谱聚类被用来突出物种差异连接组的模式。左侧面板显示了根据聚类排名(灰度条)重新排序的物种差异连接组的平均值,基于前两个主成分。在中间面板中,BAs(Brodmann 区)被放置在重新排序后的主成分空间中,交叉处突出显示了物种差异接近零的位置。右侧面板展示了用 2D 色彩图作为背景的人的皮层表面,并通过颜色编码。H: 人类,M: 猕猴,Vis: 视觉,Aud: 听觉,Sen/Mo: 感觉/运动,VAN: 腹侧注意网络,DAN: 背侧注意网络,LS: 边缘系统,FPN: 前额顶网络,DMN: 默认模式网络。

图片

Fig. 8. Six graph metrics on random, macaque, human and species-shared connectome graph displayed through box plot, paired t-test is used to compare the significance ofperformance between pairwise networks,  represents 𝑝 < 0.001, and the 𝑝-value of one-way ANOVA tests on all metrics less than 0.001

图 8. 随机、猕猴、人类和物种共享连接组图的六个图谱度量通过箱形图展示,配对 t 检验用于比较各对网络之间的性能显著性,表示 𝑝 < 0.001,单因素 ANOVA 检验所有度量的 𝑝 值小于 0.001。

图片

Fig. 9. The six genes that were simultaneously enriched on axon guidance of three databases, and their whole brain gene expression region correlation matrices were used forPearson correlation with species shared and human unique functional connectivity matrices, and compared using paired t-tests, and represents 𝑝 < 0.001.

图 9. 同时在三个数据库上富集于轴突引导的六个基因及其全脑基因表达区域相关矩阵,使用皮尔逊相关分析与物种共享和人类特有功能连接矩阵进行比较,配对 t 检验,表示 𝑝 < 0.001。

T