NeuroImage: BrainStat, a toolbox for whole-brain statistics and multimodal feature correlation

Brain-computer interface community
2024/07/26 10:40

The analysis and interpretation of neuroimaging datasets has become a multidisciplinary challenge, relying not only on statistical methods, but also increasingly in combination with other brain-derived features such as gene expression, histological data, function, and cognitive structure. To address this challenge, we introduce BrainStat, a toolbox with the following capabilities: (i) univariate and multivariate linear model analysis of volumetric and surface-based brain imaging datasets; and (ii) multi-domain feature correlation of spatial maps of gene expression and histology with results after death, as well as task-based fMRI meta-analysis and resting-state fMRI meta-analysis. BrainStat integrates statistics and feature associations into a turnkey toolbox, simplifying the analysis process and accelerating cross-modality research. The toolbox is implemented in Python and MATLAB, two programming languages that are widely used in the neuroimaging and neuroinformatics communities. BrainStat is publicly available and comes with extensible documentation.

introduce

Developments in neuroimaging have allowed us to measure the morphology, microstructure, function, and connectivity of the brain in a wide range of individuals and large populations. With the continuous advancement of image processing technology, we can integrate this data into standardized reference frames, such as stereotactic voxel spaces (e.g., MNI152 space), surface-based spaces (e.g., fsaverage, MNI152-CIVET surface, or grayordinates space), as well as various segmentation schemes.

By registering neuroimaging data into a common space, we can apply to statistical analyses such as large-scale univariate generalized linear and mixed-effects models. These models allow us to perform parallel statistical tests in each measurement unit. However, this type of analysis often requires multiple tools and procedures, which not only reduces workflow reproducibility, but also increases the risk of human error. To address these issues, we introduce BrainStat – a unified toolbox that implements these analyses in a transparent and open-source framework.

As advanced analytical workflows for neuroimaging research increasingly rely on datasets across multiple (non-)imaging modalities, the availability of these datasets becomes critical. When these datasets are mapped to the same frame of reference as neuroimaging measurements, they can be used to contextualize research findings to help interpret and validate findings.

For example, the results of the study can be contextualized within an established model of the brain's functional architecture. In addition, automated meta-analysis tools such as Neurosynth, NiMARE, or BrainMap can be used to perform population-specific meta-analyses based on thousands of previously published fMRI studies. Meta-analytic decoding correlates the chart with a database of brain activation maps, providing a quantitative method for inferring cognitive processes associated with spatial statistical patterns. Mapping post-mortem transcriptomic and histological datasets to the neuroimaging space allows us to correlate neuroimaging findings with gene expression and microstructural patterns to provide information about the molecular and cellular properties of the brain.

Figure 1 BrainStat workflow

BrainStat provides an integrated decoding engine for performing these multimodal feature associations. Our toolbox is implemented in both Python and MATLAB, both of which are very common in the neuroimaging research community. BrainStat is designed for maximum homogeneity to enhance the accessibility of the tool and help users learn to program without prior expertise. BrainStat relies on a simple object-oriented framework to streamline analysis workflows. The toolbox is publicly available on GitHub, and the documentation is available on ReadTheDocs. We divide the toolbox into two main modules: Statistics and Contextualization. In the remainder of this article, we will describe in detail how to perform the analysis shown in Figure 1.

2.1 Statistics module

The Statistics Module is built on top of SurfStat, a classic MATLAB software package for implementing fixed-effect and mixed-effects linear models. In BrainStat, users can create and fit these models by providing a topic response matrix and a predictive model. This predictive model was created with an intuitive model formula framework that makes it simple to define fixed/random effects, both as a primary variable of interest and as a control covariate, facilitating both transverse and longitudinal analyses.

For mixed-effects models, BrainStat employs a response-side specification that allows multiple random effects to act as independent effects. Currently, the fitting of the model is done by limiting the maximum likelihood estimate. When comparing effects for variables of interest, such as health status or age, a comparison needs to be specified. BrainStat is capable of processing univariate or multivariate response data and provides two commonly used multiple comparison correction methods: false discovery rate and random field theory. The false finding rate controls the proportion of false positives in the data, while the random field theory corrects for the probability of false positive findings.

We used cortical thickness and demographic data from the Microstructural Information Connectomics (MICAMICs) dataset (Figure 2A) to demonstrate the statistical module, 12 of whom were scanned twice. We created a linear model that used age and sex and their interactions as fixed effects and individuals as random effects (Figure 2B). We define a contrast-age, which indicates that cortical thickness decreases with age. The model was fitted using a one-tailed test on cortical thickness data. We plotted t-values, clustering based on random field theory, and peak p-values, as well as vertex p-values corrected using false discovery rate (Figure 2C). We found that age had an effect on cortical thickness, and this effect was significant at the cluster level, but not at the apex level. This suggests that the effect of age on cortical thickness is widespread rather than local. We generally recommend using stricter clustering definition thresholds, especially when working with data with less spatial smoothness.

Fig.2 Example Python code for fitting a fixed-effect general linear model of the effect of age on cortical thickness using BrainStat

The quality and robustness of the model can be assessed at each vertex or region of the cortex. Our quality control function outputs a histogram of the residuals and a q-q plot of the theoretical quantile values of the residuals versus the normal distribution. We also mapped slope and kurtosis measurements that characterize the residual distribution across the cortex.

2.2 Context module

The context module, which is capable of calculating the binary correlation of statistical plots with multimodal neural features, can be linked to (Figure 1):

(i) Task-based functional magnetic resonance imaging (fMRI) meta-analysis in vivo

and (ii) in vivo functional patterns from resting-state fMRI

and (iii) postmortem gene expression

(iv) Postmortem histology/cell construction

The meta-analysis sub-module is responsible for testing the relationship between a specific term-related brain map and task-fMRI meta-analysis. The Resting State Module uses functional gradients to contextualize neuroimaging results, a low-dimensional method for characterizing functional connectomes. The transcriptome sub-module extracts gene expression data from the Allen human brain atlas. The Histology module acquires cell body staining intensity spectra from BigBrain, a three-dimensional reconstruction of the structure of human brain cells. These submodules support common surface templates and, where possible, custom footprints. Overall, these submodules provide the basis for in-depth analysis of statistical results in terms of micro- and macro-brain organization.

2.2.1 Meta-analysis decoding

BrainStat's meta-analysis decoding submodule leverages data from Neurosynth and NiMARE to decode statistical plots with the goal of revealing their associations with cognitive processes. The submodule generates meta-analysis activation plots for multiple cognitive terms and associates these plots with specific charts to identify the terms that are most closely associated with them. This approach enables the unveiling of indirect links between cognitive terms used in a wide range of published task-related functional neuroimaging studies without relying on cognitive task data obtained in the same cohort. In fact, many research teams have used meta-analytic decoding to assess the association of their neuroimaging findings with cognitive processes. Figure 3 illustrates how to retrieve meta-analysis terms related to previously calculated t-charts.

Figure 3 Meta-analysis decoding

2.2.2 Resting-state fMRI

The functional architecture of the brain in its resting state can be seen as a series of successive dimensions, which are known as functional gradients. Functional gradients reveal progressive changes between brain regions and can embed newly discovered data into the overall functional architecture of the human brain by assessing peer-to-peer relationships with other biomarkers. Previous studies have used functional gradients to explore the links between brain functional architecture and higher cognitive function, hippocampal connectivity, amyloid β protein expression and aging, microstructural organization, evolutionary changes, and disease state changes.

The functional gradient included in BrainStat is a resampled average functional connection matrix based on the S1200 dataset and converted to the fsaverage5 space to reduce computational complexity. These functional connection gradients were calculated in BrainSpace. In Figure 4, we provide an example code to calculate the correlation between the first functional gradient and the t-plot. The results show that the Spearman correlation coefficient between the two graphs is relatively low (ρ=0.17). However, in order to verify the statistical significance of this correlation, spatial autocorrelation in the data needs to be considered. BrainSpace provides three correction methods: rotation testing, Moran spectrum randomization, and variance plot matching, all of which BrainStat relies on. These correction techniques help to ensure that the correlation results we draw are reliable and valid.

Fig. 4 Correlation with functional gradients

2.2.3 Gene expression

The Allen Human Brain Atlas is a database containing microarray gene expression data from over 20,000 genes in autopsy tissue samples from six adult donors. These data can be used to explore the connections between neuroimaging data and molecular factors, providing insights into the mechanisms that make up anatomical and connectome markers. For example, these data can be used to study the relationship between genetic factors and functional connectivity, anatomical connectivity, and changes in connectivity in disease states.

BrainStat's Gene Decoding Module uses the abagen toolbox to process these gene expression data. In the Python environment, BrainStat can be called abagen directly, allowing all parameters to be modified. In MATLAB, since abagen is not available, gene expression matrices are provided for pre-computed common partitioning schemes using abagen's default parameters. In Figure 5, we show how to obtain gene expression data for a specific functional map and correlate it with t-plots, which can be used for further analysis, such as extracting the principal components of gene expression and comparing them to the chart.

Fig.5 Gene expression

2.2.4 Histology

The BigBrain Atlas is a detailed 3D reconstruction of the human brain, which is based on sectioning and cell body staining techniques and is presented at a high resolution of 20 μm. As the first whole-brain 3D histology dataset, it is very useful for combining neuroimaging findings with histological features. For example, BigBrain can be used to validate microstructural results from MRI, define histology-based regions of interest, or explore the relationship between connectome markers and microstructures.

The histology module in BrainStat is designed to simplify the process of combining neuroimaging findings with the BigBrain dataset. This submodule leverages sampling data from 50 different depths on the cortical membrane in the BigBrain atlas. The covariance of these data, i.e., the covariance of the microstructure profile, is calculated by partial correlation to correct the average intensity profile. Then, using BrainSpace's default parameters, the main direction of cellular change is calculated from these microstructure profile covariances. An example is shown in Figure 6 where we find that the correlation between the first eigenvector and the t-chart is ρ=-0.28.

Figure 6 Association with histological markers

2.2.5 Runtime Assessment

Although currently implemented in both Matlab and Python is a single thread, BrainStat relies heavily on matrix multiplication and allows for fast calculations, even when analyzing larger datasets. For example, we performed a series of experiments on test data and expanded the simulated dataset using bootstrap methods to evaluate computation time (Table 1).

Table 1 Running time evaluation

discuss

BrainStat is a neuroimaging data analysis tool that combines the capabilities of statistical analysis and contextual analysis. It is designed to simplify the analysis process, making it easier for researchers to perform complex neuroimaging analyses while improving the reproducibility and reliability of their studies.

When it comes to statistical analysis, BrainStat provides a flexible framework for building and testing multivariate linear models, which are essential for understanding brain structure and function. It also includes quality control features to help validate the model's assumptions and the accuracy of the fit.

For contextual analysis, BrainStat leverages external datasets such as the high-resolution BigBrain Atlas, Neurosynth's task fMRI study database, and gene expression information from the Allen Human Brain Atlas to enrich and interpret MRI results. These capabilities allow BrainStat to not only analyze neuroimaging data, but also to consider it in a broader biological and cognitive context.

BrainStat's open access and modular design makes it easy to use and encourages community contributions, continuously enhancing its functionality and scope. It is implemented in both Python and Matlab, which means that it can be adapted to the technical background and needs of different users.

Overall, BrainStat is a powerful toolbox that simplifies multiple analytical approaches to neuroimaging with a unified framework that enables researchers to gain a deeper understanding of the brain's complexity. Its goal is to lower the technical barrier, reduce human error, and accelerate the progress of cross-modal research in the field of neuroimaging.

Impressions after reading

This article focuses on BrainStat as a toolbox that simplifies multiple analysis methods commonly used in neuroimaging through a unified framework, which greatly lowers the technical barrier to entry, reduces human error, and improves the reproducibility and reliability of research. This tool leverages many external data sets for joint analysis, allowing researchers to study, enrich, and interpret MRI results.

Text: Xiao Mengru;

Typesetting|Shi Qiji

This article is from Xinzhi self-media and does not represent the views and positions of Business Xinzhi.If there is any suspicion of infringement, please contact the administrator of the Business News Platform.Contact: system@shangyexinzhi.com