Perspective Article - Imaging in Medicine (2011) Volume 3, Issue 4

Neurological imaging: statistics behind the pictures

Ivo D Dinov*

SOCR Resource and Laboratory of Neuro Imaging, UCLA Statistics, 8125 Mathematical Science Bldg, Los Angeles, CA 90095, USA

Corresponding Author:
Ivo D Dinov
SOCR Resource and Laboratory of Neuro Imaging
UCLA Statistics, 8125 Mathematical Science Bldg
Los Angeles, CA 90095, USA
Tel: +1 310 825 8430
Fax: +1 310 206 5658
E-mail: dinov@stat.ucla.edu

Abstract

Neurological imaging represents a powerful paradigm for investigation of brain structure, physiology and function across different scales. The diverse phenotypes and significant normal and pathological brain variability demand reliable and efficient statistical methodologies to model, analyze and interpret raw neurological images and derived geometric information from these images. The validity, reproducibility and power of any statistical brain map require appropriate inference on large cohorts, significant community validation, and multidisciplinary collaborations between physicians, engineers and statisticians.

Keywords

brain mapping ▪ imaging modalities ▪ neuroimaging ▪ statistics

Scientific challenges

The clinical importance, structural fragility and organizational complexity of the brain require unique skills, powerful technologies and large amounts of data to study its intricate anatomical structure, functional connectivity, metabolic activity and physiology. Neurological imaging, or neuroimaging, along with modern quantitative and visualization techniques enable diverse means for untangling the secrets of the normal and pathological brain from development, through normal aging. There is a broad spectrum of neuroimaging modalities, significant presence of intrinsic and extrinsic noise, and extensive intra- and inter-subject variability. This explains why many neuroimaging biomarkers may have only marginal power to detect different brain phenotypes. These challenges demand reliable and efficient computational statistics methods for synthesizing, analyzing, modeling and interpreting the vast amounts of neuroimaging data [1]. Indeed, some techniques and computational methods are more susceptible to pathological, morphological and timedependent variation. For instance, some volumebased structural MRI [2,3], tensor-based [4] and functional imaging [5,6] approaches are sensitive for detecting, monitoring and tracking dementia- driven brain changes from mild cognitive impairment to Alzheimer’s disease.

Statistical methodologies

Many complementary types of statistical techniques exist to cope with the gamut of specific neuroimaging challenges arising from multiple imaging scales, normal imaging variability, high dimensional data, varying study designs and different a priori assumptions. These include parametric and nonparametric statistical tests [7,8], linear and nonlinear models [9], dimensionality reduction techniques [10], bootstrapping and resampling methods [11,12], and survival analyses [13], among others. The choice of an appropriate and sufficiently powerful statistical technique is paramount in any neuroimaging study as both false-positive (type I) and false-negative (type II) errors are not only likely, but inevitable [14]. The most common approach to communicate neuroimaging statistical results involves statistical mapping using diverse arrays of color maps to depict phenotypic effects, correlations, associations, peak outcomes, morphometric or physiological measurements beyond normally expected noise levels. Table 1 illustrates some examples of common color maps frequently used in structural, functional, diffusion, spectroscopic and tomographic neuroimaging. These examples of common color maps may lead to misunderstandings caused by fact that the range of intensity values mapped onto the RGB colors could be linearly or nonlinearly transformed by researchers and may vary significantly between different scientific reports.

Table

Validity & reproducibility

Nowadays there are many large and publicly accessible databases [1518] providing storage, management and retrieval of raw and derived neuroimaging data on a large scale (hundreds and thousands of subjects). This greatly facilitates the processes of algorithm development, mathematical modeling and testing of novel computational techniques for analyzing multimodal neuroimaging data. For example, the recent efforts on the human [101] and mouse [102] connectome projects employ diverse MRI protocols and multiparametric approaches to study the structural and functional aspects of brain connectivity [19,20]. Many new and innovative approaches fusing imaging, phenotypic and clinical data are proposed and tested to identify associations, trends and patterns characterizing intricate relations between developmental, cognitive and psychiatric traits and various functional anatomical biomarkers. Validation and reproducibility of the enormous amount of new techniques, models, results and findings remain challenging because of lack of exact data and protocol provenance, significant intrinsic and extrinsic variability within and between different cohorts (even within the same population), and model limitations of the available computational techniques [21,22].

Figure 1 & Table 2 show examples of common neuroimaging modalities, typical statistical maps, applications and imaging resolutions. Space and time resolutions refer to the most common ranges for world-space scaling (space) and possible temporal frequency (time) for image acquisitions for each specific imaging modality. The processes of result validation and reproducibility of different neuroimaging analyses and statistical maps are often difficult because of a number of intrinsic and extrinsic factors. Examples of intrinsic factors include the significant intra- and inter-subject variability, presence of noise in the imaging data, and variations in study designs, sample sizes and sampling protocols. The significant number of available mapping techniques, statistical methodologies and computational tools used in the processing of neuroimaging data demonstrate extrinsic factors impacting neuroimaging result validation.

imaging-in-medicine-neurological-imaging

Figure 1: A summary of the most common neurological imaging protocols, their characteristics, applications and examples of computational statistical mapping. See also Table 2. Cho: Choline; Cr: Creatine; Glx: Glutamine; mI: Myo-inositol; NAA: N-acetylaspartate.

Table

Challenges

■ Analysis of imaging, genetics & phenotypic data

The analysis of imaging and nonimaging data is rapidly becoming an important component of most modern neuroimaging studies. Nowadays, many neuroimaging studies include heterogeneous data from hundreds of subjects including multimodal imaging, multiple clinical measurements and diverse subject demographics. In fact, some studies include large genetics datasets (e.g., single nucleotide polymorphisms [SNPs], partial or complete genome mapping, gene-expression). The integration of quantitative and qualitative imaging, phenotypic and genomic data becomes challenging because different types of data are expressed in noncongruent bases and represent correlated (dependent) or orthogonal (independent) dimensions. Yet, the potential for significant health benefits provides strong incentives to design, validate and productize novel computational modeling and statistical analysis techniques that enable efficient, robust and powerful holistic analyses of multimodal neuroimaging data, clinical measurements, phenotypic records and genetic data. Some recent studies are making headway in analyzing such multiform data. Examples include the use of Alzheimer’s Disease Neuroimaging Initiative data [23,24,103] to investigate the relationship between genetic variation and imaging biomarkers via genome-wide association and shape analyses, and a study of schizophrenia using imaging, cognition, genetics and pharmacotherapeutic data [25].

■ Spatial versus geometric modeling

Traditional statistical mapping of neurological images focuses on spatial characterization of anatomical features in 2D, 3D or 4D images. Examples of such spatial neuroimaging modeling include structural analysis [26,27], voxel-based morphometry [28], statistical parametric mapping [29,30] and network analyses [3133]. Most of these analytic techniques utilize univariate intensity-based measures of brain anatomy or functional activation directly obtained from the tomographically or stereotactically acquired imaging data. New complementary approaches extracting, modeling and analyzing geometric data derived from the raw neuroimaging data are increasingly becoming an integral component of many contemporary neuroimaging studies. Such geometric modeling techniques, derived from the raw imaging data, include shape analyses [34,35], tensor modeling and analyses [36,37], as well as tractography and white matter integrity [38,39]. These geometric techniques rely on sophisticated mathematical models to represent static or dynamic features of brain structure and function as multidimensional curved manifolds (spaces locally homeomorphic to Euclidian spaces of the same dimension with no curvature), higher-order generalizations of scalars, vectors and matrices (tensors), and topologically equivalent canonical spaces [4042]. Figure 2 demonstrates an example of a canonical brain reference (atlas), the International Consortium for Brain Mapping Brain Atlas [43], where the entire brain is parcellated into disjoint and complementary regions of interest. The volume, geometric properties (e.g., regional surface complexity) and the inter-regional affinities (e.g., relative position or size) of this partition are all important characteristics of anatomical brain integrity. These shape and manifold-based measures can be computed for a large and diverse pool of subjects and then can be compared individually, or as a (sub) group, to an atlas, compared to other cohorts, or used as imaging markers to study the associations between neuroimaging predictors, clinical measurements and subject phenotypes.

imaging-in-medicine-Mapping-Brain

Figure 2: (A) Coronal, (B) sagittal, (C) axial and (D) 3D cortical surface views of the International Consortium for Brain Mapping Brain atlas. Geometric models of global and local brain structure provide mechanisms for classifying shape form and size of different regions of interest by measuring various quantitative characteristics, such as shape area, fractal dimension and curvedness, etc. [16].

■ Statistical inference

Statistical power is a quantitative measure of the probability that a computational inference method would produce a false negative decision (i.e., fail to detect the presence of a real effect). Power estimates for many neuroimaging studies require knowledge of the approximate effect-size being studied, the sample-size, and the exact statistical model employed in the analysis, estimates of the expected normal data variability, and investigator-defined false-positive (type I) error rate. Power analysis, sample-size calculations (e.g., numbers needed to treat), and calculations of the minimum effect size can all be used interchangeably based on whether the investigator is able to specify either a realistic sample-size of the experiment, the desired power of the study, or accurately identify the underlying effect-size of interest. Power analyses in brain imaging studies are challenging as they require separate analyses for many brain regions of interest, each of which has a different effect size and variability.

All computational and statistical inference methods require some a priori assumptions. These typically concern the generation of the observed data and specifications of model probability distributions. Examples of such a priori conditions include parametric, nonparametric and semiparametric assumptions. Parametric assumptions require that the data probability distributions can be described by a specific family of distributions (e.g., Poisson, Exponential, General Normal, or Gaussian distributions) involving only a finite number of unknown parameters. Nonparametric assumptions indicate that the data-generating process obeys some more relaxed properties (e.g., the distribution has a well-defined median). Semiparametric assumptions represent an intermediate type of condition; for instance, the data distribution may have a well-defined mean, range or shape, at the same time as demanding that two or more variables have a specific linear model relationship. The parametric neuroimaging statistics are applicable for detecting mean differences and are appropriate for identifying between-group (spatial) or within-subject (temporal) differences when the underlying research hypotheses are directly related to central tendency. On the other side, nonparametric approaches, typically based on data rank-orders, are applicable for studies where the distributions of the parameters of interest are skewed, have heavy tails, exhibit noncontiguous support or are otherwise nonregular [44].

Multiple comparison problems in neuroimaging studies may occur when investigators conduct a large set of statistical inferences simultaneously, which may lead to inference errors (e.g., intervals that fail to include their corresponding population parameters or hypothesis tests with underestimated false positive error). Although several alternative solutions to the multiple comparisons problem exist (e.g., Bonferroni correction, False Discovery Rate, Family-Wise Error Rate), these may either be too conservative or insufficiently corrective [45]. Most brain imaging studies demand multiple comparison corrections, although such post hoc analyses need not be performed on the entire brain, but can be localized, using regional masks, to specific (smaller) brain regions identified by previous studies, which increases the power to detect phenotypic and genetic effects on brain structure and function.

■ Future perspective

Computational and statistical modeling, characterization and inference of future neuroimaging studies are likely to rely on significantly increased volume and heterogeneity of multimodal imaging data across different scales, complex subject phenotypes, integrated individual subject and reference human genomics data, advanced computational infrastructure, as well as powerful new technologies for the management, processing and visualization of these intricate data. Examples of powerful new multimodal imaging protocols include simultaneous PET-CT scanning used in clinical imaging [46,47], joint PET-MRI providing high spatial resolution and excellent morphologic discrimination of MRI and the exquisite sensitivity of nuclear imaging in both preclinical and clinical settings [48,49], combining fluorescence molecular tomography, near-infrared imaging, CT and MRI [50,51], as well as variants of integrated x-ray, nuclear imaging, and optical imaging in all-in-one tomographic scanner [5254].

The reproducibility and validity of new findings may be increased if the neuroimaging community embraces open, collaborative and distributed mechanisms for sharing data, disseminating exact data analysis protocols, incorporating modern Grid and Cloud computing infrastructures, and supports the engagement of multidisciplinary investigators in such translational studies. The following activities and resources may be critical for the successful translational application of modern neuroimaging techniques in the near future – open and collaborative communication between multiple disciplines, sharing of imaging data and metadata, as well as wide distribution of methods, software tools, web services, computational infrastructure and detailed analysis protocols.

Financial & competing interests disclosure

This work was supported in part by National Institutes of Health grants U54 RR021813, P41 RR013642, and U24-RR025736, and National Science Foundation grants 0716055 and 1023115. The author has no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

No writing assistance was utilized in the production of this manuscript.

box

References