Normal view MARC view ISBD view

Journal of agricultural, biological and environmental statistics / Editor-in-chief, Brian J. Reich.

Material type:

Continuing resourceSeries: ; V.26, No.2Publication details: New York, NY : Springer Science+Business Media LLC, June 2021.Description: 131-327 pages ; 26 cmISSN:

1085-7117

Online resources:

Supplementary Materials

Summary: 1.A Statistical Perspective on the Challenges in Molecular Microbial Biology Pratheepa JEGANATHAN and Susan P. HOLMES High throughput sequencing (HTS)-based technology enables identifying and quantifying non-culturable microbial organisms in all environments. Microbial sequences have enhanced our understanding of the human microbiome, the soil and plant environment, and the marine environment. All molecular microbial data pose statistical challenges due to contamination sequences from reagents, batch effects, unequal sampling, and undetected taxa. Technical biases and heteroscedasticity have the strongest effects, but different strains across subjects and environments also make direct differential abundance testing unwieldy. We provide an introduction to a few statistical tools that can overcome some of these difficulties and demonstrate those tools on an example. We show how standard statistical methods, such as simple hierarchical mixture and topic models, can facilitate inferences on latent microbial communities. We also review some nonparametric Bayesian approaches that combine visualization and uncertainty quantification. The intersection of molecular microbial biology and statistics is an exciting new venue. Finally, we list some of the important open problems that would benefit from more careful statistical method development. Supplementary materials accompanying this paper appear on-line. Key Words: Microbial ecology; Bayesian data analysis; Hierarchical mixture models; Latent Dirichlet allocation; Bayesian nonparametric ordination; Sequencing data; Quality control.--2.Testing Independence Between Two Spatial Random Fields Shih-Hao HUANG, Hsin-Cheng HUANG, Ruey S. TSAY, and Guangming PAN In this article, we consider testing independence between two spatial Gaussian random fields evaluated, respectively, at p and q locations with sample size n, where both p and q are allowed to be larger than n. We impose no spatial stationarity and no parametric structure for the two random fields. Our approach is based on canonical correlation analysis (CCA). But instead of applying CCA directly to the two random fields, which is not feasible for high-dimensional testing considered, we adopt a dimension-reduction approach using a special class of multiresolution spline basis functions. These functions are ordered in terms of their degrees of smoothness. By projecting the data to the function space spanned by a few leading basis functions, the spatial variation of the data can be effectively preserved. The test statistic is constructed from the first sample canonical correlation coefficient in the projected space and is shown to have an asymptotic Tracy-Widom distribution under the null hypothesis. Our proposed method automatically detects the signal between the two random fields and is designed to handle irregularly spaced data directly. In addition, we show that our test is consistent under mild conditions and provide three simulation experiments to demonstrate its powers. Moreover, we apply our method to investigate whether the precipitation in continental East Africa is related to the sea surface temperature (SST) in the Indian Ocean and whether the precipitation in west Australia is related to the SST in the North Atlantic Ocean. Key Words: Canonical correlation analysis; Dimension reduction; High-dimensional test; Irregularly spaced data; Multiresolution spline basis functions; Teleconnection, Tracy-Widom distribution.--3.Continuous-Time Discrete-State Modeling for Deep Whale Dives.Joshua HEWITT, Robert S. SCHICK, and Alan E. GELFAND Understanding unexposed/baseline behavior of marine mammals is required to assess the effects of increasing levels of anthropogenic noise exposure in the marine environment. However, quantifying variation in the baseline behavior of whales is Challenging due to the fact that they spend much of their time at depth, and therefore, their diving behavior is not directly observable. Data collection employs tags as measurement device, to record vertical movement. We focus here on satellite tags, which have the advantage of collection over a time window of weeks. The type of data we analyze here suffer, the disadvantage of being in the form of depths attached to an arbitrarily created set of depth bins and being sparse in time. We provide a multi-stage generative model for deep dives using a continuous-time discrete-space Markov chain. Then, we build a likelihood, incorporating dive-specific random effects, in order to fit this model to a set of satellite tag records, each consisting of a temporally misaligned collection of deep dives with sparse binned depths for each dive. Through simulation, we demonstrate the ability to recover true model parameters. With real satellite tag records, we validate the model out of sample and also provide inference regarding stage behavior, inter-tag record behavior,dive duration, and maximum dive depth. Supplementary materials accompanying this paper appear online. Key Words: Hierarchical model; Markov chain Monte Carlo; Markov process; Misalignment; Model validation; Satellite tags.--4.Correcting Bias in Survival Probabilities for Partially Monitored Populations via Integrated Models. Blanca SARZO, Ruth KING, and David CONESA, and Jonas HENTATISUNDBERG We provide an integrated capture-recapture-recovery framework for partially monitored populations. In these studies, live resightings are only observable at a set of monitored locations, so that if an individual leaves these specific locations, they become unavailable for capture. Additional ring-recovery data reduce the corresponding bias obtained in the survival probability estimates from capture-recapture data due to the confounding with colony dispersal. We derive an explicit efficient likelihood expression for the integrated capture-recapture-recovery data, and state the associated sufficient statistics. We demonstrate the significant improvements in the estimation of the survival probabilities using the integrated approach for a colony of guillemots (Uria aalge), where we additionally specify a hierarchical approach to deal with low sample size over the early period of the study. Supplementary materials accompanying this paper appear online. Key Words: Bias; Capture-recapture-recovery data; Hierarchical model; Partial monitoring.--5.A Sample Covariance-Based Approach For Spatial Binary Data Sahar ZARMEHRI, Ephraim M. HANKS, and Lin LIN The field of landscape genetics enables the study of infectious disease dynamics by connecting the landscape features with evolutionary changes. Quantifying genetic correlation across space is helpful in providing insight into the rate of spread of an infectious disease. We investigate two genetic patterns in spatially referenced single-nucleotide polymorphisms (SNPs): isolation by distance and isolation by resistance. We model the data using a Generalized Linear Mixed effect Model (GLMM) with spatially referenced random effects and provide a novel approach for estimating parameters in spatial GLMM.s. In this approach, we use the links between binary probit models and bivariate normal probabilities to directly compute the model-based covariance function for spatial binary data. Parameter estimation is based on minimizing sum of squared distance between the elements of sample covariance and model-based covariance matrices. We analyze data including Brucella Abortus SNPs from spatially referenced hosts in the Greater Yellowstone Ecosystem. Key Words: Spatial statistics; Ecology; Landscape genetics.--6.Combining Environmental Area Frame Surveys of a Finite Population Wilmer. PRENTIUS, Xin ZHAO, and Anton GRAFSTROM New ways to combine data from multiple environmental area frame surveys of a finite population are being introduced. Environmental surveys often sample finite populations through area frames. However, to combine multiple surveys without risking bias, design components (inclusion probabilities, etc.) are needed at unit level of the finite population. We show how to derive the design components and exemplify this for three commonly used area frame sampling designs. We show how to produce an unbiased estimator using data from multiple surveys, and how to reduce the risk of introducing significant bias in linear combinations of estimators from multiple surveys. Summary: If separate estimators and variance estimators are used in linear combinations, there's a risk of introducing negative bias. By using pooled variance estimators, the bias of a linear combination estimator can be reduced. National environmental surveys often provide good estimators at national level, while being too sparse to provide sufficiently good estimators for some domains. With the proposed methods, one can plan extra sampling efforts for such domains, without discarding readily available information from the aggregate/national survey. Through simulation, we show that the proposed methods are either unbiased, or yield low variance with small bias, compared to traditionally used methods. Key Words: Combining data sources; Combining estimators; Environmental monitoring; Linear combination estimator, Sample design properties.--7.Optimizing the Allocation of Trials to sub-regions in Multi-environment Crop Variety Testing. Maryna Prus and Hans-Peter PIEPHO. New crop varieties are extensively tested in multi-environment trials in order to obtain a solid empirical basis for recommendations to farmers. When the target population of environments is large and heterogeneous, a division into sub-regions is often advantageous. When designing such trials, the question arises how to allocate trials to the different sub-regions. We consider a solution to this problem assuming a linear mixed model. We propose an analytical approach for computation of optimal designs for best linear unbiased prediction of genotype effects and their pairwise linear contrasts and illustrate

Tags from this library: No tags from this library for this title. Log in to add tags.

Average rating: 0.0 (0 votes)

Holdings
Item type	Current library	Call number	Status	Barcode
Continuing Resources	PSAU OLM Periodicals	JO JABE JE2021 (Browse shelf(Opens below))	Available	JO135

Correction to: Variance Propagation for Density Surface Models. Mark V. Bravington, David L. Miller, and Sharon L. Hedley Correction to : JABES https://doi.org/10.1007/s132153-021-00438-2.

1.A Statistical Perspective on the Challenges in Molecular Microbial Biology Pratheepa JEGANATHAN and Susan P. HOLMES High throughput sequencing (HTS)-based technology enables identifying and quantifying non-culturable microbial organisms in all environments. Microbial sequences have enhanced our understanding of the human microbiome, the soil and plant environment, and the marine environment. All molecular microbial data pose statistical challenges due to contamination sequences from reagents, batch effects, unequal sampling, and undetected taxa. Technical biases and heteroscedasticity have the strongest effects, but different strains across subjects and environments also make direct differential abundance testing unwieldy. We provide an introduction to a few statistical tools that can overcome some of these difficulties and demonstrate those tools on an example. We show how standard statistical methods, such as simple hierarchical mixture and topic models, can facilitate inferences on latent microbial communities. We also review some nonparametric Bayesian approaches that combine visualization and uncertainty quantification. The intersection of molecular microbial biology and statistics is an exciting new venue. Finally, we list some of the important open problems that would benefit from more careful statistical method development. Supplementary materials accompanying this paper appear on-line. Key Words: Microbial ecology; Bayesian data analysis; Hierarchical mixture models; Latent Dirichlet allocation; Bayesian nonparametric ordination; Sequencing data; Quality control.--2.Testing Independence Between Two Spatial Random Fields Shih-Hao HUANG, Hsin-Cheng HUANG, Ruey S. TSAY, and Guangming PAN In this article, we consider testing independence between two spatial Gaussian random fields evaluated, respectively, at p and q locations with sample size n, where both p and q are allowed to be larger than n. We impose no spatial stationarity and no parametric structure for the two random fields. Our approach is based on canonical correlation analysis (CCA). But instead of applying CCA directly to the two random fields, which is not feasible for high-dimensional testing considered, we adopt a dimension-reduction approach using a special class of multiresolution spline basis functions. These functions are ordered in terms of their degrees of smoothness. By projecting the data to the function space spanned by a few leading basis functions, the spatial variation of the data can be effectively preserved. The test statistic is constructed from the first sample canonical correlation coefficient in the projected space and is shown to have an asymptotic Tracy-Widom distribution under the null hypothesis. Our proposed method automatically detects the signal between the two random fields and is designed to handle irregularly spaced data directly. In addition, we show that our test is consistent under mild conditions and provide three simulation experiments to demonstrate its powers. Moreover, we apply our method to investigate whether the precipitation in continental East Africa is related to the sea surface temperature (SST) in the Indian Ocean and whether the precipitation in west Australia is related to the SST in the North Atlantic Ocean. Key Words: Canonical correlation analysis; Dimension reduction; High-dimensional test; Irregularly spaced data; Multiresolution spline basis functions; Teleconnection, Tracy-Widom distribution.--3.Continuous-Time Discrete-State Modeling for Deep Whale Dives.Joshua HEWITT, Robert S. SCHICK, and Alan E. GELFAND Understanding unexposed/baseline behavior of marine mammals is required to assess the effects of increasing levels of anthropogenic noise exposure in the marine environment. However, quantifying variation in the baseline behavior of whales is Challenging due to the fact that they spend much of their time at depth, and therefore, their diving behavior is not directly observable. Data collection employs tags as measurement device, to record vertical movement. We focus here on satellite tags, which have the advantage of collection over a time window of weeks. The type of data we analyze here suffer, the disadvantage of being in the form of depths attached to an arbitrarily created set of depth bins and being sparse in time. We provide a multi-stage generative model for deep dives using a continuous-time discrete-space Markov chain. Then, we build a likelihood, incorporating dive-specific random effects, in order to fit this model to a set of satellite tag records, each consisting of a temporally misaligned collection of deep dives with sparse binned depths for each dive. Through simulation, we demonstrate the ability to recover true model parameters. With real satellite tag records, we validate the model out of sample and also provide inference regarding stage behavior, inter-tag record behavior,dive duration, and maximum dive depth. Supplementary materials accompanying this paper appear online. Key Words: Hierarchical model; Markov chain Monte Carlo; Markov process; Misalignment; Model validation; Satellite tags.--4.Correcting Bias in Survival Probabilities for Partially Monitored Populations via Integrated Models. Blanca SARZO, Ruth KING, and David CONESA, and Jonas HENTATISUNDBERG We provide an integrated capture-recapture-recovery framework for partially monitored populations. In these studies, live resightings are only observable at a set of monitored locations, so that if an individual leaves these specific locations, they become unavailable for capture. Additional ring-recovery data reduce the corresponding bias obtained in the survival probability estimates from capture-recapture data due to the confounding with colony dispersal. We derive an explicit efficient likelihood expression for the integrated capture-recapture-recovery data, and state the associated sufficient statistics. We demonstrate the significant improvements in the estimation of the survival probabilities using the integrated approach for a colony of guillemots (Uria aalge), where we additionally specify a hierarchical approach to deal with low sample size over the early period of the study. Supplementary materials accompanying this paper appear online. Key Words: Bias; Capture-recapture-recovery data; Hierarchical model; Partial monitoring.--5.A Sample Covariance-Based Approach For Spatial Binary Data Sahar ZARMEHRI, Ephraim M. HANKS, and Lin LIN The field of landscape genetics enables the study of infectious disease dynamics by connecting the landscape features with evolutionary changes. Quantifying genetic correlation across space is helpful in providing insight into the rate of spread of an infectious disease. We investigate two genetic patterns in spatially referenced single-nucleotide polymorphisms (SNPs): isolation by distance and isolation by resistance. We model the data using a Generalized Linear Mixed effect Model (GLMM) with spatially referenced random effects and provide a novel approach for estimating parameters in spatial GLMM.s. In this approach, we use the links between binary probit models and bivariate normal probabilities to directly compute the model-based covariance function for spatial binary data. Parameter estimation is based on minimizing sum of squared distance between the elements of sample covariance and model-based covariance matrices. We analyze data including Brucella Abortus SNPs from spatially referenced hosts in the Greater Yellowstone Ecosystem. Key Words: Spatial statistics; Ecology; Landscape genetics.--6.Combining Environmental Area Frame Surveys of a Finite Population Wilmer. PRENTIUS, Xin ZHAO, and Anton GRAFSTROM New ways to combine data from multiple environmental area frame surveys of a finite population are being introduced. Environmental surveys often sample finite populations through area frames. However, to combine multiple surveys without risking bias, design components (inclusion probabilities, etc.) are needed at unit level of the finite population. We show how to derive the design components and exemplify this for three commonly used area frame sampling designs. We show how to produce an unbiased estimator using data from multiple surveys, and how to reduce the risk of introducing significant bias in linear combinations of estimators from multiple surveys.

If separate estimators and variance estimators are used in linear combinations, there's a risk of introducing negative bias. By using pooled variance estimators, the bias of a linear combination estimator can be reduced. National environmental surveys often provide good estimators at national level, while being too sparse to provide sufficiently good estimators for some domains. With the proposed methods, one can plan extra sampling efforts for such domains, without discarding readily available information from the aggregate/national survey. Through simulation, we show that the proposed methods are either unbiased, or yield low variance with small bias, compared to traditionally used methods. Key Words: Combining data sources; Combining estimators; Environmental monitoring; Linear combination estimator, Sample design properties.--7.Optimizing the Allocation of Trials to sub-regions in Multi-environment Crop Variety Testing. Maryna Prus and Hans-Peter PIEPHO. New crop varieties are extensively tested in multi-environment trials in order to obtain a solid empirical basis for recommendations to farmers. When the target population of environments is large and heterogeneous, a division into sub-regions is often advantageous. When designing such trials, the question arises how to allocate trials to the different sub-regions. We consider a solution to this problem assuming a linear mixed model. We propose an analytical approach for computation of optimal designs for best linear unbiased prediction of genotype effects and their pairwise linear contrasts and illustrate

There are no comments on this title.

to post a comment.