--- title: 'sizeMat: An R Package to Estimate Size at Sexual Maturity' author: 'Josymar Torrejon-Magallanes' date: "`r Sys.Date()`" output: html_document: highlight: default toc: yes vignette: > %\VignetteIndexEntry{sizeMat} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo = FALSE, message = FALSE} library(sizeMat) ``` This vignette introduces you to `sizeMat` package and provide a way to estimate **Size at Morphometric and Gonad Maturity**. ### Install package ```{r, eval=FALSE} install.packages("sizeMat") ``` Or the development version from github: ```{r, eval=FALSE} # install.packages("devtools") devtools::install_github("ejosymart/sizeMat") ``` ### 1. Size at morphometric maturity For estimating morphometric maturity use the *crabdata* base. The data set containing allometric measures and other attributes (year, month, sex category) of 223 crabs of the specie *Chionectes tanneri*. ### Load data ```{r echo=TRUE} data(crabdata) head(crabdata) names(crabdata) ``` The estimation of morphometric maturity involves two processes: First the *data classification*, which is referred to the classification of the individuals in two groups. The second is the *estimation*. This process uses the previous classification to estimate the size at morphometric maturity. The details are given below. ## Classify data (print and plot) The `classify_mature` function, classify the individuals in two groups (juveniles = 0 and adult = 1). The classification analysis is based on Principal Components Analysis with two allometric variables (x: independent variable, y: dependent variable) in log base, allowing to distinguish two groups that would represent juveniles and adult. The individuals are assigned to each group using a hierarchical classification procedure (hierarchical cluster with agglomeration method: "Ward.D" and the distance measure: "euclidean"). This method is based on establishing a predetermined number of groups (in this case, two) and assigning individuals to one of the groups according to their loads on the two axes of the PCA (Corgos and Freire, 2006). Using the results of the classification (PCA + cluster), a discriminant analysis (linear or quadratic) is conducted to obtain a discriminating function that permitted any individuals to be classified as a juvenile or an adult on the basis of the X and Y variables. The `classify_mature` function requires a data.frame (e.g. crabdata) with allometric variables and sex category. The argument `varNames` requires the name of two allometric variables only, and `varSex` requires the name of the variable containing sex information. If the argument `selecSex` is `NULL` all the individuals will be used in the classification analysis. Finally the `method` is focus in the discriminant analysis to be used ("ld": linear discriminant analysis, "qd": quadratic discriminant analysis). We recommend begin the analysis with the `method = "ld"`. The `classify_mature` function returns an object of class "classify", with the allometric variables "x" (independent) - "y"" (dependent), and classification of maturity (juveniles = 0, adult = 1). ```{r, echo = TRUE} #For all the individuals classify_data = classify_mature(crabdata, varNames = c("carapace_width", "chela_height"), varSex = "sex_category", selectSex = NULL, method = "ld") #For males only classify_data_males = classify_mature(crabdata, varNames = c("carapace_width", "chela_height"), varSex = "sex_category", selectSex = "m", method = "ld") ``` Print the results of the "classify_data" object. It shows the number of juveniles and adults after classification and the linear regression analysis for juveniles and adults. ```{r, echo = TRUE} print(classify_data) ``` The example shows the arguments that can be used in the plot for "classify_data" object: ```{r, echo = TRUE, fig.width = 10, fig.height = 10, results='hide', warning=FALSE} par(mfrow = c(2,2)) plot(classify_data) plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)", legendPlot = FALSE) legend("topleft", "Put your legend here", bty = "n") plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)", col = c(2, 3), pch = c(5, 6), legendPlot = TRUE) plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)", col = c(2, 3), pch = c(5, 6), lty_lines = c(1, 2), lwd_lines = c(1, 3), cex = c(1, 3), main = "Classification") ``` ## Estimate size at morphometric maturity (maturity ogive estimation) The `morph_mature` function use the logit approach (frequentist or bayesian). The size at 50% maturity ($L_{50}$) was estimated as the length at which a randomly chosen specimen has a 50% chance of being mature (Somerton 1980, Roa et al. 1999). In the regression analysis, $X$ (e.g: carapace width) is considered the explanatory variable and the classification of maturity $CS$ (juveniles: 0, adults: 1) is considered the response variable (binomial). The variables are fitted to a logit function with the form: $$P_{CS} = \frac{1}{1+e^{-(\hat{\beta}_{0} + \hat{\beta}_{1}*X)}}$$ where $P_{CS}$ is the probability of an individual of being mature at a determinate $X$ length. $\hat{\beta}_{0}$ (intercept) and $\hat{\beta}_{1}$ (slope) are parameters estimated. The $L_{50}$ is calculated as: $$L_{50} = -\frac{\hat{\beta}_{0}}{\hat{\beta}_{1}}$$ The `morph_mature` function requires an object of class "classify" with the X, Y (allometric variables) and classification of maturity (juveniles = 0, adults = 1). The argument `method` requires a character string indicating which regression will be used for the test. If `method = "fq"` the logit regression is based on GLM (frequentist) and if `method = "bayes"` a sample from the posterior distribution of a logit regression model using a random walk Metropolis algorithm is generated (see MCMClogit function). The argument `niter` requires a number. For the GLM regression (`method = "fq"`), a non-parametric bootstrap method consists in generate B bootstrap samples, by resampling with replacement the original data. Then all statistics for each parameter can be calculated from each bootstrap sample (median and confidence intervals). For the `method = "bayes"`, the argument `niter` is related to the number of Metropolis iterations for the sampler. The output is an object of class "morphMat". This object contains a dataframe with the allometric variables X - Y and the classification of maturity. Also the fitted values for the logit regression and confidence intervals (95\%). If you print the object, the median of the size at morphometric maturity estimation ($L_{50}$) and parameters are shown. ```{r echo = TRUE} #Frequentist regression my_ogive_fq = morph_mature(classify_data, method = "fq", niter = 1000) print(my_ogive_fq) #Bayesian regression my_ogive_bayes = morph_mature(classify_data, method = "bayes", niter = 1000) print(my_ogive_bayes) ``` ## Plot maturity ogive For plotting the maturity ogive, an object of class "morphMat" is required. The function `plot` generates 4 graphics: 1), 2) and 3) are histograms for the A, B parameters and the size at morphometric maturity ($L_{50}$), the last is the maturity ogive. ```{r echo = TRUE, fig.width = 10, fig.height = 10, warning=FALSE} par(mfrow = c(2,2)) plot(my_ogive_fq, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red")) par(mfrow = c(2,2)) plot(my_ogive_bayes, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red")) ``` If you want the maturity ogive plot only, you have to add the param `onlyOgive = TRUE`. Besides you can modify the axis (size, rotation, etc) and add the legend. ```{r echo = TRUE, fig.width = 10, fig.height = 10, warning=FALSE} plot(my_ogive_fq, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red"), onlyOgive = TRUE) ``` This methodology has been used mainly in the estimation of morphological sexual maturity in crabs, but it can be extended to other taxas as Agostinho (2000) reported. ### 2. Size at gonad maturity For estimating gonadal maturity use the *matFish* database. This database contains two variables: * `total_length`: Total length in cm. * `stage_mat`: The gonadal maturation stages: I, II, III, IV, where I is considered immature. ## Load data ```{r echo=TRUE} data(matFish) head(matFish) ``` The function to be used to estimate gonadal maturity is `gonad_mature`. This function use the logistic approach. The `gonad_mature` function requires a data.frame with allometric variables (e.g: total length, fork length, carapace width, etc) and a variable containing the stages of sexual maturity (gonadal maturation stages). The argument `varNames` requires a character string indicating the name of one allometric and the stage of sexual maturity variable to be used for analysis (e.g `varNames = c("total_length", "stage_mat")`). So the argument `varNames` must contain two character strings only, the first is the allometric variable and the second is the stage of sexual maturity. The arguments `inmName` and `matName` require a character string indicating the name of the stages of sexual maturity in the data.frame. The argument could contain one character string or could be a vector (e.g `inmName = "I"`, `matName = c("II", "III", "IV")`). The variable *stage_mat* in the *matFish* database, contains the stages of the sexual maturity. In this case, stage I is considered immature and II, III, IV are mature. Then the stages of sexual maturity are transformed in a binomial variable where immature = 0 and mature = 1. The argument `method` requires a character string indicating which regression will be used for the test. If `method = "fq"` the logit regression is based on GLM (frequentist) and if `method = "bayes"` a sample from the posterior distribution of a logit regression model using a random walk Metropolis algorithm is generated (see MCMClogit function). The argument `niter` requires a number. For the GLM regression (`method = "fq"`), a non-parametric bootstrap method consists in generate B bootstrap samples, by resampling with replacement the original data. Then all statistics for each parameter can be calculated from each bootstrap sample (median and confidence intervals). For the `method = "bayes"`, the argument `niter` is related to the number of Metropolis iterations for the sampler. The output is an object of class "gonadMat". This object contains a dataframe with the allometric variable "X" and stage of sexual maturity (immature = 0, mature = 1). Also the fitted values for the curve logistic regression and confidence intervals (95\%). If you print the object, the median of the size at gonad maturity estimation ($L_{50}$) and parameters are shown. ## Estimate size at gonad maturity (maturity ogive estimation) ```{r, echo=TRUE} #Frequentist regression my_ogive_fq = gonad_mature(matFish, varNames = c("total_length", "stage_mat"), inmName = "I", matName = c("II", "III", "IV" ), method = "fq", niter = 999) print(my_ogive_fq) #Bayesian regression my_ogive_bayes = gonad_mature(matFish, varNames = c("total_length", "stage_mat"), inmName = "I", matName = c("II", "III", "IV" ), method = "bayes", niter = 999) print(my_ogive_bayes) ``` ## Plot maturity ogive For plotting the maturity ogive the object of class "gonadMat"is required. The function `plot` generates 4 graphics: 1), 2) and 3) are histograms for the A, B parameters and the size at gonadal maturity ($L_{50}$), the last is the maturity ogive. ```{r echo = TRUE, fig.width = 10, fig.height = 10, warning=FALSE} par(mfrow = c(2,2)) plot(my_ogive_fq, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red")) par(mfrow = c(2,2)) plot(my_ogive_bayes, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red")) ``` If you want the maturity ogive plot only, you have to add the param `onlyOgive = TRUE`. ```{r echo = TRUE, fig.width = 10, fig.height = 10, warning=FALSE} plot(my_ogive_fq, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"), onlyOgive = TRUE) ``` If you do not want to modify the position of the, write `legendPosition = "bottomright"` or another position. ```{r echo = TRUE, fig.width = 10, fig.height = 10, warning=FALSE} plot(my_ogive_fq, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"), onlyOgive = TRUE, legendPosition = "bottomright") ``` If you do not want to show the legend, write `showLegend = FALSE`. Besides you can modify the axis (size, rotation, etc) and add the legend. ```{r echo = TRUE, fig.width = 10, fig.height = 10, warning=FALSE} plot(my_ogive_fq, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"), onlyOgive = TRUE, showLegend = FALSE) ``` ## References Agostinho, C. S. (2000). Use of otoliths to estimate size at sexual maturity in fish. Brazilian Archives of Biology and Technology, 43(4). Corgos, A. and Freire, J. (2006). Morphometric and gonad maturity in the spider crab Maja brachydactyla: a comparison of methods for estimating size at maturity in species with determinate growth. ICES Journal of Marine Science: Journal du Conseil, 63(5), 851-859. Roa, R., Ernst, B. and Tapia, F. (1999). Estimation of size at sexual maturity: an evaluation of analytical and resampling procedures. Fishery Bulletin, 97(3), 570-580. Somerton, D. A. (1980). A computer technique for estimating the size of sexual maturity in crabs. Canadian Journal of Fisheries and Aquatic Sciences, 37(10), 1488-1494.