Package 'enmpa'

Title:	Ecological Niche Modeling using Presence-Absence Data
Description:	A set of tools to perform Ecological Niche Modeling with presence-absence data. It includes algorithms for data partitioning, model fitting, calibration, evaluation, selection, and prediction. Other functions help to explore signals of ecological niche using univariate and multivariate analyses, and model features such as variable response curves and variable importance. Unique characteristics of this package are the ability to exclude models with concave quadratic responses, and the option to clamp model predictions to specific variables. These tools are implemented following principles proposed in Cobos et al., (2022) <doi:10.17161/bi.v17i.15985>, Cobos et al., (2019) <doi:10.7717/peerj.6281>, and Peterson et al., (2008) <doi:10.1016/j.ecolmodel.2007.11.008>.
Authors:	Luis F. Arias-Giraldo [aut, cre] , Marlon E. Cobos [aut] , A. Townsend Peterson [ctb]
Maintainer:	Luis F. Arias-Giraldo <[email protected]>
License:	GPL (>= 3)
Version:	0.1.9
Built:	2025-03-31 14:37:11 UTC
Source:	https://github.com/luisagi/enmpa

Help Index

Example of results obtained from GLM calibration using enmpa
GLM calibration with presence-absence data
Example data used to run model calibration exercises
enmpa: Ecological Niche Modeling using Presence-Absence Data
Constructor of S3 objects of class enmpa_calibration
Constructor of S3 objects of class enmpa_fitted_models
Summary of evaluation statistics for candidate models
Fitting selected GLMs models
Get GLM formulas according to defined response types
Evaluate final models using independent data
Jackkniffe test for variable contribution
K-fold data partitioning
Selection of best candidate models considering various criteria
Model validation options
Niche Signal detection using one or multiple variables
Find threshold values to produce three optimal metrics
Plot variable importance
Jackkniffe plot for variable contribution
Plot Niche Signal results
Extension of glm predict to generate predictions of different types
Predictions for the models selected after calibration
Print a short version of elements in 'calibration' and 'fitted models' objects
Partial ROC calculation
Two-Way interaction response plot
Variable response curves for GLMs
Example of selected models fitted
Summary of 'calibration' and 'fitted models'
Example data used to test models
Variable importance for GLMs

Example of results obtained from GLM calibration using enmpa

Description

An object of the class enmpa_calibration storing the results from GLM calibration.

Usage

cal_res
cal_res

Format

An object of class enmpa_calibration with results from the function ⁠\link{calibration_glm}⁠.

Examples

data("cal_res", package = "enmpa")

str(cal_res)
data("cal_res", package = "enmpa")

str(cal_res)

GLM calibration with presence-absence data

Description

Creates candidate models based on distinct parameter settings, evaluates models, and selects the ones that perform the best.

Usage

calibration_glm(data, dependent, independent, weights = NULL,
                response_type = "l", formula_mode = "moderate",
                minvar = 1, maxvar = NULL, user_formulas = NULL,
                cv_kfolds = 5, partition_index = NULL, seed = 1,
                n_threshold = 100, selection_criterion = "TSS",
                exclude_bimodal = FALSE, tolerance = 0.01,
                out_dir = NULL, parallel = FALSE,
                n_cores = NULL, verbose = TRUE)
calibration_glm(data, dependent, independent, weights = NULL,
                response_type = "l", formula_mode = "moderate",
                minvar = 1, maxvar = NULL, user_formulas = NULL,
                cv_kfolds = 5, partition_index = NULL, seed = 1,
                n_threshold = 100, selection_criterion = "TSS",
                exclude_bimodal = FALSE, tolerance = 0.01,
                out_dir = NULL, parallel = FALSE,
                n_cores = NULL, verbose = TRUE)

Arguments

`data`	data.frame or matrix of data to be used in model calibration. Columns represent dependent and independent variables.
`dependent`	(character) name of dependent variable.
`independent`	(character) vector of name(s) of independent variable(s).
`weights`	(numeric) a vector with the weights for observations.
`response_type`	(character) a character string that must contain "l", "p", "q" or a combination of them. l = lineal, q = quadratic, p = interaction between two variables. Default = "l".
`formula_mode`	(character) a character string to indicate the strategy to create the formulas for candidate models. Options are: "light", "moderate", "intensive", or "complex". Default = "moderate". "complex" returns only the most complex formula defined in `response_type`.
`minvar`	(numeric) minimum number of independent variables in formulas.
`maxvar`	(numeric) maximum number of independent variables in formulas.
`user_formulas`	(character) vector with formula(s) to test. Default = NULL.
`cv_kfolds`	(numeric) number of folds to use for k-fold cross-validation exercises. Default = 5. Ignored if `partition_index` is defined.
`partition_index`	list of indices for cross-validation in k-fold. The default, NULL, uses the function `kfold_partition`.
`seed`	(numeric) a seed for k-fold partitioning.
`n_threshold`	(logical) number of threshold values to produce evaluation metrics. Default = 100.
`selection_criterion`	(character) criterion used to select best models, options are "TSS" and "ESS". Default = "TSS".
`exclude_bimodal`	(logical) whether to filter out models with one or more variables presenting concave responses. Default = FALSE.
`tolerance`	(numeric) value to modify the limit value of the metric used to filter models during model selection if none of the models meet initial considerations. Default = 0.01
`out_dir`	(character) output directory name to save the main calibration results. Default = NULL.
`parallel`	(logical) whether to run on parallel or sequential. Default = FALSE.
`n_cores`	(numeric) number of cores to use. Default = number of free processors - 1.
`verbose`	(logical) whether to print messages and show progress bar. Default = TRUE

Details

Model evaluation is done considering the ability to predict presences and absences,as well as model fitting and complexity. Model selection consists of three steps: 1) a first filter to keep the models with ROC AUC >= 0.5 (statistically significant models), 2) a second filter to maintain only models that meet the selection_criterion ("TSS": TSS >= 0.4; or "ESS": maximum Accuracy - tolerance), and 3) from those, pick the ones with delta AIC <= 2.

formula_mode options determine what strategy to iterate the predictors defined in type for creating models:

light.– returns simple iterations of complex formulas.
moderate.– returns a comprehensive number of iterations.
intensive.– returns all possible combination. Very time-consuming for 6 or more independent variables.
complex.– returns only the most complex formula.

Value

An object of the class enmpa_calibration containing: selected models, a summary of statistics for all models, results obtained in cross-validation for all models, original data used, weights, and data-partition indices used.

Examples

# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")
head(enm_data)

# Calibration using linear (l), quadratic (q), products(p) responses.
cal_res <- calibration_glm(data = enm_data, dependent = "Sp",
                           independent = c("bio_1", "bio_12"),
                           response_type = "lpq", formula_mode = "moderate",
                           selection_criterion = "TSS", cv_kfolds = 3,
                           exclude_bimodal = TRUE, verbose = FALSE)
print(cal_res)
summary(cal_res)

head(cal_res$calibration_results)
head(cal_res$summary)
head(cal_res$selected)
head(cal_res$data)

# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")
head(enm_data)

# Calibration using linear (l), quadratic (q), products(p) responses.
cal_res <- calibration_glm(data = enm_data, dependent = "Sp",
                           independent = c("bio_1", "bio_12"),
                           response_type = "lpq", formula_mode = "moderate",
                           selection_criterion = "TSS", cv_kfolds = 3,
                           exclude_bimodal = TRUE, verbose = FALSE)
print(cal_res)
summary(cal_res)

head(cal_res$calibration_results)
head(cal_res$summary)
head(cal_res$selected)
head(cal_res$data)

Example data used to run model calibration exercises

Description

A dataset containing information on presence and absence, and independent variables used to fit GLM models.

Usage

enm_data
enm_data

Format

A data frame with 5627 rows and 3 columns.

Sp: numeric, values of 0 = absence and 1 = presence.
bio_1: numeric, temperature values.
bio_12: numeric, precipitation values.

Examples

data("enm_data", package = "enmpa")
head(enm_data)
data("enm_data", package = "enmpa")
head(enm_data)

enmpa: Ecological Niche Modeling using Presence-Absence Data

Description

enmpa contains a set of tools to perform detailed Ecological Niche Modeling using presence-absence data.

Details

It includes algorithms for data partitioning, model fitting, calibration, evaluation, selection, and prediction. Other functions help to explore model features as such variable response curves and variable importance.

Main functions in `enmpa`

calibration_glm, evaluation_stats, fit_glms, fit_selected, get_formulas, independent_eval1, kfold_partition, model_selection model_validation, niche_signal, optimize_metrics, predict_glm, predict_selected, response_curve, resp2var, var_importance, jackknife, plot_jk

Author(s)

Maintainer: Luis F. Arias-Giraldo [email protected] (ORCID)

Authors:

Marlon E. Cobos [email protected] (ORCID)

Other contributors:

A. Townsend Peterson [email protected] (ORCID) [contributor]

Constructor of S3 objects of class enmpa_calibration

Description

Constructor of S3 objects of class enmpa_calibration

Usage

new_enmpa_calibration(selected, summary, calibration_results, data,
                      partitioned_data, weights = NULL)
new_enmpa_calibration(selected, summary, calibration_results, data,
                      partitioned_data, weights = NULL)

Arguments

`selected`	date.frame with information about selected models.
`summary`	data.frame a summary of statistics for all models.
`calibration_results`	data.frame with results obtained from cross-validation for all models.
`data`	data.frame or matrix with the input data used for calibration.
`partitioned_data`	a list of partition indices.
`weights`	(numeric) a vector with the weights for observations. Default = NULL.

Value

An S3 object of class enmpa_calibration.

Constructor of S3 objects of class enmpa_fitted_models

Description

Constructor of S3 objects of class enmpa_fitted_models

Usage

new_enmpa_fitted_models(glms_fitted, selected, data, weights = NULL)
new_enmpa_fitted_models(glms_fitted, selected, data, weights = NULL)

Arguments

`glms_fitted`	a list of fitted GLMs.
`selected`	date.frame with information about selected models.
`data`	data.frame or matrix with the input data used for calibration.
`weights`	(numeric) a vector with the weights for observations. Default = NULL.

Value

An S3 object of class enmpa_fitted_models.

Summary of evaluation statistics for candidate models

Description

Calculate median and standard deviation of evaluation results for all candidate models considering cross-validation kfolds.

Usage

evaluation_stats(evaluation_results, bimodal_toexclude = FALSE)
evaluation_stats(evaluation_results, bimodal_toexclude = FALSE)

Arguments

`evaluation_results`	data.frame model evaluation results. These results are the output of the function `model_validation`.
`bimodal_toexclude`	(logical) whether models in which binomial variable response curves were detected will be excluded during selection processes.

Value

A data.frame with the mean and standard deviation for all metrics considering cross-validation kfolds.

Examples

# data
data("cal_res", package = "enmpa")
all_res <- cal_res$calibration_results[, -1]

# statistics for all evaluation results
evaluation_stats(all_res, bimodal_toexclude = TRUE)
# data
data("cal_res", package = "enmpa")
all_res <- cal_res$calibration_results[, -1]

# statistics for all evaluation results
evaluation_stats(all_res, bimodal_toexclude = TRUE)

Fitting selected GLMs models

Description

Functions to facilitate fitting multiple GLMs.

Usage

fit_selected(glm_calibration)

fit_glms(formulas, data, weights = NULL, id = NULL)
fit_selected(glm_calibration)

fit_glms(formulas, data, weights = NULL, id = NULL)

Arguments

`glm_calibration`	a list resulting from `calibration_glm`. Models fitted are those in the slot "selected".
`formulas`	(character) a vector containing the formula(s) for GLM(s).
`data`	data.frame with the dependent and independent variables.
`weights`	(numeric) a vector with the weights for observations. Default = NULL.
`id`	(character) id code for models fitted. Default = NULL.

Value

A list of fitted GLMs.

For fit_selected, an enmpa ⁠fitted models⁠ object.

Examples

# GLM calibration results
data(cal_res, package = "enmpa")

# Fitting selected models
sel_fit <- fit_selected(cal_res)

sel_fit

# Custom formulas
forms <- c("Sp ~ bio_1 + I(bio_1^2) + I(bio_12^2)",
           "Sp ~ bio_12 + I(bio_1^2) + I(bio_12^2)")

# Fitting models
fits <- fit_glms(forms, data = cal_res$data)

fits$ModelID_1
# GLM calibration results
data(cal_res, package = "enmpa")

# Fitting selected models
sel_fit <- fit_selected(cal_res)

sel_fit

# Custom formulas
forms <- c("Sp ~ bio_1 + I(bio_1^2) + I(bio_12^2)",
           "Sp ~ bio_12 + I(bio_1^2) + I(bio_12^2)")

# Fitting models
fits <- fit_glms(forms, data = cal_res$data)

fits$ModelID_1

Get GLM formulas according to defined response types

Description

Generate GLM formulas for independent variables predicting a dependent variable, taking into account response types required. All possible combinations of variables can be created using arguments of the function.

Usage

get_formulas(dependent, independent, type = "l", mode = "moderate",
             minvar = 1, maxvar = NULL)

get_formulas_main(dependent, independent, type = "l",
                  complex = FALSE, minvar = 1, maxvar = NULL)

aux_var_comb(var_names, minvar = 2, maxvar = NULL)

aux_string_comb(string)
get_formulas(dependent, independent, type = "l", mode = "moderate",
             minvar = 1, maxvar = NULL)

get_formulas_main(dependent, independent, type = "l",
                  complex = FALSE, minvar = 1, maxvar = NULL)

aux_var_comb(var_names, minvar = 2, maxvar = NULL)

aux_string_comb(string)

Arguments

`dependent`	(character) name of dependent variable.
`independent`	(character) vector of name(s) of independent variable(s).
`type`	(character) a character string that must contain "l", "p", "q" or a combination of them. l = lineal, q = quadratic, p = interaction between two variables. Default = "l".
`mode`	(character) (character) a character string to indicate the strategy to create the formulas for candidate models. Options are: "light", "moderate", "intensive", or "complex". Default = "moderate".
`minvar`	(numeric) minimum number of independent variables in formulas.
`maxvar`	(numeric) maximum number of independent variables in formulas.
`complex`	(logical) whether to return the most complex formula.
`var_names`	sames as `independent`.
`string`	same as `type`.

Details

mode options determine what strategy to iterate the predictors defined in type for creating models:

light.– returns simple iterations of complex formulas.
moderate.– returns a comprehensive number of iterations.
intensive.– returns all possible combination. Very time-consuming for 6 or more independent variables.
complex.– returns only the most complex formula.

Value

A character vector containing the resulting formula(s).

Examples

# example variables
dep <- "sp"
ind <- c("temp", "rain", "slope")

# The most complex formula according to "type"
get_formulas(dep, ind, type = "lqp", mode = "complex")

# mode = 'light', combinations according to type
get_formulas(dep, ind, type = "lqp", mode = "light")

# mode = 'light', combinations according to type
get_formulas(dep, ind, type = "lqp", mode = "intensive")
# example variables
dep <- "sp"
ind <- c("temp", "rain", "slope")

# The most complex formula according to "type"
get_formulas(dep, ind, type = "lqp", mode = "complex")

# mode = 'light', combinations according to type
get_formulas(dep, ind, type = "lqp", mode = "light")

# mode = 'light', combinations according to type
get_formulas(dep, ind, type = "lqp", mode = "intensive")

Evaluate final models using independent data

Description

Final evaluation steps for model predictions using an independent dataset (not used in model calibration).

Usage

independent_eval1(prediction, threshold, test_prediction = NULL,
                  lon_lat = NULL)

independent_eval01(prediction, observation, lon_lat = NULL)
independent_eval1(prediction, threshold, test_prediction = NULL,
                  lon_lat = NULL)

independent_eval01(prediction, observation, lon_lat = NULL)

Arguments

`prediction`	(numeric) vector or `SpatRaster` object. If numeric, predicted values in independent data (for `independent_eval01`), or the entire area of prediction (for `independent_eval1`). If `SpatRaster` prediction over the area of interest.
`threshold`	(numeric) the lowest predicted probability value for an occurrence point. This value must be defined for presences-only data. Default = NULL.
`test_prediction`	(numeric) vector of predictions for independent data. Default = NULL.
`lon_lat`	matrix or data.frame of coordinates (longitude and latitude, in that order) of independent data. Points must be located within the valid area of `prediction`. For `independent_eval01` they must correspond with values in observation.
`observation`	(numeric) vector of observed (known) values of presence or absence to test against `prediction` (if numeric) or values of prediction in `lon_lat`.

Value

A data.frame or list containing evaluation results.

Examples

# Independent test data based on coordinates (lon/lat WGS 84) from presence
# and absences records
data("test", package = "enmpa")
head(test)

# Loading a model prediction
pred <- terra::rast(system.file("extdata", "proj_out_wmean.tif",
                                package = "enmpa"))
terra::plot(pred)

# Evaluation using presence-absence data
independent_eval01(prediction = pred, observation = test$Sp,
                   lon_lat = test[, 2:3])

# Evaluation using presence-only data
test_p_only <- test[test$Sp == 1, ]
th_maxTSS   <- 0.1274123            # threshold based on the maxTSS

independent_eval1(prediction = pred, threshold = th_maxTSS,
                  lon_lat = test_p_only[, 2:3])
# Independent test data based on coordinates (lon/lat WGS 84) from presence
# and absences records
data("test", package = "enmpa")
head(test)

# Loading a model prediction
pred <- terra::rast(system.file("extdata", "proj_out_wmean.tif",
                                package = "enmpa"))
terra::plot(pred)

# Evaluation using presence-absence data
independent_eval01(prediction = pred, observation = test$Sp,
                   lon_lat = test[, 2:3])

# Evaluation using presence-only data
test_p_only <- test[test$Sp == 1, ]
th_maxTSS   <- 0.1274123            # threshold based on the maxTSS

independent_eval1(prediction = pred, threshold = th_maxTSS,
                  lon_lat = test_p_only[, 2:3])

Jackkniffe test for variable contribution

Description

The Jackknife function providing a detailed reflection of the impact of each variable on the overall model, considering four difference measures: ROC-AUC, TSS, AICc, and Deviance.

Usage

jackknife(data, dependent, independent, user_formula = NULL, cv = 3,
          response_type = "l", weights = NULL)
jackknife(data, dependent, independent, user_formula = NULL, cv = 3,
          response_type = "l", weights = NULL)

Arguments

`data`	data.frame or matrix of data to be used in model calibration. Columns represent dependent and independent variables.
`dependent`	(character) name of dependent variable.
`independent`	(character) vector of name(s) of independent variable(s).
`user_formula`	(character) custom formula to test. Default = NULL.
`cv`	(numeric) number of folds to use for k-fold cross-validation exercises. Default = 3.
`response_type`	(character) a character string that must contain "l", "p", "q" or a combination of them. l = lineal, q = quadratic, p = interaction between two variables. Default = "l".
`weights`	(numeric) a vector with the weights for observations.

Value

list including model performance metrics (ROC-AUC, TSS, AICc, and deviance) for the complete model, model performance when excluding a specific predictor, and the independent contribution of that predictor to the model.

Examples


# Load data
data("enm_data", package = "enmpa")

jk <- jackknife(data = enm_data,
                dependent = "Sp",
                independent = c("bio_1", "bio_12"),
                user_formula = NULL,
                cv = 3, response_type = "lpq")
jk

# plot JK's results
plot_jk(jk, metric = "TSS")
plot_jk(jk, metric = "ROC_AUC")
plot_jk(jk, metric = "AIC")
plot_jk(jk, metric = "Residual_deviance")

# Load data
data("enm_data", package = "enmpa")

jk <- jackknife(data = enm_data,
                dependent = "Sp",
                independent = c("bio_1", "bio_12"),
                user_formula = NULL,
                cv = 3, response_type = "lpq")
jk

# plot JK's results
plot_jk(jk, metric = "TSS")
plot_jk(jk, metric = "ROC_AUC")
plot_jk(jk, metric = "AIC")
plot_jk(jk, metric = "Residual_deviance")

K-fold data partitioning

Description

Creates indices to partition available data into k equal-sized subsets or folds, maintaining the global proportion of presence-absences in each fold.

Usage

kfold_partition(data, dependent, k = 2, seed = 1)
kfold_partition(data, dependent, k = 2, seed = 1)

Arguments

`data`	data.frame or matrix containing at least two columns.
`dependent`	(character) name of column that contains the presence-absence records (1-0).
`k`	(numeric) the number of groups that the given data is to be split into.
`seed`	(numeric) integer value to specify an initial seed. Default = 1.

Value

A list of vectors with the indices of rows corresponding to each fold.

Examples

# example data
data <- data.frame(species = c(rep(0, 80), rep (1,20)),
                   variable1 = rnorm(100),
                   variable2 = rpois(100, 2))

# create partition indices
kfolds <- kfold_partition(data, dependent = "species", k = 2)

# data for partition 1
data[kfolds$Fold_1, ]
# example data
data <- data.frame(species = c(rep(0, 80), rep (1,20)),
                   variable1 = rnorm(100),
                   variable2 = rpois(100, 2))

# create partition indices
kfolds <- kfold_partition(data, dependent = "species", k = 2)

# data for partition 1
data[kfolds$Fold_1, ]

Selection of best candidate models considering various criteria

Description

Applies a series of criteria to select best candidate models.

Usage

model_selection(evaluation_stats, criterion = "TSS", exclude_bimodal = FALSE,
                tolerance = 0.01)
model_selection(evaluation_stats, criterion = "TSS", exclude_bimodal = FALSE,
                tolerance = 0.01)

Arguments

`evaluation_stats`	data.frame with the statistics of model evaluation results. These results are the output of the function `evaluation_stats`.
`criterion`	(character) metric used as the predictive criterion for model selection.
`exclude_bimodal`	(logical) whether to exclude models in which binomial variable response curves were detected.
`tolerance`	(numeric)

Value

A data.frame with one or more selected models.

Examples

# data
data("cal_res", package = "enmpa")
eval_stats <- cal_res$summary[, -1]

# selecting best model
selected_mod <- model_selection(eval_stats, exclude_bimodal = TRUE)
# data
data("cal_res", package = "enmpa")
eval_stats <- cal_res$summary[, -1]

# selecting best model
selected_mod <- model_selection(eval_stats, exclude_bimodal = TRUE)

Model validation options

Description

Model evaluation using entire set of data and a k-fold cross validation approach. Models are assessed based on discrimination power (ROC-AUC), classification ability (accuracy, sensitivity, specificity, TSS, etc.), and the balance between fitting and complexity (AIC).

Usage

model_validation(formula, data, family = binomial(link = "logit"),
                 weights = NULL, cv = FALSE, partition_index = NULL,
                 k = NULL, dependent = NULL, n_threshold = 100,
                 keep_coefficients = FALSE, seed = 1)
model_validation(formula, data, family = binomial(link = "logit"),
                 weights = NULL, cv = FALSE, partition_index = NULL,
                 k = NULL, dependent = NULL, n_threshold = 100,
                 keep_coefficients = FALSE, seed = 1)

Arguments

`formula`	(character) `expression` to be used as a glm `formula`.
`data`	data.frame with dependent and independent variables.
`family`	a `family` object for models used by functions such as `glm`. Default = binomial(link = "logit").
`weights`	(numeric) vector with weights for observations. Default = NULL.
`cv`	(logical) whether to use a k-fold cross validation for evaluation. Default = FALSE.
`partition_index`	list of indices for cross validation in k-fold. Obtained with the function `kfold_partition`. Default = NULL.
`k`	(numeric) number of folds for a new k-fold index preparation. Ignored if `partition_index` is defined or if `cv` = FALSE. Default = NULL.
`dependent`	(character) name of dependent variable. Ignore if `cv` = FALSE. Default = NULL.
`n_threshold`	(numeric) number of threshold values to be used for ROC. Default = 100.
`keep_coefficients`	(logical) whether to keep model coefficients. Default = FALSE.
`seed`	(numeric) a seed number. Default = 1.

Value

A data.frame with results from evaluation.

Examples

# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")
head(enm_data)

# Custom formula
form <- c("Sp ~ bio_1 + I(bio_1^2) + I(bio_12^2)")

# Model evaluation using the entire set of records
model_validation(form, data = enm_data)

# Model evaluation using a k-fold cross-validation (k = 3)
model_validation(form, data = enm_data, cv = TRUE, k = 3, dependent = "Sp")
# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")
head(enm_data)

# Custom formula
form <- c("Sp ~ bio_1 + I(bio_1^2) + I(bio_12^2)")

# Model evaluation using the entire set of records
model_validation(form, data = enm_data)

# Model evaluation using a k-fold cross-validation (k = 3)
model_validation(form, data = enm_data, cv = TRUE, k = 3, dependent = "Sp")

Niche Signal detection using one or multiple variables

Description

Identifies whether a signal of niche can be detected using one or multiple variables. This is an implementation of the methods developed by Cobos & Peterson (2022) doi:10.17161/bi.v17i.15985 that focuses on identifying niche signals in presence-absence data.

Usage

niche_signal(data, condition, variables, method = "univariate",
             permanova_method = "mahalanobis", iterations = 1000,
             set_seed = 1, verbose = TRUE, ...)

niche_signal_univariate(data, condition, variable, iterations = 1000,
                        set_seed = 1, verbose = TRUE)

niche_signal_permanova(data, condition, variables, permutations = 999,
                       permanova_method = "mahalanobis", verbose = TRUE, ...)
niche_signal(data, condition, variables, method = "univariate",
             permanova_method = "mahalanobis", iterations = 1000,
             set_seed = 1, verbose = TRUE, ...)

niche_signal_univariate(data, condition, variable, iterations = 1000,
                        set_seed = 1, verbose = TRUE)

niche_signal_permanova(data, condition, variables, permutations = 999,
                       permanova_method = "mahalanobis", verbose = TRUE, ...)

Arguments

`data`	matrix or data.frame containing at least the following information: a column representing `condition` (positive = 1 or negative = 0), and one or more columns representing environmental variables.
`condition`	(character) name of the column with numeric information about detection (positive = 1 or negative = 0).
`variables`	(character) vector of one or more names of columns to be used as environmental variables. If `method` = "univariate", only one variable is used; for `method` = "permanova", multiple variables can be used.
`method`	(character) name of the method to be used for niche comparison. Default = "univariate".
`permanova_method`	(character) name of the dissimilarity index to be used as `method` in `adonis2`. See all options in `vegdist`. Default = "mahalanobis".
`iterations`	(numeric) number of iterations to be used in analysis. Default = 1000. If `method` = "permanova", permutations = iterations - 1.
`set_seed`	(numeric) integer value to specify a initial seed. Default = 1.
`verbose`	(logical) whether or not to print messages about the process. Default = TRUE.
`...`	other arguments to be passed to `adonis2`.
`variable`	(character) name of the column containing data to be used as environmental variable.
`permutations`	number of permutations to be performed.

Value

A list with results from analysis depending on method.

Examples

# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")
head(enm_data)

# Detection of niche signal using an univariate non-parametric test
sn_bio1 <- niche_signal(data = enm_data, variables = "bio_1",
                        condition = "Sp", method = "univariate")
sn_bio1

sn_bio12 <- niche_signal(data = enm_data, variables = "bio_12",
                         condition = "Sp", method = "univariate")
sn_bio12
# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")
head(enm_data)

# Detection of niche signal using an univariate non-parametric test
sn_bio1 <- niche_signal(data = enm_data, variables = "bio_1",
                        condition = "Sp", method = "univariate")
sn_bio1

sn_bio12 <- niche_signal(data = enm_data, variables = "bio_12",
                         condition = "Sp", method = "univariate")
sn_bio12

Find threshold values to produce three optimal metrics

Description

The metrics true skill statistic (TSS), sensitivity, specificity are explored by comparing actual vs predicted values to find threshold values that produce sensitivity = specificity, maximum TSS, and a sensitivity value of 0.9.

Usage

optimize_metrics(actual, predicted, n_threshold = 100)
optimize_metrics(actual, predicted, n_threshold = 100)

Arguments

`actual`	(numeric) vector of actual values (0, 1) to be compared to `predicted` values after applying a threshold.
`predicted`	(numeric) vector of predicted probability values to be thresholded and compared to `actual`.
`n_threshold`	(numeric) number of threshold values to be used. Default = 100.

Value

A list containing a data.frame with the resulting metrics for all threshold values tested, and a second data.frame with the results for the threshold values that produce sensitivity = specificity (ESS), maximum TSS (maxTSS), and a sensitivity value of 0.9 (SEN90).

Examples

# example data
act <- c(rep(1, 20), rep(0, 80))
pred <- c(runif(20, min = 0.4, max = 0.7), runif(80, min = 0, max = 0.5))

# run example
om <- optimize_metrics(actual = act, predicted = pred)
om$optimized
# example data
act <- c(rep(1, 20), rep(0, 80))
pred <- c(runif(20, min = 0.4, max = 0.7), runif(80, min = 0, max = 0.5))

# run example
om <- optimize_metrics(actual = act, predicted = pred)
om$optimized

Plot variable importance

Description

Visualization of the results obtained with the function var_importance.

Usage

plot_importance(x, xlab = NULL, ylab = "Relative contribution",
                main = "Variable importance", extra_info = TRUE, ...)
plot_importance(x, xlab = NULL, ylab = "Relative contribution",
                main = "Variable importance", extra_info = TRUE, ...)

Arguments

`x`	data.frame output from `var_importance`.
`xlab`	(character) a label for the x axis.
`ylab`	(character) a label for the y axis.
`main`	(character) main title for the plot.
`extra_info`	(logical) when results are from more than one model, it adds information about the number of models using each predictor and the mean contribution found.
`...`	additional arguments passed to `barplot` or `boxplot`.

Value

A plot

Examples


# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")

# Custom formulas
forms <- c("Sp ~ bio_1 + I(bio_1^2) + I(bio_12^2)",
           "Sp ~ bio_12 + I(bio_1^2) + I(bio_12^2)")

# Fitting models
fits <- fit_glms(forms, data = enm_data)

# Variable importance for single models
vi_1 <- var_importance(fits$ModelID_1)
plot_importance(x = vi_1)

vi_2 <- var_importance(fits$ModelID_2)
plot_importance(x = vi_2)

# Variable importance for multiple models
vi_c <- var_importance(fits)
plot_importance(x = vi_c)
# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")

# Custom formulas
forms <- c("Sp ~ bio_1 + I(bio_1^2) + I(bio_12^2)",
           "Sp ~ bio_12 + I(bio_1^2) + I(bio_12^2)")

# Fitting models
fits <- fit_glms(forms, data = enm_data)

# Variable importance for single models
vi_1 <- var_importance(fits$ModelID_1)
plot_importance(x = vi_1)

vi_2 <- var_importance(fits$ModelID_2)
plot_importance(x = vi_2)

# Variable importance for multiple models
vi_c <- var_importance(fits)
plot_importance(x = vi_c)

Jackkniffe plot for variable contribution

Description

The jackknife figure shows the impact of each variable on the full model, providing detailed information about the function and significance of each variable. Light blue indicates the impact on the model if the variable is not included, while dark blue indicates the independent contribution of the variable to the model.

Usage

plot_jk(x, metric = "ROC_AUC", legend = TRUE,
        colors = c("cyan", "blue", "red"),
        xlab = NULL, main = NULL)
plot_jk(x, metric = "ROC_AUC", legend = TRUE,
        colors = c("cyan", "blue", "red"),
        xlab = NULL, main = NULL)

Arguments

`x`	list output from `jackknife`.
`metric`	(character) model metric to plot. Default = "ROC_AUC".
`legend`	(logical) whether to add legend. Default = TRUE.
`colors`	(character) vector of colors. Default = c("cyan", "blue", "red").
`xlab`	(character) a label for the x axis.
`main`	(character) main title for the plot.

Plot Niche Signal results

Description

Plots to interpret results from niche_signal tests (Cobos & Peterson (2022) doi:10.17161/bi.v17i.15985).

Usage

plot_niche_signal(niche_signal_list, statistic = "mean",
                  variables = NULL, ellipses = FALSE, level = 0.99,
                  breaks = "Sturges", main = "", xlab = NULL, ylab = NULL,
                  h_col = "lightgray", h_cex = 0.8, lty = 2, lwd = 1,
                  l_col = c("blue", "black"), e_col = c("black", "red"),
                  pch = 19, pt_cex = c(1.3, 0.8), pt_col = c("black", "red"),
                  ...)

plot_niche_signal_univariate(niche_signal_univariate_list, statistic = "mean",
                             breaks = "Sturges", main = "", xlab = NULL,
                             ylab = "Frequency", h_col = "lightgray",
                             h_cex = 0.8, lty = 2, lwd = 1,
                             l_col = c("blue", "black"), ...)

plot_niche_signal_permanova(niche_signal_permanova_list, variables = NULL,
                           ellipses = FALSE, level = 0.99, main = "",
                           xlab = NULL, ylab = NULL,
                           e_col = c("black", "red"), lty = 2, lwd = 1,
                           pch = 19, pt_cex = c(1.3, 0.8),
                           pt_col = c("black", "red"), ...)
plot_niche_signal(niche_signal_list, statistic = "mean",
                  variables = NULL, ellipses = FALSE, level = 0.99,
                  breaks = "Sturges", main = "", xlab = NULL, ylab = NULL,
                  h_col = "lightgray", h_cex = 0.8, lty = 2, lwd = 1,
                  l_col = c("blue", "black"), e_col = c("black", "red"),
                  pch = 19, pt_cex = c(1.3, 0.8), pt_col = c("black", "red"),
                  ...)

plot_niche_signal_univariate(niche_signal_univariate_list, statistic = "mean",
                             breaks = "Sturges", main = "", xlab = NULL,
                             ylab = "Frequency", h_col = "lightgray",
                             h_cex = 0.8, lty = 2, lwd = 1,
                             l_col = c("blue", "black"), ...)

plot_niche_signal_permanova(niche_signal_permanova_list, variables = NULL,
                           ellipses = FALSE, level = 0.99, main = "",
                           xlab = NULL, ylab = NULL,
                           e_col = c("black", "red"), lty = 2, lwd = 1,
                           pch = 19, pt_cex = c(1.3, 0.8),
                           pt_col = c("black", "red"), ...)

Arguments

`niche_signal_list`	list of results from niche_signal.
`statistic`	(character) name of the statistic for which results will be explored when results come for univariate analysis. Default = "mean". Options are: "mean", "median", "SD", and "range".
`variables`	(character) name of variables to used in plots when results come from analysis using the `permanova` method. The default, NULL, uses the first two variables.
`ellipses`	(logical) whether to use ellipses to represent all and positive data when results come from PERMANOVA. The default, FALSE, plots points instead.
`level`	(numeric) value from 0 to 1 representing the limit of the ellipse to be plotted. Default = 0.99.
`breaks`	breaks in the histogram as in `hist`. Default = "Sturges".
`main`	(character) title for plot. Default = "".
`xlab`	(character) x axis label. Default = NULL. For results from PERMANOVA, appropriate variable names are used.
`ylab`	(character) y axis label. Default = NULL. For univariate results, the default turn into "Frequency". For results from PERMANOVA, appropriate variable names are used.
`h_col`	a color to be used to fill the bars of histograms. Default = "lightgray".
`h_cex`	(numeric) value by which plotting text and symbols should be magnified relative to the default in histograms. Default = 0.8.
`lty`	(numeric) line type. See options in `par`. Default = 2. A vector of length = 2 can be used, in order, all and positive.
`lwd`	(numeric) line width. See options in `par`. Default = 1. A vector of length = 2 can be used, in order, all and positive.
`l_col`	line color for observed value of positives and confidence intervals. Default = c("blue", "black").
`e_col`	color of ellipse lines for all and positive data. Default = c("black", "red").
`pch`	point type. See options in `points`. Default = 19. A vector of length = 2 can be used, in order, all and positive.
`pt_cex`	(numeric) value by which points will be magnified. Values for all and positive points are recommended. Default = c(1.3, 0.8).
`pt_col`	color for points. Values for all and positive points are recommended. Default = c("black", "red").
`...`	other plotting arguments to be used.
`niche_signal_univariate_list`	list of results from niche_signal_univariate.
`niche_signal_permanova_list`	list of results from niche_signal_permanova.

Value

A plot.

Examples

# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")
head(enm_data)

# Detection of niche signal using an univariate non-parametric test
sn_bio1 <- niche_signal(data = enm_data,
                        variables = "bio_1",
                        condition = "Sp",
                        method = "univariate")

plot_niche_signal(sn_bio1, variables = "bio_1")

sn_bio12 <- niche_signal(data = enm_data,
                         variables = "bio_12",
                         condition = "Sp",
                         method = "univariate")

plot_niche_signal(sn_bio12, variables = "bio_12")
# Load species occurrences and environmental data.
data("enm_data", package = "enmpa")
head(enm_data)

# Detection of niche signal using an univariate non-parametric test
sn_bio1 <- niche_signal(data = enm_data,
                        variables = "bio_1",
                        condition = "Sp",
                        method = "univariate")

plot_niche_signal(sn_bio1, variables = "bio_1")

sn_bio12 <- niche_signal(data = enm_data,
                         variables = "bio_12",
                         condition = "Sp",
                         method = "univariate")

plot_niche_signal(sn_bio12, variables = "bio_12")

Extension of glm predict to generate predictions of different types

Description

Obtains predictions from a fitted generalized linear model objects. It also allows the clamping option to restrict extrapolation in areas outside the calibration area.

Usage

predict_glm(
  model,
  newdata,
  data = NULL,
  extrapolation_type = "E",
  restricted_vars = NULL,
  type = "response"
)
predict_glm(
  model,
  newdata,
  data = NULL,
  extrapolation_type = "E",
  restricted_vars = NULL,
  type = "response"
)

Arguments

`model`	a `glm` object.
`newdata`	a data.frame or matrix with the new data to project the predictions.
`data`	data.frame or matrix of data used in the model calibration step. Default = NULL.
`extrapolation_type`	(character) to indicate extrapolation type of model. Models can be transferred with three options: free extrapolation ('E'), extrapolation with clamping ('EC'), and no extrapolation ('NE'). Default = 'E'.
`restricted_vars`	(character) a vector containing the names of the variables that will undergo clamping or no extrapolation. For clamping, these variables are set to minimum and maximum values established for the max and min values within calibration values. For no extrapolation, the variables outside calibration limits became NA. If no specific names are provided, the value is set to NULL by default, indicating that clamping (EC) or no extrapolation (NE) will be applied to all variables. Ignore if extrapolation_type = 'E'.
`type`	(character) the type of prediction required. For a default binomial model the default predictions are of log-odds (probabilities on logit scale). The default, "response", returns predicted probabilities.

Value

A SpatRaster object or a vector with predictions.

Examples

# Load fitted model
data("sel_fit", package = "enmpa")

# Load raster layers to be projected
env_vars <- terra::rast(system.file("extdata", "vars.tif", package = "enmpa"))

# Prediction
pred <- predict_glm(sel_fit$glms_fitted$ModelID_7, newdata = env_vars,
                    data = sel_fit$data)
terra::plot(pred)
# Load fitted model
data("sel_fit", package = "enmpa")

# Load raster layers to be projected
env_vars <- terra::rast(system.file("extdata", "vars.tif", package = "enmpa"))

# Prediction
pred <- predict_glm(sel_fit$glms_fitted$ModelID_7, newdata = env_vars,
                    data = sel_fit$data)
terra::plot(pred)

Predictions for the models selected after calibration

Description

Wrapper function that facilitates the prediction of those models selected as the most robust. In addition, it allows the calculation of consensus models, when more than one model are selected.

Usage

predict_selected(fitted, newdata, extrapolation_type = "E",
                 restricted_vars = NULL, type = "response", consensus = TRUE)
predict_selected(fitted, newdata, extrapolation_type = "E",
                 restricted_vars = NULL, type = "response", consensus = TRUE)

Arguments

`fitted`	an enmpa-class`⁠fitted models⁠` object obtained using the functions `fit_selected`.
`newdata`	a `SpatRaster`, data.frame, or matrix with the new data on which to predict.
`extrapolation_type`	(character) to indicate extrapolation type of model. Models can be transferred with three options: free extrapolation ('E'), extrapolation with clamping ('EC'), and no extrapolation ('NE'). Default = 'E'.
`restricted_vars`	(character) a vector containing the names of the variables that will undergo clamping or no extrapolation. For clamping, these variables are set to minimum and maximum values established for the max and min values within calibration values. For no extrapolation, the variables outside calibration limits became NA. If no specific names are provided, the value is set to NULL by default, indicating that clamping (EC) or no extrapolation (NE) will be applied to all variables. Ignore if extrapolation_type = 'E'.
`type`	(character) the type of prediction required. For a default binomial model the default predictions are of log-odds (probabilities on logit scale). The default, "response", returns predicted probabilities.
`consensus`	(logical) valid if `newdata` is a `SpatRaster`, whether to produce consensus results obtained by combining the predictions from the collection of selected models. By default consensuses are calculated using the mean, median, variance, and weighted average using the AIC weights. Default = TRUE.

Value

A list with predictions of selected models on the newdata and fitted selected model(s). Consensus predictions are added if multiple selected models exits and if newdata is a SpatRaster object.

Examples

# Load a fitted selected model
data(sel_fit, package = "enmpa")

# Load raster layers to be projected
env_vars <- terra::rast(system.file("extdata", "vars.tif", package = "enmpa"))

# Predictions (only one selected mode, no consensus required)
preds <- predict_selected(sel_fit, newdata = env_vars, consensus = FALSE)

# Plot prediction
terra::plot(preds$predictions)
# Load a fitted selected model
data(sel_fit, package = "enmpa")

# Load raster layers to be projected
env_vars <- terra::rast(system.file("extdata", "vars.tif", package = "enmpa"))

# Predictions (only one selected mode, no consensus required)
preds <- predict_selected(sel_fit, newdata = env_vars, consensus = FALSE)

# Plot prediction
terra::plot(preds$predictions)

Print a short version of elements in 'calibration' and 'fitted models' objects

Description

Print a short version of elements in 'calibration' and 'fitted models' objects

Usage

## S3 method for class 'enmpa_calibration'
print(x, ...)

## S3 method for class 'enmpa_fitted_models'
print(x, ...)
## S3 method for class 'enmpa_calibration'
print(x, ...)

## S3 method for class 'enmpa_fitted_models'
print(x, ...)

Arguments

`x`	object of enmpa_fitted_models or enmpa_calibration
`...`	additional arguments affecting the summary produced. Ignored in these functions.

Partial ROC calculation

Description

proc applies partial ROC tests to model predictions.

Usage

proc_enm(test_prediction, prediction, threshold = 5, sample_percentage = 50,
         iterations = 500)
proc_enm(test_prediction, prediction, threshold = 5, sample_percentage = 50,
         iterations = 500)

Arguments

`test_prediction`	(numeric) vector of model predictions for testing data.
`prediction`	`SpatRaster` or numeric vector of model predictions to be evaluated.
`threshold`	(numeric) value from 0 to 100 to represent the percentage of potential error (E) that the data could have due to any source of uncertainty. Default = 5.
`sample_percentage`	(numeric) percentage of testing data to be used in each bootstrapped process for calculating the partial ROC. Default = 50.
`iterations`	(numeric) number of bootstrap iterations to be performed; default = 500.

Details

Partial ROC is calculated following Peterson et al. (2008) doi:10.1016/j.ecolmodel.2007.11.008.

Value

A list with the summary of the results and a data.frame containing the AUC values and AUC ratios calculated for all iterations.

Examples

# Loading a model prediction
pred <- terra::rast(system.file("extdata", "proj_out_wmean.tif",
                                package = "enmpa"))

# Simulated data
test <- runif(100, min = 0.3, max = 0.8)

# partial ROC calculation
pr <- proc_enm(test, pred, threshold = 5, sample_percentage = 50,
               iterations = 500)
# Loading a model prediction
pred <- terra::rast(system.file("extdata", "proj_out_wmean.tif",
                                package = "enmpa"))

# Simulated data
test <- runif(100, min = 0.3, max = 0.8)

# partial ROC calculation
pr <- proc_enm(test, pred, threshold = 5, sample_percentage = 50,
               iterations = 500)

Two-Way interaction response plot

Description

A view of the species probability into a two-dimensional environmental space.

Usage

resp2var(model, variable1 , variable2, modelID = NULL, data = NULL, n = 1000,
         new_data = NULL, extrapolate = FALSE, add_bar = TRUE,
         add_limits = FALSE, color.palette	= NULL, xlab = NULL, ylab = NULL,
         ...)
resp2var(model, variable1 , variable2, modelID = NULL, data = NULL, n = 1000,
         new_data = NULL, extrapolate = FALSE, add_bar = TRUE,
         add_limits = FALSE, color.palette	= NULL, xlab = NULL, ylab = NULL,
         ...)

Arguments

`model`	an object of class `glm` or `enmpa_fitted_models`.
`variable1`	(character) name of the variable to be plotted in x axis.
`variable2`	(character) name of the variable to be plotted in y axis.
`modelID`	(character) name of the ModelID if inputed `model` is in the `enmpa_fitted_models` object. Default = NULL.
`data`	data.frame or matrix of data to be used in model calibration. Default = NULL.
`n`	(numeric) an integer guiding the number of breaks. Default = 100
`new_data`	a `SpatRaster`, data.frame, or matrix of variables representing the range of variable values in an area of interest. Default = NULL.
`extrapolate`	(logical) whether to allow extrapolation to study the behavior of the response outside the calibration limits. Ignored if `new_data` is defined. Default = TRUE.
`add_bar`	(logical) whether to add bar legend. Default = TRUE.
`add_limits`	(logical) whether to add calibration limits if `extrapolate = TRUE`. Default = FALSE.
`color.palette`	(function) a color palette function to be used to assign colors in the plot. Default = function(n) rev(hcl.colors(n, "terrain")).
`xlab`	(character) a label for the x axis. The default, NULL, uses the name defined in `variable1`.
`ylab`	(character) a label for the y axis. The default, NULL, uses the name defined in `variable2`.
`...`	additional arguments passed to `image`.

Details

The function calculates probabilities by focusing on each combination of the two supplied environmental variable while keeping all other variables constant at their mean values.

Value

A plot with the response interaction of two environmental dimensions for variable1 and variable2, and don't return anything.

Examples

# Load a fitted selected model
data(sel_fit, package = "enmpa")

# Two-Way interaction response plot in the calibration limits

resp2var(sel_fit, variable1 = "bio_1", variable2 = "bio_12", xlab = "BIO-1",
ylab = "BIO-12", modelID = "ModelID_7")

# Two-Way interaction response plot allowing extrapolation
resp2var(sel_fit, variable1 = "bio_1", variable2 = "bio_12", xlab = "BIO-1",
ylab = "BIO-12", modelID = "ModelID_7", extrapolate = TRUE)
# Load a fitted selected model
data(sel_fit, package = "enmpa")

# Two-Way interaction response plot in the calibration limits

resp2var(sel_fit, variable1 = "bio_1", variable2 = "bio_12", xlab = "BIO-1",
ylab = "BIO-12", modelID = "ModelID_7")

# Two-Way interaction response plot allowing extrapolation
resp2var(sel_fit, variable1 = "bio_1", variable2 = "bio_12", xlab = "BIO-1",
ylab = "BIO-12", modelID = "ModelID_7", extrapolate = TRUE)

Variable response curves for GLMs

Description

A view of variable responses in models. Responses based on single or multiple models can be provided.

Usage

response_curve(fitted, variable, data = NULL, modelID = NULL,  n = 100,
               new_data = NULL,  extrapolate = TRUE, xlab = NULL,
               ylab = "Probability", col = "red", ...)
response_curve(fitted, variable, data = NULL, modelID = NULL,  n = 100,
               new_data = NULL,  extrapolate = TRUE, xlab = NULL,
               ylab = "Probability", col = "red", ...)

Arguments

`fitted`	an object of class `glm`, a list of GLMs obtained using the function `fit_glms`, or an object `enmpa_fitted_models` from `fit_selected` .
`variable`	(character) name of the variables to be plotted.
`data`	data.frame or matrix of data used in the model calibration step. Default = NULL.
`modelID`	(character) vector of ModelID(s) to be considered when the fitted models is an `enmpa_fitted_object`. By default all models are included. Default = NULL.
`n`	(numeric) an integer guiding the number of breaks. Default = 100
`new_data`	a `SpatRaster`, data.frame, or matrix of variables representing the range of variable values in an area of interest. Default = NULL. It must be defined in case the model entered does not explicitly include a data component.
`extrapolate`	(logical) whether to allow extrapolation to study the behavior of the response outside the calibration limits. Ignored if `new_data` is defined. Default = TRUE.
`xlab`	(character) a label for the x axis. The default, NULL, uses the name defined in `variable`.
`ylab`	(character) a label for the y axis. Default = "Probability".
`col`	(character) color for lines. Default = "red".
`...`	additional arguments passed to `plot`.

Details

The function calculates these probabilities by focusing on a single environmental variable while keeping all other variables constant at their mean values.

When responses for multiple models are to be plotted, the mean and confidence intervals for the set of responses are calculated using a GAM.

Value

A plot with the response curve for a variable.

Examples

# Load a fitted selected model
data(sel_fit, package = "enmpa")

# Response curve for single models
response_curve(sel_fit$ModelID_7, variable = "bio_1")

# Response curve when model(s) are in a list (only one model in this one)
response_curve(sel_fit, variable = "bio_12")
# Load a fitted selected model
data(sel_fit, package = "enmpa")

# Response curve for single models
response_curve(sel_fit$ModelID_7, variable = "bio_1")

# Response curve when model(s) are in a list (only one model in this one)
response_curve(sel_fit, variable = "bio_12")

Example of selected models fitted

Description

An object of the class enmpa_fitted_models containing fitted selected model(s) and the information from model evaluation for such model(s).

Usage

sel_fit
sel_fit

Format

A list with four elements.

glms_fitted: a fitted glm (ModelID_7).
selected: a data.frame with results from evaluation of ModelID_7
data: a data.frame containing information on presence and absence, and independent variables used to fit GLM models.
weights: a vector with the weights for observations.

Examples

data("sel_fit", package = "enmpa")
data("sel_fit", package = "enmpa")

Summary of 'calibration' and 'fitted models'

Description

Summary of 'calibration' and 'fitted models'

Usage

## S3 method for class 'enmpa_calibration'
summary(object, ...)

## S3 method for class 'enmpa_fitted_models'
summary(object, ...)
## S3 method for class 'enmpa_calibration'
summary(object, ...)

## S3 method for class 'enmpa_fitted_models'
summary(object, ...)

Arguments

`object`	of class enmpa_calibration or enmpa_fitted_models
`...`	additional arguments affecting the summary produced. Ignored in these functions.

Value

A printed summary.

Example data used to test models

Description

A dataset containing information on presence and absence, and independent variables used to fit GLM models.

Usage

test
test

Format

A data frame with 100 rows and 3 columns.

Sp: numeric, values of 0 = absence and 1 = presence.
lon: numeric, longitude values.
lat: numeric, latitude values.

Examples

data("test", package = "enmpa")
head(test)
data("test", package = "enmpa")
head(test)

Variable importance for GLMs

Description

Calculates the relative importance of predictor variables based on the concept of explained deviance. This is achieved by fitting a GLMs multiple times, each time leaving out a different predictor variable to observe its impact on the model's performance.

Usage

var_importance(fitted, modelID = NULL, data = NULL)
var_importance(fitted, modelID = NULL, data = NULL)

Arguments

`fitted`	an object of class `glm`, a list of GLMs obtained using the function `fit_glms`, or an object `enmpa_fitted_models` from `fit_selected` .
`modelID`	(character) vector of ModelID(s) to be considered when the `fitted` input is of the class `enmpa_fitted_models`. By default all models are included. Default = NULL.
`data`	data.frame or matrix of data used in the model calibration step. It must be defined in case the model entered does not explicitly include a data component. Default = NULL.

Details

The process begins by fitting the full GLM model, which includes all predictor variables. Subsequently, separate GLM models are fitted, excluding one variable at a time to assess the influence of its absence on the model's performance. By systematically evaluating the effect of removing each predictor variable, the function provides valuable insights into their individual contributions to the model's overall performance and explanatory power.

Value

A data.frame containing the relative contribution of each variable. An identification for distinct models is added if fitted contains multiple models.

Examples

# Load a fitted selected model
data(sel_fit, package = "enmpa")

# Variable importance for single models
var_importance(sel_fit, modelID = "ModelID_7")

# Variable importance for multiple models (only one model in this list)
var_importance(sel_fit)
# Load a fitted selected model
data(sel_fit, package = "enmpa")

# Variable importance for single models
var_importance(sel_fit, modelID = "ModelID_7")

# Variable importance for multiple models (only one model in this list)
var_importance(sel_fit)

Package 'enmpa'

Help Index

Example of results obtained from GLM calibration using enmpa

Description

Usage

Format

Examples

GLM calibration with presence-absence data

Description

Usage

Arguments

Details

Value

Examples

Example data used to run model calibration exercises

Description

Usage

Format

Examples

enmpa: Ecological Niche Modeling using Presence-Absence Data

Description

Details

Main functions in enmpa

Author(s)

See Also

Constructor of S3 objects of class enmpa_calibration

Description

Usage

Arguments

Value

Constructor of S3 objects of class enmpa_fitted_models

Description

Usage

Arguments

Value

Summary of evaluation statistics for candidate models

Description

Usage

Arguments

Value

Examples

Fitting selected GLMs models

Description

Usage

Arguments

Value

Examples

Get GLM formulas according to defined response types

Description

Usage

Arguments

Details

Value

Examples

Evaluate final models using independent data

Description

Usage

Arguments

Value

Examples

Jackkniffe test for variable contribution

Description

Usage

Arguments

Value

Examples

K-fold data partitioning

Description

Usage

Arguments

Value

Examples

Selection of best candidate models considering various criteria

Description

Usage

Arguments

Value

Examples

Model validation options

Description

Main functions in `enmpa`