Fit a neuralGAM model — neuralGAM • neuralGAM

Fits a neuralGAM model by building a neural network to attend to each covariate.

neuralGAM(
  formula,
  data,
  num_units,
  family = "gaussian",
  learning_rate = 0.001,
  activation = "relu",
  kernel_initializer = "glorot_normal",
  kernel_regularizer = NULL,
  bias_regularizer = NULL,
  bias_initializer = "zeros",
  activity_regularizer = NULL,
  loss = "mse",
  w_train = NULL,
  bf_threshold = 0.001,
  ls_threshold = 0.1,
  max_iter_backfitting = 10,
  max_iter_ls = 10,
  seed = NULL,
  verbose = 1,
  ...
)

Arguments

formula: An object of class "formula": a description of the model to be fitted. You can add smooth terms using s().
data: A data frame containing the model response variable and covariates required by the formula. Additional terms not present in the formula will be ignored.
num_units: Defines the architecture of each neural network. If a scalar value is provided, a single hidden layer neural network with that number of units is used. If a vector of values is provided, a multi-layer neural network with each element of the vector defining the number of hidden units on each hidden layer is used.
family: This is a family object specifying the distribution and link to use for fitting. By default, it is "gaussian" but also works to "binomial" for logistic regression.
learning_rate: Learning rate for the neural network optimizer.
activation: Activation function of the neural network. Defaults to relu
kernel_initializer: Kernel initializer for the Dense layers. Defaults to Xavier Initializer (glorot_normal).
kernel_regularizer: Optional regularizer function applied to the kernel weights matrix.
bias_regularizer: Optional regularizer function applied to the bias vector.
bias_initializer: Optional initializer for the bias vector.
activity_regularizer: Optional regularizer function applied to the output of the layer
loss: Loss function to use during neural network training. Defaults to the mean squared error.
w_train: Optional sample weights
bf_threshold: Convergence criterion of the backfitting algorithm. Defaults to 0.001
ls_threshold: Convergence criterion of the local scoring algorithm. Defaults to 0.1
max_iter_backfitting: An integer with the maximum number of iterations of the backfitting algorithm. Defaults to 10.
max_iter_ls: An integer with the maximum number of iterations of the local scoring Algorithm. Defaults to 10.
seed: A positive integer which specifies the random number generator seed for algorithms dependent on randomization.
verbose: Verbosity mode (0 = silent, 1 = print messages). Defaults to 1.
...: Additional parameters for the Adam optimizer (see ?keras::optimizer_adam)

Value

A trained neuralGAM object. Use summary(ngam) to see details.

Details

The function builds one neural network to attend to each feature in x, using the backfitting and local scoring algorithms to fit a weighted additive model using neural networks as function approximators. The adjustment of the dependent variable and the weights is determined by the distribution of the response y, adjusted by the family parameter.

References

Hastie, T., & Tibshirani, R. (1990). Generalized Additive Models. London: Chapman and Hall, 1931(11), 683–741.

Author

Ines Ortega-Fernandez, Marta Sestelo.

Examples

# \dontrun{
n <- 24500

seed <- 42
set.seed(seed)

x1 <- runif(n, -2.5, 2.5)
x2 <- runif(n, -2.5, 2.5)
x3 <- runif(n, -2.5, 2.5)

f1 <- x1 ** 2
f2 <- 2 * x2
f3 <- sin(x3)
f1 <- f1 - mean(f1)
f2 <- f2 - mean(f2)
f3 <- f3 - mean(f3)

eta0 <- 2 + f1 + f2 + f3
epsilon <- rnorm(n, 0.25)
y <- eta0 + epsilon
train <- data.frame(x1, x2, x3, y)

library(neuralGAM)
ngam <- neuralGAM(y ~ s(x1) + x2 + s(x3), data = train,
                 num_units = 1024, family = "gaussian",
                 activation = "relu",
                 learning_rate = 0.001, bf_threshold = 0.001,
                 max_iter_backfitting = 10, max_iter_ls = 10,
                 seed = seed
                 )
#> [1] "Initializing neuralGAM..."
#> [1] "BACKFITTING Iteration 1 - Current Err =  0.32048205436 BF Threshold =  0.001 Converged =  FALSE"
#> [1] "BACKFITTING Iteration 2 - Current Err =  0.000623513488713123 BF Threshold =  0.001 Converged =  TRUE"

ngam
#> Class: neuralGAM 
#> 
#> Distribution Family:  gaussian
#> Formula:  y ~ s(x1) + x2 + s(x3)
#> Intercept: 2.2215
#> MSE: 1.0049
#> Sample size: 24500
# }