Fits a neuralGAM model by building a neural network to attend to each covariate.

neuralGAM(
  formula,
  data,
  num_units,
  family = "gaussian",
  learning_rate = 0.001,
  activation = "relu",
  kernel_initializer = "glorot_normal",
  kernel_regularizer = NULL,
  bias_regularizer = NULL,
  bias_initializer = "zeros",
  activity_regularizer = NULL,
  loss = "mse",
  w_train = NULL,
  bf_threshold = 0.001,
  ls_threshold = 0.1,
  max_iter_backfitting = 10,
  max_iter_ls = 10,
  seed = NULL,
  verbose = 1,
  ...
)

Arguments

formula

An object of class "formula": a description of the model to be fitted. You can add smooth terms using s().

data

A data frame containing the model response variable and covariates required by the formula. Additional terms not present in the formula will be ignored.

num_units

Defines the architecture of each neural network. If a scalar value is provided, a single hidden layer neural network with that number of units is used. If a vector of values is provided, a multi-layer neural network with each element of the vector defining the number of hidden units on each hidden layer is used.

family

This is a family object specifying the distribution and link to use for fitting. By default, it is "gaussian" but also works to "binomial" for logistic regression.

learning_rate

Learning rate for the neural network optimizer.

activation

Activation function of the neural network. Defaults to relu

kernel_initializer

Kernel initializer for the Dense layers. Defaults to Xavier Initializer (glorot_normal).

kernel_regularizer

Optional regularizer function applied to the kernel weights matrix.

bias_regularizer

Optional regularizer function applied to the bias vector.

bias_initializer

Optional initializer for the bias vector.

activity_regularizer

Optional regularizer function applied to the output of the layer

loss

Loss function to use during neural network training. Defaults to the mean squared error.

w_train

Optional sample weights

bf_threshold

Convergence criterion of the backfitting algorithm. Defaults to 0.001

ls_threshold

Convergence criterion of the local scoring algorithm. Defaults to 0.1

max_iter_backfitting

An integer with the maximum number of iterations of the backfitting algorithm. Defaults to 10.

max_iter_ls

An integer with the maximum number of iterations of the local scoring Algorithm. Defaults to 10.

seed

A positive integer which specifies the random number generator seed for algorithms dependent on randomization.

verbose

Verbosity mode (0 = silent, 1 = print messages). Defaults to 1.

...

Additional parameters for the Adam optimizer (see ?keras::optimizer_adam)

Value

A trained neuralGAM object. Use summary(ngam) to see details.

Details

The function builds one neural network to attend to each feature in x, using the backfitting and local scoring algorithms to fit a weighted additive model using neural networks as function approximators. The adjustment of the dependent variable and the weights is determined by the distribution of the response y, adjusted by the family parameter.

References

Hastie, T., & Tibshirani, R. (1990). Generalized Additive Models. London: Chapman and Hall, 1931(11), 683–741.

Author

Ines Ortega-Fernandez, Marta Sestelo.

Examples

# \dontrun{
n <- 24500

seed <- 42
set.seed(seed)

x1 <- runif(n, -2.5, 2.5)
x2 <- runif(n, -2.5, 2.5)
x3 <- runif(n, -2.5, 2.5)

f1 <- x1 ** 2
f2 <- 2 * x2
f3 <- sin(x3)
f1 <- f1 - mean(f1)
f2 <- f2 - mean(f2)
f3 <- f3 - mean(f3)

eta0 <- 2 + f1 + f2 + f3
epsilon <- rnorm(n, 0.25)
y <- eta0 + epsilon
train <- data.frame(x1, x2, x3, y)

library(neuralGAM)
ngam <- neuralGAM(y ~ s(x1) + x2 + s(x3), data = train,
                 num_units = 1024, family = "gaussian",
                 activation = "relu",
                 learning_rate = 0.001, bf_threshold = 0.001,
                 max_iter_backfitting = 10, max_iter_ls = 10,
                 seed = seed
                 )
#> [1] "Initializing neuralGAM..."
#> [1] "BACKFITTING Iteration 1 - Current Err =  0.32048205436 BF Threshold =  0.001 Converged =  FALSE"
#> [1] "BACKFITTING Iteration 2 - Current Err =  0.000623513488713123 BF Threshold =  0.001 Converged =  TRUE"

ngam
#> Class: neuralGAM 
#> 
#> Distribution Family:  gaussian
#> Formula:  y ~ s(x1) + x2 + s(x3)
#> Intercept: 2.2215
#> MSE: 1.0049
#> Sample size: 24500
# }