Generate a synthetic dataset for demonstrating and testing neuralGAM. The response is constructed from three covariates: a quadratic effect, a linear effect, and a sinusoidal effect, plus Gaussian noise.

sim_neuralGAM_data(n = 2000, seed = 42, test_prop = 0.3)

Arguments

n

Integer. Number of observations to generate. Default 2000.

seed

Integer. Random seed for reproducibility. Default 42.

test_prop

Numeric in \([0,1]\). Proportion of data to reserve for the test set. Default 0.3.

Value

A list with two elements:

  • train: data.frame with training data.

  • test: data.frame with test data.

Details

The data generating process is: $$y = 2 + x1^2 + 2 x2 + \sin(x3) + \varepsilon,$$ where \(\varepsilon \sim N(0, 0.25^2)\).

Covariates \(x1\), \(x2\), \(x3\) are drawn independently from \(U(-2.5, 2.5)\).

Author

Ines Ortega-Fernandez, Marta Sestelo.

Examples

# \dontrun{
set.seed(123)
dat <- sim_neuralGAM_data(n = 500, test_prop = 0.2)

train <- dat$train
test  <- dat$test

# }