Package 'fastglm' reference manual

Title:	Fast and Stable Fitting of Generalized Linear Models using 'RcppEigen'
Description:	Fits generalized linear models efficiently using 'RcppEigen'. The iteratively reweighted least squares implementation utilizes the step-halving approach of Marschner (2011) <doi:10.32614/RJ-2011-012> to help safeguard against convergence issues.
Authors:	Jared Huling [aut, cre], Douglas Bates [cph], Dirk Eddelbuettel [cph], Romain Francois [cph], Yixuan Qiu [cph]
Maintainer:	Jared Huling <[email protected]>
License:	GPL (>= 2)
Version:	0.0.3
Built:	2025-02-24 04:18:52 UTC
Source:	https://github.com/jaredhuling/fastglm

big.matrix prod

Description

big.matrix prod

Usage

## S4 method for signature 'big.matrix,vector'
x %*% y

## S4 method for signature 'vector,big.matrix'
x %*% y
## S4 method for signature 'big.matrix,vector'
x %*% y

## S4 method for signature 'vector,big.matrix'
x %*% y

Arguments

`x`	big.matrix
`y`	numeric vector

deviance method for fastglm fitted objects

Description

deviance method for fastglm fitted objects

Usage

## S3 method for class 'fastglm'
deviance(object, ...)
## S3 method for class 'fastglm'
deviance(object, ...)

Arguments

`object`	fastglm fitted object
`...`	not used

Value

The value of the deviance extracted from the object

family method for fastglm fitted objects

Description

family method for fastglm fitted objects

Usage

## S3 method for class 'fastglm'
family(object, ...)
## S3 method for class 'fastglm'
family(object, ...)

Arguments

`object`	fastglm fitted object
`...`	not used

Value

returns the family of the fitted object

fast generalized linear model fitting

Description

fast generalized linear model fitting

bigLm default

Usage

fastglm(x, ...)

## Default S3 method:
fastglm(
  x,
  y,
  family = gaussian(),
  weights = NULL,
  offset = NULL,
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  method = 0L,
  tol = 1e-08,
  maxit = 100L,
  ...
)
fastglm(x, ...)

## Default S3 method:
fastglm(
  x,
  y,
  family = gaussian(),
  weights = NULL,
  offset = NULL,
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  method = 0L,
  tol = 1e-08,
  maxit = 100L,
  ...
)

Arguments

`x`	input model matrix. Must be a matrix object
`...`	not used
`y`	numeric response vector of length nobs.
`family`	a description of the error distribution and link function to be used in the model. For `fastglm` this can be a character string naming a family function, a family function or the result of a call to a family function. For `fastglmPure` only the third option is supported. (See `family` for details of family functions.)
`weights`	an optional vector of 'prior weights' to be used in the fitting process. Should be a numeric vector.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be a numeric vector of length equal to the number of cases
`start`	starting values for the parameters in the linear predictor.
`etastart`	starting values for the linear predictor.
`mustart`	values for the vector of means.
`method`	an integer scalar with value 0 for the column-pivoted QR decomposition, 1 for the unpivoted QR decomposition, 2 for the LLT Cholesky, or 3 for the LDLT Cholesky
`tol`	threshold tolerance for convergence. Should be a positive real number
`maxit`	maximum number of IRLS iterations. Should be an integer

Value

A list with the elements

`coefficients`	a vector of coefficients
`se`	a vector of the standard errors of the coefficient estimates
`rank`	a scalar denoting the computed rank of the model matrix
`df.residual`	a scalar denoting the degrees of freedom in the model
`residuals`	the vector of residuals
`s`	a numeric scalar - the root mean square for residuals
`fitted.values`	the vector of fitted values

Examples


x <- matrix(rnorm(10000 * 100), ncol = 100)
y <- 1 * (0.25 * x[,1] - 0.25 * x[,3] > rnorm(10000))

system.time(gl1 <- glm.fit(x, y, family = binomial()))

system.time(gf1 <- fastglm(x, y, family = binomial()))

system.time(gf2 <- fastglm(x, y, family = binomial(), method = 1))

system.time(gf3 <- fastglm(x, y, family = binomial(), method = 2))

system.time(gf4 <- fastglm(x, y, family = binomial(), method = 3))

max(abs(coef(gl1) - gf1$coef))
max(abs(coef(gl1) - gf2$coef))
max(abs(coef(gl1) - gf3$coef))
max(abs(coef(gl1) - gf4$coef))


## Not run: 
nrows <- 50000
ncols <- 50
bkFile <- "bigmat2.bk"
descFile <- "bigmatk2.desc"
bigmat <- filebacked.big.matrix(nrow=nrows, ncol=ncols, type="double",
                                backingfile=bkFile, backingpath=".",
                                descriptorfile=descFile,
                                dimnames=c(NULL,NULL))
for (i in 1:ncols) bigmat[,i] = rnorm(nrows)*i
y <- 1*(rnorm(nrows) + bigmat[,1] > 0)

system.time(gfb1 <- fastglm(bigmat, y, family = binomial(), method = 3))

## End(Not run)

x <- matrix(rnorm(10000 * 100), ncol = 100)
y <- 1 * (0.25 * x[,1] - 0.25 * x[,3] > rnorm(10000))

system.time(gl1 <- glm.fit(x, y, family = binomial()))

system.time(gf1 <- fastglm(x, y, family = binomial()))

system.time(gf2 <- fastglm(x, y, family = binomial(), method = 1))

system.time(gf3 <- fastglm(x, y, family = binomial(), method = 2))

system.time(gf4 <- fastglm(x, y, family = binomial(), method = 3))

max(abs(coef(gl1) - gf1$coef))
max(abs(coef(gl1) - gf2$coef))
max(abs(coef(gl1) - gf3$coef))
max(abs(coef(gl1) - gf4$coef))


## Not run: 
nrows <- 50000
ncols <- 50
bkFile <- "bigmat2.bk"
descFile <- "bigmatk2.desc"
bigmat <- filebacked.big.matrix(nrow=nrows, ncol=ncols, type="double",
                                backingfile=bkFile, backingpath=".",
                                descriptorfile=descFile,
                                dimnames=c(NULL,NULL))
for (i in 1:ncols) bigmat[,i] = rnorm(nrows)*i
y <- 1*(rnorm(nrows) + bigmat[,1] > 0)

system.time(gfb1 <- fastglm(bigmat, y, family = binomial(), method = 3))

## End(Not run)

fast generalized linear model fitting

Description

fast generalized linear model fitting

Usage

fastglmPure(
  x,
  y,
  family = gaussian(),
  weights = rep(1, NROW(y)),
  offset = rep(0, NROW(y)),
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  method = 0L,
  tol = 1e-07,
  maxit = 100L
)
fastglmPure(
  x,
  y,
  family = gaussian(),
  weights = rep(1, NROW(y)),
  offset = rep(0, NROW(y)),
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  method = 0L,
  tol = 1e-07,
  maxit = 100L
)

Arguments

`x`	input model matrix. Must be a matrix object
`y`	numeric response vector of length nobs.
`family`	a description of the error distribution and link function to be used in the model. For `fastglmPure` this can only be the result of a call to a family function. (See `family` for details of family functions.)
`weights`	an optional vector of 'prior weights' to be used in the fitting process. Should be a numeric vector.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be a numeric vector of length equal to the number of cases
`start`	starting values for the parameters in the linear predictor.
`etastart`	starting values for the linear predictor.
`mustart`	values for the vector of means.
`method`	an integer scalar with value 0 for the column-pivoted QR decomposition, 1 for the unpivoted QR decomposition, 2 for the LLT Cholesky, 3 for the LDLT Cholesky, 4 for the full pivoted QR decomposition, 5 for the Bidiagonal Divide and Conquer SVD
`tol`	threshold tolerance for convergence. Should be a positive real number
`maxit`	maximum number of IRLS iterations. Should be an integer

Value

A list with the elements

`coefficients`	a vector of coefficients
`se`	a vector of the standard errors of the coefficient estimates
`rank`	a scalar denoting the computed rank of the model matrix
`df.residual`	a scalar denoting the degrees of freedom in the model
`residuals`	the vector of residuals
`s`	a numeric scalar - the root mean square for residuals
`fitted.values`	the vector of fitted values

Examples


set.seed(1)
x <- matrix(rnorm(1000 * 25), ncol = 25)
eta <- 0.1 + 0.25 * x[,1] - 0.25 * x[,3] + 0.75 * x[,5] -0.35 * x[,6] #0.25 * x[,1] - 0.25 * x[,3]
y <- 1 * (eta > rnorm(1000))

yp <- rpois(1000, eta ^ 2)
yg <- rgamma(1000, exp(eta) * 1.75, 1.75)

# binomial
system.time(gl1 <- glm.fit(x, y, family = binomial()))

system.time(gf1 <- fastglmPure(x, y, family = binomial(), tol = 1e-8))

system.time(gf2 <- fastglmPure(x, y, family = binomial(), method = 1, tol = 1e-8))

system.time(gf3 <- fastglmPure(x, y, family = binomial(), method = 2, tol = 1e-8))

system.time(gf4 <- fastglmPure(x, y, family = binomial(), method = 3, tol = 1e-8))

max(abs(coef(gl1) - gf1$coef))
max(abs(coef(gl1) - gf2$coef))
max(abs(coef(gl1) - gf3$coef))
max(abs(coef(gl1) - gf4$coef))

# poisson
system.time(gl1 <- glm.fit(x, yp, family = poisson(link = "log")))

system.time(gf1 <- fastglmPure(x, yp, family = poisson(link = "log"), tol = 1e-8))

system.time(gf2 <- fastglmPure(x, yp, family = poisson(link = "log"), method = 1, tol = 1e-8))

system.time(gf3 <- fastglmPure(x, yp, family = poisson(link = "log"), method = 2, tol = 1e-8))

system.time(gf4 <- fastglmPure(x, yp, family = poisson(link = "log"), method = 3, tol = 1e-8))

max(abs(coef(gl1) - gf1$coef))
max(abs(coef(gl1) - gf2$coef))
max(abs(coef(gl1) - gf3$coef))
max(abs(coef(gl1) - gf4$coef))

# gamma
system.time(gl1 <- glm.fit(x, yg, family = Gamma(link = "log")))

system.time(gf1 <- fastglmPure(x, yg, family = Gamma(link = "log"), tol = 1e-8))

system.time(gf2 <- fastglmPure(x, yg, family = Gamma(link = "log"), method = 1, tol = 1e-8))

system.time(gf3 <- fastglmPure(x, yg, family = Gamma(link = "log"), method = 2, tol = 1e-8))

system.time(gf4 <- fastglmPure(x, yg, family = Gamma(link = "log"), method = 3, tol = 1e-8))

max(abs(coef(gl1) - gf1$coef))
max(abs(coef(gl1) - gf2$coef))
max(abs(coef(gl1) - gf3$coef))
max(abs(coef(gl1) - gf4$coef))

set.seed(1)
x <- matrix(rnorm(1000 * 25), ncol = 25)
eta <- 0.1 + 0.25 * x[,1] - 0.25 * x[,3] + 0.75 * x[,5] -0.35 * x[,6] #0.25 * x[,1] - 0.25 * x[,3]
y <- 1 * (eta > rnorm(1000))

yp <- rpois(1000, eta ^ 2)
yg <- rgamma(1000, exp(eta) * 1.75, 1.75)

# binomial
system.time(gl1 <- glm.fit(x, y, family = binomial()))

system.time(gf1 <- fastglmPure(x, y, family = binomial(), tol = 1e-8))

system.time(gf2 <- fastglmPure(x, y, family = binomial(), method = 1, tol = 1e-8))

system.time(gf3 <- fastglmPure(x, y, family = binomial(), method = 2, tol = 1e-8))

system.time(gf4 <- fastglmPure(x, y, family = binomial(), method = 3, tol = 1e-8))

max(abs(coef(gl1) - gf1$coef))
max(abs(coef(gl1) - gf2$coef))
max(abs(coef(gl1) - gf3$coef))
max(abs(coef(gl1) - gf4$coef))

# poisson
system.time(gl1 <- glm.fit(x, yp, family = poisson(link = "log")))

system.time(gf1 <- fastglmPure(x, yp, family = poisson(link = "log"), tol = 1e-8))

system.time(gf2 <- fastglmPure(x, yp, family = poisson(link = "log"), method = 1, tol = 1e-8))

system.time(gf3 <- fastglmPure(x, yp, family = poisson(link = "log"), method = 2, tol = 1e-8))

system.time(gf4 <- fastglmPure(x, yp, family = poisson(link = "log"), method = 3, tol = 1e-8))

max(abs(coef(gl1) - gf1$coef))
max(abs(coef(gl1) - gf2$coef))
max(abs(coef(gl1) - gf3$coef))
max(abs(coef(gl1) - gf4$coef))

# gamma
system.time(gl1 <- glm.fit(x, yg, family = Gamma(link = "log")))

system.time(gf1 <- fastglmPure(x, yg, family = Gamma(link = "log"), tol = 1e-8))

system.time(gf2 <- fastglmPure(x, yg, family = Gamma(link = "log"), method = 1, tol = 1e-8))

system.time(gf3 <- fastglmPure(x, yg, family = Gamma(link = "log"), method = 2, tol = 1e-8))

system.time(gf4 <- fastglmPure(x, yg, family = Gamma(link = "log"), method = 3, tol = 1e-8))

max(abs(coef(gl1) - gf1$coef))
max(abs(coef(gl1) - gf2$coef))
max(abs(coef(gl1) - gf3$coef))
max(abs(coef(gl1) - gf4$coef))

logLik method for fastglm fitted objects

Description

logLik method for fastglm fitted objects

Usage

## S3 method for class 'fastglm'
logLik(object, ...)
## S3 method for class 'fastglm'
logLik(object, ...)

Arguments

`object`	fastglm fitted object
`...`	not used

Value

Returns an object of class logLik

Obtains predictions and optionally estimates standard errors of those predictions from a fitted generalized linear model object.

Description

Obtains predictions and optionally estimates standard errors of those predictions from a fitted generalized linear model object.

Usage

## S3 method for class 'fastglm'
predict(
  object,
  newdata = NULL,
  type = c("link", "response"),
  se.fit = FALSE,
  dispersion = NULL,
  ...
)
## S3 method for class 'fastglm'
predict(
  object,
  newdata = NULL,
  type = c("link", "response"),
  se.fit = FALSE,
  dispersion = NULL,
  ...
)

Arguments

`object`	a fitted object of class inheriting from "`fastglm`".
`newdata`	a matrix to be used for prediction
`type`	the type of prediction required. The default is on the scale of the linear predictors; the alternative "`response`" is on the scale of the response variable. Thus for a default binomial model the default predictions are of log-odds (probabilities on logit scale) and `type = "response"` gives the predicted probabilities. The "`terms`" option returns a matrix giving the fitted values of each term in the model formula on the linear predictor scale. The value of this argument can be abbreviated.
`se.fit`	logical switch indicating if standard errors are required.
`dispersion`	the dispersion of the GLM fit to be assumed in computing the standard errors. If omitted, that returned by `summary` applied to the object is used.
`...`	further arguments passed to or from other methods.

print method for fastglm objects

Description

print method for fastglm objects

Usage

## S3 method for class 'fastglm'
print(x, ...)
## S3 method for class 'fastglm'
print(x, ...)

Arguments

`x`	object to print
`...`	not used

residuals method for fastglm fitted objects

Description

residuals method for fastglm fitted objects

Usage

## S3 method for class 'fastglm'
residuals(
  object,
  type = c("deviance", "pearson", "working", "response", "partial"),
  ...
)
## S3 method for class 'fastglm'
residuals(
  object,
  type = c("deviance", "pearson", "working", "response", "partial"),
  ...
)

Arguments

`object`	fastglm fitted object
`type`	type of residual to be returned
`...`	not used

Value

a vector of residuals

summary method for fastglm fitted objects

Description

summary method for fastglm fitted objects

Usage

## S3 method for class 'fastglm'
summary(object, dispersion = NULL, ...)
## S3 method for class 'fastglm'
summary(object, dispersion = NULL, ...)

Arguments

`object`	fastglm fitted object
`dispersion`	the dispersion parameter for the family used. Either a single numerical value or `NULL` (the default), when it is inferred from `object`.
`...`	not used

Value

a summary.fastglm object

Examples


x <- matrix(rnorm(10000 * 10), ncol = 10)
y <- 1 * (0.25 * x[,1] - 0.25 * x[,3] > rnorm(10000))

fit <- fastglm(x, y, family = binomial())

summary(fit)


x <- matrix(rnorm(10000 * 10), ncol = 10)
y <- 1 * (0.25 * x[,1] - 0.25 * x[,3] > rnorm(10000))

fit <- fastglm(x, y, family = binomial())

summary(fit)

Package 'fastglm'

Help Index

big.matrix prod

Description

Usage

Arguments

deviance method for fastglm fitted objects

Description

Usage

Arguments

Value

family method for fastglm fitted objects

Description

Usage

Arguments

Value

fast generalized linear model fitting

Description

Usage

Arguments

Value

Examples

fast generalized linear model fitting

Description

Usage

Arguments

Value

Examples

logLik method for fastglm fitted objects

Description

Usage

Arguments

Value

Obtains predictions and optionally estimates standard errors of those predictions from a fitted generalized linear model object.

Description

Usage

Arguments

print method for fastglm objects

Description

Usage

Arguments

residuals method for fastglm fitted objects

Description

Usage

Arguments

Value

summary method for fastglm fitted objects

Description

Usage

Arguments

Value

Examples