lm
Getting Warning: ” ‘newdata’ had 1 row but variables found have 32 rows” on predict.lm
This is a problem of using different names between your data and your newdata and not a problem between using vectors or dataframes. When you fit a model with the lm function and then use predict to make predictions, predict tries to find the same names on your newdata. In your first case name x … Read more
Get coefficients estimated by maximum likelihood into a stargazer table
I was just having this problem and overcame this through the use of the coef se, and omit functions within stargazer… e.g. stargazer(regressions, … coef = list(… list of coefs…), se = list(… list of standard errors…), omit = c(sequence), covariate.labels = c(“new names”), dep.var.labels.include = FALSE, notes.append=FALSE), file=””)
Extract regression coefficient values
A summary.lm object stores these values in a matrix called ‘coefficients’. So the value you are after can be accessed with: a2Pval <- summary(mg)$coefficients[2, 4] Or, more generally/readably, coef(summary(mg))[“a2″,”Pr(>|t|)”]. See here for why this method is preferred.
How do I extract just the number from a named number (without the name)?
For a single element like this, use [[ rather than [. Compare: coefficients(out)[“newx”] # newx # 1 coefficients(out)[[“newx”]] # [1] 1 More generally, use unname(): unname(coefficients(out)[c(“newx”, “(Intercept)”)]) # [1] 1.0 1.5 head(unname(mtcars)) # NA NA NA NA NA NA NA NA NA NA NA # Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 … Read more
Linear Regression and group by in R
Since 2009, dplyr has been released which actually provides a very nice way to do this kind of grouping, closely resembling what SAS does. library(dplyr) d <- data.frame(state=rep(c(‘NY’, ‘CA’), c(10, 10)), year=rep(1:10, 2), response=c(rnorm(10), rnorm(10))) fitted_models = d %>% group_by(state) %>% do(model = lm(response ~ year, data = .)) # Source: local data frame [2 … Read more
How to succinctly write a formula with many variables from a data frame?
There is a special identifier that one can use in a formula to mean all the variables, it is the . identifier. y <- c(1,4,6) d <- data.frame(y = y, x1 = c(4,-1,3), x2 = c(3,9,8), x3 = c(4,-4,-2)) mod <- lm(y ~ ., data = d) You can also do things like this, to … Read more