## predict method for felm from lfe package

rstudio lfe package

lfe-package r

felm multiple endogenous variables

error in backsolve ch b badvars transpose true invalid k argument

felm c fitted values

predict function in r

felm output

Does anyone have a nice clean way to get `predict`

behavior for `felm`

models?

library(lfe) model1 <- lm(data = iris, Sepal.Length ~ Sepal.Width + Species) predict(model1, newdata = data.frame(Sepal.Width = 3, Species = "virginica")) # Works model2 <- felm(data = iris, Sepal.Length ~ Sepal.Width | Species) predict(model2, newdata = data.frame(Sepal.Width = 3, Species = "virginica")) # Does not work

UPDATE (2020-04-02): The answer from Grant below using the new package `fixest`

provides a more parsimonious solution.

As a workaround, you could combine `felm`

, `getfe`

, and `demeanlist`

as follows:

library(lfe) lm.model <- lm(data=demeanlist(iris[, 1:2], list(iris$Species)), Sepal.Length ~ Sepal.Width) fe <- getfe(felm(data = iris, Sepal.Length ~ Sepal.Width | Species)) predict(lm.model, newdata = data.frame(Sepal.Width = 3)) + fe$effect[fe$idx=="virginica"]

The idea is that you use `demeanlist`

to center the variables, then `lm`

to estimate the coefficient on `Sepal.Width`

using the centered variables, giving you an `lm`

object over which you can run `predict`

. Then run `felm`

+`getfe`

to get the conditional mean for the fixed effect, and add that to the output of `predict`

.

**[PDF] Package 'lfe',** levels, lm or other packages are probably better suited. lfe is designed to produce the same results First the other coefficients are estimated with the function felm Reduced residuals, i.e. the residuals resulting from predict-. Setting exactDOF='rM' computes the exact degrees of freedom with rankMatrix () in package Matrix. For the iv-part of the formula, it is only necessary to include the instruments on the right hand side. The other explanatory covariates, from the first and second part of formula, are added automatically in the first stage regression.

This might not be the answer that you are looking for, but it seems that the author did not add any functionality to the `lfe`

package in order to make predictions on external data by using the fitted `felm`

model. The primary focus seems to be on the analysis of the group fixed effects. However, it's interesting to note that in the documentation of the package the following is mentioned:

The object has some resemblance to an 'lm' object, and some postprocessing methods designed for lm may happen to work. It may however be necessary to coerce the object to succeed with this.

Hence, it might be possible to coerce the `felm`

object to an `lm`

object in order to obtain some additional `lm`

functionality (if all the required info is present in the object to perform the necessary computations).

The lfe package is intended to be run on very large datasets and effort was made to conserve memory: As a direct result of this, the `felm`

object does not use/contain a qr decomposition, as opposed to the `lm`

object. Unfortunately, the `lm`

`predict`

procedure relies on this information in order to compute the predictions. Hence, coercing the `felm`

object and executing the predict method will fail:

> model2 <- felm(data = iris, Sepal.Length ~ Sepal.Width | Species) > class(model2) <- c("lm","felm") # coerce to lm object > predict(model2, newdata = data.frame(Sepal.Width = 3, Species = "virginica")) Error in qr.lm(object) : lm object does not have a proper 'qr' component. Rank zero or should not have used lm(.., qr=FALSE).

If you really must use this package to perform the predictions then you could maybe write your own simplified version of this functionality by using the information that you have available in the `felm`

object. For example, the OLS regression coëfficients are available via `model2$coefficients`

.

**predict method for felm from lfe package,** Does anyone have a nice clean way to get predict behavior for felm models? library(lfe) model1 <- lm(data = iris, Sepal.Length ~ Sepal.Width +� Regardless, your answer doesn't answer the question, which specifically asked for the predict method for felm models. – kennyB Jul 3 '15 at 19:51 | show 1 more comment up vote 0 down vote This should work for cases where you wish to ignore the group effects in the prediction, are predicting for new X's, and and only want confidence intervals.

Late to the party, but the new **fixest** package (link) has a predict method. It supports high-dimensional fixed effects (and clustering, etc.) using a very similar syntax to lfe. Somewhat remarkably, it is also considerably *faster* than lfe for the benchmark cases that I've tested.

library(fixest) model_feols <- feols(data = iris, Sepal.Length ~ Sepal.Width | Species) predict(model_feols, newdata = data.frame(Sepal.Width = 3, Species = "virginica")) # Works

**felm: Fit a linear model with multiple group fixed effects in lfe: Linear ,** 'felm' is used to fit linear models with multiple group fixed effects, similarly to lm. It uses the Method of Alternating projections to sweep out multiple group effects computes the exact degrees of freedom with rankMatrix() in package Matrix. i.e. when predicting with the predicted endogenous variables from the 1st stage. 4 lfe-package In the exec-directory there is a perl-script lfescriptwhich is used at the author’s site for automated creation of R-scripts from a simple speciﬁcation ﬁle. The format is documented in doc/lfeguide.txt. lfe is similar in function, though not in method, to the Stata modules a2reg and felsdvreg. The

To extend the answer from pbaylis, I created a slightly longwinded function that extends nicely to allow for more than one fixed effect. Note that you have to manually enter the original dataset used in the felm model. The function returns a list with two items: the vector of predictions, and a dataframe based on the new_data that includes the predictions and fixed effects as columns.

predict_felm <- function(model, data, new_data) { require(dplyr) # Get the names of all the variables y <- model$lhs x <- rownames(model$beta) fe <- names(model$fe) # Demean according to fixed effects data_demeaned <- demeanlist(data[c(y, x)], as.list(data[fe]), na.rm = T) # Create formula for LM and run prediction lm_formula <- as.formula( paste(y, "~", paste(x, collapse = "+")) ) lm_model <- lm(lm_formula, data = data_demeaned) lm_predict <- predict(lm_model, newdata = new_data) # Collect coefficients for fe fe_coeffs <- getfe(model) %>% select(fixed_effect = effect, fe_type = fe, idx) # For each fixed effect, merge estimated fixed effect back into new_data new_data_merge <- new_data for (i in fe) { fe_i <- fe_coeffs %>% filter(fe_type == i) by_cols <- c("idx") names(by_cols) <- i new_data_merge <- left_join(new_data_merge, fe_i, by = by_cols) %>% select(-matches("^idx")) } if (length(lm_predict) != nrow(new_data_merge)) stop("unmatching number of rows") # Sum all the fixed effects all_fixed_effects <- base::rowSums(select(new_data_merge, matches("^fixed_effect"))) # Create dataframe with predictions new_data_predict <- new_data_merge %>% mutate(lm_predict = lm_predict, felm_predict = all_fixed_effects + lm_predict) return(list(predict = new_data_predict$felm_predict, data = new_data_predict)) } model2 <- felm(data = iris, Sepal.Length ~ Sepal.Width | Species) predict_felm(model = model2, data = iris, new_data = data.frame(Sepal.Width = 3, Species = "virginica")) # Returns prediction and data frame

**Predict.felm not working � Issue #3 � skranz/regtools � GitHub,** There probably were some updates with the lfe package that broke my predict. felm function. I uploaded a new version (that does some brute� Enter lfe The package lfe is designed to handle the above estimation problem. It also contains some methods for solving identiﬁcation problems, though not completely in all cases. Since lfe handles quite ordinary linear models which conceptually could be handled by lm(), we do not go into detail about the

This should work for cases where you wish to ignore the group effects in the prediction, are predicting for new X's, and only want confidence intervals. It first looks for a `clustervcv`

attribute, then `robustvcv`

, then `vcv`

.

predict.felm <- function(object, newdata, se.fit = FALSE, interval = "none", level = 0.95){ if(missing(newdata)){ stop("predict.felm requires newdata and predicts for all group effects = 0.") } tt <- terms(object) Terms <- delete.response(tt) attr(Terms, "intercept") <- 0 m.mat <- model.matrix(Terms, data = newdata) m.coef <- as.numeric(object$coef) fit <- as.vector(m.mat %*% object$coef) fit <- data.frame(fit = fit) if(se.fit | interval != "none"){ if(!is.null(object$clustervcv)){ vcov_mat <- object$clustervcv } else if (!is.null(object$robustvcv)) { vcov_mat <- object$robustvcv } else if (!is.null(object$vcv)){ vcov_mat <- object$vcv } else { stop("No vcv attached to felm object.") } se.fit_mat <- sqrt(diag(m.mat %*% vcov_mat %*% t(m.mat))) } if(interval == "confidence"){ t_val <- qt((1 - level) / 2 + level, df = object$df.residual) fit$lwr <- fit$fit - t_val * se.fit_mat fit$upr <- fit$fit + t_val * se.fit_mat } else if (interval == "prediction"){ stop("interval = \"prediction\" not yet implemented") } if(se.fit){ return(list(fit=fit, se.fit=se.fit_mat)) } else { return(fit) } }

**r: predict method for felm from lfe package,** Does anyone have a nice clean way to get predict behavior for felm models? library(lfe)model1 <- lm(data = iris, Sepal.Length ~ Sepal.Width +� Predictions from an lme Object Description. The predictions at level i are obtained by adding together the population predictions (based only on the fixed effects estimates) and the estimated contributions of the random effects to the predictions at grouping levels less or equal to i.

**Regressions,** The package lfe implements models with high dimensional fixed effects or/and Errors reported by felm are similar to the ones given by areg and not xtivreg / xtivreg2 . An estimation function returns a list that contains the estimates, the in a lot of cases, the residuals, the predicted values, or the original variables used in� You can get the actual predictions, based on the probability matrix and the values of 𝑝(𝑥), with .predict(): >>> model . predict ( x ) array([0, 0, 0, 1, 1, 1, 1, 1, 1, 1]) This function returns the predicted output values as a one-dimensional array.

**Add predict.lfe method - lfe,** Ask questionsAdd predict.lfe method SO threads on this issue, e.g. https:// stackoverflow.com/questions/30491545/predict-method-for-felm-from-lfe- package� Your package needs to have Depends: pls in the DESCRIPTION in order for the correct predict method to be available to R. There's nothing in pls that you can specifically import. You also need to import the predict generic from the stats namespace, so add #' @importFrom stats predict as that will import the generic in you packages namespace.

**Diagnostics for fixed effects panel models in R,** For large numbers of fixed effects, the function can be intractable and return A more detailed introduction to these packages can be found in [1] A basic panel model can be with using lfe with the provided felm() method. the predict() function in R doesn't work out-of-the-box with felm() or plm() models. Loess Regression is the most common method used to smoothen a volatile time series. It is a non-parametric methods where least squares regression is performed in localized subsets, which makes it a suitable candidate for smoothing any numerical vector.

##### Comments

- predict doesnt work because it creates felm class object and predict wont work for it
- Just a note, you don't have to say
`data(iris)`

, iris data is lazyloaded already. - as for adding predict to include to felm create a request to r-proj-c > methods("predict") [1] predict.ar* predict.Arima* predict.arima0* [4] predict.glm predict.HoltWinters* predict.lm [7] predict.loess* predict.mlm* predict.nls* [10] predict.poly* predict.ppr* predict.prcomp* [13] predict.princomp* predict.smooth.spline* predict.smooth.spline.fit* [16] predict.StructTS*
- I think quite a bit of re-engineering the
`felm()`

function (and the functions it calls) would be necessary as the current implementation does not store the fixed effect coefficients, or even apparently the intercept -- see this answer on a question that is at least a near duplicate of this one. - How do you do this for multiple fe?
- You add the other FE to the demeanlist and getfe commands, then add another term onto the final sum.
- This answer should get more attention, getfe is a very useful command and it's obvious how to predict once you have that. Furthermore it seems to be the only answer that actually answers the question in a general, correct way
- Well, it's not as general as I would like. You couldn't use my code to construct standard errors on yhat, or the confidence or prediction interval. I don't know how to do that, so I posted a similar question to this one to see if anyone else had thoughts. stackoverflow.com/questions/48634449/…
- No, we want to use the original value, since the coefficients we estimate still represent the same thing they would have in the uncentered model. You can double check by running predict on the
`lm`

equivalent:`lm2 <- lm(data = iris, Sepal.Length ~ Sepal.Width + factor(Species)) predict(lm2, newdata = data.frame(Sepal.Width = 3, Species = "virginica"))`

- Helpful comments. Thank you.
- This doesn't seem to replicate the example above and has results2 where it should have model2.
- Fixed the results2 (typo). The difference I'm seeing between the two answers is .001, which could easily just come from slight differences between how the two models are implemented.
- Still doesn't seem to be working on my machine. I get this error
`Error: sum(nb) == q is not TRUE`

- I updated with the complete code (loading in the library and data), and it works on both my Mac and PC. I'm using R 3.1.1 on my Mac. I'm not sure why it's not working for you - my original thought would be that it's due to NA, but we're only predicting on one observation so that shouldn't be a problem.
- lmer implements RANDOM effects. lfe implements fixed effects. fixed effects are not shrunken, because the goal is typically inference about marginal effects, rather than prediction. If you want to fit a fixed effects model, don't use
`lmer`

.