statsmodels – Tarik Billa

OLS Regression: Scikit vs. Statsmodels? [closed]

January 7, 2024 by Tarik

It sounds like you are not feeding the same matrix of regressors X to both procedures (but see below). Here’s an example to show you which options you need to use for sklearn and statsmodels to produce identical results. import numpy as np import statsmodels.api as sm from sklearn.linear_model import LinearRegression # Generate artificial data … Read more

Why am I getting “LinAlgError: Singular matrix” from grangercausalitytests?

January 3, 2024 by Tarik

The problem arises due to the perfect correlation between the two series in your data. From the traceback, you can see, that internally a wald test is used to compute the maximum likelihood estimates for the parameters of the lag-time series. To do this an estimate of the parameters covariance matrix (which is then near-zero) … Read more

Confidence interval for LOWESS in Python

December 27, 2023 by Tarik

LOESS doesn’t have an explicit concept for standard error. It just doesn’t mean anything in this context. Since that’s out, your stuck with the brute-force approach. Bootstrap your data. Your going to fit a LOESS curve to the bootstrapped data. See the middle of this page to find a pretty picture of what your doing. … Read more

How to silence statsmodels.fit() in python

December 14, 2023 by Tarik

Use the disp argument to fit. It controls the verbosity of the optimizers in scipy. mod.fit(disp=0) See the documentation for fit.

What are the pitfalls of using Dill to serialise scikit-learn/statsmodels models?

December 13, 2023 by Tarik

I’m the dill author. dill was built to do exactly what you are doing… (to persist numerical fits within class instances for statistics) where these objects can then be distributed to different resources and run in an embarrassingly parallel fashion. So, the answer is yes — I have run code like yours, using mystic and/or … Read more

Pandas rolling regression: alternatives to looping

September 23, 2023 by Tarik

I created an ols module designed to mimic pandas’ deprecated MovingOLS; it is here. It has three core classes: OLS : static (single-window) ordinary least-squares regression. The output are NumPy arrays RollingOLS : rolling (multi-window) ordinary least-squares regression. The output are higher-dimension NumPy arrays. PandasRollingOLS : wraps the results of RollingOLS in pandas Series & … Read more

Using statsmodel estimations with scikit-learn cross validation, is it possible?

August 30, 2023 by Tarik

Indeed, you cannot use cross_val_score directly on statsmodels objects, because of different interface: in statsmodels training data is passed directly into the constructor a separate object contains the result of model estimation However, you can write a simple wrapper to make statsmodels objects look like sklearn estimators: import statsmodels.api as sm from sklearn.base import BaseEstimator, … Read more

ANOVA in python using pandas dataframe with statsmodels or scipy?

August 25, 2023 by Tarik

I set up a direct comparison to test them, found that their assumptions can differ slightly , got a hint from a statistician, and here is an example of ANOVA on a pandas dataframe matching R’s results: import pandas as pd import statsmodels.api as sm from statsmodels.formula.api import ols # R code on R sample … Read more

ImportError: No module named statsmodels

August 8, 2023 by Tarik

you shouldn’t untar it to /usr/local/lib/python2.7/dist-packages (you could use any temporary directory) you might have used by mistake a different python executable e.g., /usr/bin/python instead of the one corresponding to /usr/local/lib/python2.7 You should use pip corresponding to a desired python version (use python -V to check the version) to install it: $ python -m pip … Read more

Converting statsmodels summary object to Pandas Dataframe

August 4, 2023 by Tarik

The answer from @Michael B works well, but requires “recreating” the table. The table itself is actually directly available from the summary().tables attribute. Each table in this attribute (which is a list of tables) is a SimpleTable, which has methods for outputting different formats. We can then read any of those formats back as a … Read more