OLS Regression: Scikit vs. Statsmodels? [closed]

It sounds like you are not feeding the same matrix of regressors X to both procedures (but see below). Here’s an example to show you which options you need to use for sklearn and statsmodels to produce identical results. import numpy as np import statsmodels.api as sm from sklearn.linear_model import LinearRegression # Generate artificial data … Read more

Why am I getting “LinAlgError: Singular matrix” from grangercausalitytests?

The problem arises due to the perfect correlation between the two series in your data. From the traceback, you can see, that internally a wald test is used to compute the maximum likelihood estimates for the parameters of the lag-time series. To do this an estimate of the parameters covariance matrix (which is then near-zero) … Read more

What are the pitfalls of using Dill to serialise scikit-learn/statsmodels models?

I’m the dill author. dill was built to do exactly what you are doing… (to persist numerical fits within class instances for statistics) where these objects can then be distributed to different resources and run in an embarrassingly parallel fashion. So, the answer is yes — I have run code like yours, using mystic and/or … Read more

Pandas rolling regression: alternatives to looping

I created an ols module designed to mimic pandas’ deprecated MovingOLS; it is here. It has three core classes: OLS : static (single-window) ordinary least-squares regression. The output are NumPy arrays RollingOLS : rolling (multi-window) ordinary least-squares regression. The output are higher-dimension NumPy arrays. PandasRollingOLS : wraps the results of RollingOLS in pandas Series & … Read more

Using statsmodel estimations with scikit-learn cross validation, is it possible?

Indeed, you cannot use cross_val_score directly on statsmodels objects, because of different interface: in statsmodels training data is passed directly into the constructor a separate object contains the result of model estimation However, you can write a simple wrapper to make statsmodels objects look like sklearn estimators: import statsmodels.api as sm from sklearn.base import BaseEstimator, … Read more

ANOVA in python using pandas dataframe with statsmodels or scipy?

I set up a direct comparison to test them, found that their assumptions can differ slightly , got a hint from a statistician, and here is an example of ANOVA on a pandas dataframe matching R’s results: import pandas as pd import statsmodels.api as sm from statsmodels.formula.api import ols # R code on R sample … Read more

ImportError: No module named statsmodels

you shouldn’t untar it to /usr/local/lib/python2.7/dist-packages (you could use any temporary directory) you might have used by mistake a different python executable e.g., /usr/bin/python instead of the one corresponding to /usr/local/lib/python2.7 You should use pip corresponding to a desired python version (use python -V to check the version) to install it: $ python -m pip … Read more

Converting statsmodels summary object to Pandas Dataframe

The answer from @Michael B works well, but requires “recreating” the table. The table itself is actually directly available from the summary().tables attribute. Each table in this attribute (which is a list of tables) is a SimpleTable, which has methods for outputting different formats. We can then read any of those formats back as a … Read more

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)