Search
Poisson-regression
import numpy as np
import pandas as pd
import statsmodels.api as sm
import statsmodels.formula.api as smf

Khaki Chinos

df = pd.read_csv('../data/poisson_regression.csv')
df.head()
ID Visits Income Sex Age Size
0 1 0 11.38 1 3.87 2
1 2 5 9.77 1 4.04 1
2 3 0 11.08 0 3.33 2
3 4 0 10.92 1 3.95 3
4 5 0 10.92 1 2.83 3

Vanilla Poisson Regression

model = smf.glm(formula='Visits ~ Income + Sex + Age + Size', 
                data=df, 
                family=sm.families.Poisson()).fit()

summary = model.summary()
summary
Generalized Linear Model Regression Results
Dep. Variable: Visits No. Observations: 2728
Model: GLM Df Residuals: 2723
Model Family: Poisson Df Model: 4
Link Function: log Scale: 1.0000
Method: IRLS Log-Likelihood: -6291.4
Date: Sun, 22 Mar 2020 Deviance: 10745.
Time: 17:39:02 Pearson chi2: 4.10e+04
No. Iterations: 6
Covariance Type: nonrobust
coef std err z P>|z| [0.025 0.975]
Intercept -3.1221 0.406 -7.697 0.000 -3.917 -2.327
Income 0.0931 0.034 2.710 0.007 0.026 0.160
Sex 0.0043 0.041 0.105 0.916 -0.076 0.084
Age 0.5893 0.055 10.756 0.000 0.482 0.697
Size -0.0358 0.015 -2.340 0.019 -0.066 -0.006

Zero-Inflated Poisson Regression

from statsmodels.discrete.count_model import ZeroInflatedPoisson
ZeroInflatedPoisson(Y, X).fit().summary()
Optimization terminated successfully.
         Current function value: 1.575301
         Iterations: 27
         Function evaluations: 30
         Gradient evaluations: 30
/Users/kailu/.pyenv/versions/3.6.5/lib/python3.6/site-packages/statsmodels/base/model.py:512: ConvergenceWarning: Maximum Likelihood optimization failed to converge. Check mle_retvals
  "Check mle_retvals", ConvergenceWarning)
ZeroInflatedPoisson Regression Results
Dep. Variable: Visits No. Observations: 2728
Model: ZeroInflatedPoisson Df Residuals: 2723
Method: MLE Df Model: 4
Date: Sun, 22 Mar 2020 Pseudo R-squ.: 0.001982
Time: 18:15:05 Log-Likelihood: -4297.4
converged: True LL-Null: -4306.0
Covariance Type: nonrobust LLR p-value: 0.001877
coef std err z P>|z| [0.025 0.975]
inflate_const 1.0631 0.045 23.829 0.000 0.976 1.151
Income -0.0898 0.036 -2.461 0.014 -0.161 -0.018
Sex -0.1327 0.043 -3.096 0.002 -0.217 -0.049
Age 0.1144 0.063 1.809 0.070 -0.010 0.238
Size 0.0196 0.015 1.303 0.193 -0.010 0.049
intercept 1.8964 0.433 4.377 0.000 1.047 2.746

NBD Regression