Notation to specify crossed random effects mixed effects model in python

by StreetHawk   Last Updated May 16, 2019 00:19 AM

I'm trying to model a binary response variable based on some continuous variables (fixed effects) and categorical variables (random effects). Here is the model equation: $Y_{ijk}=\beta X_i + Z_j + Z_k + \epsilon$.

$Y_{ijk}$: whether the session i got a click or not

$X_i$: fixed effect features of customer in session i: num_purchases_cat1, num_purchases_cat2, num_purchases_cat3

$Z_j$: random effect: ad_category (100 categories)

$Z_k$: random effect: ad_price (5 buckets)

Say my data_train contains all these columns: [clicked, num_purchases_cat1, num_purchases_cat2, num_purchases_cat3, ad_category, ad_price]. The two random effects ad_category and ad_price are independent and hence I'd like to fit a crossed effects mixed effects model.

Python's Documentation states that I need to treat the entire dataset as a single group so here's what I'm trying:

import statsmodels.regression.mixed_linear_model as mlm
lmm = mlm.MixedLM(data_train.clicked, data_train[['num_purchases_cat1', 'num_purchases_cat2', 'num_purchases_cat3']], groups=np.ones(data_train.shape[0]))

Now I'm struggling how to specify exog_re and exog_vc. Do I simply put data_train[['ad_category','ad_price']] or should I transform it to one-hot-encoding? How does this change if I want to have Random slopes vs Random Intercepts only?

Related Questions

Updated July 22, 2018 16:19 PM

Updated February 10, 2017 14:08 PM

Updated June 09, 2017 12:19 PM

Updated August 19, 2017 12:19 PM