When applying LIME to simulated data with a binary outcome, LIME results do not always match the data generating process. This arises because the family argument in the call to glmnet is set to gaussian by default and does not reflect the model type (classification versus regression). See e.g.,
https://github.com/thomasp85/lime/blob/0281c56e6da697c686e2d7761dfc7a658decb3ca/R/lime.R#L48
and
https://github.com/thomasp85/lime/blob/0281c56e6da697c686e2d7761dfc7a658decb3ca/R/lime.R#L56
I was wondering whether this is intentional/documented somewhere? As one possible fix, one could add a family argument to the model_permutations function that then can be used in the glm.fit and glmnet function calls. If you'd be willing to add a corresponding PR, I could prepare one.
When applying LIME to simulated data with a binary outcome, LIME results do not always match the data generating process. This arises because the
familyargument in the call toglmnetis set togaussianby default and does not reflect the model type (classification versus regression). See e.g.,https://github.com/thomasp85/lime/blob/0281c56e6da697c686e2d7761dfc7a658decb3ca/R/lime.R#L48
and
https://github.com/thomasp85/lime/blob/0281c56e6da697c686e2d7761dfc7a658decb3ca/R/lime.R#L56
I was wondering whether this is intentional/documented somewhere? As one possible fix, one could add a
familyargument to themodel_permutationsfunction that then can be used in theglm.fitandglmnetfunction calls. If you'd be willing to add a corresponding PR, I could prepare one.