Offsets in predict.glm(), given that the fitted poisson model already considered offsets - R

Question

I want to predict a fitted poisson glm on newdata, given that it was fitted using offset=log(Exposure), but I get confused with the inclusion of the term "offset" inside of predict.glm(). I do not understand if it is necessary to include it and, if I want to include it, how should I do that with an equal impact as the offset used in fitting?

To clear my doubts, I created a dataset from scratch and I decided to compare three possible approaches:

predict(glm.model, data=newdata, type="response", offset=log(Exposure));
predict(glm.model, data=newdata, type="response", offset=Exposure);
predict(glm.model, data=newdata, type="response");

However, when comparing the results, I was surprised to find all three approaches resulted in the same predictions. Conclusion: offset term in predict.glm() is ignored? I still ask this, because I am a new to R and my procedure to check it may be wrong.

Also, even though similar questions may have been posted, answers are scarce and inconsistent with this result.

set.seed(1)
exp<-c(1:100)/100
dummy<-rep(c(0,1), each=50)
y<-floor(rpois(100,dummy*exp*10)) #count data
DTtest<-as.data.frame(exp)
DTtest$D<-dummy
DTtest$y<-y
gtest<-glm(y~D, offset=log(exp), family=poisson(link="log"), data=DTtest, trace=TRUE)
DTtest1<-predict(gtest, newdata=DTtest, type="response", offset=log(exp)) #predict1
DTtest2<-predict(gtest, newdata=DTtest, type="response", offset=exp) #predict2
DTtest3<-predict(gtest, newdata=DTtest, type="response") #predict3
sum(DTtest1==DTtest2)
sum(DTtest2==DTtest3)
sum(DTtest1==DTtest3)

As a result for the sums I got: 100, 100, 100, which is equal to the length of the data used to predict (I also used DTtest as newdata). This means all predictions have the same value between the 3 approaches.

I was expecting the values to differ due to different impact of the offset explicit in predict(), but the result says otherwise.

Ben Bolker · Accepted Answer · 2024-09-10 20:25:15Z

2

Yes, offset is ignored.

The help package for ?predict.glm says only:

...: further arguments passed to or from other methods.

This is not very helpful. However, if we look at the code for predict.glm we don't see ... used anywhere (it would be passed to another function). (If you insist on a code-based solution, we can search like this:

any(grepl("...", deparse(body(stats:::predict.glm)), fixed = TRUE))
## FALSE

If you wanted to predict with other values of the offset you would set up a newdata data frame with different values of exp. (By the way, it's probably a bad idea to use exp as a variable name since it's the same as the name of the built-in exponential function ... most of the time you can get away with this but occasionally it will cause confusing errors.)

answered Sep 10, 2024 at 20:25

Ben Bolker

230k26 gold badges405 silver badges497 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

rmcm22 Over a year ago

Thank you! I only used "exp" for this trial, ofc. So, just to clarify, predict.glm() ignores the argument "offset" inside it and uses the collumn in newdata which corresponds to the one used in fitting the model? So, if I used offset=log(Exposure) in fitting, then in predict.glm() my "Exposure" collumn will be used with a log scale? Thanks in advance!

Ben Bolker Over a year ago

yes. (You can always run a short experiment to check ...)

Collectives™ on Stack Overflow

Offsets in predict.glm(), given that the fitted poisson model already considered offsets - R

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related