Yesterday, I came across a paper titled “What are the best covariates for predicting Y?”¹ Having a question as a title is tricky; sometimes it’s good, sometimes it isn’t. Joshua Schimel wrote an excellent guideline² on this topic. Based on that guideline, I would like to analyse this title in this post.
As I read this title, I imagined myself as a professor (my aspiration, after all). A student came up to me and asked the exact same question. I would think to myself, “Best in terms of what?” The question is not specific enough. It is certainly not possible to have a set of covariates that is the best in all situations. So, I would have to ask the students to be more specific; I might have to list several situations and say which set of covariates is generally considered best for each of these situations and why; and I must then conclude that this is the state-of-the-art knowledge at the moment, but when a new situation comes up, more research is needed.
When a question is asked in a paper’s title, the reader expects that the answer will be provided in the body of the paper. But in this case, most readers may react somewhat similarly to how I did. They may think that the paper had better qualify the claims specifically (as above). This means the the reader is prompted to be careful and skeptic. In other words, he knows that he will be disappointed because the original question will not be answered, but only a more specific one will be.
So, why not ask a more specific question from the beginning? From the abstract, it seemed that a more suitable question should be in which circumstance is which covariate good to use. That is what they did in the paper. They considered 6 covariates in 3 classes and tried different combinations and assessed goodness of fit. They concluded that one class of covariates was the best for one type of model, another the class for another type of model, and the third class was never the best. They did not give a best overall combination.
- Out of respect for the authors, I did not show the actual title, but a paraphase. Schimel also did this trick in his Writing Science book.