The growing number of papers finding cases of the Clever Hans effect raises important questions for NLP research, the most obvious one being how the effect can be prevented.
To prevent the Clever Hans effect, we need to aim for datasets without spurious patterns, and we need to assume that a well-performing model didn’t learn anything useful until proven otherwise. Let's make suggestions for improving datasets first, later improving models!

Comments