In other posts, I have argued (based on what other, smarter people have said) that it is important for us to recruit larger samples, to pre-register hypothesis-driven work, and to openly share study materials wherever possible.
However, elsewhere I (and some other folks) have argued that all of this is a bit moot if we don’t get measurement right. That is “there is little value in running a high-powered study, in which data are analysed in a transparent, reproducible manner, if we have not effectively measured the variables we are interested in” (Flake & Fried, 2020).
To give an example, let’s imagine we are running a study looking at the link between corpus callosum volume and daily frequency of hallucinations. And let’s say that we have a huge sample, and have ticked off all of the other ‘open science’ boxes. That stuff is a bit pointless if our measure of hallucinations is very imprecise and can’t reliably tell us whether someone hallucinated zero times on a particular day, or multiple times.
The measurement problems we have in hallucinations research probably aren’t as stark as that, but measuring hallucinations is inherently very difficult (because of the nature of hallucinations) and so we do face some measurement problems. E.g., does it make sense for us to include items about vivid imagery and/or intrusive thoughts on measures of hallucinatory experiences? I don’t really think so, but some people do. So, I think that we may need to look again at the validity of some of our measures.
A separate issue relates to how reliably our tasks measure the variables we are interested in. This is something we typically don’t think about when using a task, but we should be. In other parts of psychiatry/clinical psychology, there’s evidence that some tasks do not measure variable-of-interest reliably, and it is possible that this is also true in hallucinations research (we provided some evidence re: that in this – https://www.tandfonline.com/doi/full/10.1080/13546805.2021.1999224 – article).
To learn more about optimal versus suboptimal measurement practices and to read some suggestions for good practice, read this – https://journals.sagepub.com/doi/full/10.1177/2515245920952393 – paper by Fried and Flake. And to learn more about estimating the reliability of measurements made by cognitive tasks, have a look at this – https://journals.sagepub.com/doi/full/10.1177/2515245919879695 – paper from Parsons and colleagues. That paper also describes how to generate reliability estimates using an R package. But if you (like me) struggle with R, then have a look at this – https://www.sciencedirect.com/science/article/pii/S2590260120300102 – paper which describes an Excel-based too that serves the same purpose.
