Tuesday 10 October 2017

ethics - How to argue against questionable research practices such as P-hacking and Harking?


I have come into conflict with co-authors when being asked to do things that I consider to be questionable.




  1. Once I was told to try every possible specification of a dependent variable (count, proportion, binary indicator, you name it) in a regression until I find a significant relationship. That is it, no justification for choosing one specification over another besides finding significance. The famous fishing expedition for starfish (also known as P-Hacking).





  2. In another occasion I was asked to re-write a theory section of a paper to reflect an incidental finding from our analysis, so that it shows up as if we were asking a question about the incidental finding and had come up with the supported hypothesis a priori. The famous hypothesising after results are known (Harking).




In both cases I refused to comply and explained my reasoning, what led to conflicts with the other party. I tried my best to not sound accusatory (not to give the impression that I doubt the ethics of the other party), but it nonetheless led to attrition and a worsening of the working relationship. In the long argument that followed, I was told that 'social science is not done as the natural sciences,' and that I was 'too inflexible,' 'too positivist,' and that everybody does these things that I was being asked to do. The argument culminated with me being asked to 'stop obstructing the progress of the paper,' what made me feel very frustrated.


Since then I have seen several cases of what I suspect to be this type of research practice. For example, PhD students coming to me to ask about what they should change in their models so that their results come out significant, and people working at the same computer lab as me asking me for the same type of help.


I do consider these things to be seriously questionable from an ethical point of view, and would like to be able to argue against them effectively. However, the other parties are usually experienced researchers or students under the supervision of an experienced researcher. As a young researcher, I feel that I'm at a disadvantage when arguing against. It is often the case that I'm arguing against the instructions of someone who has more experience, publications, and, supposedly, knowledge than I do.


Is this one of those cases where we can't do much but try to be the 'change that we want to bring about,' shud it, and just make sure that we are doing the right things ourselves? Should we speak up more often? If so, any good strategies to be more effective and convincing?


p.s. The tag is social sciences because of my field, but I reckon that this happens in other areas as well, and I welcome input from other fields.





EDIT 1: In example 2), at no moment anyone suggested that we would confirm the new hypothesis in a new set of data. The intention was to pretend that we got it right from the onset, which is why I objected.


EDIT 2: Just to make clear. I am aware of the right way of doing these things (i.e. cross validation, confirmatory analysis in a new dataset, penalising for multiple statistical tests, etc.). This is a question about how to argue that p-hacking and harking are not the way to go.


EDIT 3: I was unaware of the strong connotations of the word misconduct. I have edited and replaced it with 'questionable research practices'



Answer



Kenji, For the last few years, I have given a continuing education course called Common Mistakes in Using Statistics: Spotting Them and Avoiding Them. I hope that some of the approaches I have taken might be helpful to you in convincing your colleagues that changes are needed.


First, I don't start out saying that things are unethical (although I might get to that eventually). I talk instead about mistakes, misunderstandings, and confusions. I also at some point introduce the idea that "That's the way we've always done things" doesn't make that way correct.


I also use the metaphor of "the game of telephone" that many people have played as a child: people sit in a circle; one person whispers something into the ear of the person next to them; that person whispers what he/she hears to the next person, and so on around the circle. The last person says what they hear out loud, and the first person reveals the original phrase. Usually the two are so different that it's funny. Applying the metaphor to statistics teaching: someone genuinely is trying to understand the complex ideas of frequentist statistics; they finally believe they get it, and pass their perceived (but somewhat flawed) understanding on to others; some of the recipients (with good intentions) make more oversimplifications or misinterpretations and pass them on to more people -- and so on down the line. Eventually a seriously flawed version appears in textbooks and becomes standard practice.


The notes for my continuing ed course are freely available at http://www.ma.utexas.edu/users/mks/CommonMistakes2015/commonmistakeshome2015.html. Feel free to use them in any way -- e.g., having an informal discussion seminar using them (or some of them) as background reading might help communicate the ideas. You will note that the first "Common mistake" discussed is "Expecting too much uncertainty." Indeed that is a fundamental mistake that underlies a lot of what has gone wrong in using statistics. The recommendations given there are a good starting point for helping colleagues begin to see the point of all the other mistakes.


The course website also has links to some online demos that are helpful to some in understanding problems that are often glossed over.



I've also done some blogging on the general theme at http://www.ma.utexas.edu/blogs/mks/. Some of the June 2014 entries are especially relevant.


I hope these suggestions and resources are helpful. Feel free to contact me if you have any questions.


No comments:

Post a Comment

evolution - Are there any multicellular forms of life which exist without consuming other forms of life in some manner?

The title is the question. If additional specificity is needed I will add clarification here. Are there any multicellular forms of life whic...