inf-428-data-analytics-online

Discussion 2 Statistics

Initial post due February 5th

Replies due February 12th

In your initial post

Answer: Why do we use t-tests and p-values?? What is the null hypothesis?? How do we reject the null hypothesis?

Then Read the following

What is P-hacking?? What is the reproducibility crisis?? Do some research on the reproducibility crisis and post a link to another article on it.

In your reply

Discuss the articles that the other students have posted in their initial posts.

My thoughts on the ‘reproducibility’ crisis

Just recently I got the following question.  “If researcher actually can’t replicate data, causing reproducibility crisis what part of science do we even know?”

Good question

Even though there is a reproducibility crisis the scientific literature is still a good place to find information.  Acknowledging the reproducibility crisis does not mean you trust other sources over scientific literature, it means you are skeptical about everything, including scientific literature but even more skeptical about data and experiments that have not been peer reviewed.

For example scientists say there is global warming.  Can you trust them?  Most likely.  Many, many scientists have looked at this and reviewed each others work.  Besides that the actions scientists recommend (use less fossil fuels) will have to be made anyway.  I think it is OK to be skeptical of everything, as long as you are genuinely skeptical of everything.  Don’t fall in the trap of trusting “alternative” sources.  For example say a small group of scientists don’t believe in global warming, and even published papers in some journals proving there is no global warming.  Should you trust these guys?  If there is a reproducibility crisis, it affects them too, and their work may not be reproducible.

A good rule of thumb is that the more people looking at a problem and the older a problem is, the more trustworthy the literature.  Results pertaining to a problem that many people have been looking at, for many years, are more trustworthy because in fact they have been replicated.  Results on a new problem, or a niche problem, that not many people have looked at are less trustworthy and more susceptible to replication issues. 

BN