Any statisticians out there?
johngti
Posts: 2,508
Just in case, what are your thoughts on this summary data (it’s from a piece of coursework)
Pearson’s PMCC at 0.52 (moderate correlation according to the textbook, but only just!)
Spearman’s at 0.38 so poor correlation there
My view - there’s no correlation. Even though Pearson’s is moderate, it’s only just moderate and Spearman’s is bad enough to say that there’s no correlation in the data.
Colleague’s view - moderate correlation in the data but no correlation in the ranks so more likely to have a linear relationship so go ahead with regression line etc.
I’m worried that the second approach just comes across as desperately looking for a correlation that isn’t there for the sake of jumping through a hoop.
Thoughts?
Pearson’s PMCC at 0.52 (moderate correlation according to the textbook, but only just!)
Spearman’s at 0.38 so poor correlation there
My view - there’s no correlation. Even though Pearson’s is moderate, it’s only just moderate and Spearman’s is bad enough to say that there’s no correlation in the data.
Colleague’s view - moderate correlation in the data but no correlation in the ranks so more likely to have a linear relationship so go ahead with regression line etc.
I’m worried that the second approach just comes across as desperately looking for a correlation that isn’t there for the sake of jumping through a hoop.
Thoughts?
0
Comments
-
Thoughts?
I'm glad I didn't do statistics.
The older I get, the better I was.0 -
Can’t blame you!capt_slog said:Thoughts?
I'm glad I didn't do statistics.
I think, for reference, that the answer is as follows. The student is looking to see if more highly paid footballers score more goals. So Pearson’s is the wrong correlation to look at. Because you’re comparing ranks, ie the highest paid should score more goals, it makes more sense to use spearman’s rank correlation coefficient.
I suspect that’s the approach needed anyway.0 -
Excellent. I think.0
-
Pretty much.johngti said:
Can’t blame you!capt_slog said:Thoughts?
I'm glad I didn't do statistics.
I think, for reference, that the answer is as follows. The student is looking to see if more highly paid footballers score more goals. So Pearson’s is the wrong correlation to look at. Because you’re comparing ranks, ie the highest paid should score more goals, it makes more sense to use spearman’s rank correlation coefficient.
I suspect that’s the approach needed anyway.
The test choice depends on the data distribution (amongst other things).
Neither of your variables have a normal distribution - they are highly skewed, so the data do not satisfy at least one if the underlying assumptions of the Pearson test. Think of the p value as a calibrated system - If the assumptions are not met then the significance value is flawed.
You can test for normality with Shapiro-Wilke, although the bleeding obvious test tells you that neither wages nor goals are normally distributed.
Spearman’s does not make an assumption about the distribution, so is appropriate, although less powerful than Pearson with smaller samples (provided the underlying assumptions ARE satisfied).0 -
Thanks for that. I hadn’t thought of the lack of a normal distribution being a problem (I’m not a natural statistician - avoided it in my degree!)0
-
I can tell you from simply watching football that there's no correlation.Ben
Bikes: Donhou DSS4 Custom | Condor Italia RC | Gios Megalite | Dolan Preffisio | Giant Bowery '76
Instagram: https://www.instagram.com/ben_h_ppcc/
Flickr: https://www.flickr.com/photos/143173475@N05/0 -
Well exactly0