# Any statisticians out there?

johngti
Posts:

**2,481**
Just in case, what are your thoughts on this summary data (it’s from a piece of coursework)

Pearson’s PMCC at 0.52 (moderate correlation according to the textbook, but only just!)

Spearman’s at 0.38 so poor correlation there

My view - there’s no correlation. Even though Pearson’s is moderate, it’s only just moderate and Spearman’s is bad enough to say that there’s no correlation in the data.

Colleague’s view - moderate correlation in the data but no correlation in the ranks so more likely to have a linear relationship so go ahead with regression line etc.

I’m worried that the second approach just comes across as desperately looking for a correlation that isn’t there for the sake of jumping through a hoop.

Thoughts?

Pearson’s PMCC at 0.52 (moderate correlation according to the textbook, but only just!)

Spearman’s at 0.38 so poor correlation there

My view - there’s no correlation. Even though Pearson’s is moderate, it’s only just moderate and Spearman’s is bad enough to say that there’s no correlation in the data.

Colleague’s view - moderate correlation in the data but no correlation in the ranks so more likely to have a linear relationship so go ahead with regression line etc.

I’m worried that the second approach just comes across as desperately looking for a correlation that isn’t there for the sake of jumping through a hoop.

Thoughts?

0

## Posts

3,872I'm glad I didn't do statistics.

The older I get, the better I was.

2,481I think, for reference, that the answer is as follows. The student is looking to see if more highly paid footballers score more goals. So Pearson’s is the wrong correlation to look at. Because you’re comparing ranks, ie the highest paid should score more goals, it makes more sense to use spearman’s rank correlation coefficient.

I suspect that’s the approach needed anyway.

25,582😶

We're in danger of confusing passion with incompetence- @ddraver

2,4814,844The test choice depends on the data distribution (amongst other things).

Neither of your variables have a normal distribution - they are highly skewed, so the data do not satisfy at least one if the underlying assumptions of the Pearson test. Think of the p value as a calibrated system - If the assumptions are not met then the significance value is flawed.

You can test for normality with Shapiro-Wilke, although the bleeding obvious test tells you that neither wages nor goals are normally distributed.

Spearman’s does not make an assumption about the distribution, so is appropriate, although less powerful than Pearson with smaller samples (provided the underlying assumptions ARE satisfied).

2,4819,686Bikes: Donhou DSS4 Custom | Condor Italia RC | Gios Megalite | Dolan Preffisio | Giant Bowery '76

Instagram: https://www.instagram.com/ben_h_ppcc/

Flickr: https://www.flickr.com/photos/[email protected]/

2,481