Any statisticians out there?

Just in case, what are your thoughts on this summary data (it’s from a piece of coursework)

Pearson’s PMCC at 0.52 (moderate correlation according to the textbook, but only just!)

Spearman’s at 0.38 so poor correlation there

My view - there’s no correlation. Even though Pearson’s is moderate, it’s only just moderate and Spearman’s is bad enough to say that there’s no correlation in the data.

Colleague’s view - moderate correlation in the data but no correlation in the ranks so more likely to have a linear relationship so go ahead with regression line etc.

I’m worried that the second approach just comes across as desperately looking for a correlation that isn’t there for the sake of jumping through a hoop.

Thoughts?

Comments

  • capt_slog
    capt_slog Posts: 3,952
    Thoughts?

    I'm glad I didn't do statistics.


    The older I get, the better I was.

  • johngti
    johngti Posts: 2,508
    capt_slog said:

    Thoughts?

    I'm glad I didn't do statistics.

    Can’t blame you!

    I think, for reference, that the answer is as follows. The student is looking to see if more highly paid footballers score more goals. So Pearson’s is the wrong correlation to look at. Because you’re comparing ranks, ie the highest paid should score more goals, it makes more sense to use spearman’s rank correlation coefficient.

    I suspect that’s the approach needed anyway.
  • ddraver
    ddraver Posts: 26,405
    Yeah I agree...

    😶
    We're in danger of confusing passion with incompetence
    - @ddraver
  • johngti
    johngti Posts: 2,508
    Excellent. I think.
  • Mad_Malx
    Mad_Malx Posts: 5,018
    johngti said:

    capt_slog said:

    Thoughts?

    I'm glad I didn't do statistics.

    Can’t blame you!

    I think, for reference, that the answer is as follows. The student is looking to see if more highly paid footballers score more goals. So Pearson’s is the wrong correlation to look at. Because you’re comparing ranks, ie the highest paid should score more goals, it makes more sense to use spearman’s rank correlation coefficient.

    I suspect that’s the approach needed anyway.
    Pretty much.

    The test choice depends on the data distribution (amongst other things).
    Neither of your variables have a normal distribution - they are highly skewed, so the data do not satisfy at least one if the underlying assumptions of the Pearson test. Think of the p value as a calibrated system - If the assumptions are not met then the significance value is flawed.
    You can test for normality with Shapiro-Wilke, although the bleeding obvious test tells you that neither wages nor goals are normally distributed.

    Spearman’s does not make an assumption about the distribution, so is appropriate, although less powerful than Pearson with smaller samples (provided the underlying assumptions ARE satisfied).
  • johngti
    johngti Posts: 2,508
    Thanks for that. I hadn’t thought of the lack of a normal distribution being a problem (I’m not a natural statistician - avoided it in my degree!)
  • Ben6899
    Ben6899 Posts: 9,686
    I can tell you from simply watching football that there's no correlation.
    Ben

    Bikes: Donhou DSS4 Custom | Condor Italia RC | Gios Megalite | Dolan Preffisio | Giant Bowery '76
    Instagram: https://www.instagram.com/ben_h_ppcc/
    Flickr: https://www.flickr.com/photos/143173475@N05/
  • johngti
    johngti Posts: 2,508
    Well exactly