Forum home Road cycling forum The cake stop

Any statisticians out there?

Just in case, what are your thoughts on this summary data (it’s from a piece of coursework)

Pearson’s PMCC at 0.52 (moderate correlation according to the textbook, but only just!)

Spearman’s at 0.38 so poor correlation there

My view - there’s no correlation. Even though Pearson’s is moderate, it’s only just moderate and Spearman’s is bad enough to say that there’s no correlation in the data.

Colleague’s view - moderate correlation in the data but no correlation in the ranks so more likely to have a linear relationship so go ahead with regression line etc.

I’m worried that the second approach just comes across as desperately looking for a correlation that isn’t there for the sake of jumping through a hoop.

Thoughts?

Posts

  • capt_slogcapt_slog Posts: 3,872
    Thoughts?

    I'm glad I didn't do statistics.


    The older I get, the better I was.

  • johngtijohngti Posts: 2,481
    capt_slog said:

    Thoughts?

    I'm glad I didn't do statistics.

    Can’t blame you!

    I think, for reference, that the answer is as follows. The student is looking to see if more highly paid footballers score more goals. So Pearson’s is the wrong correlation to look at. Because you’re comparing ranks, ie the highest paid should score more goals, it makes more sense to use spearman’s rank correlation coefficient.

    I suspect that’s the approach needed anyway.
  • ddraverddraver Posts: 25,582
    Yeah I agree...

    😶
    We're in danger of confusing passion with incompetence
    - @ddraver
  • johngtijohngti Posts: 2,481
    Excellent. I think.
  • Mad_MalxMad_Malx Posts: 4,844
    johngti said:

    capt_slog said:

    Thoughts?

    I'm glad I didn't do statistics.

    Can’t blame you!

    I think, for reference, that the answer is as follows. The student is looking to see if more highly paid footballers score more goals. So Pearson’s is the wrong correlation to look at. Because you’re comparing ranks, ie the highest paid should score more goals, it makes more sense to use spearman’s rank correlation coefficient.

    I suspect that’s the approach needed anyway.
    Pretty much.

    The test choice depends on the data distribution (amongst other things).
    Neither of your variables have a normal distribution - they are highly skewed, so the data do not satisfy at least one if the underlying assumptions of the Pearson test. Think of the p value as a calibrated system - If the assumptions are not met then the significance value is flawed.
    You can test for normality with Shapiro-Wilke, although the bleeding obvious test tells you that neither wages nor goals are normally distributed.

    Spearman’s does not make an assumption about the distribution, so is appropriate, although less powerful than Pearson with smaller samples (provided the underlying assumptions ARE satisfied).
  • johngtijohngti Posts: 2,481
    Thanks for that. I hadn’t thought of the lack of a normal distribution being a problem (I’m not a natural statistician - avoided it in my degree!)
  • Ben6899Ben6899 Posts: 9,686
    I can tell you from simply watching football that there's no correlation.
    Ben

    Bikes: Donhou DSS4 Custom | Condor Italia RC | Gios Megalite | Dolan Preffisio | Giant Bowery '76
    Instagram: https://www.instagram.com/ben_h_ppcc/
    Flickr: https://www.flickr.com/photos/[email protected]/
  • johngtijohngti Posts: 2,481
    Well exactly
Sign In or Register to comment.