this post was submitted on 11 Oct 2023
8 points (100.0% liked)

The R Project for Statistical Computing

21 readers
1 users here now

Everything about the R programming language.

Rules

  1. No bigotry

Check out

founded 4 years ago
MODERATORS
 

P values?
Do they account solely for sampling error (therefore irrelevant when population data is available) OR do they serve to asses the likelihood of something being due to chance in other ways (therefore relevant for studies with population data)?

Any links or literature are welcome :)

@rstats @phdstudents @datascience @socialscience @org_studies

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 2 points 1 year ago (6 children)

It really depends on what you mean by "population data". If you mean that you have data on every person (or object, or whatever your research is about) in the population you are interested about, then the is no need for p-values. The mean you calculate IS the actual population mean and there is no room for error (assuming each measurement is correct). If you just mean "a big dataset from the population" the inference statistics can still make sense.

One thing to consider is that mathematically a t- or z-test always assumes that the population is infinitely large (the confidence interval reaches zero at infinity), while in reality, as described above, your confidence interval should already be zero when your sample size is equal to the actual population size.

Hope that helps. ;)

[–] [email protected] 0 points 1 year ago (5 children)

@arandomthought
I read some similar comments online, but there were also positions contrary, but I think this makes sense.

And I didn't know about the infinite population thing, that is interesting.

If I may a follow up: despite p values, regression models and correlation tests can still be interesting to apply to census data to measure effect sizes and such, right?

[–] [email protected] 1 points 1 year ago (1 children)

Sure, even if you had all the data on your whole population (and therefore p-values "wouldn't make sense") a regression could still tell you something useful about that population. It can for example let you estimate how strongly variable X influences variable Y (or at least how strongly they are related. Causality is a separate issue), or what value of Y we would expect for someone new in the population with a certain value of X.

[–] [email protected] 1 points 1 year ago

@arandomthought that's very helpful, thank you so much!

load more comments (3 replies)
load more comments (3 replies)