Talk:Chi-squared test

From WikiProjectMed
Jump to navigation Jump to search

Some examples please?

The explanation is completely in theoretical terms. I'm trying to understand an article better, and this piece is absolutely no help in doing so. Depaderico 04:11, 22 October 2007 (UTC)[reply]

The specifics of the examples are in the articles linked to, rather than in this article. Certainly more of those could be added, and this article remains somewhat stubby. But if you follow those links, you'll find some examples. Michael Hardy 20:21, 22 October 2007 (UTC)[reply]
I repeat my earlier assertion that this article would be aided by an example. It is, of course, fine to link to examples; however, the article should contain some examples. We are talking about an important statistical method, and this is what people will find when they google it. We should really try to explain it as clearly as possible, so that your average Joe (who has a less than 50% chance of having a good background in statistics) can walk away from this article knowing what a Chi-squared test is. Although I'm not saying that the article does a poor job of explaining the test--no, in fact the article is brilliantly written--, a concrete example is most useful in understanding something of this nature. --Depaderico (talk) 00:20, 1 April 2011 (UTC)[reply]

Accuracy?

Whew. That last paragraph (after the table) is a blow-out non sequitor. Where did the p=0.5485 come from? —Preceding unsigned comment added by 70.79.12.65 (talk) 17:53, 13 January 2008 (UTC)[reply]

I'd like to know that too. I kinda forgot. --RoSeeker (talk) 17:12, 27 January 2008 (UTC)[reply]

Yeah, citing the equation that is used to calculate the 54.85% would be very beneficial. —Preceding unsigned comment added by 132.189.76.18 (talk) 19:46, 22 February 2008 (UTC)[reply]
The above editors are right. Also, I can't make sense of the statement that there is a 54 per cent probablility of seeing "this data" if the coin is fair. We all know that if we toss a fair coin 100 times the result could easily be 47-53 but it is not more likely than not that we get exactly 47-53. 48-52 would also come up pretty often. Itsmejudith (talk) 15:43, 12 March 2008 (UTC)[reply]
I fixed the wording so that the interpretation of the ~54% is correct. Baccyak4H (Yak!) 18:22, 12 March 2008 (UTC)[reply]
Clearer now, thanks. Itsmejudith (talk) 08:47, 13 March 2008 (UTC)[reply]

Calculating P from Chi2 and DoF

This article seems to be a good introduction, but could use a lot more detail such as-

  • Another example with >1 DoF (Degrees of Freedom)
  • DoF = (r-1)(c-1)
  • Where do the P-values come from? (how are they computed?)

If anyone knows a formula/algorithm for calculating a P-value from the Chi2 and degrees of freedom, please let me know.

--Karuna8 (talk) 18:14, 17 March 2008 (UTC)[reply]

Thank you. An example taking us through every step of calculating the Chi square of a 2x2 contingency table would seem to be a basic requirement. Itsmejudith (talk) 18:39, 17 March 2008 (UTC)[reply]

Re the above, there's more info at Pearson's chi-squared test. At one point this page (i.e. Chi-squared test was just a disambiguation page but it has slowly expanded, i think because it wasn't clear it was just meant to be a disambig page. Could just revert it to the being a disambig page , but I've been thinking that to prevent us going around the same slow circle again it might be better to move the page currently at Pearson's chi-squared test to Chi-squared test with a note at the top along the lines of:

This article is about Pearson's chi-squared test, the oldest and most common chi-squared test used with contingency tables. For other tests that also make use of the chi-squared distribution, see Chi-squared test (disambiguation).

After all, I don't think there's any question that the vast majority of users who type in "chi-squared test" are looking for Pearson's. Nor is there any historical question that Pearson's paper introduced the use of the symbol chi in this context, so calling it simply "the chi-squared test" seems quite reasonable.

On Karuna8's last point, calculating a P-value from the Chi2 and degrees of freedom requires calculation of the cumulative distribution function of the chi-squared distribution, so more details are at chi-squared distribution, but in a nutshell you need to calculate the Incomplete gamma function which isn't simple. In the past people looked it up in a table, but most people these days use statistical software that has the chi-squared distribution's cdf programmed into it. I wouldn't know how to calculate it or program it from scratch — I'm sure that it's built into various software libraries based on algorithms in the relevant standard textbooks but I'm equally sure it's not available on Wikipedia I'm afraid, nor would I see adding it as a high priority. Qwfp (talk) 18:47, 17 March 2008 (UTC)[reply]

Thanks, I don't think moving is necessary, I just added a 'see also' to the Pearson's page for 'more detail'. That should suffice. Also thanks for the leads, if anyone knows of a written algorithm I could follow, let me know. --Karuna8 (talk) 18:55, 17 March 2008 (UTC)[reply]
Agree- I changed my mind- I didn't quite understand the difference before. Since 'chi2 test' really means pearson's, I think the two should be merged. This page provides a good introduction and the Pearson's page has the more detailed parts. No disambig page is necessary since Yates is really just a modified Pearson's (my books call it Yates correction, not a different test). --Karuna8 (talk) 15:40, 18 March 2008 (UTC)[reply]
It could do with sorting out. Simplest would seem to be to make Chi-squared test a disambiguation page again and to move all the material currently here that provides an introduction to the Pearson chi-squared to the Pearson's chi-squared test article. As Karuna says, this can stand as the introduction and the more detailed material in the Pearson's article can simply follow. Itsmejudith (talk) 20:49, 18 March 2008 (UTC)[reply]

It's ridiculous to say that "chi-squared test" really means Pearson's. Such a merger is the opposite of what we need to do. Michael Hardy (talk) 16:41, 22 March 2008 (UTC)[reply]

Can you explain the difference then, because I don't see it. --Karuna8 (talk) 17:20, 22 March 2008 (UTC)[reply]

The difference is that Pearson's chi-squared test is used only for testing a null hypothesis that about sizes of subsets that a population has been partitioned into. If you throw a die, you can get any of six outcomes; a null hypothesis may say the die is "fair", meaning all six happen equally frequently, or, in the language of the previous sentence, that all six of those subsets of the population are of equal sizes. A chi-squared test generally is any statistical test in which the probability distribution of the test statistic, assuming the null hypothesis is true, is a chi-squared distribution. A simple example would be a table in which the null hypothesis says just that rows and columns are independent. That's not Pearson's chi-squared test, but it is a chi-squared test. There are many other chi-squared tests besides Pearsons. Michael Hardy (talk) 17:52, 22 March 2008 (UTC)[reply]

So would it be better to use a 2x2 contingency table, as I have seen done in introductory stats texts? For example, a company has managerial and non-managerial staff, male and female. The null hypothesis is that men and women are equally likely to be managers. We draw up a 2x2 grid with the numbers of workers actually found in the categories, compare with the expected and calculate the Pearson's chi-squared to see if the null can be excluded. Itsmejudith (talk) 11:40, 1 April 2008 (UTC)[reply]

What more?

Given that this article has been classed as high priority and is still classed as a stub, can people add discussion here of what needs to be done to improve things. Melcombe (talk) 14:13, 21 April 2008 (UTC)[reply]

Enough exemplar material to take a reader who has only vaguely heard of the test through to being able to use it in a real situation. That's my priority, anyway. Itsmejudith (talk) 14:28, 21 April 2008 (UTC)[reply]
And could we use the same 2x2 contingency table as in the contingency table article? Itsmejudith (talk) 14:29, 21 April 2008 (UTC)[reply]
I think the discussion in the section above concludes that there are many different tests that can reasonably be called chi-squared tests, so that "the test" is not quite appropriate. Potentially what is needed here is a set of simple examples of the different tests to help readers to distinguish between them, but leaving most details to other articles. Melcombe (talk) 11:50, 22 April 2008 (UTC)[reply]

I have added a section on testing of the variance of a normal population, and I think that there is now enough to move this article out of the stub class ... so I have done so. Melcombe (talk) 16:21, 4 August 2008 (UTC)[reply]

First Line

A chi-squared test (also chi-squared or test) is any statistical hypothesis test in which the test statistic has a chi-squared distribution when the null hypothesis is true, or any in which the probability distribution of the test statistic (assuming the null hypothesis is true) can be made to approximate a chi-squared distribution as closely as desired by making the sample size large enough.

The first line saying that the null hypothesis is true isn't always true, but more like not been proved otherwise. —Preceding unsigned comment added by 202.89.166.150 (talk) 22:48, 16 November 2008 (UTC)[reply]

Chi-squared test for variance in a normal population

This entire section must be removed / deleted as the author is confusing a normal distribution with that of a chi square distribution in his/her explanation. The normal variance of the Chi-Square Test is based upon that of a chi square distributive function but the explanation points to a normal distribution when the end user clicks on the hyperlink. To tell you guys the absolute truth, this entire wikipage should be re-written from scratch because the definitions and explanations seem to be written from the perspective of someone who is barely familiar with the subject matter shiznaw (talk) 19:31, 2 December 2012 (UTC)shiznaw@gmail.com John Allen Shaw, Econometrics, MA Univ of Utah[reply]


The first sentence

The first sentence is ridiculously complicated! Statistics is very poorly explained on wikipedia, and this is one of the worst examples. — Preceding unsigned comment added by 137.43.182.222 (talk) 17:17, 12 December 2012 (UTC)[reply]

I agree. It needs a simple explanation in one paragraph so the reader gets the point, before giving the technical definition and any further explanation. And yes, examples would be good.I looked up this article to find out about the topic, and it sunk so quickly into detail that I had to use what I already know to understand it. (On first impression, I had no idea what it's talking about.)

I also agree ... and does it REALLY mean 'when the null hypothesis is true' - i.e.when the non-null hypothesis is false? It is a terrible article for a non-statistician to understand! — Preceding unsigned comment added by 193.34.187.245 (talk) 08:51, 25 September 2018 (UTC)[reply]

Title article

Why is it that the article is titled "chi-squared" test when the initial sentence describes this name as "infrequently used" and talks about the "chi-square" test? Rakimmitt (talk) 15:57, 21 November 2014 (UTC)[reply]

I made the change, but don't know how to change the title since I'm a newbie. "Chi-squared" is not correct or at least is infrequent. I changed it to "chi-square" throughout.juanTamad 09:25, 22 November 2014 (UTC) I made the title change by "moving" the articlejuanTamad 09:50, 22 November 2014 (UTC)

chi-squared not chi-square

An anonymous user has made a good point. It should be chi-squared, not chi-square, i.e. chi that has been squared. This page's title has bugged me for a long time. Can we move it to chi-squared test? Tayste (edits) 21:46, 20 March 2015 (UTC)[reply]

This is incorrect. The correct name as per AMA is chi-square. This page name should be changed.--118.238.204.132 (talk) 06:21, 24 May 2016 (UTC)[reply]
The AMA does not have global juristiction; it represents just one country. Tayste (edits) 19:33, 26 February 2018 (UTC)[reply]

@Tayste, Jtamad, and Rakimmitt: This issue keeps coming up, now user @Tdivala: again changed all occurences of "chi-squared" back to "chi-square". We need a consensus on this, rather than just flipping it around on a whim. -- intgr [talk] 13:04, 26 February 2018 (UTC)[reply]

Thanks for the alert. I've reverted that. Tayste (edits) 19:33, 26 February 2018 (UTC)[reply]
The article title should be 𝜒² test. That's what Pearson called it in his 1900 paper (doi:10.1080/14786440009463897). How you pronounce 𝜒² is entirely up to you. Qwfp (talk) 19:53, 26 February 2018 (UTC)[reply]
If that's what Pearson called it, and because chi has been squared, I agree with squared. I sent a tweet to https://twitter.com/AMAManual?lang=en, see what they say. JuanTamad (talk) 10:13, 30 March 2018 (UTC)[reply]
in the documentation for an R package, it's referred to as chi-squared: "
statistic the value of Friedman's chi-squared statistic.

I think that's typical, so I don't know what is the basis for chi-square with the AMA.JuanTamad (talk) 02:55, 31 March 2018 (UTC)[reply]

The APA style -- which is widely used as a reference to scholarly writing -- appears to be "chi-square test". The logic is probably that the statistic on which the test rests is typically called "chi square" (or the "chi-square statistic"), not "chi square". (Note the Wikipedia rule WP:OR). Statistics books seem to use it more often without the d. Google Scholar has >1 million hits for "chi-square test" and only 215,000 for "chi-squared test". Calling it 𝜒² test, with appropriate redirections, seems like a fair solution. Strasburger (talk) 15:48, 24 May 2018 (UTC)[reply]

As a holder of a master's degree in Statistics, I can attest that the standard term is "chi-square", not "chi-squared". Every single textbook I used throughout my undergraduate and graduate studies that covered this topic did so without the final "d". This seriously needs to be changed. Bubbha (talk) 03:08, 14 May 2020 (UTC)[reply]

I am also a holder of a master's degree in Statistics. As such, I assert that your particular view from your particular corner of the world does not, in fact, represent the entire world view on the matter. Please put it back the way it was. Tayste (edits) 08:37, 25 May 2020 (UTC)[reply]
I can't find any evidence to suggest that 'chi-squared' is used more than 'chi-square', in any part of the world. Google Trends, Google ngrams, and StackExchange all point to chi-square being used more frequently. The only semi-authoritative source in favor of 'chi-squared' seems to be Wikipedia itself, which is quite misleading. I can't see any good reason of using chi-squared over chi-square. I highly urge switching to chi-square. Phlaxyr (talk) 03:38, 9 November 2021 (UTC)[reply]
Well I have a PhD in statistics so does that mean I have more authority than you two? Hopefully the fallacy that a degree bestows authority to declare the correct name is obvious enough? That's like a PhD in anatomy saying, "I'm sorry, you're wrong: that's called 'the thigh' not 'the leg,' and since I have a degree in anatomy, I'm right." Similarly, my son has a silly science teacher who rails whenever someone uses the term "dirt" instead of "soil." Fortunately, the anatomist and the science teacher, though they may have degrees in science, are not experts on whether their preferred terms are being applied correctly in general usage. Chafe66 (talk) 22:17, 23 August 2020 (UTC)[reply]

Example chi-squared test for categorical data

In the section "Example chi-squared test for categorical data", one formula reads

(3-1)(4-1)=5

which seems to be wrong, is not it?

Needs intro for general audience

The beginning of the article reads like a statistics textbook. Should be simplified/made more accessible so that somebody without background in statistics can understand it. (In a few words/one sentence, what in layman's terms is a "null hypothesis," what is a "chi-squared distribution," etc. I shouldn't have to immediately read 3 or 4 other articles just to get a general idea what this article is about.) 172.58.46.133 (talk) 04:30, 22 June 2019 (UTC)[reply]

The beginning of the article is stupid! The usual meaning of "chi square test" is not "a test statistic where one uses the chi square distribution". Richard Gill (talk) 15:53, 2 January 2020 (UTC)[reply]