"정규 분포"의 두 판 사이의 차이

수학노트
둘러보기로 가기 검색하러 가기
 
(사용자 2명의 중간 판 57개는 보이지 않습니다)
1번째 줄: 1번째 줄:
<h5>간단한 소개</h5>
+
==개요==
  
* 고딩과정의 통계에서는 정규분포의 기본적인 성질과 정규분포표 읽는 방법을 배움.
+
* 고교 과정의 통계에서는 정규분포의 기본적인 성질과 정규분포표 읽는 방법을 배움.
*  정규분포의 확률밀도 함수는 다음과 같음이 알려져 있음.<br><math>\frac{1}{\sigma \sqrt{2\pi} } \exp \left(-\frac{(x-\mu)^2}{2\sigma ^2} \right)</math><br>
+
평균이 <math>\mu</math>, 표준편차가 <math>\sigma</math>인 정규분포의 <math>N(\mu,\sigma^2)</math>의 확률밀도함수, 즉 가우시안은 다음과 같음이 알려져 있음.:<math>\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)</math>
* 이 확률밀도함수가 어떻게 해서 얻어지는가 하는 것은 일반적인 고등학교 수준에서는 약간 무리가 있지만, 호기심이 있는 학생들을 한번 도전해 보는 것이 괜찮아 보임.
+
* 아래에서는 이 확률밀도함수가 어떻게 해서 얻어지는가를 보임.(기본적으로는 가우스의 증명)
* 방법1. 이항분포에 대한 중심극한정리를 통한 방법<br>
+
* 가우시안의 형태를 얻는 또다른 방법으로 [[드무아브르-라플라스 중심극한정리]] 를 참조.
** [http://bomber0.byus.net/index.php/2008/07/06/680 드무아브르의 중심극한정리(i)]
+
 
** [http://bomber0.byus.net/index.php/2008/07/12/686 드무아브르의 중심극한정리(ii) : 스털링이 가져간 영광]
+
 
** [http://bomber0.byus.net/index.php/2008/07/12/687 드무아브르의 중심극한정리(iii) : 숫자 파이와 동전던지기][http://bomber0.byus.net/index.php/2008/07/14/688 ]
+
=='오차의 법칙'을 통한 가우시안의 유도==
** [http://bomber0.byus.net/index.php/2008/07/14/688 드무아브르의 중심극한정리(iv) : 가우시안의 눈부신 등장]
+
 
*  방법2. 가우스의 '오차의 법칙' 을 통한 방법[[1950958/attachments/870482|]]<br>
+
* 오차 = 관측하려는 실제값 - 관측에서 얻어지는 값
** [[1950958/attachments/870482|The law of errors]] ('Excursions in calculus' 206~216p 에서 가져옴)
+
* 오차의 분포를 기술하는 확률밀도함수 <math>\Phi</math>는 다음과 같은 성질을 만족시켜야 함. 1) <math>\Phi(x)=\Phi(-x)</math> 2)작은 오차가 큰 오차보다 더 나타날 확률이 커야한다. 그리고 매우 큰 오차는 나타날 확률이 매우 작아야 한다. 3) <math>\int_{-\infty}^{\infty} \Phi(x)\,dx=1</math> 4) 관측하려는 실제값이 <math>\mu</math> 이고, n 번의 관측을 통해 <math>x_ 1, x_ 2, \cdots, x_n</math> 을 얻을 확률 <math>\Phi(\mu-x_ 1)\Phi(\mu-x_ 2)\cdots\Phi(\mu-x_n)</math>의 최대값은 <math>\mu=\frac{x_ 1+x_ 2+ \cdots+ x_n}{n}</math>에서 얻어진다.
 +
* 4번 조건을 가우스의 산술평균의 법칙이라 부르며, 관측에 있어 실제값이 될 개연성이 가장 높은 값은 관측된 값들의 산술평균이라는 가정을 하는 것임.
 +
 
 +
 
 +
;정리 (가우스)
 +
이 조건들을 만족시키는 확률밀도함수는 <math>\Phi(x)=\frac{h}{\sqrt{\pi}}e^{-h^2x^2}</math> 형태로 주어진다. 여기서 <math>h</math>는 확률의 정확도와 관련된 값임. (실제로는 표준편차와 연관되는 값)
 +
 +
 
 +
;증명
 +
 
 +
<math>n=3</math>인 경우에 4번 조건을 만족시키는 함수를 찾아보자.
 +
 
 +
<math>\Phi(x-x_ 1)\Phi(x-x_ 2)\Phi(x-x_ 3)</math>의 최대값은 <math>x=\frac{x_ 1+x_ 2+ x_ 3}{3}</math> 에서 얻어진다.
 +
 
 +
따라서 <math>\ln \Phi(x-x_ 1)\Phi(x-x_ 2)\Phi(x-x_ 3)</math> 의 최대값도 <math>x=\frac{x_ 1+x_ 2+ x_ 3}{3}</math> 에서 얻어진다.
 +
 
 +
미분적분학의 결과에 의해,  <math>x=\frac{x_ 1+x_ 2+ x_ 3}{3}</math> 이면,  <math>\frac{\Phi'(x-x_ 1)}{\Phi(x-x_ 1)}+\frac{\Phi'(x-x_ 2)}{\Phi(x-x_ 2)}+\frac{\Phi'(x-x_ 3)}{\Phi(x-x_ 3)}=0</math> 이어야 한다.  
 +
 
 +
<math>F(x)=\frac{\Phi'(x)}{\Phi(x)}</math> 으로 두자.
 +
 
 +
<math>x+y+z=0</math> 이면, <math>F(x)+F(y)+F(z)=0</math> 이어야 한다.
 +
 
 +
1번 조건에 의해, <math>F</math> 는 기함수이다.  
 +
 
 +
따라서 모든 <math>x,y</math> 에 의해서, <math>F(x+y)=F(x)+F(y)</math> 가 성립한다. 그러므로 <math>F(x)=Ax</math> 형태로 쓸수 있다.
 +
 
 +
이제 적당한 상수 <math>B, h</math> 에 의해 <math>\Phi(x)=Be^{-h^2x^2}</math> 꼴로 쓸 수 있다.
  
 
+
모든 <math>n</math>에 대하여 4번조건이 만족됨은 쉽게 확인할 수 있다. (증명끝)
  
<h5>재미있는 사실</h5>
+
  
 
+
==역사==
  
 
+
* 중심극한정리는 여러 과정을 거쳐 발전
 +
* 이항분포의 중심극한 정리
 +
** 라플라스의 19세기 초기 버전
 +
확률변수 X가 이항분포 B(n,p)를 따를 때, n이 충분히 크면 X의 분포는 근사적으로 정규분포 N(np,npq)를 따른다
 +
** 드무아브르가 18세기에 발견한 것은 이항분포에서 확률이 1/2인 경우
 +
** [[드무아브르-라플라스 중심극한정리]] 의 유도는 해당 항목을 참조.
 +
* [[수학사 연표]]
  
<h5>관련된 단원</h5>
 
  
 
+
==재미있는 사실==
  
 
+
* 정규분포와 중심극한정리에 대한 이해는 교양인이 알아야 할 수학 주제의 하나
 +
*  Galton's quincunx
 +
** 정규분포의 밀도함수 형태를 물리적으로 얻을 수 있는 장치.
 +
** http://ptrow.com/articles/Galton_June_07.htm
 +
*  예전 독일 마르크화에는 가우스의 발견을 기려 정규분포곡선이 새겨짐[[파일:1950958-Gauss-detail2.jpg]]
  
<h5>관련된 다른 주제들</h5>
+
  
 
 
  
 
+
==관련된 항목들==
 +
* [[열방정식]]
 +
* [[월리스 곱 (Wallis product formula)]]
 +
* [[스털링 공식]]
 +
* [[파이가 아니라 2파이다?]]
 +
* [[벤포드의 법칙]]
 +
* [[최소자승법]]
 +
  
<h5>관련도서 및 추천도서</h5>
+
==계산 리소스==
 +
* [http://www.ruf.rice.edu/%7Elane/stat_sim/normal_approx/index.html 동전던지기 시뮬레이션]
 +
** 자바애플릿
 +
  
 
+
==관련도서==
  
 
+
* Fischer, Hans , History of the Central Limit Theorem : From Laplace to Donsker
 +
* [http://www.amazon.com/History-Statistics-Measurement-Uncertainty-before/dp/067440341X/ref=sr_1_7?ie=UTF8&s=books&qid=1246720061&sr=1-7 The History of Statistics: The Measurement of Uncertainty before 1900]
 +
*  Excursions in calculus
 +
** 206~216p, The law of errors
  
<h5>관련된 고교수학 또는 대학수학</h5>
+
  
 
+
==사전형태의 자료==
 +
* [http://ko.wikipedia.org/wiki/%EC%A4%91%EC%8B%AC%EA%B7%B9%ED%95%9C%EC%A0%95%EB%A6%AC http://ko.wikipedia.org/wiki/중심극한정리]
 +
* [http://ko.wikipedia.org/wiki/%EC%A0%95%EA%B7%9C%EB%B6%84%ED%8F%AC http://ko.wikipedia.org/wiki/정규분포]
 +
* http://en.wikipedia.org/wiki/normal_distribution
 +
* http://en.wikipedia.org/wiki/Central_limit_theorem
 +
* http://viswiki.com/en/central_limit_theorem
 +
* 다음백과사전 [http://enc.daum.net/dic100/search.do?q=%EC%98%A4%EC%B0%A8%EC%9D%98%EB%B2%95%EC%B9%99 http://enc.daum.net/dic100/search.do?q=오차의법칙]
  
 
 
  
<h5>참고할만한 자료</h5>
+
==에세이==
 +
* http://math.stackexchange.com/questions/28558/what-do-pi-and-e-stand-for-in-the-normal-distribution-formula
 +
* Pearson, Karl. "Historical note on the origin of the normal curve of errors." Biometrika (1924): 402-404. http://biomet.oxfordjournals.org/cgi/reprint/16/3-4/402.pdf
 +
  
 
+
==관련기사==
  
* [http://news.naver.com/main/read.nhn?mode=LPOD&mid=etc&oid=042&aid=0000010241 [재미있는 과학이야기] 통계의 기본원리 ② 가우스 분포]<br>
+
* [http://www.hani.co.kr/arti/science/kistiscience/315218.html 과학자들의 진실게임 - 그 법칙은 내꺼야!]
 +
**  과학에서 최초의 발견자와 크레딧 논쟁 사례
 +
**  한겨레, 2008-10-10
 +
* [http://news.naver.com/main/read.nhn?mode=LPOD&mid=etc&oid=042&aid=0000010241 [재미있는 과학이야기] 통계의 기본원리 ② 가우스 분포]
 
** 주간한국, 2008-01-07
 
** 주간한국, 2008-01-07
 +
*  기사 검색 (키워드 수정)
 +
** [http://news.search.naver.com/search.naver?where=news&x=0&y=0&sm=tab_hty&query=%EC%A0%95%EA%B7%9C%EB%B6%84%ED%8F%AC http://news.search.naver.com/search.naver?where=news&x=0&y=0&sm=tab_hty&query=정규분포]
 +
 +
 +
 +
 +
==블로그==
 +
*  피타고라스의 창
 +
** [http://bomber0.byus.net/index.php/2008/07/06/680 드무아브르의 중심극한정리(i)]
 +
** [http://bomber0.byus.net/index.php/2008/07/12/686 드무아브르의 중심극한정리(ii) : 스털링이 가져간 영광]
 +
** [http://bomber0.byus.net/index.php/2008/07/12/687 드무아브르의 중심극한정리(iii) : 숫자 파이와 동전던지기]
 +
** [http://bomber0.byus.net/index.php/2008/07/14/688 드무아브르의 중심극한정리(iv) : 가우시안의 눈부신 등장]
 +
 +
 +
 +
 +
==동영상==
 +
* [http://www.youtube.com/watch?v=9tTHST1sLV8 Quincunx - The Probability Machine]
 +
 +
 +
== 노트 ==
 +
 +
===위키데이터===
 +
* ID :  [https://www.wikidata.org/wiki/Q133871 Q133871]
 +
===말뭉치===
 +
# Normal distribution, also called Gaussian distribution, the most common distribution function for independent, randomly generated variables.<ref name="ref_9e8bfb64">[https://www.britannica.com/topic/normal-distribution normal distribution | Definition, Examples, Graph, & Facts]</ref>
 +
# Read More on This Topic statistics: The normal distribution The most widely used continuous probability distribution in statistics is the normal probability distribution.<ref name="ref_9e8bfb64" />
 +
# normal distribution , sometimes called the bell curve, is a distribution that occurs naturally in many situations.<ref name="ref_a4fee010">[https://www.statisticshowto.com/probability-and-statistics/normal-distributions/ Normal Distributions (Bell Curve): Definition, Word Problems]</ref>
 +
# For example, the bell curve is seen in tests like the SAT and GRE.<ref name="ref_a4fee010" />
 +
# A smaller standard deviation indicates that the data is tightly clustered around the mean; the normal distribution will be taller.<ref name="ref_a4fee010" />
 +
# If the data is evenly distributed, you may come up with a bell curve.<ref name="ref_a4fee010" />
 +
# Everything we do, or almost everything we do in inferential statistics, which is essentially making inferences based on data points, is to some degree based on the normal distribution.<ref name="ref_86114ede">[https://www.khanacademy.org/math/statistics-probability/modeling-distributions-of-data/more-on-normal-distributions/v/introduction-to-the-normal-distribution Normal distribution (Gaussian distribution) (video)]</ref>
 +
# And so what I want to do in this video and in this spreadsheet is to essentially give you as deep an understanding of the normal distribution as possible.<ref name="ref_86114ede" />
 +
# And it actually turns out, for the normal distribution, this isn't an easy thing to evaluate analytically.<ref name="ref_86114ede" />
 +
# and if you were to take the sum of them, as you approach an infinite number of flips, you approach the normal distribution.<ref name="ref_86114ede" />
 +
# You can see a normal distribution being created by random chance!<ref name="ref_d98c6d84">[https://www.mathsisfun.com/data/standard-normal-distribution.html Normal Distribution]</ref>
 +
# From the big bell curve above we see that 0.1% are less.<ref name="ref_d98c6d84" />
 +
# Use the Standard Normal Distribution Table when you want more accurate values.<ref name="ref_d98c6d84" />
 +
# The normal distribution is the most common type of distribution assumed in technical stock market analysis and in other types of statistical analyses.<ref name="ref_6e423141">[https://www.investopedia.com/terms/n/normaldistribution.asp Normal Distribution]</ref>
 +
# The normal distribution model is motivated by the Central Limit Theorem.<ref name="ref_6e423141" />
 +
# Normal distribution is sometimes confused with symmetrical distribution.<ref name="ref_6e423141" />
 +
# The skewness and kurtosis coefficients measure how different a given distribution is from a normal distribution.<ref name="ref_6e423141" />
 +
# The case where μ = 0 and σ = 1 is called the standard normal distribution.<ref name="ref_84962c75">[https://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm 1.3.6.6.1. Normal Distribution]</ref>
 +
# The normal distribution is the most important probability distribution in statistics because it fits many natural phenomena.<ref name="ref_5a30d4a5">[https://statisticsbyjim.com/basics/normal-distribution/ Normal Distribution in Statistics]</ref>
 +
# For example, heights, blood pressure, measurement error, and IQ scores follow the normal distribution.<ref name="ref_5a30d4a5" />
 +
# The normal distribution is a probability function that describes how the values of a variable are distributed.<ref name="ref_5a30d4a5" />
 +
# As with any probability distribution, the parameters for the normal distribution define its shape and probabilities entirely.<ref name="ref_5a30d4a5" />
 +
# If a dataset follows a normal distribution, then about 68% of the observations will fall within of the mean , which in this case is with the interval (-1,1).<ref name="ref_b8780e50">[http://www.stat.yale.edu/Courses/1997-98/101/normal.htm The Normal Distribution]</ref>
 +
# Although it may appear as if a normal distribution does not include any values beyond a certain interval, the density is actually positive for all values, .<ref name="ref_b8780e50" />
 +
# The standardized values in the second column and the corresponding normal quantile scores are very similar, indicating that the temperature data seem to fit a normal distribution.<ref name="ref_b8780e50" />
 +
# Let us find the mean and variance of the standard normal distribution.<ref name="ref_8e76b796">[https://www.probabilitycourse.com/chapter4/4_2_3_normal.php Normal random variables]</ref>
 +
# To find the CDF of the standard normal distribution, we need to integrate the PDF function.<ref name="ref_8e76b796" />
 +
# Most of the continuous data values in a normal distribution tend to cluster around the mean, and the further a value is from the mean, the less likely it is to occur.<ref name="ref_46815ddf">[https://www.simplypsychology.org/normal-distribution.html Normal Distribution (Bell Curve)]</ref>
 +
# The normal distribution is often called the bell curve because the graph of its probability density looks like a bell.<ref name="ref_46815ddf" />
 +
# The normal distribution is the most important probability distribution in statistics because many continuous data in nature and psychology displays this bell-shaped curve when compiled and graphed.<ref name="ref_46815ddf" />
 +
# Converting the raw scores of a normal distribution to z-scores We can standardized the values (raw scores) of a normal distribution by converting them into z-scores.<ref name="ref_46815ddf" />
 +
# The diagram above shows the bell shaped curve of a normal (Gaussian) distribution superimposed on a histogram of a sample from a normal distribution.<ref name="ref_1fbead32">[https://www.statsdirect.com/help/distributions/normal.htm Normal Distribution and Standard Normal (Gaussian)]</ref>
 +
# The tail area of the normal distribution is evaluated to 15 decimal places of accuracy using the complement of the error function (Abramowitz and Stegun, 1964; Johnson and Kotz, 1970).<ref name="ref_1fbead32" />
 +
# This guide will show you how to calculate the probability (area under the curve) of a standard normal distribution.<ref name="ref_5f182710">[https://statistics.laerd.com/statistical-guides/normal-distribution-calculations.php How to do Normal Distributions Calculations]</ref>
 +
# It will first show you how to interpret a Standard Normal Distribution Table.<ref name="ref_5f182710" />
 +
# As explained above, the standard normal distribution table only provides the probability for values less than a positive z-value (i.e., z-values on the right-hand side of the mean).<ref name="ref_5f182710" />
 +
# We start by remembering that the standard normal distribution has a total area (probability) equal to 1 and it is also symmetrical about the mean.<ref name="ref_5f182710" />
 +
# The "normal distribution" is the most commonly used distribution in statistics.<ref name="ref_46543148">[https://dietassessmentprimer.cancer.gov/learn/distribution.html Dietary Assessment Primer]</ref>
 +
# To choose the best Box-Cox transformation—the one that best approximates a normal distribution - Box and Cox suggested using the maximum likelihood method.<ref name="ref_46543148" />
 +
# The graph of the normal distribution depends on two factors - the mean and the standard deviation.<ref name="ref_c5c2b6a6">[https://stattrek.com/probability-distributions/normal.aspx Normal Distribution]</ref>
 +
# To find the probability associated with a normal random variable, use a graphing calculator, an online normal distribution calculator, or a normal distribution table.<ref name="ref_c5c2b6a6" />
 +
# In the examples below, we illustrate the use of Stat Trek's Normal Distribution Calculator, a free tool available on this site.<ref name="ref_c5c2b6a6" />
 +
# The normal distribution calculator solves common statistical problems, based on the normal distribution.<ref name="ref_c5c2b6a6" />
 +
# In a normal distribution, data is symmetrically distributed with no skew.<ref name="ref_ae505594">[https://www.scribbr.com/statistics/normal-distribution/ Examples, Formulas, & Uses]</ref>
 +
# Example: Using the empirical rule in a normal distribution You collect SAT scores from students in a new test preparation course.<ref name="ref_ae505594" />
 +
# The data follows a normal distribution with a mean score (M) of 1150 and a standard deviation (SD) of 150.<ref name="ref_ae505594" />
 +
# A random variable with the standard Normal distribution, commonly denoted by \(Z\), has mean zero and standard deviation one.<ref name="ref_92889984">[http://amsi.org.au/ESA_Senior_Years/SeniorTopic4/4f/4f_2content_3.html Normal distribution]</ref>
 +
# The probabilities for any Normal distribution can be reduced to probabilities for the standard Normal distribution, using the device of standardisation.<ref name="ref_92889984" />
 +
# Crowd size Suppose that crowd size at home games for a particular football club follows a Normal distribution with mean \(26\ 000\) and standard deviation 5000.<ref name="ref_92889984" />
 +
# The cdf of any Normal distribution can also be found, using technology, without first standardising.<ref name="ref_92889984" />
 +
# The normal distribution is also useful when sampling data out of a non-normal data set.<ref name="ref_2c3aba79">[https://radiopaedia.org/articles/normal-distribution Radiology Reference Article]</ref>
 +
# A truncated NORMAL distribution can be defined for a variable by setting the desired minimum and/or maximum values for the variable.<ref name="ref_81a3889f">[https://www.rocscience.com/help/slide2/slide_model/probability/Normal_Distribution.htm Normal Distribution]</ref>
 +
# For practical purposes, minimum and maximum values that are at least 3 standard deviations away from the mean generate a complete normal distribution.<ref name="ref_81a3889f" />
 +
# For a Normal distribution, 99.73 % of all samples, will fall within 3 Standard Deviations of the mean value.<ref name="ref_81a3889f" />
 +
# Many other common distributions become like the normal distribution in special cases.<ref name="ref_b6faa27f">[https://www.sciencedirect.com/topics/computer-science/normal-distribution Normal Distribution - an overview]</ref>
 +
# Look at the histograms of lifetimes given in Figure 21.3 and of resistances given in Figure 21.4 and you will see that they resemble the normal distribution.<ref name="ref_b6faa27f" />
 +
# If you were to get a large group of students to measure the diameter of a washer to the nearest 0.1 mm, then a histogram of the results would give an approximately normal distribution.<ref name="ref_b6faa27f" />
 +
# However, there is a problem with the normal distribution function in that is not easy to integrate!<ref name="ref_b6faa27f" />
 +
# The normal distribution is also referred to as Gaussian or Gauss distribution.<ref name="ref_7b422dd0">[https://corporatefinanceinstitute.com/resources/knowledge/other/normal-distribution/ Overview, Parameters, and Properties]</ref>
 +
# In a normal distribution graph, the mean defines the location of the peak, and most of the data points are clustered around the mean.<ref name="ref_7b422dd0" />
 +
# A normal distribution comes with a perfectly symmetrical shape.<ref name="ref_7b422dd0" />
 +
# The middle point of a normal distribution is the point with the maximum frequency, which means that it possesses the most observations of the variable.<ref name="ref_7b422dd0" />
 +
# We will get a normal distribution if there is a true answer for the distance, but as we shoot for this distance, since, to err is human, we are likely to miss the target.<ref name="ref_a00b502e">[https://www.sonoma.edu/users/w/wilsonst/Papers/Normal/default.html The Normal Distribution]</ref>
 +
# We can use the fact that the normal distribution is a probability distribution, and the total area under the curve is 1.<ref name="ref_a00b502e" />
 +
# If you use the normal distribution, the probability comes of to be about 0.728668.<ref name="ref_a00b502e" />
 +
# The minimum variance unbiased estimator (MVUE) is commonly used to estimate the parameters of the normal distribution.<ref name="ref_316bf57b">[https://www.mathworks.com/help/stats/normal-distribution.html Normal Distribution]</ref>
 +
# For an example, see Fit Normal Distribution Object.<ref name="ref_316bf57b" />
 +
# The normal distribution is the most well-known distribution and the most frequently used in statistical theory and applications.<ref name="ref_a1608f17">[https://www.frontiersin.org/articles/267072 Non-normal Distributions Commonly Used in Health, Education, and Social Sciences: A Systematic Review]</ref>
 +
# Any articles that did not specify the type of distribution or which referred to the normal distribution were likewise excluded.<ref name="ref_a1608f17" />
 +
# In stage 2 we eliminated a further 292 abstracts that made no mention of the type of distribution and one which referred to a normal distribution.<ref name="ref_a1608f17" />
 +
# Before introducing the normal distribution, we first look at two important concepts: the Central limit theorem, and the concept of independence.<ref name="ref_de83d748">[https://learnche.org/pid/univariate-review/normal-distribution-and-checking-for-normality 2.8. Normal distribution — Process Improvement using Data]</ref>
 +
# The Central limit theorem plays an important role in the theory of probability and in the derivation of the normal distribution.<ref name="ref_de83d748" />
 +
# As one sees from the above figures, the distribution from these averages quickly takes the shape of the so-called normal distribution.<ref name="ref_de83d748" />
 +
# You might still find yourself having to refer to tables of cumulative area under the normal distribution, instead of using the pnorm() function (for example in a test or exam).<ref name="ref_de83d748" />
 +
# The normal distribution is a continuous, univariate, symmetric, unbounded, unimodal and bell-shaped probability distribution.<ref name="ref_34c908a6">[https://wiki.analytica.com/index.php?title=Normal_distribution Normal distribution]</ref>
 +
# Use this to describe a quantity that has a normal normal distribution with the given «mean» and standard deviation «stddev».<ref name="ref_34c908a6" />
 +
# Suppose you want to fit a Normal distribution to historical data.<ref name="ref_34c908a6" />
 +
# The normal distribution holds an honored role in probability and statistics, mostly because of the central limit theorem, one of the fundamental theorems that forms a bridge between the two subjects.<ref name="ref_3ac6d8c6">[https://www.randomservices.org/random/special/Normal.html The Normal Distribution]</ref>
 +
# In addition, as we will see, the normal distribution has many nice mathematical properties.<ref name="ref_3ac6d8c6" />
 +
# In the Special Distribution Simulator, select the normal distribution and keep the default settings.<ref name="ref_3ac6d8c6" />
 +
# In the special distribution calculator, select the normal distribution and keep the default settings.<ref name="ref_3ac6d8c6" />
 +
# Indeed it is so common, that people often know it as the normal curve or normal distribution, shown in Figure \(\PageIndex{1}\).<ref name="ref_7f395a29">[https://stats.libretexts.org/Bookshelves/Introductory_Statistics/Book%3A_OpenIntro_Statistics_(Diez_et_al)./03%3A_Distributions_of_Random_Variables/3.01%3A_Normal_Distribution 3.1: Normal Distribution]</ref>
 +
# It is also known as the Gaussian distribution after Frederic Gauss, the first person to formalize its mathematical expression.<ref name="ref_7f395a29" />
 +
# The normal distribution model always describes a symmetric, unimodal, bell shaped curve.<ref name="ref_7f395a29" />
 +
# Specifically, the normal distribution model can be adjusted using two parameters: mean and standard deviation.<ref name="ref_7f395a29" />
 +
# The normal or Gaussian distribution is extremely important in statistics, in part because it shows up all the time in nature.<ref name="ref_e54d5f95">[https://bookdown.org/cquirk/LetsExploreStatistics/lets-explore-the-normal-distribution.html 1 Let’s Explore the Normal Distribution]</ref>
 +
# The standard normal is defined as a normal distribution with μ = 0 and σ = 1.<ref name="ref_e54d5f95" />
 +
# You probably have explored the normal distribution before.<ref name="ref_e54d5f95" />
 +
# Below, you can adjust the parameters of the normal distribution and compare it to the standard normal.<ref name="ref_e54d5f95" />
 +
# For normally distributed vectors, see Multivariate normal distribution .<ref name="ref_413e2467">[https://en.wikipedia.org/wiki/Normal_distribution Normal distribution]</ref>
 +
# The simplest case of a normal distribution is known as the standard normal distribution.<ref name="ref_413e2467" />
 +
# Authors differ on which normal distribution should be called the "standard" one.<ref name="ref_413e2467" />
 +
# σ Z + μ {\displaystyle X=\sigma Z+\mu } will have a normal distribution with expected value μ {\displaystyle \mu } and standard deviation σ {\displaystyle \sigma } .<ref name="ref_413e2467" />
 +
# The so-called "standard normal distribution" is given by taking and in a general normal distribution.<ref name="ref_2c4ffbaa">[https://mathworld.wolfram.com/NormalDistribution.html Normal Distribution -- from Wolfram MathWorld]</ref>
 +
# This theorem states that the mean of any set of variates with any distribution having a finite mean and variance tends to the normal distribution.<ref name="ref_2c4ffbaa" />
 +
===소스===
 +
<references />
 +
 +
==메타데이터==
 +
===위키데이터===
 +
* ID :  [https://www.wikidata.org/wiki/Q133871 Q133871]
 +
===Spacy 패턴 목록===
 +
* [{'LOWER': 'normal'}, {'LEMMA': 'distribution'}]
 +
* [{'LOWER': 'gaussian'}, {'LEMMA': 'distribution'}]
 +
* [{'LOWER': 'bell'}, {'LEMMA': 'curve'}]

2021년 2월 17일 (수) 04:58 기준 최신판

개요

  • 고교 과정의 통계에서는 정규분포의 기본적인 성질과 정규분포표 읽는 방법을 배움.
  • 평균이 \(\mu\), 표준편차가 \(\sigma\)인 정규분포의 \(N(\mu,\sigma^2)\)의 확률밀도함수, 즉 가우시안은 다음과 같음이 알려져 있음.\[\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)\]
  • 아래에서는 이 확률밀도함수가 어떻게 해서 얻어지는가를 보임.(기본적으로는 가우스의 증명)
  • 가우시안의 형태를 얻는 또다른 방법으로 드무아브르-라플라스 중심극한정리 를 참조.


'오차의 법칙'을 통한 가우시안의 유도

  • 오차 = 관측하려는 실제값 - 관측에서 얻어지는 값
  • 오차의 분포를 기술하는 확률밀도함수 \(\Phi\)는 다음과 같은 성질을 만족시켜야 함. 1) \(\Phi(x)=\Phi(-x)\) 2)작은 오차가 큰 오차보다 더 나타날 확률이 커야한다. 그리고 매우 큰 오차는 나타날 확률이 매우 작아야 한다. 3) \(\int_{-\infty}^{\infty} \Phi(x)\,dx=1\) 4) 관측하려는 실제값이 \(\mu\) 이고, n 번의 관측을 통해 \(x_ 1, x_ 2, \cdots, x_n\) 을 얻을 확률 \(\Phi(\mu-x_ 1)\Phi(\mu-x_ 2)\cdots\Phi(\mu-x_n)\)의 최대값은 \(\mu=\frac{x_ 1+x_ 2+ \cdots+ x_n}{n}\)에서 얻어진다.
  • 4번 조건을 가우스의 산술평균의 법칙이라 부르며, 관측에 있어 실제값이 될 개연성이 가장 높은 값은 관측된 값들의 산술평균이라는 가정을 하는 것임.


정리 (가우스)

이 조건들을 만족시키는 확률밀도함수는 \(\Phi(x)=\frac{h}{\sqrt{\pi}}e^{-h^2x^2}\) 형태로 주어진다. 여기서 \(h\)는 확률의 정확도와 관련된 값임. (실제로는 표준편차와 연관되는 값)


증명

\(n=3\)인 경우에 4번 조건을 만족시키는 함수를 찾아보자.

\(\Phi(x-x_ 1)\Phi(x-x_ 2)\Phi(x-x_ 3)\)의 최대값은 \(x=\frac{x_ 1+x_ 2+ x_ 3}{3}\) 에서 얻어진다.

따라서 \(\ln \Phi(x-x_ 1)\Phi(x-x_ 2)\Phi(x-x_ 3)\) 의 최대값도 \(x=\frac{x_ 1+x_ 2+ x_ 3}{3}\) 에서 얻어진다.

미분적분학의 결과에 의해, \(x=\frac{x_ 1+x_ 2+ x_ 3}{3}\) 이면, \(\frac{\Phi'(x-x_ 1)}{\Phi(x-x_ 1)}+\frac{\Phi'(x-x_ 2)}{\Phi(x-x_ 2)}+\frac{\Phi'(x-x_ 3)}{\Phi(x-x_ 3)}=0\) 이어야 한다.

\(F(x)=\frac{\Phi'(x)}{\Phi(x)}\) 으로 두자.

\(x+y+z=0\) 이면, \(F(x)+F(y)+F(z)=0\) 이어야 한다.

1번 조건에 의해, \(F\) 는 기함수이다.

따라서 모든 \(x,y\) 에 의해서, \(F(x+y)=F(x)+F(y)\) 가 성립한다. 그러므로 \(F(x)=Ax\) 형태로 쓸수 있다.

이제 적당한 상수 \(B, h\) 에 의해 \(\Phi(x)=Be^{-h^2x^2}\) 꼴로 쓸 수 있다.

모든 \(n\)에 대하여 4번조건이 만족됨은 쉽게 확인할 수 있다. (증명끝)


역사

  • 중심극한정리는 여러 과정을 거쳐 발전
  • 이항분포의 중심극한 정리
    • 라플라스의 19세기 초기 버전

확률변수 X가 이항분포 B(n,p)를 따를 때, n이 충분히 크면 X의 분포는 근사적으로 정규분포 N(np,npq)를 따른다


재미있는 사실

  • 정규분포와 중심극한정리에 대한 이해는 교양인이 알아야 할 수학 주제의 하나
  • Galton's quincunx
  • 예전 독일 마르크화에는 가우스의 발견을 기려 정규분포곡선이 새겨짐1950958-Gauss-detail2.jpg



관련된 항목들


계산 리소스


관련도서


사전형태의 자료


에세이


관련기사



블로그



동영상


노트

위키데이터

말뭉치

  1. Normal distribution, also called Gaussian distribution, the most common distribution function for independent, randomly generated variables.[1]
  2. Read More on This Topic statistics: The normal distribution The most widely used continuous probability distribution in statistics is the normal probability distribution.[1]
  3. normal distribution , sometimes called the bell curve, is a distribution that occurs naturally in many situations.[2]
  4. For example, the bell curve is seen in tests like the SAT and GRE.[2]
  5. A smaller standard deviation indicates that the data is tightly clustered around the mean; the normal distribution will be taller.[2]
  6. If the data is evenly distributed, you may come up with a bell curve.[2]
  7. Everything we do, or almost everything we do in inferential statistics, which is essentially making inferences based on data points, is to some degree based on the normal distribution.[3]
  8. And so what I want to do in this video and in this spreadsheet is to essentially give you as deep an understanding of the normal distribution as possible.[3]
  9. And it actually turns out, for the normal distribution, this isn't an easy thing to evaluate analytically.[3]
  10. and if you were to take the sum of them, as you approach an infinite number of flips, you approach the normal distribution.[3]
  11. You can see a normal distribution being created by random chance![4]
  12. From the big bell curve above we see that 0.1% are less.[4]
  13. Use the Standard Normal Distribution Table when you want more accurate values.[4]
  14. The normal distribution is the most common type of distribution assumed in technical stock market analysis and in other types of statistical analyses.[5]
  15. The normal distribution model is motivated by the Central Limit Theorem.[5]
  16. Normal distribution is sometimes confused with symmetrical distribution.[5]
  17. The skewness and kurtosis coefficients measure how different a given distribution is from a normal distribution.[5]
  18. The case where μ = 0 and σ = 1 is called the standard normal distribution.[6]
  19. The normal distribution is the most important probability distribution in statistics because it fits many natural phenomena.[7]
  20. For example, heights, blood pressure, measurement error, and IQ scores follow the normal distribution.[7]
  21. The normal distribution is a probability function that describes how the values of a variable are distributed.[7]
  22. As with any probability distribution, the parameters for the normal distribution define its shape and probabilities entirely.[7]
  23. If a dataset follows a normal distribution, then about 68% of the observations will fall within of the mean , which in this case is with the interval (-1,1).[8]
  24. Although it may appear as if a normal distribution does not include any values beyond a certain interval, the density is actually positive for all values, .[8]
  25. The standardized values in the second column and the corresponding normal quantile scores are very similar, indicating that the temperature data seem to fit a normal distribution.[8]
  26. Let us find the mean and variance of the standard normal distribution.[9]
  27. To find the CDF of the standard normal distribution, we need to integrate the PDF function.[9]
  28. Most of the continuous data values in a normal distribution tend to cluster around the mean, and the further a value is from the mean, the less likely it is to occur.[10]
  29. The normal distribution is often called the bell curve because the graph of its probability density looks like a bell.[10]
  30. The normal distribution is the most important probability distribution in statistics because many continuous data in nature and psychology displays this bell-shaped curve when compiled and graphed.[10]
  31. Converting the raw scores of a normal distribution to z-scores We can standardized the values (raw scores) of a normal distribution by converting them into z-scores.[10]
  32. The diagram above shows the bell shaped curve of a normal (Gaussian) distribution superimposed on a histogram of a sample from a normal distribution.[11]
  33. The tail area of the normal distribution is evaluated to 15 decimal places of accuracy using the complement of the error function (Abramowitz and Stegun, 1964; Johnson and Kotz, 1970).[11]
  34. This guide will show you how to calculate the probability (area under the curve) of a standard normal distribution.[12]
  35. It will first show you how to interpret a Standard Normal Distribution Table.[12]
  36. As explained above, the standard normal distribution table only provides the probability for values less than a positive z-value (i.e., z-values on the right-hand side of the mean).[12]
  37. We start by remembering that the standard normal distribution has a total area (probability) equal to 1 and it is also symmetrical about the mean.[12]
  38. The "normal distribution" is the most commonly used distribution in statistics.[13]
  39. To choose the best Box-Cox transformation—the one that best approximates a normal distribution - Box and Cox suggested using the maximum likelihood method.[13]
  40. The graph of the normal distribution depends on two factors - the mean and the standard deviation.[14]
  41. To find the probability associated with a normal random variable, use a graphing calculator, an online normal distribution calculator, or a normal distribution table.[14]
  42. In the examples below, we illustrate the use of Stat Trek's Normal Distribution Calculator, a free tool available on this site.[14]
  43. The normal distribution calculator solves common statistical problems, based on the normal distribution.[14]
  44. In a normal distribution, data is symmetrically distributed with no skew.[15]
  45. Example: Using the empirical rule in a normal distribution You collect SAT scores from students in a new test preparation course.[15]
  46. The data follows a normal distribution with a mean score (M) of 1150 and a standard deviation (SD) of 150.[15]
  47. A random variable with the standard Normal distribution, commonly denoted by \(Z\), has mean zero and standard deviation one.[16]
  48. The probabilities for any Normal distribution can be reduced to probabilities for the standard Normal distribution, using the device of standardisation.[16]
  49. Crowd size Suppose that crowd size at home games for a particular football club follows a Normal distribution with mean \(26\ 000\) and standard deviation 5000.[16]
  50. The cdf of any Normal distribution can also be found, using technology, without first standardising.[16]
  51. The normal distribution is also useful when sampling data out of a non-normal data set.[17]
  52. A truncated NORMAL distribution can be defined for a variable by setting the desired minimum and/or maximum values for the variable.[18]
  53. For practical purposes, minimum and maximum values that are at least 3 standard deviations away from the mean generate a complete normal distribution.[18]
  54. For a Normal distribution, 99.73 % of all samples, will fall within 3 Standard Deviations of the mean value.[18]
  55. Many other common distributions become like the normal distribution in special cases.[19]
  56. Look at the histograms of lifetimes given in Figure 21.3 and of resistances given in Figure 21.4 and you will see that they resemble the normal distribution.[19]
  57. If you were to get a large group of students to measure the diameter of a washer to the nearest 0.1 mm, then a histogram of the results would give an approximately normal distribution.[19]
  58. However, there is a problem with the normal distribution function in that is not easy to integrate![19]
  59. The normal distribution is also referred to as Gaussian or Gauss distribution.[20]
  60. In a normal distribution graph, the mean defines the location of the peak, and most of the data points are clustered around the mean.[20]
  61. A normal distribution comes with a perfectly symmetrical shape.[20]
  62. The middle point of a normal distribution is the point with the maximum frequency, which means that it possesses the most observations of the variable.[20]
  63. We will get a normal distribution if there is a true answer for the distance, but as we shoot for this distance, since, to err is human, we are likely to miss the target.[21]
  64. We can use the fact that the normal distribution is a probability distribution, and the total area under the curve is 1.[21]
  65. If you use the normal distribution, the probability comes of to be about 0.728668.[21]
  66. The minimum variance unbiased estimator (MVUE) is commonly used to estimate the parameters of the normal distribution.[22]
  67. For an example, see Fit Normal Distribution Object.[22]
  68. The normal distribution is the most well-known distribution and the most frequently used in statistical theory and applications.[23]
  69. Any articles that did not specify the type of distribution or which referred to the normal distribution were likewise excluded.[23]
  70. In stage 2 we eliminated a further 292 abstracts that made no mention of the type of distribution and one which referred to a normal distribution.[23]
  71. Before introducing the normal distribution, we first look at two important concepts: the Central limit theorem, and the concept of independence.[24]
  72. The Central limit theorem plays an important role in the theory of probability and in the derivation of the normal distribution.[24]
  73. As one sees from the above figures, the distribution from these averages quickly takes the shape of the so-called normal distribution.[24]
  74. You might still find yourself having to refer to tables of cumulative area under the normal distribution, instead of using the pnorm() function (for example in a test or exam).[24]
  75. The normal distribution is a continuous, univariate, symmetric, unbounded, unimodal and bell-shaped probability distribution.[25]
  76. Use this to describe a quantity that has a normal normal distribution with the given «mean» and standard deviation «stddev».[25]
  77. Suppose you want to fit a Normal distribution to historical data.[25]
  78. The normal distribution holds an honored role in probability and statistics, mostly because of the central limit theorem, one of the fundamental theorems that forms a bridge between the two subjects.[26]
  79. In addition, as we will see, the normal distribution has many nice mathematical properties.[26]
  80. In the Special Distribution Simulator, select the normal distribution and keep the default settings.[26]
  81. In the special distribution calculator, select the normal distribution and keep the default settings.[26]
  82. Indeed it is so common, that people often know it as the normal curve or normal distribution, shown in Figure \(\PageIndex{1}\).[27]
  83. It is also known as the Gaussian distribution after Frederic Gauss, the first person to formalize its mathematical expression.[27]
  84. The normal distribution model always describes a symmetric, unimodal, bell shaped curve.[27]
  85. Specifically, the normal distribution model can be adjusted using two parameters: mean and standard deviation.[27]
  86. The normal or Gaussian distribution is extremely important in statistics, in part because it shows up all the time in nature.[28]
  87. The standard normal is defined as a normal distribution with μ = 0 and σ = 1.[28]
  88. You probably have explored the normal distribution before.[28]
  89. Below, you can adjust the parameters of the normal distribution and compare it to the standard normal.[28]
  90. For normally distributed vectors, see Multivariate normal distribution .[29]
  91. The simplest case of a normal distribution is known as the standard normal distribution.[29]
  92. Authors differ on which normal distribution should be called the "standard" one.[29]
  93. σ Z + μ {\displaystyle X=\sigma Z+\mu } will have a normal distribution with expected value μ {\displaystyle \mu } and standard deviation σ {\displaystyle \sigma } .[29]
  94. The so-called "standard normal distribution" is given by taking and in a general normal distribution.[30]
  95. This theorem states that the mean of any set of variates with any distribution having a finite mean and variance tends to the normal distribution.[30]

소스

  1. 1.0 1.1 normal distribution | Definition, Examples, Graph, & Facts
  2. 2.0 2.1 2.2 2.3 Normal Distributions (Bell Curve): Definition, Word Problems
  3. 3.0 3.1 3.2 3.3 Normal distribution (Gaussian distribution) (video)
  4. 4.0 4.1 4.2 Normal Distribution
  5. 5.0 5.1 5.2 5.3 Normal Distribution
  6. 1.3.6.6.1. Normal Distribution
  7. 7.0 7.1 7.2 7.3 Normal Distribution in Statistics
  8. 8.0 8.1 8.2 The Normal Distribution
  9. 9.0 9.1 Normal random variables
  10. 10.0 10.1 10.2 10.3 Normal Distribution (Bell Curve)
  11. 11.0 11.1 Normal Distribution and Standard Normal (Gaussian)
  12. 12.0 12.1 12.2 12.3 How to do Normal Distributions Calculations
  13. 13.0 13.1 Dietary Assessment Primer
  14. 14.0 14.1 14.2 14.3 Normal Distribution
  15. 15.0 15.1 15.2 Examples, Formulas, & Uses
  16. 16.0 16.1 16.2 16.3 Normal distribution
  17. Radiology Reference Article
  18. 18.0 18.1 18.2 Normal Distribution
  19. 19.0 19.1 19.2 19.3 Normal Distribution - an overview
  20. 20.0 20.1 20.2 20.3 Overview, Parameters, and Properties
  21. 21.0 21.1 21.2 The Normal Distribution
  22. 22.0 22.1 Normal Distribution
  23. 23.0 23.1 23.2 Non-normal Distributions Commonly Used in Health, Education, and Social Sciences: A Systematic Review
  24. 24.0 24.1 24.2 24.3 2.8. Normal distribution — Process Improvement using Data
  25. 25.0 25.1 25.2 Normal distribution
  26. 26.0 26.1 26.2 26.3 The Normal Distribution
  27. 27.0 27.1 27.2 27.3 3.1: Normal Distribution
  28. 28.0 28.1 28.2 28.3 1 Let’s Explore the Normal Distribution
  29. 29.0 29.1 29.2 29.3 Normal distribution
  30. 30.0 30.1 Normal Distribution -- from Wolfram MathWorld

메타데이터

위키데이터

Spacy 패턴 목록

  • [{'LOWER': 'normal'}, {'LEMMA': 'distribution'}]
  • [{'LOWER': 'gaussian'}, {'LEMMA': 'distribution'}]
  • [{'LOWER': 'bell'}, {'LEMMA': 'curve'}]