# 정규 분포

둘러보기로 가기 검색하러 가기

## 개요

• 고교 과정의 통계에서는 정규분포의 기본적인 성질과 정규분포표 읽는 방법을 배움.
• 평균이 $$\mu$$, 표준편차가 $$\sigma$$인 정규분포의 $$N(\mu,\sigma^2)$$의 확률밀도함수, 즉 가우시안은 다음과 같음이 알려져 있음.$\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$
• 아래에서는 이 확률밀도함수가 어떻게 해서 얻어지는가를 보임.(기본적으로는 가우스의 증명)
• 가우시안의 형태를 얻는 또다른 방법으로 드무아브르-라플라스 중심극한정리 를 참조.

## '오차의 법칙'을 통한 가우시안의 유도

• 오차 = 관측하려는 실제값 - 관측에서 얻어지는 값
• 오차의 분포를 기술하는 확률밀도함수 $$\Phi$$는 다음과 같은 성질을 만족시켜야 함. 1) $$\Phi(x)=\Phi(-x)$$ 2)작은 오차가 큰 오차보다 더 나타날 확률이 커야한다. 그리고 매우 큰 오차는 나타날 확률이 매우 작아야 한다. 3) $$\int_{-\infty}^{\infty} \Phi(x)\,dx=1$$ 4) 관측하려는 실제값이 $$\mu$$ 이고, n 번의 관측을 통해 $$x_ 1, x_ 2, \cdots, x_n$$ 을 얻을 확률 $$\Phi(\mu-x_ 1)\Phi(\mu-x_ 2)\cdots\Phi(\mu-x_n)$$의 최대값은 $$\mu=\frac{x_ 1+x_ 2+ \cdots+ x_n}{n}$$에서 얻어진다.
• 4번 조건을 가우스의 산술평균의 법칙이라 부르며, 관측에 있어 실제값이 될 개연성이 가장 높은 값은 관측된 값들의 산술평균이라는 가정을 하는 것임.

정리 (가우스)

이 조건들을 만족시키는 확률밀도함수는 $$\Phi(x)=\frac{h}{\sqrt{\pi}}e^{-h^2x^2}$$ 형태로 주어진다. 여기서 $$h$$는 확률의 정확도와 관련된 값임. (실제로는 표준편차와 연관되는 값)

증명

$$n=3$$인 경우에 4번 조건을 만족시키는 함수를 찾아보자.

$$\Phi(x-x_ 1)\Phi(x-x_ 2)\Phi(x-x_ 3)$$의 최대값은 $$x=\frac{x_ 1+x_ 2+ x_ 3}{3}$$ 에서 얻어진다.

따라서 $$\ln \Phi(x-x_ 1)\Phi(x-x_ 2)\Phi(x-x_ 3)$$ 의 최대값도 $$x=\frac{x_ 1+x_ 2+ x_ 3}{3}$$ 에서 얻어진다.

미분적분학의 결과에 의해, $$x=\frac{x_ 1+x_ 2+ x_ 3}{3}$$ 이면, $$\frac{\Phi'(x-x_ 1)}{\Phi(x-x_ 1)}+\frac{\Phi'(x-x_ 2)}{\Phi(x-x_ 2)}+\frac{\Phi'(x-x_ 3)}{\Phi(x-x_ 3)}=0$$ 이어야 한다.

$$F(x)=\frac{\Phi'(x)}{\Phi(x)}$$ 으로 두자.

$$x+y+z=0$$ 이면, $$F(x)+F(y)+F(z)=0$$ 이어야 한다.

1번 조건에 의해, $$F$$ 는 기함수이다.

따라서 모든 $$x,y$$ 에 의해서, $$F(x+y)=F(x)+F(y)$$ 가 성립한다. 그러므로 $$F(x)=Ax$$ 형태로 쓸수 있다.

이제 적당한 상수 $$B, h$$ 에 의해 $$\Phi(x)=Be^{-h^2x^2}$$ 꼴로 쓸 수 있다.

모든 $$n$$에 대하여 4번조건이 만족됨은 쉽게 확인할 수 있다. (증명끝)

## 역사

• 중심극한정리는 여러 과정을 거쳐 발전
• 이항분포의 중심극한 정리
• 라플라스의 19세기 초기 버전

확률변수 X가 이항분포 B(n,p)를 따를 때, n이 충분히 크면 X의 분포는 근사적으로 정규분포 N(np,npq)를 따른다

• 피타고라스의 창

## 노트

### 말뭉치

1. Normal distribution, also called Gaussian distribution, the most common distribution function for independent, randomly generated variables.
2. Read More on This Topic statistics: The normal distribution The most widely used continuous probability distribution in statistics is the normal probability distribution.
3. normal distribution , sometimes called the bell curve, is a distribution that occurs naturally in many situations.
4. For example, the bell curve is seen in tests like the SAT and GRE.
5. A smaller standard deviation indicates that the data is tightly clustered around the mean; the normal distribution will be taller.
6. If the data is evenly distributed, you may come up with a bell curve.
7. Everything we do, or almost everything we do in inferential statistics, which is essentially making inferences based on data points, is to some degree based on the normal distribution.
8. And so what I want to do in this video and in this spreadsheet is to essentially give you as deep an understanding of the normal distribution as possible.
9. And it actually turns out, for the normal distribution, this isn't an easy thing to evaluate analytically.
10. and if you were to take the sum of them, as you approach an infinite number of flips, you approach the normal distribution.
11. You can see a normal distribution being created by random chance!
12. From the big bell curve above we see that 0.1% are less.
13. Use the Standard Normal Distribution Table when you want more accurate values.
14. The normal distribution is the most common type of distribution assumed in technical stock market analysis and in other types of statistical analyses.
15. The normal distribution model is motivated by the Central Limit Theorem.
16. Normal distribution is sometimes confused with symmetrical distribution.
17. The skewness and kurtosis coefficients measure how different a given distribution is from a normal distribution.
18. The case where μ = 0 and σ = 1 is called the standard normal distribution.
19. The normal distribution is the most important probability distribution in statistics because it fits many natural phenomena.
20. For example, heights, blood pressure, measurement error, and IQ scores follow the normal distribution.
21. The normal distribution is a probability function that describes how the values of a variable are distributed.
22. As with any probability distribution, the parameters for the normal distribution define its shape and probabilities entirely.
23. If a dataset follows a normal distribution, then about 68% of the observations will fall within of the mean , which in this case is with the interval (-1,1).
24. Although it may appear as if a normal distribution does not include any values beyond a certain interval, the density is actually positive for all values, .
25. The standardized values in the second column and the corresponding normal quantile scores are very similar, indicating that the temperature data seem to fit a normal distribution.
26. Let us find the mean and variance of the standard normal distribution.
27. To find the CDF of the standard normal distribution, we need to integrate the PDF function.
28. Most of the continuous data values in a normal distribution tend to cluster around the mean, and the further a value is from the mean, the less likely it is to occur.
29. The normal distribution is often called the bell curve because the graph of its probability density looks like a bell.
30. The normal distribution is the most important probability distribution in statistics because many continuous data in nature and psychology displays this bell-shaped curve when compiled and graphed.
31. Converting the raw scores of a normal distribution to z-scores We can standardized the values (raw scores) of a normal distribution by converting them into z-scores.
32. The diagram above shows the bell shaped curve of a normal (Gaussian) distribution superimposed on a histogram of a sample from a normal distribution.
33. The tail area of the normal distribution is evaluated to 15 decimal places of accuracy using the complement of the error function (Abramowitz and Stegun, 1964; Johnson and Kotz, 1970).
34. This guide will show you how to calculate the probability (area under the curve) of a standard normal distribution.
35. It will first show you how to interpret a Standard Normal Distribution Table.
36. As explained above, the standard normal distribution table only provides the probability for values less than a positive z-value (i.e., z-values on the right-hand side of the mean).
37. We start by remembering that the standard normal distribution has a total area (probability) equal to 1 and it is also symmetrical about the mean.
38. The "normal distribution" is the most commonly used distribution in statistics.
39. To choose the best Box-Cox transformation—the one that best approximates a normal distribution - Box and Cox suggested using the maximum likelihood method.
40. The graph of the normal distribution depends on two factors - the mean and the standard deviation.
41. To find the probability associated with a normal random variable, use a graphing calculator, an online normal distribution calculator, or a normal distribution table.
42. In the examples below, we illustrate the use of Stat Trek's Normal Distribution Calculator, a free tool available on this site.
43. The normal distribution calculator solves common statistical problems, based on the normal distribution.
44. In a normal distribution, data is symmetrically distributed with no skew.
45. Example: Using the empirical rule in a normal distribution You collect SAT scores from students in a new test preparation course.
46. The data follows a normal distribution with a mean score (M) of 1150 and a standard deviation (SD) of 150.
47. A random variable with the standard Normal distribution, commonly denoted by $$Z$$, has mean zero and standard deviation one.
48. The probabilities for any Normal distribution can be reduced to probabilities for the standard Normal distribution, using the device of standardisation.
49. Crowd size Suppose that crowd size at home games for a particular football club follows a Normal distribution with mean $$26\ 000$$ and standard deviation 5000.
50. The cdf of any Normal distribution can also be found, using technology, without first standardising.
51. The normal distribution is also useful when sampling data out of a non-normal data set.
52. A truncated NORMAL distribution can be defined for a variable by setting the desired minimum and/or maximum values for the variable.
53. For practical purposes, minimum and maximum values that are at least 3 standard deviations away from the mean generate a complete normal distribution.
54. For a Normal distribution, 99.73 % of all samples, will fall within 3 Standard Deviations of the mean value.
55. Many other common distributions become like the normal distribution in special cases.
56. Look at the histograms of lifetimes given in Figure 21.3 and of resistances given in Figure 21.4 and you will see that they resemble the normal distribution.
57. If you were to get a large group of students to measure the diameter of a washer to the nearest 0.1 mm, then a histogram of the results would give an approximately normal distribution.
58. However, there is a problem with the normal distribution function in that is not easy to integrate!
59. The normal distribution is also referred to as Gaussian or Gauss distribution.
60. In a normal distribution graph, the mean defines the location of the peak, and most of the data points are clustered around the mean.
61. A normal distribution comes with a perfectly symmetrical shape.
62. The middle point of a normal distribution is the point with the maximum frequency, which means that it possesses the most observations of the variable.
63. We will get a normal distribution if there is a true answer for the distance, but as we shoot for this distance, since, to err is human, we are likely to miss the target.
64. We can use the fact that the normal distribution is a probability distribution, and the total area under the curve is 1.
65. If you use the normal distribution, the probability comes of to be about 0.728668.
66. The minimum variance unbiased estimator (MVUE) is commonly used to estimate the parameters of the normal distribution.
67. For an example, see Fit Normal Distribution Object.
68. The normal distribution is the most well-known distribution and the most frequently used in statistical theory and applications.
69. Any articles that did not specify the type of distribution or which referred to the normal distribution were likewise excluded.
70. In stage 2 we eliminated a further 292 abstracts that made no mention of the type of distribution and one which referred to a normal distribution.
71. Before introducing the normal distribution, we first look at two important concepts: the Central limit theorem, and the concept of independence.
72. The Central limit theorem plays an important role in the theory of probability and in the derivation of the normal distribution.
73. As one sees from the above figures, the distribution from these averages quickly takes the shape of the so-called normal distribution.
74. You might still find yourself having to refer to tables of cumulative area under the normal distribution, instead of using the pnorm() function (for example in a test or exam).
75. The normal distribution is a continuous, univariate, symmetric, unbounded, unimodal and bell-shaped probability distribution.
76. Use this to describe a quantity that has a normal normal distribution with the given «mean» and standard deviation «stddev».
77. Suppose you want to fit a Normal distribution to historical data.
78. The normal distribution holds an honored role in probability and statistics, mostly because of the central limit theorem, one of the fundamental theorems that forms a bridge between the two subjects.
79. In addition, as we will see, the normal distribution has many nice mathematical properties.
80. In the Special Distribution Simulator, select the normal distribution and keep the default settings.
81. In the special distribution calculator, select the normal distribution and keep the default settings.
82. Indeed it is so common, that people often know it as the normal curve or normal distribution, shown in Figure $$\PageIndex{1}$$.
83. It is also known as the Gaussian distribution after Frederic Gauss, the first person to formalize its mathematical expression.
84. The normal distribution model always describes a symmetric, unimodal, bell shaped curve.
85. Specifically, the normal distribution model can be adjusted using two parameters: mean and standard deviation.
86. The normal or Gaussian distribution is extremely important in statistics, in part because it shows up all the time in nature.
87. The standard normal is defined as a normal distribution with μ = 0 and σ = 1.
88. You probably have explored the normal distribution before.
89. Below, you can adjust the parameters of the normal distribution and compare it to the standard normal.
90. For normally distributed vectors, see Multivariate normal distribution .
91. The simplest case of a normal distribution is known as the standard normal distribution.
92. Authors differ on which normal distribution should be called the "standard" one.
93. σ Z + μ {\displaystyle X=\sigma Z+\mu } will have a normal distribution with expected value μ {\displaystyle \mu } and standard deviation σ {\displaystyle \sigma } .
94. The so-called "standard normal distribution" is given by taking and in a general normal distribution.
95. This theorem states that the mean of any set of variates with any distribution having a finite mean and variance tends to the normal distribution.

## 메타데이터

### Spacy 패턴 목록

• [{'LOWER': 'normal'}, {'LEMMA': 'distribution'}]
• [{'LOWER': 'gaussian'}, {'LEMMA': 'distribution'}]
• [{'LOWER': 'bell'}, {'LEMMA': 'curve'}]