# 쿨백-라이블러 발산

둘러보기로 가기
검색하러 가기

## 노트

- One important thing to note is that the KL Divergence is an asymmetric measure (i.e. KL(P,Q) !
^{[1]} - As expected we see a smaller KL Divergence for distributions 1 & 2 than 1 & 3.
^{[1]} - And we also see the KL Divergence of a distribution with itself is 0.
^{[1]} - Finally, we comment on recent applications of KL divergence in the neural coding literature and highlight its natural application.
^{[2]} - Proposition Let and be two probability density functions such that their KL divergence is well-defined.
^{[3]} - This study also investigates a variety of applications of KL divergence in medical diagnostics.
^{[4]} - Graphically, KL divergence depicted through the information graph.
^{[4]} - It described an application of the KL divergence for discrete biomarkers.
^{[4]} - Section 2 describes preliminaries, including mathematical details of the KL divergence.
^{[4]} - Optimal encoding of information is a very interesting topic, but not necessary for understanding KL divergence.
^{[5]} - With KL divergence we can calculate exactly how much information is lost when we approximate one distribution with another.
^{[5]} - Now we can go ahead and calculate the KL divergence for our two approximating distributions.
^{[5]} - We can double check our work by looking at the way KL Divergence changes as we change our values for this parameter.
^{[5]} - It is a great post explaining the KL divergence, but felt some of the intricacies in the explanation can be explained in more detail.
^{[6]} - Let us now compute the KL divergence for each of the approximate distributions we came up with.
^{[6]} - First we will see how the KL divergence changes when the success probability of the binomial distribution changes.
^{[6]} - You can see that as we are moving away from our choice (red dot), the KL divergence rapidly increases.
^{[6]} - The SciPy library provides the kl_div() function for calculating the KL divergence, although with a different definition as defined here.
^{[7]} - It also provides the rel_entr() function for calculating the relative entropy, which matches the definition of KL divergence here.
^{[7]} - # example of calculating the kl divergence (relative entropy) with scipy from scipy .
^{[7]} - It uses the KL divergence to calculate a normalized score that is symmetrical.
^{[7]} - Relative entropy relates to " rate function " in the theory of large deviations .
^{[8]} - Relative entropy remains well-defined for continuous distributions, and furthermore is invariant under parameter transformations .
^{[8]} - Relative entropy is directly related to the Fisher information metric .
^{[8]}

### 소스

- ↑
^{1.0}^{1.1}^{1.2}Kullback-Leibler (KL) Divergence — Apache MXNet documentation - ↑ Notes on Kullback-Leibler Divergence and Likelihood
- ↑ Kullback-Leibler divergence
- ↑
^{4.0}^{4.1}^{4.2}^{4.3}Kullback-Leibler Divergence for Medical Diagnostics Accuracy and Cut-point Selection Criterion: How it is related to the Youden Index - ↑
^{5.0}^{5.1}^{5.2}^{5.3}Kullback-Leibler Divergence Explained — Count Bayesie - ↑
^{6.0}^{6.1}^{6.2}^{6.3}Intuitive Guide to Understanding KL Divergence - ↑
^{7.0}^{7.1}^{7.2}^{7.3}How to Calculate the KL Divergence for Machine Learning - ↑
^{8.0}^{8.1}^{8.2}Relative entropy

## 메타데이터

### 위키데이터

- ID : Q255166

### Spacy 패턴 목록

- [{'LOWER': 'kullback'}, {'OP': '*'}, {'LOWER': 'leibler'}, {'LEMMA': 'divergence'}]
- [{'LOWER': 'information'}, {'LEMMA': 'divergence'}]
- [{'LOWER': 'information'}, {'LEMMA': 'gain'}]
- [{'LOWER': 'relative'}, {'LEMMA': 'entropy'}]
- [{'LOWER': 'kl'}, {'LEMMA': 'divergence'}]
- [{'LEMMA': 'KLIC'}]