<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="ko">
	<id>https://wiki.mathnt.net/index.php?action=history&amp;feed=atom&amp;title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98</id>
	<title>손실 함수 - 편집 역사</title>
	<link rel="self" type="application/atom+xml" href="https://wiki.mathnt.net/index.php?action=history&amp;feed=atom&amp;title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98"/>
	<link rel="alternate" type="text/html" href="https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;action=history"/>
	<updated>2026-04-05T04:48:44Z</updated>
	<subtitle>이 문서의 편집 역사</subtitle>
	<generator>MediaWiki 1.35.0</generator>
	<entry>
		<id>https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=51640&amp;oldid=prev</id>
		<title>2021년 2월 17일 (수) 08:59에 Pythagoras0님의 편집</title>
		<link rel="alternate" type="text/html" href="https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=51640&amp;oldid=prev"/>
		<updated>2021-02-17T08:59:06Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;ko&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← 이전 판&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;2021년 2월 17일 (수) 08:59 판&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l99&quot; &gt;99번째 줄:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;99번째 줄:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  &amp;lt;references /&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  &amp;lt;references /&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== 메타데이터 ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==메타데이터==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt;−&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===위키데이터===&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===위키데이터===&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* ID :  [https://www.wikidata.org/wiki/Q1036748 Q1036748]&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* ID :  [https://www.wikidata.org/wiki/Q1036748 Q1036748]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===Spacy 패턴 목록===&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* [{&amp;#039;LOWER&amp;#039;: &amp;#039;loss&amp;#039;}, {&amp;#039;LEMMA&amp;#039;: &amp;#039;function&amp;#039;}]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* [{&amp;#039;LOWER&amp;#039;: &amp;#039;loss&amp;#039;}, {&amp;#039;LEMMA&amp;#039;: &amp;#039;function&amp;#039;}]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* [{&amp;#039;LOWER&amp;#039;: &amp;#039;error&amp;#039;}, {&amp;#039;LEMMA&amp;#039;: &amp;#039;function&amp;#039;}]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* [{&amp;#039;LOWER&amp;#039;: &amp;#039;cost&amp;#039;}, {&amp;#039;LEMMA&amp;#039;: &amp;#039;function&amp;#039;}]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Pythagoras0</name></author>
	</entry>
	<entry>
		<id>https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=47447&amp;oldid=prev</id>
		<title>Pythagoras0: /* 메타데이터 */ 새 문단</title>
		<link rel="alternate" type="text/html" href="https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=47447&amp;oldid=prev"/>
		<updated>2020-12-26T13:25:41Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;메타데이터: &lt;/span&gt; 새 문단&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;ko&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← 이전 판&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;2020년 12월 26일 (토) 13:25 판&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l98&quot; &gt;98번째 줄:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;98번째 줄:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===소스===&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===소스===&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  &amp;lt;references /&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  &amp;lt;references /&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;== 메타데이터 ==&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===위키데이터===&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* ID :  [https://www.wikidata.org/wiki/Q1036748 Q1036748]&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Pythagoras0</name></author>
	</entry>
	<entry>
		<id>https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=45797&amp;oldid=prev</id>
		<title>Pythagoras0: /* 노트 */ 새 문단</title>
		<link rel="alternate" type="text/html" href="https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=45797&amp;oldid=prev"/>
		<updated>2020-12-16T07:39:01Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;노트: &lt;/span&gt; 새 문단&lt;/span&gt;&lt;/p&gt;
&lt;table class=&quot;diff diff-contentalign-left diff-editfont-monospace&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;ko&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← 이전 판&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;2020년 12월 16일 (수) 07:39 판&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot; &gt;1번째 줄:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;1번째 줄:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;== 노트 ==&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt; &lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The loss function is used to measure how good or bad the model is performing.&amp;lt;ref name=&amp;quot;ref_937a&amp;quot;&amp;gt;[https://www.analyticssteps.com/blogs/what-are-different-loss-functions-used-optimizers-neural-networks What Are Different Loss Functions Used as Optimizers in Neural Networks?]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Also, there is no fixed loss function that can be used in all places.&amp;lt;ref name=&amp;quot;ref_937a&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss functions are mainly classified into two different categories that are Classification loss and Regression Loss.&amp;lt;ref name=&amp;quot;ref_937a&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* We implement this mechanism in the form of losses and loss functions.&amp;lt;ref name=&amp;quot;ref_b293&amp;quot;&amp;gt;[https://data-flair.training/blogs/keras-loss-functions/ Keras Loss Functions]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Neural networks are trained using an optimizer and we are required to choose a loss function while configuring our model.&amp;lt;ref name=&amp;quot;ref_b293&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Different loss functions play slightly different roles in training neural nets.&amp;lt;ref name=&amp;quot;ref_b293&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* This article will explain the role of Keras loss functions in training deep neural nets.&amp;lt;ref name=&amp;quot;ref_b293&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* At its core, a loss function is incredibly simple: it’s a method of evaluating how well your algorithm models your dataset.&amp;lt;ref name=&amp;quot;ref_ffc8&amp;quot;&amp;gt;[https://iq.opengenus.org/types-of-loss-function/ Types of Loss Function]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* If your predictions are totally off, your loss function will output a higher number.&amp;lt;ref name=&amp;quot;ref_ffc8&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* There are variety of pakages which surropt these loss function.&amp;lt;ref name=&amp;quot;ref_ffc8&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* This paper studies a variety of loss functions and output layer regularization strategies on image classification tasks.&amp;lt;ref name=&amp;quot;ref_3230&amp;quot;&amp;gt;[https://paperswithcode.com/paper/what-s-in-a-loss-function-for-image What&amp;#039;s in a Loss Function for Image Classification?]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* , we’ll be discussing what a loss function is and how it’s used in an artificial neural network.&amp;lt;ref name=&amp;quot;ref_e14f&amp;quot;&amp;gt;[https://deeplizard.com/learn/video/Skc8nqJirJg Loss in a Neural Network explained]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Recall that we’ve already introduced the idea of a loss function in our post on training a neural network.&amp;lt;ref name=&amp;quot;ref_e14f&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The loss function is what SGD is attempting to minimize by iteratively updating the weights in the network.&amp;lt;ref name=&amp;quot;ref_e14f&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* This was just illustrating the math behind how one loss function, MSE, works.&amp;lt;ref name=&amp;quot;ref_e14f&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* However, there is no universally accepted definition for other loss functions.&amp;lt;ref name=&amp;quot;ref_2c78&amp;quot;&amp;gt;[https://link.springer.com/article/10.1023/A:1022899518027 Variance and Bias for General Loss Functions]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Most approaches have focused solely on 0-1 loss functions and have produced significantly different definitions.&amp;lt;ref name=&amp;quot;ref_2c78&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Using this framework, bias and variance definitions are produced which generalize to any symmetric loss function.&amp;lt;ref name=&amp;quot;ref_2c78&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* We illustrate these statistics on several loss functions with particular emphasis on 0-1 loss.&amp;lt;ref name=&amp;quot;ref_2c78&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The results obtained with their bi-temperature loss function was then compared to the vanilla logistic loss function.&amp;lt;ref name=&amp;quot;ref_b4d0&amp;quot;&amp;gt;[https://www.kdnuggets.com/2019/11/research-guide-advanced-loss-functions-machine-learning-models.html Research Guide: Advanced Loss Functions for Machine Learning Models]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* This loss function is adopted for the discriminator.&amp;lt;ref name=&amp;quot;ref_b4d0&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* As a result of this, GANs using this loss function are able to generate higher quality images than regular GANs.&amp;lt;ref name=&amp;quot;ref_b4d0&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* This loss function is used when images that look similar are being compared.&amp;lt;ref name=&amp;quot;ref_b4d0&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* We will use the term cost function for a single training example and loss function for the entire training dataset.&amp;lt;ref name=&amp;quot;ref_6b69&amp;quot;&amp;gt;[https://analyticsindiamag.com/hands-on-guide-to-loss-functions-used-to-evaluate-a-ml-algorithm/ Hands-On Guide To Loss Functions Used To Evaluate A ML Algorithm]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Depending on the output variable we need to choose loss function to our model.&amp;lt;ref name=&amp;quot;ref_6b69&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* MSE loss is popularly used loss functions in dealing with regression problems.&amp;lt;ref name=&amp;quot;ref_6b69&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The args and kwargs will be passed to loss_cls during the initialization to instantiate a loss function.&amp;lt;ref name=&amp;quot;ref_aa20&amp;quot;&amp;gt;[https://docs.fast.ai/losses.html Loss Functions]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* + 1(e &amp;lt; 0)c 2 (e ) will be a loss function.&amp;lt;ref name=&amp;quot;ref_6adc&amp;quot;&amp;gt;[https://www.encyclopedia.com/social-sciences/applied-and-social-sciences-magazines/loss-functions Encyclopedia.com]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Optimal forecasting of a time series model depends extensively on the specification of the loss function.&amp;lt;ref name=&amp;quot;ref_6adc&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Suppose the loss functions c 1 (·), c 2 (·) are used for forecasting Y t + h and for forecasting h (Y t + h ), respectively.&amp;lt;ref name=&amp;quot;ref_6adc&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Granger (1999) remarks that it would be strange behavior to use the same loss function for Y and h (Y ).&amp;lt;ref name=&amp;quot;ref_6adc&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss functions are used to train neural networks and to compute the difference between output and target variable.&amp;lt;ref name=&amp;quot;ref_e490&amp;quot;&amp;gt;[https://mxnet.apache.org/versions/1.7/api/python/docs/tutorials/packages/gluon/loss/loss.html Loss functions — Apache MXNet documentation]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* A critical component of training neural networks is the loss function.&amp;lt;ref name=&amp;quot;ref_e490&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* A loss function is a quantative measure of how bad the predictions of the network are when compared to ground truth labels.&amp;lt;ref name=&amp;quot;ref_e490&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Some tasks use a combination of multiple loss functions, but often you’ll just use one.&amp;lt;ref name=&amp;quot;ref_e490&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss functions are to be supplied in the loss parameter of the compile.keras.engine.training.&amp;lt;ref name=&amp;quot;ref_e4cb&amp;quot;&amp;gt;[https://keras.rstudio.com/reference/loss_mean_squared_error.html Model loss functions — loss_mean_squared_error]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* How do you capture the difference between two distributions in GAN loss functions?&amp;lt;ref name=&amp;quot;ref_5206&amp;quot;&amp;gt;[https://developers.google.com/machine-learning/gan/loss Generative Adversarial Networks]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The loss function used in the paper that introduced GANs.&amp;lt;ref name=&amp;quot;ref_5206&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* A GAN can have two loss functions: one for generator training and one for discriminator training.&amp;lt;ref name=&amp;quot;ref_5206&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* There are several ways to define the details of the loss function.&amp;lt;ref name=&amp;quot;ref_dcd8&amp;quot;&amp;gt;[https://cs231n.github.io/linear-classify/ CS231n Convolutional Neural Networks for Visual Recognition]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* There is one bug with the loss function we presented above.&amp;lt;ref name=&amp;quot;ref_dcd8&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* We can do so by extending the loss function with a regularization penalty \(R(W)\).&amp;lt;ref name=&amp;quot;ref_dcd8&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The demo visualizes the loss functions discussed in this section using a toy 3-way classification on 2D data.&amp;lt;ref name=&amp;quot;ref_dcd8&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* In SLF, a generic loss function is formulated as a joint optimization problem of network weights and loss parameters.&amp;lt;ref name=&amp;quot;ref_8a58&amp;quot;&amp;gt;[https://aaai.org/ojs/index.php/AAAI/article/view/5925 Stochastic Loss Function]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The loss function for linear regression is squared loss.&amp;lt;ref name=&amp;quot;ref_21cb&amp;quot;&amp;gt;[https://developers.google.com/machine-learning/crash-course/logistic-regression/model-training Logistic Regression: Loss and Regularization]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The way you configure your loss functions can make or break the performance of your algorithm.&amp;lt;ref name=&amp;quot;ref_e8ab&amp;quot;&amp;gt;[https://neptune.ai/blog/pytorch-loss-functions PyTorch Loss Functions: The Ultimate Guide]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* In this article, we’ll talk about popular loss functions in PyTorch, and about building custom loss functions.&amp;lt;ref name=&amp;quot;ref_e8ab&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss functions are used to gauge the error between the prediction output and the provided target value.&amp;lt;ref name=&amp;quot;ref_e8ab&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* A loss function tells us how far the algorithm model is from realizing the expected outcome.&amp;lt;ref name=&amp;quot;ref_e8ab&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* In fact, we can design our own (very) basic loss function to further explain how it works.&amp;lt;ref name=&amp;quot;ref_8213&amp;quot;&amp;gt;[https://algorithmia.com/blog/introduction-to-loss-functions Introduction to Loss Functions]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* For each prediction that we make, our loss function will simply measure the absolute difference between our prediction and the actual value.&amp;lt;ref name=&amp;quot;ref_8213&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Notice how in the loss function we defined, it doesn’t matter if our predictions were too high or too low.&amp;lt;ref name=&amp;quot;ref_8213&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* A lot of the loss functions that you see implemented in machine learning can get complex and confusing.&amp;lt;ref name=&amp;quot;ref_8213&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* An optimization problem seeks to minimize a loss function.&amp;lt;ref name=&amp;quot;ref_7bae&amp;quot;&amp;gt;[https://en.wikipedia.org/wiki/Loss_function Loss function]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The use of a quadratic loss function is common, for example when using least squares techniques.&amp;lt;ref name=&amp;quot;ref_7bae&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The quadratic loss function is also used in linear-quadratic optimal control problems.&amp;lt;ref name=&amp;quot;ref_7bae&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* One of these algorithmic changes was the replacement of mean squared error with the cross-entropy family of loss functions.&amp;lt;ref name=&amp;quot;ref_8699&amp;quot;&amp;gt;[https://machinelearningmastery.com/loss-and-loss-functions-for-training-deep-learning-neural-networks/ Loss and Loss Functions for Training Deep Learning Neural Networks]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Importantly, the choice of loss function is directly related to the activation function used in the output layer of your neural network.&amp;lt;ref name=&amp;quot;ref_8699&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The choice of cost function is tightly coupled with the choice of output unit.&amp;lt;ref name=&amp;quot;ref_8699&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The model can be updated to use the ‘mean_squared_logarithmic_error‘ loss function and keep the same configuration for the output layer.&amp;lt;ref name=&amp;quot;ref_ead3&amp;quot;&amp;gt;[https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/ How to Choose Loss Functions When Training Deep Learning Neural Networks]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss functions are used to determine the error (aka “the loss”) between the output of our algorithms and the given target value.&amp;lt;ref name=&amp;quot;ref_39e5&amp;quot;&amp;gt;[https://deepai.org/machine-learning-glossary-and-terms/loss-function Loss Function]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The quadratic loss is a commonly used symmetric loss function.&amp;lt;ref name=&amp;quot;ref_39e5&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The Cost function and Loss function refer to the same context.&amp;lt;ref name=&amp;quot;ref_e085&amp;quot;&amp;gt;[https://dev.to/imsparsh/most-common-loss-functions-in-machine-learning-57p7 Most Common Loss Functions in Machine Learning]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The cost function is a function that is calculated as the average of all loss function values.&amp;lt;ref name=&amp;quot;ref_e085&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The Loss function is directly related to the predictions of your model that you have built.&amp;lt;ref name=&amp;quot;ref_e085&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* This is the most common Loss function used in Classification problems.&amp;lt;ref name=&amp;quot;ref_e085&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The group of functions that are minimized are called “loss functions”.&amp;lt;ref name=&amp;quot;ref_faa0&amp;quot;&amp;gt;[https://medium.com/@phuctrt/loss-functions-why-what-where-or-when-189815343d3f Loss functions: Why, what, where or when?]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss function is used as measurement of how good a prediction model does in terms of being able to predict the expected outcome.&amp;lt;ref name=&amp;quot;ref_faa0&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* A loss function is a mathematical function commonly used in statistics.&amp;lt;ref name=&amp;quot;ref_cb5b&amp;quot;&amp;gt;[https://radiopaedia.org/articles/loss-function Radiology Reference Article]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* There are many types of loss functions including mean absolute loss, mean squared error and mean bias error.&amp;lt;ref name=&amp;quot;ref_cb5b&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss functions are at the heart of the machine learning algorithms we love to use.&amp;lt;ref name=&amp;quot;ref_2088&amp;quot;&amp;gt;[https://www.analyticsvidhya.com/blog/2019/08/detailed-guide-7-loss-functions-machine-learning-python-code/ Loss Function In Machine Learning]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* In this article, I will discuss 7 common loss functions used in machine learning and explain where each of them is used.&amp;lt;ref name=&amp;quot;ref_2088&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss functions are one part of the entire machine learning journey you will take.&amp;lt;ref name=&amp;quot;ref_2088&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Here, theta_j is the weight to be updated, alpha is the learning rate and J is the cost function.&amp;lt;ref name=&amp;quot;ref_2088&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Machines learn by means of a loss function.&amp;lt;ref name=&amp;quot;ref_8f8d&amp;quot;&amp;gt;[https://towardsdatascience.com/common-loss-functions-in-machine-learning-46af0ffc4d23 Common Loss functions in machine learning]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* If predictions deviates too much from actual results, loss function would cough up a very large number.&amp;lt;ref name=&amp;quot;ref_8f8d&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Gradually, with the help of some optimization function, loss function learns to reduce the error in prediction.&amp;lt;ref name=&amp;quot;ref_8f8d&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* There’s no one-size-fits-all loss function to algorithms in machine learning.&amp;lt;ref name=&amp;quot;ref_8f8d&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The loss function is the function that computes the distance between the current output of the algorithm and the expected output.&amp;lt;ref name=&amp;quot;ref_7ee5&amp;quot;&amp;gt;[https://towardsdatascience.com/what-is-loss-function-1e2605aeb904 What are Loss Functions?]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* This loss function is convex and grows linearly for negative values (less sensitive to outliers).&amp;lt;ref name=&amp;quot;ref_7ee5&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* The Hinge loss function was developed to correct the hyperplane of SVM algorithm in the task of classification.&amp;lt;ref name=&amp;quot;ref_7ee5&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* At the difference of the previous loss function, the square is replaced by an absolute value.&amp;lt;ref name=&amp;quot;ref_7ee5&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Square Error (MSE) is the most commonly used regression loss function.&amp;lt;ref name=&amp;quot;ref_a0e0&amp;quot;&amp;gt;[https://heartbeat.fritz.ai/5-regression-loss-functions-all-machine-learners-should-know-4fb140e9d4b0 5 Regression Loss Functions All Machine Learners Should Know]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Whenever we train a machine learning model, our goal is to find the point that minimizes loss function.&amp;lt;ref name=&amp;quot;ref_a0e0&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Problems with both: There can be cases where neither loss function gives desirable predictions.&amp;lt;ref name=&amp;quot;ref_a0e0&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Another way is to try a different loss function.&amp;lt;ref name=&amp;quot;ref_a0e0&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Generally cost and loss functions are synonymous but cost function can contain regularization terms in addition to loss function.&amp;lt;ref name=&amp;quot;ref_d4f7&amp;quot;&amp;gt;[https://medium.com/@zeeshanmulla/cost-activation-loss-function-neural-network-deep-learning-what-are-these-91167825a4de Cost, Activation, Loss Function|| Neural Network|| Deep Learning. What are these?]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss function is a method of evaluating “how well your algorithm models your dataset”.&amp;lt;ref name=&amp;quot;ref_d4f7&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Cost Function quantifies the error between predicted values and expected values and presents it in the form of a single real number.&amp;lt;ref name=&amp;quot;ref_d4f7&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Depending on the problem Cost Function can be formed in many different ways.&amp;lt;ref name=&amp;quot;ref_d4f7&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* In this example, we’re defining the loss function by creating an instance of the loss class.&amp;lt;ref name=&amp;quot;ref_477b&amp;quot;&amp;gt;[https://neptune.ai/blog/keras-loss-functions Keras Loss Functions: Everything You Need To Know]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Problems involving the prediction of more than one class use different loss functions.&amp;lt;ref name=&amp;quot;ref_477b&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* During the training process, one can weigh the loss function by observations or samples.&amp;lt;ref name=&amp;quot;ref_477b&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* It is usually a good idea to monitor the loss function, on the training and validation set as the model is training.&amp;lt;ref name=&amp;quot;ref_477b&amp;quot; /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;* Loss functions are typically created by instantiating a loss class (e.g. keras.losses.&amp;lt;ref name=&amp;quot;ref_3d67&amp;quot;&amp;gt;[https://keras.io/api/losses/ Losses]&amp;lt;/ref&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===소스===&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot;&gt; &lt;/td&gt;&lt;td class=&#039;diff-marker&#039;&gt;+&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &amp;lt;references /&amp;gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Pythagoras0</name></author>
	</entry>
	<entry>
		<id>https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=45796&amp;oldid=prev</id>
		<title>Pythagoras0: 문서를 비움</title>
		<link rel="alternate" type="text/html" href="https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=45796&amp;oldid=prev"/>
		<updated>2020-12-16T07:38:49Z</updated>

		<summary type="html">&lt;p&gt;문서를 비움&lt;/p&gt;
&lt;a href=&quot;https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;amp;diff=45796&amp;amp;oldid=45761&quot;&gt;차이 보기&lt;/a&gt;</summary>
		<author><name>Pythagoras0</name></author>
	</entry>
	<entry>
		<id>https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=45761&amp;oldid=prev</id>
		<title>Pythagoras0: /* 노트 */ 새 문단</title>
		<link rel="alternate" type="text/html" href="https://wiki.mathnt.net/index.php?title=%EC%86%90%EC%8B%A4_%ED%95%A8%EC%88%98&amp;diff=45761&amp;oldid=prev"/>
		<updated>2020-12-16T05:31:16Z</updated>

		<summary type="html">&lt;p&gt;&lt;span dir=&quot;auto&quot;&gt;&lt;span class=&quot;autocomment&quot;&gt;노트: &lt;/span&gt; 새 문단&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;새 문서&lt;/b&gt;&lt;/p&gt;&lt;div&gt;== 노트 ==&lt;br /&gt;
&lt;br /&gt;
* The following points highlight the three main types of cost functions.&amp;lt;ref name=&amp;quot;ref_fabe&amp;quot;&amp;gt;[http://www.economicsdiscussion.net/cost/3-main-types-of-cost-functions/19976 3 Main Types of Cost Functions]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* that statistical cost functions will have a bias towards linearity.&amp;lt;ref name=&amp;quot;ref_fabe&amp;quot; /&amp;gt;&lt;br /&gt;
* We have noted that if the cost function is linear, the equation used in preparing the total cost curve in Fig.&amp;lt;ref name=&amp;quot;ref_fabe&amp;quot; /&amp;gt;&lt;br /&gt;
* Most economists agree that linear cost functions are valid over the relevant range of output for the firm.&amp;lt;ref name=&amp;quot;ref_fabe&amp;quot; /&amp;gt;&lt;br /&gt;
* In traditional economics, we must make use of the cubic cost function as illustrated in Fig. 15.5.&amp;lt;ref name=&amp;quot;ref_fabe&amp;quot; /&amp;gt;&lt;br /&gt;
* However, there are cost functions which cannot be decomposed using a loss function.&amp;lt;ref name=&amp;quot;ref_2c4d&amp;quot;&amp;gt;[http://image.diku.dk/shark/sphinx_pages/build/html/rest_sources/tutorials/concepts/library_design/losses.html Loss and Cost Functions — Shark 3.0a documentation]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* In other words, all loss functions generate a cost function, but not all cost functions must be based on a loss function.&amp;lt;ref name=&amp;quot;ref_2c4d&amp;quot; /&amp;gt;&lt;br /&gt;
* This allows embarrassingly parallelizable gradient descent on the cost function.&amp;lt;ref name=&amp;quot;ref_2c4d&amp;quot; /&amp;gt;&lt;br /&gt;
* hasFirstDerivative Can the cost function calculate its first derivative?&amp;lt;ref name=&amp;quot;ref_2c4d&amp;quot; /&amp;gt;&lt;br /&gt;
* The cost function, , describes how the firm’s total costs vary with its output—the number of cars, , that it produces.&amp;lt;ref name=&amp;quot;ref_d624&amp;quot;&amp;gt;[https://www.core-econ.org/the-economy/book/text/leibniz-07-03-01.html The Economy: Leibniz: Average and marginal cost functions]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Now think about the shape of the average cost function.&amp;lt;ref name=&amp;quot;ref_d624&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function is a MATLAB® function that evaluates your design requirements using design variable values.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot;&amp;gt;[https://www.mathworks.com/help/sldo/ug/writing-a-custom-cost-function.html Write a Cost Function]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* When you optimize or estimate model parameters, you provide the saved cost function as an input to sdo.optimize .&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* To understand the parts of a cost function, consider the following sample function myCostFunc .&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* Value; % Compute the requirements (objective and constraint violations) and % assign them to vals, the output of the cost function.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* Specifies the inputs of the cost function.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function must have as input, params , a vector of the design variables to be estimated, optimized, or used for sensitivity analysis.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* For more information, see Specify Inputs of the Cost Function.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* In this sample cost function, the requirements are based on the design variable x, a model parameter.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* The cost function first extracts the current values of the design variables and then computes the requirements.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* Specifies the requirement values as outputs, vals and derivs , of the cost function.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function must return vals , a structure with one or more fields that specify the values of the objective and constraint violations.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* For more information, see Specify Outputs of the Cost Function.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* However, sdo.optimize and sdo.evaluate accept a cost function with only one input argument.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* To use a cost function that accepts more than one input argument, you use an anonymous function.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* Suppose that the myCostFunc_multi_inputs.m file specifies a cost function that takes params and arg1 as inputs.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* For example, you can make the model name an input argument, arg1 , and configure the cost function to be used for multiple models.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* You create convenience objects once and pass them as an input to the cost function to reduce code redundancy and computation cost.&amp;lt;ref name=&amp;quot;ref_ead5&amp;quot; /&amp;gt;&lt;br /&gt;
* We will conclude that theT-policy optimumN andD policies depends on the employed cost function.&amp;lt;ref name=&amp;quot;ref_a167&amp;quot;&amp;gt;[https://link.springer.com/article/10.1007/BF02888260 A unified cost function for M/G/1 queueing systems with removable server]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* What we need is a cost function so we can start optimizing our weights.&amp;lt;ref name=&amp;quot;ref_a976&amp;quot;&amp;gt;[https://ml-cheatsheet.readthedocs.io/en/latest/linear_regression.html Linear Regression — ML Glossary documentation]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Let’s use MSE (L2) as our cost function.&amp;lt;ref name=&amp;quot;ref_a976&amp;quot; /&amp;gt;&lt;br /&gt;
* To minimize MSE we use Gradient Descent to calculate the gradient of our cost function.&amp;lt;ref name=&amp;quot;ref_a976&amp;quot; /&amp;gt;&lt;br /&gt;
* Math There are two parameters (coefficients) in our cost function we can control: weight \(m\) and bias \(b\).&amp;lt;ref name=&amp;quot;ref_a976&amp;quot; /&amp;gt;&lt;br /&gt;
* This applet will allow you to graph a cost function, tangent line to the cost function and the marginal cost function.&amp;lt;ref name=&amp;quot;ref_8500&amp;quot;&amp;gt;[https://www.geogebra.org/m/Rva9PED2 Cost Functions and Marginal Cost Functions]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* The cost is the quadratic cost function, \(C\), introduced back in Chapter 1.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot;&amp;gt;[https://eng.libretexts.org/Bookshelves/Computer_Science/Book%3A_Neural_Networks_and_Deep_Learning_(Nielsen)/03%3A_Improving_the_way_neural_networks_learn/3.01%3A_The_cross-entropy_cost_function 3.1: The cross-entropy cost function]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* I&amp;#039;ll remind you of the exact form of the cost function shortly, so there&amp;#039;s no need to go and dig up the definition.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* Introducing the cross-entropy cost function How can we address the learning slowdown?&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* It turns out that we can solve the problem by replacing the quadratic cost with a different cost function, known as the cross-entropy.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* In fact, frankly, it&amp;#039;s not even obvious that it makes sense to call this a cost function!&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* Before addressing the learning slowdown, let&amp;#039;s see in what sense the cross-entropy can be interpreted as a cost function.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* Two properties in particular make it reasonable to interpret the cross-entropy as a cost function.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* These are both properties we&amp;#039;d intuitively expect for a cost function.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* But the cross-entropy cost function has the benefit that, unlike the quadratic cost, it avoids the problem of learning slowing down.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* This cancellation is the special miracle ensured by the cross-entropy cost function.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* For both cost functions I simply experimented to find a learning rate that made it possible to see what is going on.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* As discussed above, it&amp;#039;s not possible to say precisely what it means to use the &amp;quot;same&amp;quot; learning rate when the cost function is changed.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* Part of the reason is that the cross-entropy is a widely-used cost function, and so is worth understanding well.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* So the log-likelihood cost behaves as we&amp;#039;d expect a cost function to behave.&amp;lt;ref name=&amp;quot;ref_83c2&amp;quot; /&amp;gt;&lt;br /&gt;
* The average cost function is formed by dividing the cost by the quantity.&amp;lt;ref name=&amp;quot;ref_db24&amp;quot;&amp;gt;[https://scholarlyoa.com/what-is-an-average-cost-function/ What is an average cost function?]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Cost functions are also known as Loss functions.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot;&amp;gt;[https://machinelearningknowledge.ai/cost-functions-in-machine-learning/ Dummies guide to Cost Functions in Machine Learning [with Animation]]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* This is where cost function comes into the picture.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* weight for the next iteration on training data so that the error given by cost function gets further reduced.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* The cost functions for regression are calculated on distance-based error.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* This also known as distance-based error and it forms the basis of cost functions that are used in regression models.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* In this cost function, the error for each training data is calculated and then the mean value of all these errors is derived.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* So Mean Error is not a recommended cost function for regression.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* Cost functions used in classification problems are different than what we saw in the regression problem above.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* So how does cross entropy help in the cost function for classification?&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* We could have used regression cost function MAE/MSE even for classification problems.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* Hinge loss is another cost function that is mostly used in Support Vector Machines (SVM) for classification.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* There are many cost functions to choose from and the choice depends on type of data and type of problem (regression or classification).&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* error (MSE) and Mean Absolute Error (MAE) are popular cost functions used in regression problems.&amp;lt;ref name=&amp;quot;ref_aeab&amp;quot; /&amp;gt;&lt;br /&gt;
* We will illustrate the impact of partial updates on the cost function J M ( k ) with two numerical examples.&amp;lt;ref name=&amp;quot;ref_fcc2&amp;quot;&amp;gt;[https://www.sciencedirect.com/topics/engineering/cost-function-contour Cost Function Contour - an overview]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* The cost functions of the averaged systems have been computed to shed some light on the observed differences in convergence rates.&amp;lt;ref name=&amp;quot;ref_fcc2&amp;quot; /&amp;gt;&lt;br /&gt;
* This indicates that the cost function gets gradually flatter for M -max and is the flattest for sequential partial updates.&amp;lt;ref name=&amp;quot;ref_fcc2&amp;quot; /&amp;gt;&lt;br /&gt;
* Then given this class definition, the auto differentiated cost function for it can be constructed as follows.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot;&amp;gt;[http://ceres-solver.org/nnls_modeling.html Modeling Non-linear Least Squares — Ceres Solver]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* The algorithm exhibits considerably higher accuracy, but does so by additional evaluations of the cost function.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* This class allows you to apply different conditioning to the residual values of a wrapped cost function.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* This class compares the Jacobians returned by a cost function against derivatives estimated using finite differencing.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* Using a robust loss function, the cost for large residuals is reduced.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* Here the convention is that the contribution of a term to the cost function is given by \(\frac{1}{2}\rho(s)\), where \(s =\|f_i\|^2\).&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* Ceres includes a number of predefined loss functions.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* Sometimes after the optimization problem has been constructed, we wish to mutate the scale of the loss function.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* This can have better convergence behavior than just using a loss function with a small scale.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* The cost function carries with it information about the sizes of the parameter blocks it expects.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* This option controls whether the Problem object owns the cost functions.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* If set to TAKE_OWNERSHIP, then the problem object will delete the cost functions on destruction.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* The destructor is careful to delete the pointers only once, since sharing cost functions is allowed.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* This option controls whether the Problem object owns the loss functions.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* If set to TAKE_OWNERSHIP, then the problem object will delete the loss functions on destruction.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* The destructor is careful to delete the pointers only once, since sharing loss functions is allowed.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* * loss_function, double* x0, Ts... xs) Add a residual block to the overall cost function.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* apply_loss_function as the name implies allows the user to switch the application of the loss function on and off.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* Users must provide access to pre-computed shared data to their cost functions behind the scenes; this all happens without Ceres knowing.&amp;lt;ref name=&amp;quot;ref_965b&amp;quot; /&amp;gt;&lt;br /&gt;
* I think it would be useful to have a list of common cost functions, alongside a few ways that they have been used in practice.&amp;lt;ref name=&amp;quot;ref_4a94&amp;quot;&amp;gt;[https://stats.stackexchange.com/questions/154879/a-list-of-cost-functions-used-in-neural-networks-alongside-applications A list of cost functions used in neural networks, alongside applications]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* A cost function is the performance measure you want to minimize.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot;&amp;gt;[https://zone.ni.com/reference/en-XX/help/371894J-01/lvsimconcepts/sim_c_costfunc/ Defining a Cost Function (Control Design and Simulation Module)]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* The cost function is a functional equation, which maps a set of points in a time series to a single scalar value.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* Use the Cost type parameter of the SIM Optimal Design VI to specify the type of cost function you want this VI to minimize.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function that integrates the error.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function that integrates the absolute value of the error.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function that integrates the square of the error.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function that integrates the time multiplied by the absolute value of the error.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function that integrates the time multiplied by the error.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function that integrates the time multiplied by the square of the error.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function that integrates the square of the time multiplied by the square of the error.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* After you define these parameters, you can write LabVIEW block diagram code to manipulate the parameters according to the cost function.&amp;lt;ref name=&amp;quot;ref_0df0&amp;quot; /&amp;gt;&lt;br /&gt;
* However, the reward associated with each reach (i.e., cost function) is experimentally imposed in most work of this sort.&amp;lt;ref name=&amp;quot;ref_d68d&amp;quot;&amp;gt;[https://jov.arvojournals.org/article.aspx?articleid=2130788 Statistical decision theory for everyday tasks: A natural cost function for human reach and grasp]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* We are interested in deriving natural cost functions that may be used to predict people&amp;#039;s actions in everyday tasks.&amp;lt;ref name=&amp;quot;ref_d68d&amp;quot; /&amp;gt;&lt;br /&gt;
* Our results indicate that people are reaching in a manner that maximizes their expected reward for a natural cost function.&amp;lt;ref name=&amp;quot;ref_d68d&amp;quot; /&amp;gt;&lt;br /&gt;
* Y* one of the parameters of the cost-minimization story, must be included in the cost function.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot;&amp;gt;[https://cruel.org/econthought/essays/product/cost.html The Cost Function]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Property (6), the concavity of the cost function, can be understood via the use of Figure 8.2.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* We have drawn two cost functions, C*(w, y) and C(w, y), where total costs are mapped with respect to one factor price, w i .&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* The corresponding cost function is shown in Figure 8.2 by C*(w, y).&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* , the cost function C(w, y) will lie below the Leontief cost function C*(w, y).&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* Now, recall that one of the properties of cost functions were their concavity with respect to individual factor prices.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* Now, as we saw, ｶ C/ ｶ y ｳ 0 by the properties of the cost function.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* As we have demonstrated, the cost function C(w, y) is positively related to the scale of output.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* One ought to imagine that the cost function would thus also capture these different returns to scale in one way or another.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* The cost function C(w 0 , y) drawn in Figure 8.5 is merely a &amp;quot;stretched mirror image&amp;quot; of the production function in Figure 3.1.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* The resulting shape would be similar to the cost function in Figure 8.5.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* We can continue exploiting the relationship between cost functions and production functions by turning to factor price frontiers.&amp;lt;ref name=&amp;quot;ref_6f05&amp;quot; /&amp;gt;&lt;br /&gt;
* Relying on the observation of flexible cost functions is pivotal to successful business planning in regards to market expenses.&amp;lt;ref name=&amp;quot;ref_bff8&amp;quot;&amp;gt;[https://www.thoughtco.com/cost-function-definition-1147988 What is a Cost Function?]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* One of these algorithmic changes was the replacement of mean squared error with the cross-entropy family of loss functions.&amp;lt;ref name=&amp;quot;ref_8699&amp;quot;&amp;gt;[https://machinelearningmastery.com/loss-and-loss-functions-for-training-deep-learning-neural-networks/ Loss and Loss Functions for Training Deep Learning Neural Networks]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Importantly, the choice of loss function is directly related to the activation function used in the output layer of your neural network.&amp;lt;ref name=&amp;quot;ref_8699&amp;quot; /&amp;gt;&lt;br /&gt;
* The choice of cost function is tightly coupled with the choice of output unit.&amp;lt;ref name=&amp;quot;ref_8699&amp;quot; /&amp;gt;&lt;br /&gt;
* A cost function is a mathematical formula used to used to chart how production expenses will change at different output levels.&amp;lt;ref name=&amp;quot;ref_4afc&amp;quot;&amp;gt;[https://www.myaccountingcourse.com/accounting-dictionary/cost-function What is a Cost Function? - Definition]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Gradient descent is an iterative optimization algorithm used in machine learning to minimize a loss function.&amp;lt;ref name=&amp;quot;ref_3e49&amp;quot;&amp;gt;[https://www.kdnuggets.com/2020/05/5-concepts-gradient-descent-cost-function.html 5 Concepts You Should Know About Gradient Descent and Cost Function]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Let’s use supervised learning problem ; linear regression to introduce model, cost function and gradient descent.&amp;lt;ref name=&amp;quot;ref_983b&amp;quot;&amp;gt;[https://medium.com/@dhartidhami/machine-learning-basics-model-cost-function-and-gradient-descent-79b69ff28091 Machine Learning Basics: Model, Cost function and Gradient Descent]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Also, as it turns out the gradient descent for the cost function for linear regression is a convex function.&amp;lt;ref name=&amp;quot;ref_983b&amp;quot; /&amp;gt;&lt;br /&gt;
* An optimization problem seeks to minimize a loss function.&amp;lt;ref name=&amp;quot;ref_7bae&amp;quot;&amp;gt;[https://en.wikipedia.org/wiki/Loss_function Loss function]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* The use of a quadratic loss function is common, for example when using least squares techniques.&amp;lt;ref name=&amp;quot;ref_7bae&amp;quot; /&amp;gt;&lt;br /&gt;
* The quadratic loss function is also used in linear-quadratic optimal control problems.&amp;lt;ref name=&amp;quot;ref_7bae&amp;quot; /&amp;gt;&lt;br /&gt;
* In ML, cost functions are used to estimate how badly models are performing.&amp;lt;ref name=&amp;quot;ref_3099&amp;quot;&amp;gt;[https://towardsdatascience.com/machine-learning-fundamentals-via-linear-regression-41a5d11f5220 Machine learning fundamentals (I): Cost functions and gradient descent]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* At this point the model has optimized the weights such that they minimize the cost function.&amp;lt;ref name=&amp;quot;ref_3099&amp;quot; /&amp;gt;&lt;br /&gt;
* Cost Function quantifies the error between predicted values and expected values and presents it in the form of a single real number.&amp;lt;ref name=&amp;quot;ref_0625&amp;quot;&amp;gt;[https://towardsdatascience.com/coding-deep-learning-for-beginners-linear-regression-part-2-cost-function-49545303d29f Coding Deep Learning for Beginners — Linear Regression (Part 2): Cost Function]&amp;lt;/ref&amp;gt;&lt;br /&gt;
* Depending on the problem Cost Function can be formed in many different ways.&amp;lt;ref name=&amp;quot;ref_0625&amp;quot; /&amp;gt;&lt;br /&gt;
* The goal is to find the values of model parameters for which Cost Function return as small number as possible.&amp;lt;ref name=&amp;quot;ref_0625&amp;quot; /&amp;gt;&lt;br /&gt;
* let’s try picking smaller weight now and see if the created Cost Function works.&amp;lt;ref name=&amp;quot;ref_0625&amp;quot; /&amp;gt;&lt;br /&gt;
===소스===&lt;br /&gt;
 &amp;lt;references /&amp;gt;&lt;/div&gt;</summary>
		<author><name>Pythagoras0</name></author>
	</entry>
</feed>