손실 함수 - 편집 역사

2021년 2월 17일 (수) 08:59에 Pythagoras0님의 편집

2021-02-17T08:59:06Z

Pythagoras0: /* 메타데이터 */ 새 문단

2020-12-26T13:25:41Z

메타데이터: 새 문단

Pythagoras0: /* 노트 */ 새 문단

2020-12-16T07:39:01Z

노트: 새 문단

Pythagoras0: 문서를 비움

2020-12-16T07:38:49Z

문서를 비움

차이 보기

Pythagoras0: /* 노트 */ 새 문단

2020-12-16T05:31:16Z

노트: 새 문단

새 문서

== 노트 ==

* The following points highlight the three main types of cost functions.<ref name="ref_fabe">[http://www.economicsdiscussion.net/cost/3-main-types-of-cost-functions/19976 3 Main Types of Cost Functions]</ref>
* that statistical cost functions will have a bias towards linearity.<ref name="ref_fabe" />
* We have noted that if the cost function is linear, the equation used in preparing the total cost curve in Fig.<ref name="ref_fabe" />
* Most economists agree that linear cost functions are valid over the relevant range of output for the firm.<ref name="ref_fabe" />
* In traditional economics, we must make use of the cubic cost function as illustrated in Fig. 15.5.<ref name="ref_fabe" />
* However, there are cost functions which cannot be decomposed using a loss function.<ref name="ref_2c4d">[http://image.diku.dk/shark/sphinx_pages/build/html/rest_sources/tutorials/concepts/library_design/losses.html Loss and Cost Functions — Shark 3.0a documentation]</ref>
* In other words, all loss functions generate a cost function, but not all cost functions must be based on a loss function.<ref name="ref_2c4d" />
* This allows embarrassingly parallelizable gradient descent on the cost function.<ref name="ref_2c4d" />
* hasFirstDerivative Can the cost function calculate its first derivative?<ref name="ref_2c4d" />
* The cost function, , describes how the firm’s total costs vary with its output—the number of cars, , that it produces.<ref name="ref_d624">[https://www.core-econ.org/the-economy/book/text/leibniz-07-03-01.html The Economy: Leibniz: Average and marginal cost functions]</ref>
* Now think about the shape of the average cost function.<ref name="ref_d624" />
* A cost function is a MATLAB® function that evaluates your design requirements using design variable values.<ref name="ref_ead5">[https://www.mathworks.com/help/sldo/ug/writing-a-custom-cost-function.html Write a Cost Function]</ref>
* When you optimize or estimate model parameters, you provide the saved cost function as an input to sdo.optimize .<ref name="ref_ead5" />
* To understand the parts of a cost function, consider the following sample function myCostFunc .<ref name="ref_ead5" />
* Value; % Compute the requirements (objective and constraint violations) and % assign them to vals, the output of the cost function.<ref name="ref_ead5" />
* Specifies the inputs of the cost function.<ref name="ref_ead5" />
* A cost function must have as input, params , a vector of the design variables to be estimated, optimized, or used for sensitivity analysis.<ref name="ref_ead5" />
* For more information, see Specify Inputs of the Cost Function.<ref name="ref_ead5" />
* In this sample cost function, the requirements are based on the design variable x, a model parameter.<ref name="ref_ead5" />
* The cost function first extracts the current values of the design variables and then computes the requirements.<ref name="ref_ead5" />
* Specifies the requirement values as outputs, vals and derivs , of the cost function.<ref name="ref_ead5" />
* A cost function must return vals , a structure with one or more fields that specify the values of the objective and constraint violations.<ref name="ref_ead5" />
* For more information, see Specify Outputs of the Cost Function.<ref name="ref_ead5" />
* However, sdo.optimize and sdo.evaluate accept a cost function with only one input argument.<ref name="ref_ead5" />
* To use a cost function that accepts more than one input argument, you use an anonymous function.<ref name="ref_ead5" />
* Suppose that the myCostFunc_multi_inputs.m file specifies a cost function that takes params and arg1 as inputs.<ref name="ref_ead5" />
* For example, you can make the model name an input argument, arg1 , and configure the cost function to be used for multiple models.<ref name="ref_ead5" />
* You create convenience objects once and pass them as an input to the cost function to reduce code redundancy and computation cost.<ref name="ref_ead5" />
* We will conclude that theT-policy optimumN andD policies depends on the employed cost function.<ref name="ref_a167">[https://link.springer.com/article/10.1007/BF02888260 A unified cost function for M/G/1 queueing systems with removable server]</ref>
* What we need is a cost function so we can start optimizing our weights.<ref name="ref_a976">[https://ml-cheatsheet.readthedocs.io/en/latest/linear_regression.html Linear Regression — ML Glossary documentation]</ref>
* Let’s use MSE (L2) as our cost function.<ref name="ref_a976" />
* To minimize MSE we use Gradient Descent to calculate the gradient of our cost function.<ref name="ref_a976" />
* Math There are two parameters (coefficients) in our cost function we can control: weight \(m\) and bias \(b\).<ref name="ref_a976" />
* This applet will allow you to graph a cost function, tangent line to the cost function and the marginal cost function.<ref name="ref_8500">[https://www.geogebra.org/m/Rva9PED2 Cost Functions and Marginal Cost Functions]</ref>
* The cost is the quadratic cost function, \(C\), introduced back in Chapter 1.<ref name="ref_83c2">[https://eng.libretexts.org/Bookshelves/Computer_Science/Book%3A_Neural_Networks_and_Deep_Learning_(Nielsen)/03%3A_Improving_the_way_neural_networks_learn/3.01%3A_The_cross-entropy_cost_function 3.1: The cross-entropy cost function]</ref>
* I'll remind you of the exact form of the cost function shortly, so there's no need to go and dig up the definition.<ref name="ref_83c2" />
* Introducing the cross-entropy cost function How can we address the learning slowdown?<ref name="ref_83c2" />
* It turns out that we can solve the problem by replacing the quadratic cost with a different cost function, known as the cross-entropy.<ref name="ref_83c2" />
* In fact, frankly, it's not even obvious that it makes sense to call this a cost function!<ref name="ref_83c2" />
* Before addressing the learning slowdown, let's see in what sense the cross-entropy can be interpreted as a cost function.<ref name="ref_83c2" />
* Two properties in particular make it reasonable to interpret the cross-entropy as a cost function.<ref name="ref_83c2" />
* These are both properties we'd intuitively expect for a cost function.<ref name="ref_83c2" />
* But the cross-entropy cost function has the benefit that, unlike the quadratic cost, it avoids the problem of learning slowing down.<ref name="ref_83c2" />
* This cancellation is the special miracle ensured by the cross-entropy cost function.<ref name="ref_83c2" />
* For both cost functions I simply experimented to find a learning rate that made it possible to see what is going on.<ref name="ref_83c2" />
* As discussed above, it's not possible to say precisely what it means to use the "same" learning rate when the cost function is changed.<ref name="ref_83c2" />
* Part of the reason is that the cross-entropy is a widely-used cost function, and so is worth understanding well.<ref name="ref_83c2" />
* So the log-likelihood cost behaves as we'd expect a cost function to behave.<ref name="ref_83c2" />
* The average cost function is formed by dividing the cost by the quantity.<ref name="ref_db24">[https://scholarlyoa.com/what-is-an-average-cost-function/ What is an average cost function?]</ref>
* Cost functions are also known as Loss functions.<ref name="ref_aeab">[https://machinelearningknowledge.ai/cost-functions-in-machine-learning/ Dummies guide to Cost Functions in Machine Learning [with Animation]]</ref>
* This is where cost function comes into the picture.<ref name="ref_aeab" />
* weight for the next iteration on training data so that the error given by cost function gets further reduced.<ref name="ref_aeab" />
* The cost functions for regression are calculated on distance-based error.<ref name="ref_aeab" />
* This also known as distance-based error and it forms the basis of cost functions that are used in regression models.<ref name="ref_aeab" />
* In this cost function, the error for each training data is calculated and then the mean value of all these errors is derived.<ref name="ref_aeab" />
* So Mean Error is not a recommended cost function for regression.<ref name="ref_aeab" />
* Cost functions used in classification problems are different than what we saw in the regression problem above.<ref name="ref_aeab" />
* So how does cross entropy help in the cost function for classification?<ref name="ref_aeab" />
* We could have used regression cost function MAE/MSE even for classification problems.<ref name="ref_aeab" />
* Hinge loss is another cost function that is mostly used in Support Vector Machines (SVM) for classification.<ref name="ref_aeab" />
* There are many cost functions to choose from and the choice depends on type of data and type of problem (regression or classification).<ref name="ref_aeab" />
* error (MSE) and Mean Absolute Error (MAE) are popular cost functions used in regression problems.<ref name="ref_aeab" />
* We will illustrate the impact of partial updates on the cost function J M ( k ) with two numerical examples.<ref name="ref_fcc2">[https://www.sciencedirect.com/topics/engineering/cost-function-contour Cost Function Contour - an overview]</ref>
* The cost functions of the averaged systems have been computed to shed some light on the observed differences in convergence rates.<ref name="ref_fcc2" />
* This indicates that the cost function gets gradually flatter for M -max and is the flattest for sequential partial updates.<ref name="ref_fcc2" />
* Then given this class definition, the auto differentiated cost function for it can be constructed as follows.<ref name="ref_965b">[http://ceres-solver.org/nnls_modeling.html Modeling Non-linear Least Squares — Ceres Solver]</ref>
* The algorithm exhibits considerably higher accuracy, but does so by additional evaluations of the cost function.<ref name="ref_965b" />
* This class allows you to apply different conditioning to the residual values of a wrapped cost function.<ref name="ref_965b" />
* This class compares the Jacobians returned by a cost function against derivatives estimated using finite differencing.<ref name="ref_965b" />
* Using a robust loss function, the cost for large residuals is reduced.<ref name="ref_965b" />
* Here the convention is that the contribution of a term to the cost function is given by \(\frac{1}{2}\rho(s)\), where \(s =\|f_i\|^2\).<ref name="ref_965b" />
* Ceres includes a number of predefined loss functions.<ref name="ref_965b" />
* Sometimes after the optimization problem has been constructed, we wish to mutate the scale of the loss function.<ref name="ref_965b" />
* This can have better convergence behavior than just using a loss function with a small scale.<ref name="ref_965b" />
* The cost function carries with it information about the sizes of the parameter blocks it expects.<ref name="ref_965b" />
* This option controls whether the Problem object owns the cost functions.<ref name="ref_965b" />
* If set to TAKE_OWNERSHIP, then the problem object will delete the cost functions on destruction.<ref name="ref_965b" />
* The destructor is careful to delete the pointers only once, since sharing cost functions is allowed.<ref name="ref_965b" />
* This option controls whether the Problem object owns the loss functions.<ref name="ref_965b" />
* If set to TAKE_OWNERSHIP, then the problem object will delete the loss functions on destruction.<ref name="ref_965b" />
* The destructor is careful to delete the pointers only once, since sharing loss functions is allowed.<ref name="ref_965b" />
* * loss_function, double* x0, Ts... xs) Add a residual block to the overall cost function.<ref name="ref_965b" />
* apply_loss_function as the name implies allows the user to switch the application of the loss function on and off.<ref name="ref_965b" />
* Users must provide access to pre-computed shared data to their cost functions behind the scenes; this all happens without Ceres knowing.<ref name="ref_965b" />
* I think it would be useful to have a list of common cost functions, alongside a few ways that they have been used in practice.<ref name="ref_4a94">[https://stats.stackexchange.com/questions/154879/a-list-of-cost-functions-used-in-neural-networks-alongside-applications A list of cost functions used in neural networks, alongside applications]</ref>
* A cost function is the performance measure you want to minimize.<ref name="ref_0df0">[https://zone.ni.com/reference/en-XX/help/371894J-01/lvsimconcepts/sim_c_costfunc/ Defining a Cost Function (Control Design and Simulation Module)]</ref>
* The cost function is a functional equation, which maps a set of points in a time series to a single scalar value.<ref name="ref_0df0" />
* Use the Cost type parameter of the SIM Optimal Design VI to specify the type of cost function you want this VI to minimize.<ref name="ref_0df0" />
* A cost function that integrates the error.<ref name="ref_0df0" />
* A cost function that integrates the absolute value of the error.<ref name="ref_0df0" />
* A cost function that integrates the square of the error.<ref name="ref_0df0" />
* A cost function that integrates the time multiplied by the absolute value of the error.<ref name="ref_0df0" />
* A cost function that integrates the time multiplied by the error.<ref name="ref_0df0" />
* A cost function that integrates the time multiplied by the square of the error.<ref name="ref_0df0" />
* A cost function that integrates the square of the time multiplied by the square of the error.<ref name="ref_0df0" />
* After you define these parameters, you can write LabVIEW block diagram code to manipulate the parameters according to the cost function.<ref name="ref_0df0" />
* However, the reward associated with each reach (i.e., cost function) is experimentally imposed in most work of this sort.<ref name="ref_d68d">[https://jov.arvojournals.org/article.aspx?articleid=2130788 Statistical decision theory for everyday tasks: A natural cost function for human reach and grasp]</ref>
* We are interested in deriving natural cost functions that may be used to predict people's actions in everyday tasks.<ref name="ref_d68d" />
* Our results indicate that people are reaching in a manner that maximizes their expected reward for a natural cost function.<ref name="ref_d68d" />
* Y* one of the parameters of the cost-minimization story, must be included in the cost function.<ref name="ref_6f05">[https://cruel.org/econthought/essays/product/cost.html The Cost Function]</ref>
* Property (6), the concavity of the cost function, can be understood via the use of Figure 8.2.<ref name="ref_6f05" />
* We have drawn two cost functions, C*(w, y) and C(w, y), where total costs are mapped with respect to one factor price, w i .<ref name="ref_6f05" />
* The corresponding cost function is shown in Figure 8.2 by C*(w, y).<ref name="ref_6f05" />
* , the cost function C(w, y) will lie below the Leontief cost function C*(w, y).<ref name="ref_6f05" />
* Now, recall that one of the properties of cost functions were their concavity with respect to individual factor prices.<ref name="ref_6f05" />
* Now, as we saw, ｶ C/ ｶ y ｳ 0 by the properties of the cost function.<ref name="ref_6f05" />
* As we have demonstrated, the cost function C(w, y) is positively related to the scale of output.<ref name="ref_6f05" />
* One ought to imagine that the cost function would thus also capture these different returns to scale in one way or another.<ref name="ref_6f05" />
* The cost function C(w 0 , y) drawn in Figure 8.5 is merely a "stretched mirror image" of the production function in Figure 3.1.<ref name="ref_6f05" />
* The resulting shape would be similar to the cost function in Figure 8.5.<ref name="ref_6f05" />
* We can continue exploiting the relationship between cost functions and production functions by turning to factor price frontiers.<ref name="ref_6f05" />
* Relying on the observation of flexible cost functions is pivotal to successful business planning in regards to market expenses.<ref name="ref_bff8">[https://www.thoughtco.com/cost-function-definition-1147988 What is a Cost Function?]</ref>
* One of these algorithmic changes was the replacement of mean squared error with the cross-entropy family of loss functions.<ref name="ref_8699">[https://machinelearningmastery.com/loss-and-loss-functions-for-training-deep-learning-neural-networks/ Loss and Loss Functions for Training Deep Learning Neural Networks]</ref>
* Importantly, the choice of loss function is directly related to the activation function used in the output layer of your neural network.<ref name="ref_8699" />
* The choice of cost function is tightly coupled with the choice of output unit.<ref name="ref_8699" />
* A cost function is a mathematical formula used to used to chart how production expenses will change at different output levels.<ref name="ref_4afc">[https://www.myaccountingcourse.com/accounting-dictionary/cost-function What is a Cost Function? - Definition]</ref>
* Gradient descent is an iterative optimization algorithm used in machine learning to minimize a loss function.<ref name="ref_3e49">[https://www.kdnuggets.com/2020/05/5-concepts-gradient-descent-cost-function.html 5 Concepts You Should Know About Gradient Descent and Cost Function]</ref>
* Let’s use supervised learning problem ; linear regression to introduce model, cost function and gradient descent.<ref name="ref_983b">[https://medium.com/@dhartidhami/machine-learning-basics-model-cost-function-and-gradient-descent-79b69ff28091 Machine Learning Basics: Model, Cost function and Gradient Descent]</ref>
* Also, as it turns out the gradient descent for the cost function for linear regression is a convex function.<ref name="ref_983b" />
* An optimization problem seeks to minimize a loss function.<ref name="ref_7bae">[https://en.wikipedia.org/wiki/Loss_function Loss function]</ref>
* The use of a quadratic loss function is common, for example when using least squares techniques.<ref name="ref_7bae" />
* The quadratic loss function is also used in linear-quadratic optimal control problems.<ref name="ref_7bae" />
* In ML, cost functions are used to estimate how badly models are performing.<ref name="ref_3099">[https://towardsdatascience.com/machine-learning-fundamentals-via-linear-regression-41a5d11f5220 Machine learning fundamentals (I): Cost functions and gradient descent]</ref>
* At this point the model has optimized the weights such that they minimize the cost function.<ref name="ref_3099" />
* Cost Function quantifies the error between predicted values and expected values and presents it in the form of a single real number.<ref name="ref_0625">[https://towardsdatascience.com/coding-deep-learning-for-beginners-linear-regression-part-2-cost-function-49545303d29f Coding Deep Learning for Beginners — Linear Regression (Part 2): Cost Function]</ref>
* Depending on the problem Cost Function can be formed in many different ways.<ref name="ref_0625" />
* The goal is to find the values of model parameters for which Cost Function return as small number as possible.<ref name="ref_0625" />
* let’s try picking smaller weight now and see if the created Cost Function works.<ref name="ref_0625" />
===소스===
<references />

@@ 99번째 줄: / 99번째 줄: @@
   <references />
-== 메타데이터 ==
+==메타데이터==
 ===위키데이터===
 * ID :  [https://www.wikidata.org/wiki/Q1036748 Q1036748]

← 이전 판		2020년 12월 26일 (토) 13:25 판
98번째 줄:		98번째 줄:
	===소스===		===소스===
	<references />		<references />
		+
		+	== 메타데이터 ==
		+
		+	===위키데이터===
		+	* ID : [https://www.wikidata.org/wiki/Q1036748 Q1036748]

← 이전 판	2020년 12월 16일 (수) 07:39 판
1번째 줄:	1번째 줄:
	+	== 노트 ==

	+	* The loss function is used to measure how good or bad the model is performing.<ref name="ref_937a">[https://www.analyticssteps.com/blogs/what-are-different-loss-functions-used-optimizers-neural-networks What Are Different Loss Functions Used as Optimizers in Neural Networks?]</ref>
	+	* Also, there is no fixed loss function that can be used in all places.<ref name="ref_937a" />
	+	* Loss functions are mainly classified into two different categories that are Classification loss and Regression Loss.<ref name="ref_937a" />
	+	* We implement this mechanism in the form of losses and loss functions.<ref name="ref_b293">[https://data-flair.training/blogs/keras-loss-functions/ Keras Loss Functions]</ref>
	+	* Neural networks are trained using an optimizer and we are required to choose a loss function while configuring our model.<ref name="ref_b293" />
	+	* Different loss functions play slightly different roles in training neural nets.<ref name="ref_b293" />
	+	* This article will explain the role of Keras loss functions in training deep neural nets.<ref name="ref_b293" />
	+	* At its core, a loss function is incredibly simple: it’s a method of evaluating how well your algorithm models your dataset.<ref name="ref_ffc8">[https://iq.opengenus.org/types-of-loss-function/ Types of Loss Function]</ref>
	+	* If your predictions are totally off, your loss function will output a higher number.<ref name="ref_ffc8" />
	+	* There are variety of pakages which surropt these loss function.<ref name="ref_ffc8" />
	+	* This paper studies a variety of loss functions and output layer regularization strategies on image classification tasks.<ref name="ref_3230">[https://paperswithcode.com/paper/what-s-in-a-loss-function-for-image What's in a Loss Function for Image Classification?]</ref>
	+	* , we’ll be discussing what a loss function is and how it’s used in an artificial neural network.<ref name="ref_e14f">[https://deeplizard.com/learn/video/Skc8nqJirJg Loss in a Neural Network explained]</ref>
	+	* Recall that we’ve already introduced the idea of a loss function in our post on training a neural network.<ref name="ref_e14f" />
	+	* The loss function is what SGD is attempting to minimize by iteratively updating the weights in the network.<ref name="ref_e14f" />
	+	* This was just illustrating the math behind how one loss function, MSE, works.<ref name="ref_e14f" />
	+	* However, there is no universally accepted definition for other loss functions.<ref name="ref_2c78">[https://link.springer.com/article/10.1023/A:1022899518027 Variance and Bias for General Loss Functions]</ref>
	+	* Most approaches have focused solely on 0-1 loss functions and have produced significantly different definitions.<ref name="ref_2c78" />
	+	* Using this framework, bias and variance definitions are produced which generalize to any symmetric loss function.<ref name="ref_2c78" />
	+	* We illustrate these statistics on several loss functions with particular emphasis on 0-1 loss.<ref name="ref_2c78" />
	+	* The results obtained with their bi-temperature loss function was then compared to the vanilla logistic loss function.<ref name="ref_b4d0">[https://www.kdnuggets.com/2019/11/research-guide-advanced-loss-functions-machine-learning-models.html Research Guide: Advanced Loss Functions for Machine Learning Models]</ref>
	+	* This loss function is adopted for the discriminator.<ref name="ref_b4d0" />
	+	* As a result of this, GANs using this loss function are able to generate higher quality images than regular GANs.<ref name="ref_b4d0" />
	+	* This loss function is used when images that look similar are being compared.<ref name="ref_b4d0" />
	+	* We will use the term cost function for a single training example and loss function for the entire training dataset.<ref name="ref_6b69">[https://analyticsindiamag.com/hands-on-guide-to-loss-functions-used-to-evaluate-a-ml-algorithm/ Hands-On Guide To Loss Functions Used To Evaluate A ML Algorithm]</ref>
	+	* Depending on the output variable we need to choose loss function to our model.<ref name="ref_6b69" />
	+	* MSE loss is popularly used loss functions in dealing with regression problems.<ref name="ref_6b69" />
	+	* The args and kwargs will be passed to loss_cls during the initialization to instantiate a loss function.<ref name="ref_aa20">[https://docs.fast.ai/losses.html Loss Functions]</ref>
	+	* + 1(e < 0)c 2 (e ) will be a loss function.<ref name="ref_6adc">[https://www.encyclopedia.com/social-sciences/applied-and-social-sciences-magazines/loss-functions Encyclopedia.com]</ref>
	+	* Optimal forecasting of a time series model depends extensively on the specification of the loss function.<ref name="ref_6adc" />
	+	* Suppose the loss functions c 1 (·), c 2 (·) are used for forecasting Y t + h and for forecasting h (Y t + h ), respectively.<ref name="ref_6adc" />
	+	* Granger (1999) remarks that it would be strange behavior to use the same loss function for Y and h (Y ).<ref name="ref_6adc" />
	+	* Loss functions are used to train neural networks and to compute the difference between output and target variable.<ref name="ref_e490">[https://mxnet.apache.org/versions/1.7/api/python/docs/tutorials/packages/gluon/loss/loss.html Loss functions — Apache MXNet documentation]</ref>
	+	* A critical component of training neural networks is the loss function.<ref name="ref_e490" />
	+	* A loss function is a quantative measure of how bad the predictions of the network are when compared to ground truth labels.<ref name="ref_e490" />
	+	* Some tasks use a combination of multiple loss functions, but often you’ll just use one.<ref name="ref_e490" />
	+	* Loss functions are to be supplied in the loss parameter of the compile.keras.engine.training.<ref name="ref_e4cb">[https://keras.rstudio.com/reference/loss_mean_squared_error.html Model loss functions — loss_mean_squared_error]</ref>
	+	* How do you capture the difference between two distributions in GAN loss functions?<ref name="ref_5206">[https://developers.google.com/machine-learning/gan/loss Generative Adversarial Networks]</ref>
	+	* The loss function used in the paper that introduced GANs.<ref name="ref_5206" />
	+	* A GAN can have two loss functions: one for generator training and one for discriminator training.<ref name="ref_5206" />
	+	* There are several ways to define the details of the loss function.<ref name="ref_dcd8">[https://cs231n.github.io/linear-classify/ CS231n Convolutional Neural Networks for Visual Recognition]</ref>
	+	* There is one bug with the loss function we presented above.<ref name="ref_dcd8" />
	+	* We can do so by extending the loss function with a regularization penalty \(R(W)\).<ref name="ref_dcd8" />
	+	* The demo visualizes the loss functions discussed in this section using a toy 3-way classification on 2D data.<ref name="ref_dcd8" />
	+	* In SLF, a generic loss function is formulated as a joint optimization problem of network weights and loss parameters.<ref name="ref_8a58">[https://aaai.org/ojs/index.php/AAAI/article/view/5925 Stochastic Loss Function]</ref>
	+	* The loss function for linear regression is squared loss.<ref name="ref_21cb">[https://developers.google.com/machine-learning/crash-course/logistic-regression/model-training Logistic Regression: Loss and Regularization]</ref>
	+	* The way you configure your loss functions can make or break the performance of your algorithm.<ref name="ref_e8ab">[https://neptune.ai/blog/pytorch-loss-functions PyTorch Loss Functions: The Ultimate Guide]</ref>
	+	* In this article, we’ll talk about popular loss functions in PyTorch, and about building custom loss functions.<ref name="ref_e8ab" />
	+	* Loss functions are used to gauge the error between the prediction output and the provided target value.<ref name="ref_e8ab" />
	+	* A loss function tells us how far the algorithm model is from realizing the expected outcome.<ref name="ref_e8ab" />
	+	* In fact, we can design our own (very) basic loss function to further explain how it works.<ref name="ref_8213">[https://algorithmia.com/blog/introduction-to-loss-functions Introduction to Loss Functions]</ref>
	+	* For each prediction that we make, our loss function will simply measure the absolute difference between our prediction and the actual value.<ref name="ref_8213" />
	+	* Notice how in the loss function we defined, it doesn’t matter if our predictions were too high or too low.<ref name="ref_8213" />
	+	* A lot of the loss functions that you see implemented in machine learning can get complex and confusing.<ref name="ref_8213" />
	+	* An optimization problem seeks to minimize a loss function.<ref name="ref_7bae">[https://en.wikipedia.org/wiki/Loss_function Loss function]</ref>
	+	* The use of a quadratic loss function is common, for example when using least squares techniques.<ref name="ref_7bae" />
	+	* The quadratic loss function is also used in linear-quadratic optimal control problems.<ref name="ref_7bae" />
	+	* One of these algorithmic changes was the replacement of mean squared error with the cross-entropy family of loss functions.<ref name="ref_8699">[https://machinelearningmastery.com/loss-and-loss-functions-for-training-deep-learning-neural-networks/ Loss and Loss Functions for Training Deep Learning Neural Networks]</ref>
	+	* Importantly, the choice of loss function is directly related to the activation function used in the output layer of your neural network.<ref name="ref_8699" />
	+	* The choice of cost function is tightly coupled with the choice of output unit.<ref name="ref_8699" />
	+	* The model can be updated to use the ‘mean_squared_logarithmic_error‘ loss function and keep the same configuration for the output layer.<ref name="ref_ead3">[https://machinelearningmastery.com/how-to-choose-loss-functions-when-training-deep-learning-neural-networks/ How to Choose Loss Functions When Training Deep Learning Neural Networks]</ref>
	+	* Loss functions are used to determine the error (aka “the loss”) between the output of our algorithms and the given target value.<ref name="ref_39e5">[https://deepai.org/machine-learning-glossary-and-terms/loss-function Loss Function]</ref>
	+	* The quadratic loss is a commonly used symmetric loss function.<ref name="ref_39e5" />
	+	* The Cost function and Loss function refer to the same context.<ref name="ref_e085">[https://dev.to/imsparsh/most-common-loss-functions-in-machine-learning-57p7 Most Common Loss Functions in Machine Learning]</ref>
	+	* The cost function is a function that is calculated as the average of all loss function values.<ref name="ref_e085" />
	+	* The Loss function is directly related to the predictions of your model that you have built.<ref name="ref_e085" />
	+	* This is the most common Loss function used in Classification problems.<ref name="ref_e085" />
	+	* The group of functions that are minimized are called “loss functions”.<ref name="ref_faa0">[https://medium.com/@phuctrt/loss-functions-why-what-where-or-when-189815343d3f Loss functions: Why, what, where or when?]</ref>
	+	* Loss function is used as measurement of how good a prediction model does in terms of being able to predict the expected outcome.<ref name="ref_faa0" />
	+	* A loss function is a mathematical function commonly used in statistics.<ref name="ref_cb5b">[https://radiopaedia.org/articles/loss-function Radiology Reference Article]</ref>
	+	* There are many types of loss functions including mean absolute loss, mean squared error and mean bias error.<ref name="ref_cb5b" />
	+	* Loss functions are at the heart of the machine learning algorithms we love to use.<ref name="ref_2088">[https://www.analyticsvidhya.com/blog/2019/08/detailed-guide-7-loss-functions-machine-learning-python-code/ Loss Function In Machine Learning]</ref>
	+	* In this article, I will discuss 7 common loss functions used in machine learning and explain where each of them is used.<ref name="ref_2088" />
	+	* Loss functions are one part of the entire machine learning journey you will take.<ref name="ref_2088" />
	+	* Here, theta_j is the weight to be updated, alpha is the learning rate and J is the cost function.<ref name="ref_2088" />
	+	* Machines learn by means of a loss function.<ref name="ref_8f8d">[https://towardsdatascience.com/common-loss-functions-in-machine-learning-46af0ffc4d23 Common Loss functions in machine learning]</ref>
	+	* If predictions deviates too much from actual results, loss function would cough up a very large number.<ref name="ref_8f8d" />
	+	* Gradually, with the help of some optimization function, loss function learns to reduce the error in prediction.<ref name="ref_8f8d" />
	+	* There’s no one-size-fits-all loss function to algorithms in machine learning.<ref name="ref_8f8d" />
	+	* The loss function is the function that computes the distance between the current output of the algorithm and the expected output.<ref name="ref_7ee5">[https://towardsdatascience.com/what-is-loss-function-1e2605aeb904 What are Loss Functions?]</ref>
	+	* This loss function is convex and grows linearly for negative values (less sensitive to outliers).<ref name="ref_7ee5" />
	+	* The Hinge loss function was developed to correct the hyperplane of SVM algorithm in the task of classification.<ref name="ref_7ee5" />
	+	* At the difference of the previous loss function, the square is replaced by an absolute value.<ref name="ref_7ee5" />
	+	* Square Error (MSE) is the most commonly used regression loss function.<ref name="ref_a0e0">[https://heartbeat.fritz.ai/5-regression-loss-functions-all-machine-learners-should-know-4fb140e9d4b0 5 Regression Loss Functions All Machine Learners Should Know]</ref>
	+	* Whenever we train a machine learning model, our goal is to find the point that minimizes loss function.<ref name="ref_a0e0" />
	+	* Problems with both: There can be cases where neither loss function gives desirable predictions.<ref name="ref_a0e0" />
	+	* Another way is to try a different loss function.<ref name="ref_a0e0" />
	+	* Generally cost and loss functions are synonymous but cost function can contain regularization terms in addition to loss function.<ref name="ref_d4f7">[https://medium.com/@zeeshanmulla/cost-activation-loss-function-neural-network-deep-learning-what-are-these-91167825a4de Cost, Activation, Loss Function\|\| Neural Network\|\| Deep Learning. What are these?]</ref>
	+	* Loss function is a method of evaluating “how well your algorithm models your dataset”.<ref name="ref_d4f7" />
	+	* Cost Function quantifies the error between predicted values and expected values and presents it in the form of a single real number.<ref name="ref_d4f7" />
	+	* Depending on the problem Cost Function can be formed in many different ways.<ref name="ref_d4f7" />
	+	* In this example, we’re defining the loss function by creating an instance of the loss class.<ref name="ref_477b">[https://neptune.ai/blog/keras-loss-functions Keras Loss Functions: Everything You Need To Know]</ref>
	+	* Problems involving the prediction of more than one class use different loss functions.<ref name="ref_477b" />
	+	* During the training process, one can weigh the loss function by observations or samples.<ref name="ref_477b" />
	+	* It is usually a good idea to monitor the loss function, on the training and validation set as the model is training.<ref name="ref_477b" />
	+	* Loss functions are typically created by instantiating a loss class (e.g. keras.losses.<ref name="ref_3d67">[https://keras.io/api/losses/ Losses]</ref>
	+	===소스===
	+	<references />