# ReLU

둘러보기로 가기
검색하러 가기

## 노트

- Rectified Linear Unit, otherwise known as ReLU is an activation function used in neural networks.
^{[1]} - It suffers from the problem of dying ReLU’s.
^{[1]} - Does the Rectified Linear Unit (ReLU) function meet this criterion?
^{[2]} - Because ReLU doesn't change any non-negative value.
^{[3]} - So for (sigmoid, relu) in the last two layers, the model is not able to learn, i.e. the gradients are not back propagated well.
^{[4]} - Rectifier linear unit or its more widely known name as ReLU becomes popular for the past several years since its performance and speed.
^{[5]} - However, ReLU destroys gradient vanishing problem.
^{[5]} - That’s why, experiments show ReLU is six times faster than other well known activation functions.
^{[5]} - If you input an x-value that is greater than zero, then it's the same as the ReLU – the result will be a y-value equal to the x-value.
^{[6]} - SNNs cannot be derived with (scaled) rectified linear units (ReLUs), sigmoid units, tanh units, and leaky ReLUs.
^{[6]} - ReLU is very simple to calculate, as it involves only a comparison between its input and the value 0.
^{[7]} - As a consequence, the usage of ReLU helps to prevent the exponential growth in the computation required to operate the neural network.
^{[7]} - While sigmoidal functions have derivatives that tend to 0 as they approach positive infinity, ReLU always remains at a constant 1.
^{[7]} - This flowchart shows a typical architecture for a CNN with a ReLU and a Dropout layer.
^{[7]} - larization to the inputs of the ReLU can be reduced.
^{[8]} - Instead of sigmoids, most recent deep learning networks use rectified linear units (ReLUs) for the hidden layers.
^{[9]} - ReLU activations are the simplest non-linear activation function you can use, obviously.
^{[9]} - Research has shown that ReLUs result in much faster training for large networks.
^{[9]} - That is, the ReLU units can irreversibly die during training since they can get knocked off the data manifold.
^{[9]} - Neural networks (NN) with rectified linear units (ReLU) have been widely implemented since 2012.
^{[10]} - In this paper, we describe an activation function called the biased ReLU neuron (BReLU), which is similar to the ReLU.
^{[10]} - ReLu is a non-linear activation function that is used in multi-layer neural networks or deep neural networks.
^{[11]} - According to equation 1, the output of ReLu is the maximum value between zero and the input value.
^{[11]} - ReLU stands for rectified linear activation unit and is considered one of the few milestones in the deep learning revolution.
^{[12]} - The activations functions that were used mostly before ReLU such as sigmoid or tanh activation function saturated.
^{[12]} - ReLU, on the other hand, does not face this problem as its slope doesn’t plateau, or “saturate,” when the input gets large.
^{[12]} - Because the slope of ReLU in the negative range is also 0, once a neuron gets negative, it’s unlikely for it to recover.
^{[12]} - ReLU stands for Rectified Linear Unit.
^{[13]} - This is another variant of ReLU that aims to solve the problem of gradient’s becoming zero for the left half of the axis.
^{[13]} - The parameterised ReLU, as the name suggests, introduces a new parameter as a slope of the negative part of the function.
^{[13]} - Unlike the leaky relu and parametric ReLU functions, instead of a straight line, ELU uses a log curve for defning the negatice values.
^{[13]} - One way ReLUs improve neural networks is by speeding up training.
^{[14]} - The Rectified Linear Unit has become very popular in the last few years.
^{[15]} - (-) Unfortunately, ReLU units can be fragile during training and can “die”.
^{[15]} - Leaky ReLUs are one attempt to fix the “dying ReLU” problem.
^{[15]} - Instead of the function being zero when x < 0, a leaky ReLU will instead have a small negative slope (of 0.01, or so).
^{[15]} - Since ReLU is zero for all negative inputs, it’s likely for any given unit to not activate at all.
^{[16]} - As long as not all of them are negative, we can still get a slope out of ReLU.
^{[16]} - If not, leaky ReLU and ELU are also good alternatives to try.
^{[16]} - ReLU stands for rectified linear unit, and is a type of activation function.
^{[17]} - Concatenated ReLU has two outputs, one ReLU and one negative ReLU, concatenated together.
^{[17]} - You may run into ReLU-6 in some libraries, which is ReLU capped at 6.
^{[17]} - On the other hand, ELU becomes smooth slowly until its output equal to -α whereas RELU sharply smoothes.
^{[18]} - ReLu is less computationally expensive than tanh and sigmoid because it involves simpler mathematical operations.
^{[18]} - Further reading Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, Kaiming He et al.
^{[18]} - A node or unit that implements this activation function is referred to as a rectified linear activation unit, or ReLU for short.
^{[19]} - The idea is to use rectified linear units to produce the code layer.
^{[19]} - Most papers that achieve state-of-the-art results will describe a network using ReLU.
^{[19]} - … we propose a new generalization of ReLU, which we call Parametric Rectified Linear Unit (PReLU).
^{[19]}

### 소스

- ↑
^{1.0}^{1.1}ReLU as an Activation Function in Neural Networks - ↑ Why is the ReLU function not differentiable at x=0?
- ↑ Can relu be used at the last layer of a neural network?
- ↑ Is ReLU After Sigmoid Bad?
- ↑
^{5.0}^{5.1}^{5.2}ReLU as Neural Networks Activation Function - ↑
^{6.0}^{6.1}Activation Functions Explained - GELU, SELU, ELU, ReLU and more - ↑
^{7.0}^{7.1}^{7.2}^{7.3}How ReLU and Dropout Layers Work in CNNs - ↑ (PDF) Improvement of learning for CNN with ReLU activation by sparse regularization
- ↑
^{9.0}^{9.1}^{9.2}^{9.3}ReLU and Softmax Activation Functions · Kulbear/deep-learning-nano-foundation Wiki · GitHub - ↑
^{10.0}^{10.1}Biased ReLU neural networks ☆ - ↑
^{11.0}^{11.1}ReLu - ↑
^{12.0}^{12.1}^{12.2}^{12.3}An Introduction to Rectified Linear Unit (ReLU) - ↑
^{13.0}^{13.1}^{13.2}^{13.3}Fundamentals Of Deep Learning - ↑ Why do we use ReLU in neural networks and how do we use it?
- ↑
^{15.0}^{15.1}^{15.2}^{15.3}CS231n Convolutional Neural Networks for Visual Recognition - ↑
^{16.0}^{16.1}^{16.2}A Practical Guide to ReLU - ↑
^{17.0}^{17.1}^{17.2}ReLU — Most popular Activation Function for Deep Neural Networks - ↑
^{18.0}^{18.1}^{18.2}Activation Functions — ML Glossary documentation - ↑
^{19.0}^{19.1}^{19.2}^{19.3}A Gentle Introduction to the Rectified Linear Unit (ReLU)

## 메타데이터

### 위키데이터

- ID : Q7303176