Recurrent Neural Network and Long-Short Term Memory

Recurrent Neural Network

In traditional neural networks, all the inputs and outputs are independent of each other, but in cases like when it is required to predict the next word of a sentence, the previous words are required and hence there is a need to remember the previous words.  Thus Recurrent Neural Network (RNN) came into existence. RNN are networks with loops in them, allowing information to persist.
Recurrent Neural Network Architecture
In the above diagram, the network takes $ x_t$ as input and outputs $ y_t$. The hidden layer applies a formula to the current input as well as the previous state to get current state $ h_t$.
The formula for the current state can be written like this:
$h_t = tanh(l1(x_t) + r1(h_{t-1}))$
The output can be calculated:
$y_t = l2(h_t)$
This chain-like nature reveals that recurrent neural networks are intimately related to sequences and lists. They’re the natural architecture of neural network to use for such data.

Recurrent Neural Networks suffer from short-term memory. If a sequence is long enough, they’ll have a hard time carrying information from earlier time steps to later ones. So if you are trying to process a paragraph of text to do predictions, RNN’s may leave out important information from the beginning.

Long-Short Term Memory


During back propagation, recurrent neural networks suffer from the vanishing gradient problem. Gradients are values used to update a neural networks weights. The vanishing gradient problem is when the gradient shrinks as it back propagates through time. If a gradient value becomes extremely small, it doesn’t contribute too much learning.

Then later, LSTM (long short term memory) was invented to solve this issue by explicitly introducing a memory unit, called the cell into the network. This is the diagram of LSTM building blocks.
The repeating module in an LSTM contains four interacting layers.


  • How much information will be forgotten from previous cell  


  • How much information will be kept from input


              Temporary cell information based on input.


  • Current cell information in whole sequence




  • How much cell information will be used for output  


Gated Recurrent Unit



Various types


Attention



Comments

  1. Deep Learning Projects for Final Year is for students who aim to develop intelligent systems using advanced neural network architectures aligned with IEEE research standards. These projects focus on solving real-world problems such as image classification, object detection, medical diagnosis, and natural language processing using frameworks like TensorFlow and PyTorch. It helps final year students build strong research, implementation, and documentation skills required for IEEE-based academic and industry-oriented projects.



    Image Processing Projects For Final Year is for students interested in developing systems that analyze, enhance, and interpret digital images following IEEE project guidelines. These projects cover techniques such as image enhancement, segmentation, feature extraction, and pattern recognition for applications in healthcare, surveillance, and automation. It enables students to gain practical knowledge in algorithm design, performance evaluation, and technical documentation as per IEEE standards.

    ReplyDelete

Post a Comment

Popular posts from this blog

Intersection over Union (IoU) cho object detection

Inception modules

Triển khai model Deep Learning trên Edge Computing phần I