Fidelis
Deep Learning

Deep Learning

 
Deep learning is a part of machine learning that uses neural networks. Normal machine learning works well when the data is small and simple, but when we have a lot of data, deep learning does a better job at finding patterns.
Neural networks are designed to act like the human brain. Just like our brain uses many connected neurons to recognize things, a neural network uses layers of nodes to figure things out.
In normal machine learning, we usually give the model clear features (like “height” or “color”) so it can learn. But with neural networks, the computer can learn the features by itself.
For example, if we give the computer a picture of a cat, we don’t have to tell it “look for whiskers” or “look for ears.” The network learns step by step:
  • First layers find simple shapes (lines, edges).
  • Middle layers combine them into parts (ears, eyes, tail).
  • Final layers put everything together to say, “This is a cat.”
The network learns through trial and error. If it makes a mistake, it adjusts its “weights” (connections) until it gets better at making the right choice.
 
notion image
This picture shows how a neural network can tell if an image is a cat or a dog.
  1. Input Layer (left side):
    1. The photo of the animal (a cat in this case) goes into the network.
  1. Hidden Layer 1:
    1. The network looks for small parts, like cat’s ears, eyes, legs, or tail. It also checks for dog parts, like dog’s ears or tail.
  1. Hidden Layer 2:
    1. These small parts are combined into bigger features, like the cat’s head and body or the dog’s head and body.
  1. Output Layer (right side):
    1. Finally, the network makes a decision:
      • If the features look more like a cat → it outputs Cat.
      • If they look more like a dog → it outputs Dog.
The green and orange lines show how information flows forward from one layer to the next. The network learns by adjusting these connections until it gets good at telling cats and dogs apart.
 
Comparison between Statistical Machine Learning (ML) and Deep Learning (DL) in a table format:
Aspect
Statistical Machine Learning
Deep Learning
Data Requirement
Works well with small to medium-sized datasets.
Requires very large datasets to perform well.
Feature Engineering
Relies heavily on human-designed features (domain knowledge is important).
Learns features automatically from raw data (minimal manual feature engineering).
Complexity of Patterns
Suitable for simpler or structured data patterns (e.g., tabular data).
Excels at learning highly complex, nonlinear, and unstructured patterns (e.g., images, audio, text).
Interpretability
Models are usually easier to interpret and explain.
Models are often “black boxes” and harder to interpret.
Examples of Algorithms
Linear/Logistic Regression, SVM, Random Forest, k-NN.
CNNs (Convolutional Neural Networks), RNNs (Recurrent Neural Networks), Transformers.
notion image
 
Different types of deep learning architectures.
Architecture
How It Works
Best For
Example Applications
Feed Forward Neural Network (FNN)
Information moves in one direction (input → hidden layers → output). No loops.
General-purpose tasks with structured/tabular data.
Credit scoring, basic regression/classification, simple recommendation systems.
Recurrent Neural Network (RNN)
Has loops, so it can remember previous inputs and process sequences over time.
Sequential data (time-dependent).
Text generation, speech recognition, stock price prediction.
Convolutional Neural Network (CNN)
Uses convolution filters to detect patterns (edges, textures, shapes) in data.
Image, video, and spatial data.
Image recognition (cats vs dogs), medical imaging, object detection, facial recognition.
Transformers
Uses self-attention to understand relationships in data without needing sequence-by-sequence processing.
Very large-scale sequence and language tasks.
Large Language Models (ChatGPT, BERT), machine translation, document summarization.
 

Activation functions in NN

  • It’s like a switch that decides how much a neuron should “fire.”
  • Without it, a neural network is just doing straight-line math.
  • With it, the network can bend and learn more complex patterns.
It helps introduce the concept of nonlinearity of the problem you are trying to solve

🔹 What is the sigmoid function?

The sigmoid takes any number (big, small, positive, or negative) and squeezes it between 0 and 1:
notion image
notion image
  • If the input is very negative → output is close to 0
  • If the input is very positive → output is close to 1
  • If the input is around 0 → output is 0.5

🔹 Why use sigmoid for binary classification?

Because binary problems are just yes/no, 0/1, true/false.
  • Sigmoid makes the model’s output look like a probability.
    • 0.9 → 90% chance it’s “yes”
    • 0.1 → 10% chance it’s “yes” (so likely “no”)
  • That’s why it’s perfect for binary classification.
The sigmoid is like a squasher. No matter what numbers the model calculates, sigmoid squeezes them into a range between 0 and 1, so we can treat the output as the probability of something being “yes” (class 1) or “no” (class 0).