Comparing Shallow and Deep Neural Networks: A Detailed Analysis

Subin Alex
2 min readAug 3, 2024

--

When designing neural networks, one common decision is whether to use a shallow or deep architecture. Both have their advantages and disadvantages, depending on the task at hand. In this post, we’ll compare two neural networks that map a scalar input x to a scalar output y: a shallow network with one hidden layer and a deep network with multiple hidden layers.

The Neural Networks

Shallow Network

  • Structure: Single hidden layer
  • Hidden Units: 95

Deep Network

  • Structure: 10 hidden layers
  • Hidden Units per Layer: 5

Calculating Parameters

Shallow Network

  1. Input to Hidden Layer:
  • Weights: 1×95=95
  • Biases: 95
  • Total: 95+95=190
  1. Hidden Layer to Output Layer:
  • Weights: 95×1=95
  • Biases: 1
  • Total: 95+1=96
  1. Total Parameters:
    190+96=286

Deep Network

  1. Input to First Hidden Layer:
  • Weights: 1×5=5
  • Biases: 5
  • Total: 5+5=10
  1. Hidden Layers to Hidden Layers:
  • For 9 connections (between 10 layers):
  • Weights: 5×5=25
  • Biases: 5
  • Total per connection: 25+5=30
  • Total for all connections: 9×30=270
  1. Last Hidden Layer to Output Layer:
  • Weights: 5×1=5
  • Biases: 1
  • Total: 5+1=6
  1. Total Parameters:
    10+270+6=286

Both networks have the same number of parameters: 286.

Linear Regions

Shallow Network

  • Number of Linear Regions: 295 (a very large number)

Deep Network

  • Number of Linear Regions: 5^10=9,765,625

The deep network, despite having fewer units per layer, can create significantly more linear regions due to its multiple layers.

Runtime Performance

Shallow Network

  • Likely to run faster due to fewer layers and operations per forward pass.

Deep Network

  • May be slower due to more layers and operations, despite having the same number of parameters.

Conclusion

  • Shallow Network: Runs faster but may be limited in modeling complex functions.
  • Deep Network: More powerful in terms of creating linear regions and modeling complexity but may run slower.

In summary, the choice between a shallow and deep neural network depends on the specific requirements of your task. If speed is crucial, a shallow network might be preferable. If modeling complex patterns is more important, a deep network could be the better choice.

--

--

No responses yet