+5 votes
1 view
in Machine Learning by (3.4k points)
edited by
What is the difference between 1D, 2D and 3D convolutions in CNN? Please explain with examples.

2 Answers

+4 votes
by (10.9k points)
edited by

1D convolutions -

  • Only 1-direction to calculate conv
  • input = [W], filter = [k], output = [W]
  • input = [1,1,1,1,1], filter = [0.25,0.5,0.25], output = [1,1,1,1,1]
  • output-shape is 1D array
  • Ex-graph smoothing

Ex - 

import tensorflow as tf

import numpy as np

 ses = tf.Session()

ones_1d = np.ones(5)

weight_1d = np.ones(3)

stride_1d = 1

 in_1d = tf.constant(ones_1d, dtype=tf.float32)

fil_1d = tf.constant(weight_1d, dtype=tf.float32)

 in_w = int(in_1d.shape[0])

fil_w = int(fil_1d.shape[0])

input_1d   = tf.reshape(in_1d, [1, in_w, 1])

kernel_1d = tf.reshape(fil_1d, [fil_w, 1, 1])

result_1d = tf.squeeze(tf.nn.conv1d(input_1d, kernel_1d, stride_1d, padding='SAME'))

print ses.run(result_1d)

2D convolution -

  • 2-directions (x,y) to calculate conv
  • input = [W, H], filter = [k,k] output = [W,H]
  • output-shape is a 2D Matrix
  • Ex-Sobel Egde Fllter 

Ex - 

ones_2d = np.ones((5,5))

weight_2d = np.ones((3,3))

stride_2d = [1, 1, 1, 1]

 in_2d = tf.constant(ones_2d, dtype=tf.float32)

fil_2d = tf.constant(weight_2d, dtype=tf.float32)

in_w = int(in_2d.shape[0])

in_h= int(in_2d.shape[1])

 fil_w = int(fil_2d.shape[0])

fil_h = int(fil_2d.shape[1])

input_2d   = tf.reshape(in_2d, [1, in_he, in_w, 1])

kernel_2d = tf.reshape(fil_2d, [fil_h, fil_w, 1, 1])

result_2d = tf.squeeze(tf.nn.conv2d(input_2d, kernel_2d, strides=stride_2d, padding='SAME'))

print ses.run(result_2d)

3D convolution -

  • 3-direction (x,y,z) to calcuate conv
  • input = [W,H,L], filter = [k,k,d] output = [W,H,M]
  • output-shape is 3D Volume
  • d < L is important! for making volume output 

Ex -

 ones_3d = np.ones((5,5,5))

weight_3d = np.ones((3,3,3))

stride_3d = [1, 1, 1, 1, 1]

in_3d = tf.constant(ones_3d, dtype=tf.float32)

fil_3d = tf.constant(weight_3d, dtype=tf.float32)

in_w = int(in_3d.shape[0])

in_h= int(in_3d.shape[1])

in_d= int(in_3d.shape[2])

fil_w = int(fil_3d.shape[0])

fil_h = int(fil_3d.shape[1])

fil_d = int(fil_3d.shape[2])

input_3d   = tf.reshape(in_3d, [1, in_d, in_h, in_d, 1])

kernel_3d = tf.reshape(filter_3d, [fil_d, filter_h, fil_w, 1, 1])

result_3d = tf.squeeze(tf.nn.conv3d(input_3d, kernel_3d, strides=stride_3d, padding='SAME'))

print ses.run(result_3d)

Hope this helps! 

If you want to know more about the neural network visit this Neural Network Tutorial.

by (19.7k points)
Very well explained, thanks!
by (29.8k points)
This helped a lot thanks!!
by (33.2k points)
Thanks for the clear explanation!
by (47.2k points)
The 2d conv with 3d input is a nice touch. I would suggest an edit to include 1d conv with 2d input (e.g. a multi-channel array) and compare the difference thereof with a 2d conv with 2d input.
by (44.6k points)
This answer is the complete package
by (107k points)
Nicely explained got the answer
by (14.5k points)
Great Explanation.
by (28.1k points)
Amazing answer!
+1 vote
by (90.8k points)

CNN 1D,2D, or 3D relates to convolution direction, rather than input or filter dimension.

For 1 channel input, CNN2D equals to CNN1D is the kernel length = input length. (1 conv direction)

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...