# Intuitive understanding of 1D, 2D, and 3D Convolutions in Convolutional Neural Networks

1 view

edited
What is the difference between 1D, 2D and 3D convolutions in CNN? Please explain with examples.

by (10.9k points)
edited

1D convolutions -

• Only 1-direction to calculate conv
• input = [W], filter = [k], output = [W]
• input = [1,1,1,1,1], filter = [0.25,0.5,0.25], output = [1,1,1,1,1]
• output-shape is 1D array
• Ex-graph smoothing

Ex -

import tensorflow as tf

import numpy as np

ses = tf.Session()

ones_1d = np.ones(5)

weight_1d = np.ones(3)

stride_1d = 1

in_1d = tf.constant(ones_1d, dtype=tf.float32)

fil_1d = tf.constant(weight_1d, dtype=tf.float32)

in_w = int(in_1d.shape)

fil_w = int(fil_1d.shape)

input_1d   = tf.reshape(in_1d, [1, in_w, 1])

kernel_1d = tf.reshape(fil_1d, [fil_w, 1, 1])

result_1d = tf.squeeze(tf.nn.conv1d(input_1d, kernel_1d, stride_1d, padding='SAME'))

print ses.run(result_1d)

2D convolution -

• 2-directions (x,y) to calculate conv
• input = [W, H], filter = [k,k] output = [W,H]
• output-shape is a 2D Matrix
• Ex-Sobel Egde Fllter

Ex -

ones_2d = np.ones((5,5))

weight_2d = np.ones((3,3))

stride_2d = [1, 1, 1, 1]

in_2d = tf.constant(ones_2d, dtype=tf.float32)

fil_2d = tf.constant(weight_2d, dtype=tf.float32)

in_w = int(in_2d.shape)

in_h= int(in_2d.shape)

fil_w = int(fil_2d.shape)

fil_h = int(fil_2d.shape)

input_2d   = tf.reshape(in_2d, [1, in_he, in_w, 1])

kernel_2d = tf.reshape(fil_2d, [fil_h, fil_w, 1, 1])

result_2d = tf.squeeze(tf.nn.conv2d(input_2d, kernel_2d, strides=stride_2d, padding='SAME'))

print ses.run(result_2d)

3D convolution -

• 3-direction (x,y,z) to calcuate conv
• input = [W,H,L], filter = [k,k,d] output = [W,H,M]
• output-shape is 3D Volume
• d < L is important! for making volume output

Ex -

ones_3d = np.ones((5,5,5))

weight_3d = np.ones((3,3,3))

stride_3d = [1, 1, 1, 1, 1]

in_3d = tf.constant(ones_3d, dtype=tf.float32)

fil_3d = tf.constant(weight_3d, dtype=tf.float32)

in_w = int(in_3d.shape)

in_h= int(in_3d.shape)

in_d= int(in_3d.shape)

fil_w = int(fil_3d.shape)

fil_h = int(fil_3d.shape)

fil_d = int(fil_3d.shape)

input_3d   = tf.reshape(in_3d, [1, in_d, in_h, in_d, 1])

kernel_3d = tf.reshape(filter_3d, [fil_d, filter_h, fil_w, 1, 1])

result_3d = tf.squeeze(tf.nn.conv3d(input_3d, kernel_3d, strides=stride_3d, padding='SAME'))

print ses.run(result_3d)

Hope this helps!

If you want to know more about the neural network visit this Neural Network Tutorial.

by (19.8k points)
Very well explained, thanks!
by (29.8k points)
This helped a lot thanks!!
by (33.2k points)
Thanks for the clear explanation!
by (47.2k points)
The 2d conv with 3d input is a nice touch. I would suggest an edit to include 1d conv with 2d input (e.g. a multi-channel array) and compare the difference thereof with a 2d conv with 2d input.
by (44.6k points)
This answer is the complete package
by (107k points)
by (16.3k points)
Great Explanation.
by (28.1k points)
+1 vote
by (92.8k points)

CNN 1D,2D, or 3D relates to convolution direction, rather than input or filter dimension.

For 1 channel input, CNN2D equals to CNN1D is the kernel length = input length. (1 conv direction)