Intuitive understanding of 1D, 2D, and 3D Convolutions in Convolutional Neural Networks

Question

2 Answers

Shrutiparna · Answer 1 · 2019-05-31T08:21:00+0000

1D convolutions -

Only 1-direction to calculate conv
input = [W], filter = [k], output = [W]
input = [1,1,1,1,1], filter = [0.25,0.5,0.25], output = [1,1,1,1,1]
output-shape is 1D array
Ex-graph smoothing

Ex -

import tensorflow as tf
import numpy as np
ses = tf.Session()
ones_1d = np.ones(5)
weight_1d = np.ones(3)
stride_1d = 1
in_1d = tf.constant(ones_1d, dtype=tf.float32)
fil_1d = tf.constant(weight_1d, dtype=tf.float32)
in_w = int(in_1d.shape[0])
fil_w = int(fil_1d.shape[0])
input_1d = tf.reshape(in_1d, [1, in_w, 1])
kernel_1d = tf.reshape(fil_1d, [fil_w, 1, 1])
result_1d = tf.squeeze(tf.nn.conv1d(input_1d, kernel_1d, stride_1d, padding='SAME'))
print ses.run(result_1d)

2D convolution -

2-directions (x,y) to calculate conv
input = [W, H], filter = [k,k] output = [W,H]
output-shape is a 2D Matrix
Ex-Sobel Egde Fllter

Ex -

ones_2d = np.ones((5,5))
weight_2d = np.ones((3,3))
stride_2d = [1, 1, 1, 1]
in_2d = tf.constant(ones_2d, dtype=tf.float32)
fil_2d = tf.constant(weight_2d, dtype=tf.float32)
in_w = int(in_2d.shape[0])
in_h= int(in_2d.shape[1])
fil_w = int(fil_2d.shape[0])
fil_h = int(fil_2d.shape[1])
input_2d = tf.reshape(in_2d, [1, in_he, in_w, 1])
kernel_2d = tf.reshape(fil_2d, [fil_h, fil_w, 1, 1])
result_2d = tf.squeeze(tf.nn.conv2d(input_2d, kernel_2d, strides=stride_2d, padding='SAME'))
print ses.run(result_2d)

3D convolution -

3-direction (x,y,z) to calcuate conv
input = [W,H,L], filter = [k,k,d] output = [W,H,M]
output-shape is 3D Volume
d < L is important! for making volume output

Ex -

ones_3d = np.ones((5,5,5))
weight_3d = np.ones((3,3,3))
stride_3d = [1, 1, 1, 1, 1]
in_3d = tf.constant(ones_3d, dtype=tf.float32)
fil_3d = tf.constant(weight_3d, dtype=tf.float32)
in_w = int(in_3d.shape[0])
in_h= int(in_3d.shape[1])
in_d= int(in_3d.shape[2])
fil_w = int(fil_3d.shape[0])
fil_h = int(fil_3d.shape[1])
fil_d = int(fil_3d.shape[2])
input_3d = tf.reshape(in_3d, [1, in_d, in_h, in_d, 1])
kernel_3d = tf.reshape(filter_3d, [fil_d, filter_h, fil_w, 1, 1])
result_3d = tf.squeeze(tf.nn.conv3d(input_3d, kernel_3d, strides=stride_3d, padding='SAME'))
print ses.run(result_3d)

Hope this helps!

If you want to know more about the neural network visit this Neural Network Tutorial.

The 2d conv with 3d input is a nice touch. I would suggest an edit to include 1d conv with 2d input (e.g. a multi-channel array) and compare the difference thereof with a 2d conv with 2d input. — Ashok, Aug 15, 2019

vinita · Answer 2 · 2019-08-10T06:57:30+0000

CNN 1D,2D, or 3D relates to convolution direction, rather than input or filter dimension.

For 1 channel input, CNN2D equals to CNN1D is the kernel length = input length. (1 conv direction)

Intuitive understanding of 1D, 2D, and 3D Convolutions in Convolutional Neural Networks

2 Answers

Related questions

Browse Categories