0 votes
1 view
in Machine Learning by (12.5k points)

I am curious about the Tensorflow implementation of tf.nn.conv2d(...). To call it, one simply runs tf.nn.conv2d(...). However, I'm going down the rabbit hole trying to see where it is executed. The code is as follows (where the arrow indicates the function it ultimately calls):

tf.nn.conv2d(...)

tf.nn_ops.conv2d(...)

tf.gen_nn_ops.conv2d(...)

_op_def_lib.apply_op("Conv2D", ...)

I am familiar with Tensorflow's implementation of LSTMs and the ability to easily manipulate them as one deems fit. Is the function that performs the conv2d() calculation written in Python, and if so, where is it? Can I see where and how the strides are executed?

1 Answer

0 votes
by (32.7k points)

The implementation of tf.nn.conv2d() is written in C++, which invokes optimized code using Eigen or the cuDNN library (for GPU).

The chain of functions from tf.nn.conv2d() are Python functions for building a TensorFlow graph, but these do not invoke the implementation. In TensorFlow, you first build a symbolic graph, then execute it.

The implementation of tf.nn.conv2d() is only executed happens when you call Session.run(), passing a Tensor depends on the result of some convolution. 

For example:

input = tf.placeholder(tf.float32)

filter = tf.Variable(tf.truncated_normal([5, 5, 3, 32], stddev=0.1)

conv = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')

result = sess.run(conv, feed_dict={input: ...})  

If you invoke sess.run(...), then TensorFlow will run all the ops that are needed to compute the value of conv, including the convolution itself. The path from here to the implementation is quite complicated, but goes through the following steps:

sess.run() calls the TensorFlow backend to fetch the value of conv.

The backend prunes the computation graph so that nodes must be executed and places the nodes on the appropriate devices.

Each device is instructed to execute its subgraph, using an executor.

The executor eventually invokes the tensorflow::OpKernel that simply corresponds to the convolution operator, by calling its Compute() method.

The "Conv2D" OpKernel is implemented here, and its Compute() method is here. Because this op is performance-critical for many workloads, then the implementation is quite complicated, but the basic idea is that the computation is the Eigen Tensor library or cuDNN's optimized GPU implementation.

Hope this answer helps you!

...