I'm using inception v3 and tensorflow to identify some objects within the image. However, it just creates a list of possible objects and I need it to inform their position in the image.
I'm following the flowers tutorial: https://www.tensorflow.org/versions/r0.9/how_tos/image_retraining/index.html
Bazel-bin/tensorflow/examples/image_retraining/retrain --image_dir ~/flower_photos
You need another architecture to predict the bounding boxes, like R-CNN and its newer (and faster) variants (Fast R-CNN, Faster R-CNN).
R-CNN extracts a group of regions from the given image using selective search and then checks if any of these boxes contains an object. The first step is to extract these regions, and for each region, CNN is used to extract specific features. Finally, these features are then used to detect objects.
For a practical implementation of the Faster R-CNN Algorithm for Object Detection, refer the following link:https://medium.com/analytics-vidhya/a-practical-implementation-of-the-faster-r-cnn-algorithm-for-object-detection-part-2-with-cac45dada619