Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in Python by (19.9k points)

I'm trying to parse a docx file using python-docx. The file contains images and text. Basically i need a way to take an image(an InlineShape object) from the file and save it as a separate image (like "smth.jpg"). Is there a way to do that? From reading the API docs it doesn't seem like it, but maybe i'm missing something.

1 Answer

0 votes
by (25.1k points)

You can use the docsx2python python module for this.

Firstly, install the module by running the following command:

pip install docx2python

Now just run the following code:

from docx2python import docx2python

content = docx2python('my_document.docx', 'output_image_directory')

The extracted images will be stored in the directory you passed in as the second argument.

Related questions

0 votes
2 answers
0 votes
1 answer
+1 vote
1 answer
asked May 29, 2019 in R Programming by Ritik (3.5k points)
+5 votes
4 answers
asked May 15, 2019 in Python by Anisha (320 points)

Browse Categories

...