ubuntu - Issue reading text with python-docx when document contains Images -
i having issues parsing text document contains images.
i using version 0.7.0 of python docx on ubuntu linux machine running ubuntu 12.04.4 lts (gnu/linux 3.2.0-60-generic x86_64)
i using logic:
```
document = document(path) # paragraphs paras = document.paragraphs text = "" # push text paragraph on single string para in paras: # don't forget line break text += "\n" + para.text return text.strip() ```
when there image process fails.
is there doing wrong?
python-docx should support you're trying here. if you'll provide stack trace when error raised i'll take look.
btw, can code little more elegantly as:
document = document(path) text = '\n'.join([para.text para in document.paragraphs])
Comments
Post a Comment