点击下载
本文文档

当前位置：首页 - 正文

Python读写word文档docx和docx2txt包示例,python,使用,实例

来源：动视网责编：小OO 时间：2025-10-08 00:12:46

Python读写word文档docx和docx2txt包示例,python,使用,实例

Python读写word文档docx和docx2txt包示例,python,使用,实例简介doc是微软的专有的文件格式，docx是MicrosoftOffice2007之后版本使用，其基于OfficeOpenXML标准的压缩文件格式，比doc文件所占用空间更小。 docx格式的文件本质上是一个ZIP文件，所以其实也可以把.docx文件直接改成.zip，解压后，里面的word/document.xml包含了Word文档的大部分内容，图片文件则保存在word/media里面。docx包pytho

推荐度：

点击下载本文 文档为doc格式

导读Python读写word文档docx和docx2txt包示例,python,使用,实例简介doc是微软的专有的文件格式，docx是MicrosoftOffice2007之后版本使用，其基于OfficeOpenXML标准的压缩文件格式，比doc文件所占用空间更小。 docx格式的文件本质上是一个ZIP文件，所以其实也可以把.docx文件直接改成.zip，解压后，里面的word/document.xml包含了Word文档的大部分内容，图片文件则保存在word/media里面。docx包pytho

Python读写word文档docx和docx2txt包示例,python,使用,实例

简介

doc是微软的专有的文件格式，docx是Microsoft Office2007之后版本使用，其基于Office Open XML标准的压缩文件格式，比 doc文件所占用空间更小。 docx格式的文件本质上是一个ZIP文件，所以其实也可以把.docx文件直接改成.zip，解压后，里面的 word/document.xml包含了Word文档的大部分内容，图片文件则保存在word/media里面。

docx包

python-docx不支持.doc文件，间接解决方法是在代码里面先把.doc转为.docx。

python-docx模块会把word文档中的段落、文本、字体等看作对象，处理对象。

∙Document对象：一个word文档

∙Paragraph对象：word文档中的一个段落

∙Paragraph对象的text属性：段落的文本内容

安装

pip install python-docx

使用实例1：读

from docx import Document

def readDocx(fileName):

doc = Document(fileName)

# python UnicodeEncodeError: 'gbk' codec can't encode character '\\xef' in posi

outFile = open("a." + fileName + ".txt", "w", encoding='utf-8')

#for para in doc.paragraphs:

# print(para.txt)

# 每一段的编号、内容

for i in range(len(doc.paragraphs)):

outFile.write(str(i) + " " + doc.paragraphs[i].text + "\\n")

# 表格

tbs = doc.tables

for tb in tbs:

# 行

for row in tb.rows:

# 列

for cell in row.cells:

outFile.write(cell.text + "\")

outFile.write("\\n")

# 也可以用下面方法

# text = ''

# for p in cell.paragraphs:

# text += p.text

# print(text)

写

from docx import Document

from docx.shared import Inches

def createDocx():

document = Document()

# 添加标题并设置级别，范围0-9，默认1

document.add_heading("Title", 0)

p = document.add_paragraph("a plain paragraph lalalal")

# 在段落后面追加文本，并设置样式

# 直接追加哦

p.add_run("bold").bold = True

p.add_run(" test ")

p.add_run("italic.").italic = True

for i in range(10):

document.add_heading("heading, level " + str(i) , level=i)

document.add_paragraph("intense quote", style="Intense Quote")

# 添加list(原点)

document.add_paragraph("first item in unordered list", style="List Bullet")

document.add_paragraph("second item in unordered list", style="List Bullet")

# 添加带计数的list

document.add_paragraph('first item in ordered list', style='List Number')

document.add_paragraph('second item in ordered list', style='List Number')

# 添加图片

document.add_picture('test.PNG', width=Inches(1.25))

records = (

(3, '101', 'Spam'),

(7, '422', 'Eggs'),

(4, '631', 'Spam, spam, eggs, and spam')

)

# 添加表格：一行三列

# 表格样式参数可选：

# Normal Table

# Table Grid

# Light Shading、 Light Shading Accent 1 至 Light Shading Accent 6

# Light List、Light List Accent 1 至 Light List Accent 6

# Light Grid、Light Grid Accent 1 至 Light Grid Accent 6

# 太多了其它省略...

table = document.add_table(rows=1, cols=3, style='Light Shading Accent 1')

# 获取第一行的单元格列表

hdr_cells = table.rows[0].cells

# 下面三行设置上面第一行的三个单元格的文本值

hdr_cells[0].text = 'Qty'

hdr_cells[1].text = 'Id'

hdr_cells[2].text = 'Desc'

for qty, id, desc in records:

# 表格添加行，并返回行所在的单元格列表

row_cells = table.add_row().cells

row_cells[0].text = str(qty)

row_cells[1].text = id

row_cells[2].text = desc

document.add_page_break()

# 保存.docx文档

document.save('demo.docx')

docx2txt包

用它是因为python-docx读不到超链接的文字内容。而docx2txt一定能读到所有字符。

def read_docx(fileName):

text = docx2txt.process(fileName)

outFile = open("b." + fileName + ".txt", "w", encoding='utf-8')

outFile.write(text)

Python读写word文档docx和docx2txt包示例,python,使用,实例

Python读写word文档docx和docx2txt包示例,python,使用,实例简介doc是微软的专有的文件格式，docx是MicrosoftOffice2007之后版本使用，其基于OfficeOpenXML标准的压缩文件格式，比doc文件所占用空间更小。 docx格式的文件本质上是一个ZIP文件，所以其实也可以把.docx文件直接改成.zip，解压后，里面的word/document.xml包含了Word文档的大部分内容，图片文件则保存在word/media里面。docx包pytho

推荐度：

点击下载本文 文档为doc格式

热门焦点

Python读写word文档docx和docx2txt包示例,python,使用,实例

Python读写word文档docx和docx2txt包示例,python,使用,实例

Python读写word文档docx和docx2txt包示例,python,使用,实例

最新推荐

猜你喜欢

热门推荐