核心东西,执行命令:sudo yum install poppler-utils.x86_64
安装pdftohtml扩展。
注意下面黑色加粗部分,需要自己手动操作
[root@iZ2zegrzsxl1644zcx9rffZ zhanwang.tnvk.com]# sudo yum install poppler-utils.x86_64
Loaded plugins: fastestmirror
Determining fastest mirrors
base | 3.6 kB 00:00:00
epel | 4.7 kB 00:00:00
extras | 2.9 kB 00:00:00
updates | 2.9 kB 00:00:00
(1/4): extras/7/x86_64/primary_db | 246 kB 00:00:00
(2/4): epel/x86_64/updateinfo | 1.0 MB 00:00:00
(3/4): epel/x86_64/primary_db | 7.0 MB 00:00:00
(4/4): updates/7/x86_64/primary_db | 14 MB 00:00:00
Resolving Dependencies
--> Running transaction check
---> Package poppler-utils.x86_64 0:0.26.5-43.el7.1 will be installed
--> Processing Dependency: poppler(x86-64) = 0.26.5-43.el7.1 for package: poppler-utils-0.26.5-43.el7.1.x86_64
--> Processing Dependency: libpoppler.so.46()(64bit) for package: poppler-utils-0.26.5-43.el7.1.x86_64
--> Processing Dependency: libopenjpeg.so.1()(64bit) for package: poppler-utils-0.26.5-43.el7.1.x86_64
--> Running transaction check
---> Package openjpeg-libs.x86_64 0:1.5.1-18.el7 will be installed
---> Package poppler.x86_64 0:0.26.5-43.el7.1 will be installed
--> Processing Dependency: poppler-data >= 0.4.0 for package: poppler-0.26.5-43.el7.1.x86_64
--> Running transaction check
---> Package poppler-data.noarch 0:0.4.6-3.el7 will be installed
--> Finished Dependency Resolution
Dependencies Resolved
====================================================================================================
Package Arch Version Repository Size
====================================================================================================
Installing:
poppler-utils x86_64 0.26.5-43.el7.1 updates 170 k
Installing for dependencies:
openjpeg-libs x86_64 1.5.1-18.el7 base 86 k
poppler x86_64 0.26.5-43.el7.1 updates 787 k
poppler-data noarch 0.4.6-3.el7 base 2.2 M
Transaction Summary
====================================================================================================
Install 1 Package (+3 Dependent packages)
Total download size: 3.2 M
Installed size: 15 M
Is this ok [y/d/N]: y
Downloading packages:
(1/4): openjpeg-libs-1.5.1-18.el7.x86_64.rpm | 86 kB 00:00:00
(2/4): poppler-utils-0.26.5-43.el7.1.x86_64.rpm | 170 kB 00:00:00
(3/4): poppler-0.26.5-43.el7.1.x86_64.rpm | 787 kB 00:00:00
(4/4): poppler-data-0.4.6-3.el7.noarch.rpm | 2.2 MB 00:00:00
----------------------------------------------------------------------------------------------------
Total 9.3 MB/s | 3.2 MB 00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : openjpeg-libs-1.5.1-18.el7.x86_64 1/4
Installing : poppler-data-0.4.6-3.el7.noarch 2/4
Installing : poppler-0.26.5-43.el7.1.x86_64 3/4
Installing : poppler-utils-0.26.5-43.el7.1.x86_64 4/4
Verifying : poppler-0.26.5-43.el7.1.x86_64 1/4
Verifying : openjpeg-libs-1.5.1-18.el7.x86_64 2/4
Verifying : poppler-data-0.4.6-3.el7.noarch 3/4
Verifying : poppler-utils-0.26.5-43.el7.1.x86_64 4/4
Installed:
poppler-utils.x86_64 0:0.26.5-43.el7.1
Dependency Installed:
openjpeg-libs.x86_64 0:1.5.1-18.el7 poppler.x86_64 0:0.26.5-43.el7.1
poppler-data.noarch 0:0.4.6-3.el7
Complete!
[root@iZ2zegrzsxl1644zcx9rffZ zhanwang.tnvk.com]# pdftohtml --help
pdftohtml version 0.26.5
Copyright 2005-2014 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1999-2003 Gueorgui Ovtcharov and Rainer Dorsch
Copyright 1996-2011 Glyph & Cog, LLC
Usage: pdftohtml [options] <PDF-file> [<html-file> <xml-file>]
-f <int> : first page to convert
-l <int> : last page to convert
-q : don't print any messages or errors
-h : print usage information
-? : print usage information
-help : print usage information
--help : print usage information
-p : exchange .pdf links by .html
-c : generate complex document
-s : generate single document that includes all pages
-i : ignore images
-noframes : generate no frames
-stdout : use standard output
-zoom <fp> : zoom the pdf document (default 1.5)
-xml : output for XML post-processing
-hidden : output hidden text
-nomerge : do not merge paragraphs
-enc <string> : output text encoding name
-fmt <string> : image file format for Splash output (png or jpg)
-v : print copyright and version info
-opw <string> : owner password (for encrypted files)
-upw <string> : user password (for encrypted files)
-nodrm : override document DRM settings
-wbt <fp> : word break threshold (default 10 percent)
-fontfullname : outputs font full name
[root@iZ2zegrzsxl1644zcx9rffZ zhanwang.tnvk.com]# ls
application composer.json cpadmin.php index.php static system vendor
backup composer.lock data lib svg.php tt.php
[root@iZ2zegrzsxl1644zcx9rffZ zhanwang.tnvk.com]# pdftohtml -c -s 1.pdf
Page-1
Page-2
Page-3
Page-4
Page-5
Page-6
[root@iZ2zegrzsxl1644zcx9rffZ zhanwang.tnvk.com]# pdftohtml -c -s data/doc/0f9c2fb33a846aef27200a5f92789d4b.pdf
Page-1
Page-2
Page-3
Page-4
Page-5
Page-6
Page-7
Page-8
Page-9
Page-10
Page-11
Page-12
Page-13
Page-14
Page-15
Page-16
Page-17
Page-18
Page-19
Page-20
Page-21
Page-22
Page-23
Page-24
Page-25
Page-26
pdftohtml -c -s 文件绝对路径