1 使用documents4j+libreoffice进行转换-有缺陷
实现思路:
1-在Windows系统中使用documents4j进行word向pdf的转换,这个依赖底层主要是使用Microsoft office的apis进行文档转换,所以只能在Windows中使用
2-在Linux中由于没有Microsoft office,所以只能手动下载libreoffice,通过这个服务进行文档的转换
-- 此程序在Windows中可以正常运行, 但是Linux中执行失败
1.1 下载libreoffice
yum install libreoffice
1.2 libreoffice转换命令
/usr/bin/libreoffice --headless --convert-to pdf srcUrl --outdir destUrl
1.3 Java代码细节
/**
* 如果源文件为word 需要转换为pdf
* @param inputStream
* @param type 0-doc\1-docx
* @return
*/
private InputStream wordToPdf(InputStream inputStream, int type) {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
String os = System.getProperty("os.name").toLowerCase();
if(os.contains("windows")) {
IConverter converter = LocalConverter.builder().build();
if (0 == type)
converter.convert(inputStream).as(DocumentType.DOC).to(outputStream).as(DocumentType.PDF).execute();
else converter.convert(inputStream).as(DocumentType.DOCX).to(outputStream).as(DocumentType.PDF).execute();
}else if(os.contains("linux") || os.contains("unix") || os.contains("mac")){
// 将流写进文件
String fileExtension = (type == 0) ? ".doc" : ".docx";
// 这里的baseDir需要为绝对路径
String srcUrl = baseDir + "/tmp/tmpDoc-" + IdUtils.randomUUID() + fileExtension;
String destUrl = baseDir + "/tmp/tmpPdf-" + IdUtils.randomUUID() + ".pdf";
try {
writeToFile(inputStream, srcUrl);
} catch (IOException e) {
throw new RuntimeException(e);
}
// 构建LibreOffice命令
String command = String.format(
"/usr/bin/libreoffice --headless --convert-to pdf '%s' --outdir '%s'",
srcUrl, destUrl);
// 执行LibreOffice命令
Process process = null;
try {
process = Runtime.getRuntime().exec(command);
process.waitFor(); // 等待LibreOffice完成转换
// 将文件转换为流
outputStream = readFromFile(destUrl);
// 清理临时文件
new File(srcUrl).delete();
new File(destUrl).delete();
} catch (IOException e) {
throw new RuntimeException(e);
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}else{
throw new RuntimeException("不支持的系统");
}
return new ByteArrayInputStream(outputStream.toByteArray());
}
private void writeToFile(InputStream inputStream, String filePath) throws IOException {
try (FileOutputStream fileOutputStream = new FileOutputStream(filePath)) {
byte[] buffer = new byte[1024];
int bytesRead;
while ((bytesRead = inputStream.read(buffer)) != -1) {
fileOutputStream.write(buffer, 0, bytesRead);
}
}
}
private ByteArrayOutputStream readFromFile(String filePath) throws IOException {
try (FileInputStream fileInputStream = new FileInputStream(filePath)) {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
int bytesRead;
while ((bytesRead = fileInputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, bytesRead);
}
return outputStream;
}
}
2 使用aspose-words.jar进行文档转换
由于上述方法在Linux中存在问题,依赖于系统中的第三方api,所以并不方便,还对部署造成了负担,使用aspose似乎没有问题-除了付费,这里使用一点魔法应该没有大问题
官方jar包地址:https://releases.aspose.com/java/repo/com/aspose/aspose-words/
实现细节
/**
* 如果源文件为word 需要转换为pdf
* @param inputStream
* @return
*/
private InputStream wordToPdf(InputStream inputStream) {
InputStream fis = null;
ByteArrayOutputStream out = null;
try {
ClassPathResource classPathResource = new ClassPathResource("license.xml");
fis = classPathResource.getInputStream();
// fis = new FileInputStream("src/main/resources/license.xml");
License license = new License();
license.setLicense(fis);
out = new ByteArrayOutputStream();
//开始转换代码...
Document doc = new Document(inputStream);// 加载 Word 文档
// 创建输出流
doc.save(out, SaveFormat.PDF);
} catch (FileNotFoundException e) {
throw new RuntimeException(e);
} catch (Exception e) {
throw new RuntimeException(e);
}
return new ByteArrayInputStream(out.toByteArray());
}
确实比上述方法简洁、有效!收费的东西就是不一样
这里需要手动引入jar包,有几个细节需要注意:
- 引入jar包依赖
<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-words</artifactId>
<version>21.11</version>
<classifier>jdk17</classifier>
<scope>system</scope>
<systemPath>${project.basedir}/src/main/resources/lib/aspose-words-21.11-jdk17.jar</systemPath>
</dependency>
- 打包插件
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<includeSystemScope>true</includeSystemScope>
</configuration>
</plugin>
- jar包中路径问题
ClassPathResource classPathResource = new ClassPathResource("license.xml");
InputStream fis = classPathResource.getInputStream();
标签:outputStream,word,String,new,inputStream,aspose,pdf
From: https://www.cnblogs.com/yuqiu2004/p/18377701