首页 > 编程语言 >java中使用apache poi 读取 doc,docx,ppt,pptx,xls,xlsx,txt,csv格式的文件示例代码

java中使用apache poi 读取 doc,docx,ppt,pptx,xls,xlsx,txt,csv格式的文件示例代码

时间:2022-12-06 15:02:35浏览次数:46  
标签:pptx xlsx document extractor 示例 fis file new null

java使用apache poi 读取 doc,docx,ppt,pptx,xls,xlsx,txt,csv格式的文件示例代码

1、maven依赖添加

在 pom 文件中添加如下依赖

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>4.1.0</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>4.1.0</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml-schemas</artifactId>
    <version>4.1.0</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-scratchpad</artifactId>
    <version>4.1.0</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>ooxml-schemas</artifactId>
    <version>1.4</version>
</dependency>

2、文件读取代码示例

doc 格式文件

// --------- doc -----------
File file = new File("E:\\search-file\\22.doc");
FileInputStream fis = null;
HWPFDocument document = null;
WordExtractor extractor = null;
try {
    fis = new FileInputStream(file);
    document = new HWPFDocument(fis);
    extractor = new WordExtractor(document);
    log.info("extractor.getText:{}", extractor.getText());
} catch (Exception e) {
    e.printStackTrace();
}

docx 格式文件

// --------- docx -----------
File file = new File("E:\\search-file\\11.docx");
FileInputStream fis = null;
XWPFDocument document = null;
XWPFWordExtractor extractor = null;
try {
    fis = new FileInputStream(file);
    document = new XWPFDocument(fis);
    extractor = new XWPFWordExtractor(document);
    log.info("extractor.getText:{}", extractor.getText());
} catch (Exception e) {
    e.printStackTrace();
}

pptx 格式文件

// --------- pptx -----------
File file = new File("E:\\search-file\\33.pptx");
FileInputStream fis = null;
XMLSlideShow document = null;
SlideShowExtractor extractor = null;
try {
    fis = new FileInputStream(file);
    document = new XMLSlideShow(fis);
    extractor = new SlideShowExtractor(document);
    log.info("extractor.getText:{}", extractor.getText());
} catch (Exception e) {
    e.printStackTrace();
}

ppt 格式文件

// --------- ppt -----------
File file = new File("E:\\search-file\\44.ppt");
FileInputStream fis = null;
HSLFSlideShow document = null;
SlideShowExtractor extractor = null;
try {
    fis = new FileInputStream(file);
    document = new HSLFSlideShow(fis);
    extractor = new SlideShowExtractor(document);
    log.info("extractor.getText:{}", extractor.getText());
} catch (Exception e) {
    e.printStackTrace();
}

xlsx 格式文件

// --------- xlsx -----------

File file = new File("E:\\search-file\\55.xlsx");
FileInputStream fis = null;
XSSFWorkbook document = null;
XSSFExcelExtractor extractor = null;
try {
    fis = new FileInputStream(file);
    document = new XSSFWorkbook(fis);
    extractor = new XSSFExcelExtractor(document);
    log.info("extractor.getText:{}", extractor.getText());
} catch (Exception e) {
    e.printStackTrace();
}

xls 格式文件

// --------- xls -----------
File file = new File("E:\\search-file\\66.xls");
FileInputStream fis = null;
HSSFWorkbook document = null;
ExcelExtractor extractor = null;
try {
    fis = new FileInputStream(file);
    document = new HSSFWorkbook(fis);
    extractor = new ExcelExtractor(document);
    log.info("extractor.getText:{}", extractor.getText());
} catch (Exception e) {
    e.printStackTrace();
}

txt,csv 格式文件

// --------- txt,csv -----------
File file = new File("E:\\search-file\\77.txt");
StringBuffer buffer = new StringBuffer();
try (BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(file), "utf8"))){
    String line = null;
    while ((line = reader.readLine()) != null) {
        buffer.append(line).append('\n');
    }
} catch (Exception e) {
    e.printStackTrace();
}
log.info("txt-context:{}", buffer);

标签:pptx,xlsx,document,extractor,示例,fis,file,new,null
From: https://www.cnblogs.com/xiangningdeguang/p/16955251.html

相关文章

  • JAVA8 steam 常用示例
    packagehk.org.ha.tims;importhk.org.ha.tims.dto.vo.UserRoleVo;importlombok.Data;importjava.util.*;importjava.util.function.Function;importjava.util.stream......
  • SQL Server Merge matched 再加其他条件的示例
    这里介绍使用临时表的方式进行Merge,额外的条件语句用红色标出:假设有一个字典表dic_dict第一步先创建临时表createtable#temp_source([code][varchar](20)COLLATE......
  • python之 json文件转xlsx文件
    直接上干货JSON数据转化后的xlsx文件代码解析(可直接食用)"""@File:json_to_xlsx.py@Author:Logan@Date:2022/12/6@Desc:json数据保存未xlsx文件"""......
  • python之xlsx合并单元格
    需求背景:工作中将数据保存xlsx文件之后,里面每一列中有很多重复的看着很不美观,需要将每一列中的相同值合并起来,是表格看起来美观简洁处理前处理后直接上代码(内涵注释......
  • Mysqlbackup 增量备份恢复示例
    适用范围5.7+方案概述在生产环境中,我们都会对数据库进行备份,我们知道ORACLE的RMAN备份很灵活,有全备,增量,归档等等备份方式!针对MYSQL来讲,也有一款自己的备份工具mysql......
  • ASP.NET Core上传文件 示例代码记录
    微软文档https://learn.microsoft.com/zh-cn/aspnet/core/mvc/models/file-uploads?view=aspnetcore-3.1 到下载页面如下: 下载代码后到3.x\SampleApp目录,vs打开该......
  • CountDownLatch详解以及用法示例
    一、什么是countDownlatchCountDownLatch是一个同步工具类,它通过一个计数器来实现的,初始值为线程的数量。每当一个线程完成了自己的任务,计数器的值就相应得减1。当计数器......
  • Flutter 陈航 05-工程结构 示例项目 声明式
    本文地址目录目录目录05|Flutter是如何运行在原生系统上的计数器示例工程工程结构工程代码应用的整体结构MyAppMyHomePage页面布局及交互逻辑ScaffoldsetState代码流......
  • SwiftUI 常见组件示例
    基础组件TextText("Hamlet").font(.largeTitle)Text("byWilliamShakespeare").font(.caption).italic()ImageHStack{Image(systemName:"fol......
  • spring boot使用阿里云分片上传讲解示例
    阿里云分片上传importcom.aliyun.oss.ClientException;importcom.aliyun.oss.OSS;importcom.aliyun.oss.OSSClientBuilder;importcom.aliyun.oss.OSSException;im......