首页 > 编程语言 >【Java】Word题库解析2

【Java】Word题库解析2

时间:2024-09-08 19:36:54浏览次数:4  
标签:Java String text poi static import Word public 题库

 

初稿见:https://www.cnblogs.com/mindzone/p/18362194

一、新增需求

在原稿题库之后,还需要生成一份纯题目 + 纯答案

答案放在开头,题目里面去掉答案

在检查题型时还发现部分内容略有区别:

 所以在判断是否为答案的时候需要兼容这种答案

二、关于老版本支持

doc2000版需要追加一个scratchpad的库支持才行

<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>5.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>5.0.0</version>
</dependency>
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-scratchpad</artifactId>
    <version>5.0.0</version>
</dependency>

  

需要导入的资源:

import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.CharacterProperties;
import org.apache.poi.hwpf.usermodel.CharacterRun;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;

  

三、工具类实现

package cn.cloud9.word;

import com.alibaba.druid.util.StringUtils;
import lombok.*;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.CharacterProperties;
import org.apache.poi.hwpf.usermodel.CharacterRun;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.xwpf.usermodel.XWPFDocument;

import java.io.File;
import java.io.FileInputStream;
import java.util.*;
import java.util.stream.Collectors;

public class ExamUtil {
    private static final List<String> ANSWER_PREFIX = Arrays.asList("答案:", "参考答案:");
    private static final List<String> OPTIONS = Arrays.asList("A", "B", "C", "D", "E", "F", "G");;
    private static final String NUMBER_REGEXP = "^[1-9]\\d*";
    private static final String SPLIT_IDENTIFY = "\\.";

    @Data
    @AllArgsConstructor
    @NoArgsConstructor
    @Builder
    @ToString
    public static final class RoughItem {
        public int serial;
        public String exCode;
        public String content;
    }

    @Data
    @AllArgsConstructor
    @NoArgsConstructor
    @Builder
    @ToString
    public static final class ExamItem {
        public String no;
        public String title;
        public String type;
        public String answer;
        public String explain;
    }

    @SneakyThrows
    public static XWPFDocument getWordFileDocxType(String path) {
        FileInputStream fileInputStream = new FileInputStream(path);
        XWPFDocument xwpfDocument = new XWPFDocument(fileInputStream);
        fileInputStream.close();
        return xwpfDocument;
    }

    @SneakyThrows
    public static HWPFDocument getWordFileDocType(String path) {
        FileInputStream fileInputStream = new FileInputStream(path);
        HWPFDocument hwpfDocument = new HWPFDocument(fileInputStream);
        fileInputStream.close();
        return hwpfDocument;
    }


    @SneakyThrows
    public static void main(String[] args) {
        int examCount = 0;
        String exCode = "";
        List<RoughItem> roughItems = new ArrayList<>();
        CharacterProperties props = new CharacterProperties();
        props.setFontSize(32);

        String filePath = "C:\\Users\\Administrator\\Documents\\Tencent Files\\1791255334\\FileRecv\\答案  (增加 1301-2100共 800)中级保育师增加题库 .doc";
        String newFilePath = "C:\\Users\\Administrator\\Documents\\Tencent Files\\1791255334\\FileRecv\\答案  (增加 1301-2100共 800)中级保育师增加题库 " + new Date().getTime() + ".doc";
        HWPFDocument wordFile = getWordFileDocType(filePath);
        Range range = wordFile.getRange();
        int numParagraphs = range.numParagraphs();


        for (int i = 0; i < numParagraphs; i++) {
            Paragraph paragraph = range.getParagraph(i);
            String text = paragraph.text();
            if (StringUtils.isEmpty(text)) continue;
            /* 按点号分割字符串 */
            String[] split = text.split(SPLIT_IDENTIFY);
            /* 首个字符串是否匹配数值序号 */
            boolean isExamNo = split[0].matches(NUMBER_REGEXP);
            /* 是否为答案 */
            boolean isAnswer = text.startsWith(ANSWER_PREFIX.get(0)) || text.startsWith(ANSWER_PREFIX.get(1));
            /* 是否为选项 */
            boolean isOptions = OPTIONS.contains(split[0]);
            /* 当判断为题目序列时,迭代计数变量,是一道新的题目 */
            if (isExamNo) {
                ++ examCount;
                exCode = split[0];
                ExamUtil.RoughItem roughItem = ExamUtil.RoughItem.builder()
                        .serial(examCount)
                        .content(text)
                        .exCode(exCode)
                        .build() ;
                roughItems.add(roughItem);
            } else if (isAnswer || isOptions) {
                /* 反之不是题目序列,而是选项,答案,解析时,保存起来 */
                RoughItem roughItem = RoughItem.builder()
                        .serial(examCount)
                        .content(text)
                        .exCode(exCode)
                        .build() ;
                roughItems.add(roughItem);
            }
            /* 答案部分是一个完整段落,所以对其删除即可 */
            if (isAnswer) paragraph.delete();
        }

        List<ExamItem> examItems = new ArrayList<>();
        /* 收集完成后使用序列进行分组处理 */
        Map<Integer, List<RoughItem>> listMap = roughItems.stream().collect(Collectors.groupingBy(RoughItem::getSerial));
        listMap.forEach((k, v) -> {
            /* 第一项一定是题目 */
            RoughItem titleItem = v.get(0);
            String content = titleItem.getContent();
            content = content.replaceAll("\r", "");
            /* 处理集合得到答案和解析,解析不一定存在,所以orElse设置空串默认值 */
            String answer = v.stream()
                    .map(RoughItem::getContent)
                    .filter(xContent -> xContent.startsWith(ANSWER_PREFIX.get(0)) || xContent.startsWith(ANSWER_PREFIX.get(1)))
                    .map(x -> x.replaceAll(ANSWER_PREFIX.get(1), "").replaceAll(ANSWER_PREFIX.get(0), ""))
                    .findFirst()
                    .orElse("");
            answer = answer.replaceAll("\r", "");
            /* 包装成题目对象后给调用者消费 */
            ExamItem build = ExamItem
                    .builder()
                    .no(titleItem.getExCode())
                    .title(content)
                    .type(null)
                    .answer(answer)
                    .explain(null)
                    .build();
            examItems.add(build);
        });

        examItems.forEach(System.out::println);

        /* 创建一行para,写N个答案在一行中  rowSize = N */
        int examTotal = examItems.size();
        int rowSize = 10;
        boolean isComplete = examTotal % rowSize == 0;
        int totalRow = examTotal / rowSize;
        totalRow = isComplete ? totalRow : totalRow + 1;
        /* 因为用的是insertBefore方式插入,所以需要反着翻页写入 */
        for (int currentRow = totalRow; currentRow >= 1; currentRow--) {
            int begin = (currentRow - 1) * rowSize;
            int end = (currentRow * rowSize) - 1;
            StringBuilder rowText = new StringBuilder();
            for (int exIdx = begin; exIdx <= end; exIdx++) {
                if (exIdx < 0) break;
                else if (exIdx >= examTotal) break;
                ExamItem examItem = examItems.get(exIdx);
                String no = examItem.getNo();
                String answer = examItem.getAnswer();
                rowText.append(no).append(".").append(answer).append(" ");
            }
            rowText.append("\r");
            CharacterRun characterRun = range.insertBefore(rowText.toString());
        }

        wordFile.write(new File(newFilePath));
    }
}

  

四、答案嵌套在题目里的处理

选项嵌套在选项,题目中,需要再写逻辑判断

 

 

 

为了处理这种类型的题库文档,单开了一个新的工具类处理

细节部分看代码实现就行

package cn.cloud9.word;

import com.alibaba.druid.util.StringUtils;
import lombok.*;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.CharacterProperties;
import org.apache.poi.hwpf.usermodel.CharacterRun;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.xwpf.usermodel.XWPFDocument;

import java.io.File;
import java.io.FileInputStream;
import java.util.*;
import java.util.stream.Collectors;

public class ExamUtil2 {
    // private static final List<String> ANSWER_PREFIX = Arrays.asList("答案:", "参考答案:");
    private static final List<String> ANSWER_IDENT = Arrays.asList("(正确答案)", "【正确答案】");
    private static final List<String> ANSWER_IDENT2 = Arrays.asList("×", "√");
    private static final List<String> ANSWER_IDENT3 = Arrays.asList("A", "B", "C", "D", "E", "F", "G");;
    private static final List<String> OPTIONS = Arrays.asList("A", "B", "C", "D", "E", "F", "G");;
    private static final List<String> OPTIONS2 = Arrays.asList("A、", "B、", "C、", "D、", "E、", "F、", "G、");
    private static final String NUMBER_REGEXP = "^[1-9]\\d*";
    private static final String SPLIT_IDENTIFY = "\\.";

    @Data
    @AllArgsConstructor
    @NoArgsConstructor
    @Builder
    @ToString
    public static final class RoughItem {
        public int serial;
        public String exCode;
        public String content;
    }

    @Data
    @AllArgsConstructor
    @NoArgsConstructor
    @Builder
    @ToString
    public static final class ExamItem {
        public String no;
        public String title;
        public String type;
        public String answer;
        public String explain;
    }

    @SneakyThrows
    public static XWPFDocument getWordFileDocxType(String path) {
        FileInputStream fileInputStream = new FileInputStream(path);
        XWPFDocument xwpfDocument = new XWPFDocument(fileInputStream);
        fileInputStream.close();
        return xwpfDocument;
    }

    @SneakyThrows
    public static HWPFDocument getWordFileDocType(String path) {
        FileInputStream fileInputStream = new FileInputStream(path);
        HWPFDocument hwpfDocument = new HWPFDocument(fileInputStream);
        fileInputStream.close();
        return hwpfDocument;
    }


    @SneakyThrows
    public static void main(String[] args) {
        int examCount = 0;
        String exCode = "";
        List<RoughItem> roughItems = new ArrayList<>();
        CharacterProperties props = new CharacterProperties();
        props.setFontSize(32);

        String filePath = "C:\\Users\\Administrator\\Documents\\Tencent Files\\1791255334\\FileRecv\\11 (   )高级保育师理论题库增加.doc";
        String newFilePath = "C:\\Users\\Administrator\\Documents\\Tencent Files\\1791255334\\FileRecv\\11 (   )高级保育师理论题库增加- " + new Date().getTime() + ".doc";
        HWPFDocument wordFile = getWordFileDocType(filePath);
        Range range = wordFile.getRange();
        int numParagraphs = range.numParagraphs();


        for (int i = 0; i < numParagraphs; i++) {
            Paragraph paragraph = range.getParagraph(i);
            String text = paragraph.text();
            if (StringUtils.isEmpty(text)) continue;

            /* 按点号分割字符串 */
            String[] split = text.split(SPLIT_IDENTIFY);
            /* 首个字符串是否匹配数值序号 */
            boolean isExamNo = split[0].matches(NUMBER_REGEXP);
            /* 是否为选项 */
            boolean isOptions = OPTIONS.contains(split[0]) || OPTIONS2.stream().anyMatch(text::contains);
            /* 是否为答案 */
            boolean rightOption = ANSWER_IDENT.stream().anyMatch(text::contains) && isOptions; /* 答案在选项中 */
            boolean rightOption2 = ANSWER_IDENT2.stream().anyMatch(text::contains) && isExamNo; /* 答案填放在题目里面 */
            boolean rightOption3 = ANSWER_IDENT3.stream().anyMatch(text::contains) && isExamNo; /* 答案填放在题目里面 */
            boolean isAnswer = rightOption || rightOption2 || rightOption3;


            /* 当判断为题目序列时,迭代计数变量,是一道新的题目 */
            if (isExamNo) {
                ++ examCount;
                exCode = split[0];
                ExamUtil2.RoughItem roughItem = ExamUtil2.RoughItem.builder()
                        .serial(examCount)
                        .content(text)
                        .exCode(exCode)
                        .build() ;
                roughItems.add(roughItem);
            }
            if (isAnswer) {
                String correctOption = "";
                if (rightOption) {
                    for (String answer : ANSWER_IDENT) text = text.replaceAll(answer, "");
                    paragraph.replaceText(text, false);
                    correctOption = String.valueOf(text.charAt(0));
                }
                if (rightOption2) {
                    correctOption = text.contains(ANSWER_IDENT2.get(0)) ? ANSWER_IDENT2.get(0) : ANSWER_IDENT2.get(1);
                    for (String answer : ANSWER_IDENT2)  text = text.replaceAll(answer, "");
                    paragraph.replaceText(text, false);
                }
                if (rightOption3) {
                    for (String option : ANSWER_IDENT3) {
                        if (text.contains(option)) {
                            correctOption = option;
                            text = text.replaceAll(option, "");
                            break;
                        }
                    }
                    paragraph.replaceText(text, false);
                }
                RoughItem roughItem = RoughItem.builder()
                        .serial(examCount)
                        .content(correctOption)
                        .exCode(exCode)
                        .build() ;
                roughItems.add(roughItem);
            }
        }

        List<ExamItem> examItems = new ArrayList<>();
        /* 收集完成后使用序列进行分组处理 */
        Map<Integer, List<RoughItem>> listMap = roughItems.stream().collect(Collectors.groupingBy(RoughItem::getSerial));
        listMap.forEach((k, v) -> {
            if (v.size() == 1) return;
            /* 第一项一定是题目 */
            RoughItem titleItem = v.get(0);
            String content = titleItem.getContent();
            content = content.replaceAll("\r", "");
            /* 处理集合得到答案和解析,解析不一定存在,所以orElse设置空串默认值 */
            String answer = v.get(1).content;
            answer = answer.replaceAll("\r", "");
            /* 包装成题目对象后给调用者消费 */
            ExamItem build = ExamItem
                    .builder()
                    .no(titleItem.getExCode())
                    .title(content)
                    .type(null)
                    .answer(answer)
                    .explain(null)
                    .build();
            examItems.add(build);
        });

        examItems.forEach(System.out::println);

        /* 创建一行para,写10个答案上来 */
        int examTotal = examItems.size();
        int rowSize = 10;
        boolean isComplete = examTotal % rowSize == 0;
        int totalRow = examTotal / rowSize;
        totalRow = isComplete ? totalRow : totalRow + 1;
        for (int currentRow = totalRow; currentRow >= 1; currentRow--) {
            int begin = (currentRow - 1) * rowSize;
            int end = (currentRow * rowSize) - 1;
            StringBuilder rowText = new StringBuilder();
            for (int exIdx = begin; exIdx <= end; exIdx++) {
                if (exIdx < 0) break;
                else if (exIdx >= examTotal) break;
                ExamItem examItem = examItems.get(exIdx);
                String no = examItem.getNo();
                String answer = examItem.getAnswer();
                rowText.append(no).append(".").append(answer).append(" ");
            }
            rowText.append("\r");
            CharacterRun characterRun = range.insertBefore(rowText.toString());
        }

         wordFile.write(new File(newFilePath));
    }
}

  

 

标签:Java,String,text,poi,static,import,Word,public,题库
From: https://www.cnblogs.com/mindzone/p/18403308

相关文章

  • 1-4Java修饰符
    Java修饰符Java语言提供了很多修饰符,主要分为以下两类:访问修饰符非访问修饰符修饰符用来定义类,方法或者变量,通常放在语句的最前端。访问控制修饰符Java中,可以使用访问控制符来保护对类,方法,变量,构造方法的访问。Java支持4种不同的访问权限。default(即默认,什么也不写):在......
  • Java基础第六天-面向对象编程
    类与对象类就是数据类型,对象就是一个具体的实例。类拥有属性和行为。类是抽象的,概念的,代表一类事物,比如人类,猫类等它是数据类型。对象是具体的,实际的,代表一个具体事物,即是实例。类是对象的模板,对象是类得一个个体,对应一个实例。对象在内存中的存在形式:字符串是指向地址保......
  • 6.跟着狂神学JAVA(数组)
    数组数组是相同类型数据的有序集合每一个数据称作一个数据元素,每个数组元素可以通过一个下标来访问获取数组长度:array.length数组的使用声明数组dataType[]arrayName;初始化数组在声明时初始化int[]numbers=newint[5];//创建一个长度为5的整型数组在声明......
  • 7.跟着狂神学JAVA(面向对象)
    什么是面向对象面向过程步骤清晰简单适合处理一些较为简单的问题线性思维面向对象先分类、然后对分类后的细节进行面向过程的思考适合处理复杂、多人协作的问题分类思维面向对象编程的本质是:以类的方式组织代码,以对象的组织(封装)数据抽象从认识论的角度考......
  • Java毕业设计源码 - ssm框架网上服装销售系统+jsp+vue+数据库mysql+毕业论文等
    文章目录前言一、毕设成果演示(源代码在文末)二、毕设摘要展示1、开发说明2、需求/流程分析3、系统功能结构三、系统实现展示1、用户功能模块2、管理员功能模块四、毕设内容和源代码获取总结逃逸的卡路里博主介绍:✌️码农一枚|毕设布道师,专注于大学生项目实战开发、......
  • Javaweb-子查询
    select*fromempwheresalary>(selectsalaryfromempwherename='猪八戒');1.select*fromempwheredep_idin(selectdidfromdeptwherednamein('财务部','市场部'));2.select*fromempwheredep_id=(selectdidfromd......
  • JAVA第五天
    目录:变量、常量、作用域、变量的命名规范1.变量、常量、作用域变量就是可以变化的量,每个变量都必须声明其类型。java变量是程序中最基本的存储单元,其要素包括变量名,变量类型和作用域。注意事项:每个变量都有类型,类型可以是基本类型,也可以是引用类型。变量名必须是合法的......
  • JavaScript速查表
    JavaScript速查表本手册绝大部分内容是从AirbnbJavaScriptStyleGuide精简整理,将开发者们都明确的操作去掉,目的为了就是更快的速查。此处为源地址。译制:HaleNing目录基础知识类型引用对象数组解构字符串变量属性测试公共约束注释分号命名规范标准......
  • 基于JAVA的景区行李寄存系统设计与实现,LW+源码+部署讲解
    摘要 针对传统人工行李寄存效率低和安全性不足等问题,设计并实现了一种由网页控制器组成的智能行李寄存系统。首先能够实现行李的寄存管理和行李柜管理以及记录查询和通知公告以及管理员等灵活控制菜单显示权限。经过研究和测试结果显示,该行李寄存系统实现了行李的安全、高效......
  • Java中的反射
    1.1反射的概述:专业的解释(了解一下):反射允许对封装类的字段,方法和构造函数的信息进行编程访问是在运行状态中,对于任意一个类,都能够知道这个类的所有属性和方法;对于任意一个对象,都能够调用它的任意属性和方法;这种动态获取信息以及动态调用对象方法的功能称为Java语言的反射机......