文件分段上传和下载

1、RandomAccessFile简介

RandomAccessFile类是一个随机读取文件数据的java类，常用于分片上传和下载，使用方法和InputStream类似，不同之处在与其构造方法，需要传入mode，mode有四种，"r"、"rw"、"rwd"、"rws"

r 只读模式，进行写操作会报IO异常。
rw 读写模式，不过是写入到buffer，除非满了或者close、sync，才会写入到文件。
rws 同步读写模式，包括元数据，每次操作都会同步到文件中。
rwd 同步读写模式，不包含元数据，只是正文的更新，会每次同步到文件中。
rws和rwd模式相对安全（比如意外jvm退出，没有来得及close的时候，rw模式对文件的修改无效），但是性能上来说要比rw模式要差（每次同步性能肯定比较差）

注：文件的数据包含两部分，分别是实际数据（正文）和元数据，实际数据很好理解，就是文件实际记录的东西，元数据表示文件的一些其他信息，比如：文件名、文件类型、大小、节点号、权限、所有者、所属组、链接数、时间等等

2、文件分片下载

首先需要知道下载文件的总大小，然后逻辑处理需要分成几片子文件，将所有子文件下载完成后，再合并成一个完整的文件，中间部分文件下载失败时，可以重试下载。

2.1、优缺点分析

优点：a、并发从不同的后端服务获取不同的分片文件，在带宽未受限制的情况下，可以提升下载速率。

b、如果中间某一片下载失败，其他下载成功的不需要重新下载，只需要对失败的分片文件进行下载重试。

缺点：a、代码的复杂度上升。

b、若是单线程调用分片下载接口，由于调用接口次数增多，并且还需要合并，所以下载速率反而会下降。

2.2、请求头和响应头参数

从上面知道，分段下载需要知道文件总大小，才好分片，所以需要约定一些参数，通用做法如下：

Request Header参数说明：

range：值的格式为 bytes={start}-{end},{start}从0开始，0表示第一个byte,例子：bytes=10000-20000 ,表示从10000个byte读取到第20000个byte,{start}可以为空，例如：bytes=-20000,表示从第一个byte读取到20000个byte,{end}可以为空，例如bytes=10000-,表示读取10000个byte开始后面所有的数据。

Response Header参数说明

content-range：格式为 bytes {range}/{size} ,例子bytes 1000-2000/12000,其中{range}为request传入的参数{start-end},{size}为下载的文件总的大小

content-length：下载的分段文件的大小，例如1000，即为起始值和初始值的差值

2.3、代码实现

private void randomReadBytes(File file, Integer start, Integer length, OutputStream os) throws IOException {
 
        //以只读模式分段获取文件
        try (RandomAccessFile rf = new RandomAccessFile(file, "r")) {
 
            //定位到要读取的文件位置，包含该位置的字节
            rf.seek(start);
 
            //创建读取数据的缓冲区
            int len = 2 * 1024;
            byte[] bytes = new byte[len];
 
            //计算循环次数和最后一次读取的比特数
            int remainder = length % len;
            int times = remainder == 0 ? length / len : length / len + 1;
 
            //循环取数据
            for (int i = 1; i <= times; i++) {
 
                //最后一次循环需要特殊处理
                if (i == times && remainder > 0) {
                    rf.read(bytes, 0, remainder);
                    os.write(bytes, 0, remainder);
                } else {
                    rf.read(bytes, 0, len);
                    os.write(bytes, 0, len);
                }
            }
        }
    }

3、文件分片上传

文件被分割成多个子文件上传，后端接受到所有的分片文件后进行合并，可以实现断点续传的功能。

3.1、优缺点分析

优点：a、并发向不同的服务器上传不同的分片文件，在带宽未受限制的情况下，可以提升上传速率。

b、如果中间某一片上传失败，其他已经上传成功的不需要重新上传，只需要对失败的分片文件进行上传重试，其实就是断点续传的功能。

缺点：a、代码复杂度上升。

b、并且在分布式系统中，需要考虑并发的问题，多个分片文件上传到服务器后，需要再次读取合并，增加了服务器IO。

3.2、实现方案

1、创建分片上传事件，告知服务器要上传的文件名称、大小和总的分片数，返回一个用于此次分片上传的唯一Id和过期时间。

2、上传分片文件，携带第一步获取的唯一Id和当前分片的序号。

3、合并分片文件，携带第一步获取的唯一Id。

4、放弃分片上传，携带第一步获取的唯一Id.

时序图如下：

3.3、合并文件代码实现

利用字节流按顺序合并文件

private static void mergeFileByFileStream(List<String> sourceFilePaths, File targetFile) throws IOException {
        try (OutputStream os = Files.newOutputStream(targetFile.toPath())) {
            for (String filePath : sourceFilePaths) {
                try (InputStream is = Files.newInputStream(new File(filePath).toPath(), StandardOpenOption.READ)) {
                    byte[] bytes = new byte[2048];
                    int readCount;
                    while ((readCount = is.read(bytes)) > 0) {
                        os.write(bytes, 0, readCount);
                    }
                }
            }
        }
    }

如果是字符或者字符串文件，可以用buffer合并文件

private static void merFileByBuffer(List<String> sourceFilePaths, File targetFile) throws IOException {
        try (BufferedWriter writer = Files.newBufferedWriter(targetFile.toPath(), StandardOpenOption.WRITE)) {
            for (String filePath : sourceFilePaths) {
                File file = new File(filePath);
                if (file.exists() && file.isFile()) {
                    try (BufferedReader reader = Files.newBufferedReader(file.toPath())) {
                        String line;
                        while ((line = reader.readLine()) != null) {
                            writer.write(line);
                        }
                    }
                }
            }
        }
    }

用NIO的Channel，transerFrom和transerTo本质没什么区别，但是transerTo，对于文件大小有2G限制，对于socketChannel有8M的限制。

private static void mergeFileByChannel(List<String> sourceFilePaths, File targetFile) throws IOException {
        try (FileChannel oc = new FileOutputStream(targetFile).getChannel()) {
            for (String filePath : sourceFilePaths) {
                File file = new File(filePath);
                if (file.exists() && file.isFile()) {
                    try (FileChannel ic = new FileInputStream(file).getChannel()) {
                        oc.transferFrom(ic, oc.size(), ic.size());
                    }
                }
            }
        }
    }

RandomAccessFile实现，如果知道每个子文件在合并文件中的字节起止位置，可以采用多线程的方式向合并文件中同时写入数据，提高文件合并速度。

private static void mergeFileByRandomAccessFile(List<String> sourceFilePaths, File targetFile) throws IOException {
        //根据名称排序
        sourceFilePaths = sourceFilePaths.stream().sorted(String::compareTo).collect(Collectors.toList());
        for (String filePath : sourceFilePaths) {
            File file = new File(filePath);
            if (file.exists() && file.isFile()) {
                try (FileInputStream inputStream = new FileInputStream(file);
                     RandomAccessFile accessFile = new RandomAccessFile(targetFile, "rw")) {
                    byte[] bytes = new byte[2048];
                    accessFile.seek(accessFile.length());
                    int readCount;
                    while ((readCount = inputStream.read(bytes)) != -1) {
                        accessFile.write(bytes, 0, readCount);
                    }
                }
            }
        }
    }

标签：文件,分段,bytes,file,分片,new,上传,下载
From： https://www.cnblogs.com/zhaodalei/p/17137081.html