需求背景

想实现一个导出功能，可以很简单，也可以很复杂。简单做法，无脑用 EasyExcel + 注解，实现起来简便快捷。

easyexcel能大大减少占用内存的主要原因是：在解析Excel时没有将文件数据一次性全部加载到内存中，而是从磁盘上一行行读取数据，逐个解析。

但是如果想要指定导出字段呢？如果要一次性导出上百万条数据呢？

那可能会面临下面的问题：

如果同步导数据，接口很容易超时。
如果把所有数据一次性装载到内存，很容易引起OOM。
数据量太大sql语句必定很慢。
相同商品编号的数据要放到一起。
如果走异步，如何通知用户导出结果？
如果excel文件太大，目标用户打不开怎么办？

异步处理

要把导出这一功能异步，意味着需要有一张异步任务表来存储这些导出任务。

导入导出任务表如下：

CREATE TABLE `excel_import_process` (
  `id` bigint(20) NOT NULL COMMENT '主键id',
  `import_name` varchar(255) DEFAULT NULL COMMENT '导出名称',
  `import_detail` varchar(255) DEFAULT NULL COMMENT '导出详情',
  `import_status` tinyint(1) DEFAULT NULL COMMENT '导出状态 1-处理中 2-处理完成 3-处理失败',
  `error_file_url` varchar(255) DEFAULT NULL COMMENT '错误文件链接',
  `source_url` varchar(1000) DEFAULT NULL COMMENT '源文件地址',
  `import_total` int(10) DEFAULT '0' COMMENT '导出成功数量',
  `task_type` tinyint(1) DEFAULT '1' COMMENT '任务类型（1：导入 2：导出）',
  `import_type` tinyint(1) DEFAULT NULL COMMENT '导入类型',
  `status` tinyint(1) DEFAULT '1' COMMENT '数据状态 1-启用 2-停用',
  `tenant_id` bigint(20) DEFAULT NULL COMMENT '租户id',
  `create_id` int(11) DEFAULT NULL COMMENT '创建者',
  `create_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
  `update_id` int(11) DEFAULT NULL COMMENT '更新者',
  `update_time` datetime DEFAULT NULL COMMENT '更新时间',
  PRIMARY KEY (`id`),
  KEY `idx_tenant_id` (`tenant_id`) USING BTREE COMMENT '租户ID'
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COMMENT='导出进度表';

一般来讲，导出接口会调用一个导出方法，简单的异步处理可以开启一个线程去实现导出

1
2
3

ThreadPoolManager.getInstance().execute(new Thread(() -> {
               export(fileName, exportParams, null, null, exportFieldList, null, importType, sysUser.getId(), sysUser.getTenantId(), excelImportProcess, getPayrollDetailsMap);
           }));

完整的异步导出代码如下：

    @PostConstruct
    public void init() {
        // 静态字段，无法直接依赖注入
        ExcelExportUtil.sysDictDataService = applicationContext.getBean(ISysDictDataService.class);
        ExcelExportUtil.contextAwareExecutor = applicationContext.getBean("contextAwareExecutor", ThreadPoolTaskExecutor.class);
    } 

public static void exportDynamicExcel(List<ExcelExportEntity> exportFieldList, Class<?> serviceImpl, Method method, Object dto, String fileName, Integer importType) {
        if (CollUtil.isEmpty(exportFieldList)) {
            throw new BaseException("请选择需要导出的字段");
        }
        SysUser sysUser = UserCacheUtils.getSysUser();
        Class<?> exportClass = getMethodType(method);

        List<String> fieldNames = Arrays.stream(FieldUtils.getAllFields(exportClass)).map(Field::getName).collect(Collectors.toList());
        for (ExcelExportEntity excelExportEntity : exportFieldList) {
            if (!fieldNames.contains((String) excelExportEntity.getKey())) {
                throw new BaseException("需要导出的字段不存在: " + (String) excelExportEntity.getKey());
            }
        }

        ExportParams exportParams = new ExportParams();
        ExportParams params = new ExportParams(fileName, fileName, ExcelType.XSSF);

        contextAwareExecutor.execute(() -> {
            ExcelImportProcess excelImportProcess = createExcelImportProcess(exportParams.getTitle(), sysUser.getTenantId(), importType, sysUser.getId());
            try {
                cacheDict(exportClass);
                export(fileName, exportParams, serviceImpl, dto, exportFieldList, method, importType, sysUser.getId(), sysUser.getTenantId(), excelImportProcess, null);
            } catch (Exception e) {
                e.printStackTrace();
                String fullStackTrace = ExceptionUtils.getFullStackTrace(e);
                excelImportProcess.setImportStatus(ExcelProcessStatusEnum.PROCESS_FAILED.getCode());
                excelImportProcess.setImportDetail(fullStackTrace.substring(0, 150));
                excelImportProcess.updateById();
                log.error("导出发生错误: {}", fullStackTrace);
            }
        });
    }

分页查询

我们调用的导出方法示例代码如下：

private static void export(String fileName, ExportParams exportParams, Class<?> excelExportTaskImpl, Object queryParams,
                              List<ExcelExportEntity> excelExportEntityList, Method method, Integer importType, Integer userId,
                              Long tenantId, ExcelImportProcess excelImportProcess, List<Map<String, Object>> dataMap) {
       Field tenantIdField = null;
       if (!Objects.isNull(queryParams)) {
           tenantIdField = ReflectionUtils.findField(queryParams.getClass(), "tenantId");
       }
       if (!Objects.isNull(tenantIdField)) {
           tenantIdField.setAccessible(true);
           ReflectionUtils.setField(tenantIdField, queryParams, tenantId);
       }

       Workbook workbook;
       StopWatch stopWatch = new StopWatch();
       try {
           stopWatch.start();
           log.info("---开始执行excel导出---");
           if (Objects.isNull(dataMap)) {
               Object excelExportTask = SpringUtils.getBean(excelExportTaskImpl);
               workbook = cn.afterturn.easypoi.excel.ExcelExportUtil.exportBigExcel(exportParams, excelExportEntityList, (queryParam, page) -> dataList(excelExportTask, method, excelImportProcess, queryParam, page), queryParams);  // 导出-分页查询
           } else {
               excelImportProcess.setImportTotal(dataMap.size());
               ExportParams exportParams2 = new ExportParams(null, fileName);
               exportParams2.setType(ExcelType.XSSF);
               workbook = cn.afterturn.easypoi.excel.ExcelExportUtil.exportExcel(exportParams2, excelExportEntityList, dataMap);
           }
           downloadExcel(fileName, workbook, excelImportProcess);
           stopWatch.stop();
           log.info("---excel导出结束---, 成功导出:{}行, 总共耗时:{}秒, 导出记录id: {}", excelImportProcess.getImportDetail(), stopWatch.getTotalTimeSeconds(), excelImportProcess.getId());
       } catch (Exception e) {
           String fullStackTrace = ExceptionUtils.getFullStackTrace(e);
           excelImportProcess.setImportStatus(ExcelProcessStatusEnum.PROCESS_FAILED.getCode());
           excelImportProcess.setImportDetail(fullStackTrace.substring(0, 150));
           excelImportProcess.updateById();
           log.error("导出发生错误: {}", fullStackTrace);
       }
   }

可以看到，exportBigExcel 方法可以指定表头，只需要传入excelExportEntityList即可，其中的dataList方法为分页方法，在这里指定了分页和总数信息，通过反射+动态代理查询。

private static List<Object> dataList(Object excelExportTask, Method method, ExcelImportProcess excelImportProcess, Object queryParam, int page) {
      if (!(queryParam instanceof PageParam)) {
          throw new BaseException("参数不是PageParam类型");
      }
      // 设置分页和租户id信息
      ((PageParam) queryParam).setPageNum(page);
      ((PageParam) queryParam).setPageSize(sliceSize); // 默认为 20000 条
      log.info("pageNum: {}", page);
      List<Object> data = (List<Object>) ReflectionUtils.invokeMethod(method, excelExportTask, queryParam);

      if (!Objects.isNull(data)) {
          data = serializeDataList(data);
          Integer importTotal = excelImportProcess.getImportTotal();
          if (Objects.isNull(importTotal)) {
              importTotal = 0;
          }
          excelImportProcess.setImportTotal(importTotal + data.size());
      }
      return data;
  }

exportBigExcel的遍历分页代码如下：

public Workbook exportBigExcel(IExcelExportServer server, Object queryParams) {
       int page = 1;
       List<Object> list = server
               .selectListForExcelExport(queryParams, page++);
       while (list != null && list.size() > 0) {
           write(list);
           list = server.selectListForExcelExport(queryParams, page++);
       }
       return close();
   }

看到这里你肯定会有个疑问，都支持导出百万级数据了，这样随便把查询条件放到 while 里去循环查询，数据量越大，偏移量越多，不会有深分页问题吗？

但是再仔细看看，这个方法只传了个页数，实际上，它不关心你怎么分页的。深分页这个问题其实是需要自己在业务代码里解决，easypoi只负责我们告诉分页以后的数据。

多个 sheet

我们知道，excel对一个sheet存放的最大数据量，是有做限制的，一个sheet最多可以保存1048576行数据。否则在保存数据时会直接报错：

1	invalid row number (1048576) outside allowable range (0..1048575)

如果你想导出一百万以上的数据，excel的一个sheet肯定是存放不下的，因此我们需要把数据保存到多个sheet中。

之前说过，我们一般是通过limit语句来实现分页查询功能的，其中的pagetStart参数，是通过pageNo和pageSize动态计算出来的。如果只有一个sheet可以这么玩，但如果有多个sheet就会有问题。因此，我们需要重新计算limit的起始位置。

ExcelWriter excelWriter = EasyExcelFactory.write(out).build();
int totalPage = searchUserTotalPage(searchModel);

if(totalPage > 0) {
   Page<User> page = Page.create(searchModel);
   int sheet = (totalPage % maxSheetCount == 0) ? totalPage / maxSheetCount: (totalPage / maxSheetCount) + 1;
   for(int i=0;i<sheet;i++) {
      WriterSheet writeSheet = buildSheet(i,"sheet"+i);
      int startPageNo = i*(maxSheetCount/pageSize)+1;
      int endPageNo = (i+1)*(maxSheetCount/pageSize);
      while(page.getPageNo()>=startPageNo && page.getPageNo()<=endPageNo) {
        page = searchUser(searchModel);
        if(CollectionUtils.isEmpty(page.getList())) {
            break;
        }
        
        excelWriter.write(page.getList(),writeSheet);
        page.setPageNo(page.getPageNo()+1);
     }
   }
}

流式查询

可以使用MapReduce思想并利用流式API来处理

mybatis有提供cursor流式接口，然后就不用数据的join和order by那些耗性能的操作了，流式1 条条的取出数据在内存中去关联和排序，最终组合数据写到一个文档里面

文件上传

由于现在我们导出excel数据的方案改成了异步，所以没法直接将excel文件，同步返回给用户。

因此我们需要先将excel文件存放到一个地方，当用户有需要时，可以访问到。

这时，我们可以直接将文件上传到COS文件服务器上。

所以你会发现，我们在异步任务表的设计上添加上了文件链接地址（细节！）

private static void downloadExcel(String fileName, Workbook workbook, ExcelImportProcess excelImportProcess) throws IOException {
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    workbook.write(outputStream);
    fileName = fileName + ToolUtil.serialNumber();
    String url = FileUploadUtil.upload(outputStream.toByteArray(), fileName + ".xlsx", FileUploadConstant.EXPORT);  // 上传到 cos
    excelImportProcess.setSourceUrl(url);

    excelImportProcess.setImportName(fileName);
    excelImportProcess.setImportDetail("成功导出" + excelImportProcess.getImportTotal() + "条");
    excelImportProcess.setImportStatus(ExcelProcessStatusEnum.PROCESS_COMPLETE.getCode());
    excelImportProcess.updateById();
}

功能优化

写过一些 crud 代码就清楚，导出方法一般和查询的接口一样，需要在导出前调用查询的方法把 List 查出来，然后再进行后续的导出操作，把文件输出流传给前端。

但是如果需要导出一些带有字典值的字段呢？查询接口一般都是把表里的枚举值查出来，前端再通过查询字典表把这些枚举值替换成对应的中文字段。但是导出就没办法通过前端来转换了，这种转换字典的操作只能让后端做。

但是更加高级的做法可以使用注解，只需要在返回的VO里给某些需要转换的字段上加一个注解，就可以省去繁琐的替换操作了。优雅，太优雅了。

导出时检测字典类型的注解

private static void cacheDict(Class<?> exportClass) {
       List<String> dictTypeList = Lists.newArrayList();
       for (Field field : FieldUtils.getAllFields(exportClass)) {
           DictCode dictCode = field.getAnnotation(DictCode.class);
           if (ToolUtil.isNotEmpty(dictCode)) {
               String type = dictCode.type();
               String value = dictCode.value();
               if (ToolUtil.isNotEmpty(value) && "dict".equals(type)) {
                   dictTypeList.add(value);
               }
           }
       }
       Map<String, Map<String, String>> dictMap = new HashMap<>();
       if (ToolUtil.isNotEmpty(dictTypeList)) {
           dictMap = sysDictDataService.selectEmpDictData(dictTypeList);
       }
       RedisUtil.hmSet("sys:dict:", dictMap);
   }

通过查询字典表，替换掉字典键值

@Override
   public Map<String, Map<String, String>> selectEmpDictData(List<String> dictTypeList) {
       Long tenantId = UserCacheUtils.getByTenantId();
       List<SysDictDataVO> list = dictDataMapper.selectEmpDictData(dictTypeList, tenantId);
       return list.stream().collect(Collectors.groupingBy(SysDictDataVO::getDictType, TreeMap::new, Collectors.toMap(SysDictDataVO::getDictValue, SysDictDataVO::getDictLabel, (existing, replacement) -> existing, // 如果遇到相同的键，保留现有的
               LinkedHashMap::new)));
   }