springbatch---->springbatch的使用（二）

　　这里我们对springbatch做一个比较深入的学习例子，解压文件，读取文件内容过滤写入到数据库中。如果你掉进了黑暗里，你能做的，不过是静心等待，直到你的双眼适应黑暗。

springbatch的使用案例

　　首先，我们来列举一下spring batch里面所涉及到的概念。

1、Job repository
    An infrastructure component that persists job execution metadata
2、Job launcher 
    An infrastructure component that starts job executions
3、Job 
    An application component that represents a batch process
4、Step 
    A phase in a job; a job is a sequence of steps
5、Tasklet 
    A transactional, potentially repeatable process occurring in a step
6、Item 
    A record read from or written to a data source
7、Chunk 
    A list of items of a given size
8、Item reader 
    A component responsible for reading items from a data source
9、Item processor 
    A component responsible for processing (transforming, validating, or filtering) a read item before it’s written
10、Item writer 
    A component responsible for writing a chunk to a data source

　　我们的项目是基于上篇博客的，具体的可以参考博客：springbatch---->springbatch的使用（一）。流程：将得到的zip文件解压成txt文件，根据里面的信息。我们过滤到年龄大于30的数据，并把过滤后的数据按格式存放到数据库中。以下我们只列出新增或者修改的文件内容。

一、关于xml文件的修改，增加一个job.xml文件和修改batch.xml文件

batch.xml文件增加内容：

<!-- 读取文件写入到数据库的job -->
<job id="readFlatFileJob">
    <step id="decompress" next="readWriter">
        <tasklet ref="decompressTasklet"/>
    </step>
    <step id="readWriter" next="clean">
        <tasklet>
            <chunk reader="reader" writer="writer" commit-interval="100" processor="processor"/>
        </tasklet>
    </step>
    <step id="clean">
        <tasklet ref="cleanTasklet"/>
    </step>
</job>

　　关于commit-interval：Number of items to process before issuing a commit. When the number of items read reaches the commit interval number, the entire corresponding chunk is written out through the item writer and the transaction is committed.

新增的job.xml文件内容如下：

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd">

    <!-- 对压缩文件进行解压 -->
    <bean id="decompressTasklet" class="spring.batch.readFile.DecompressTasklet">
        <property name="inputResource" value="file:file/file.zip"/>
        <property name="targetDirectory" value="file"/>
        <property name="targetFile" value="file.txt"/>
    </bean>

    <!-- 读取文本文件 -->
    <bean id="reader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="lineMapper" ref="lineMapper"/>
        <property name="linesToSkip" value="1"/> <!-- 跳过第一行，也就是从第二行开始读 -->
        <property name="resource" value="file:file/file.txt"/>
    </bean>

    <!-- 对读取的内容做过滤处理 -->
    <bean id="processor" class="spring.batch.readFile.FileProcessor"/>

    <!-- 将读取的内容写到数据库中 -->
    <bean id="writer" class="spring.batch.readFile.FileWriter">
        <constructor-arg ref="dataSource"/>
    </bean>

    <!-- 清除压缩文件-->
    <bean id="cleanTasklet" class="spring.batch.readFile.FileCleanTasklet">
        <property name="resource" value="file:file/file.zip"/>
    </bean>

    <!-- 对文件里面的数据进行|分割，映射成一个类 -->
    <bean id="lineMapper" class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
        <property name="lineTokenizer" ref="delimitedLineTokenizer"/>
        <property name="fieldSetMapper">
            <bean class="spring.batch.readFile.ReadFileMapper"/>
        </property>
    </bean>
    <bean id="delimitedLineTokenizer" class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
        <constructor-arg name="delimiter" value="|"/>
    </bean>

    <!-- 简单的数据源配置 -->
    <bean id="dataSource" class="org.springframework.jdbc.datasource.DriverManagerDataSource">
        <property name="driverClassName" value="com.mysql.jdbc.Driver"/>
        <property name="url" value="jdbc:mysql://127.0.0.1:3306/csiilearn"/>
        <property name="username" value="root"/>
        <property name="password" value="*****"/>
    </bean>
</beans>

其实关于有格式的文件的读写，springbatch为我们提供了FlatFileItemWriter和FlatFileItemReader类。我们可以直接使用，只需要配置它的属性就行。

二、新增加的java类，如下所示

DecompressTasklet：对zip文件进行解压

package spring.batch.readFile;

import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.core.io.Resource;

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.util.zip.ZipInputStream;

/**
 * @Author: huhx
 * @Date: 2017-11-01 上午 9:39
 */
public class DecompressTasklet implements Tasklet {
    private Resource inputResource;
    private String targetDirectory;
    private String targetFile;

    @Override
    public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
        System.out.println("file path: " + inputResource.getFile().getAbsolutePath());
        ZipInputStream zis = new ZipInputStream(
                new BufferedInputStream(
                       inputResource.getInputStream()));
        File targetDirectoryAsFile = new File(targetDirectory);
        if (!targetDirectoryAsFile.exists()) {
            FileUtils.forceMkdir(targetDirectoryAsFile);
        }
        File target = new File(targetDirectory, targetFile);
        BufferedOutputStream dest = null;
        while (zis.getNextEntry() != null) {
            if (!target.exists()) {
                target.createNewFile();
            }
            FileOutputStream fos = new FileOutputStream(target);
            dest = new BufferedOutputStream(fos);
            IOUtils.copy(zis, dest);
            dest.flush();
            dest.close();
        }
        zis.close();
        if (!target.exists()) {
            throw new IllegalStateException("Could not decompress anything from the archive!");
        }
        return RepeatStatus.FINISHED;
    }

    public void setInputResource(Resource inputResource) {
        this.inputResource = inputResource;
    }

    public void setTargetDirectory(String targetDirectory) {
        this.targetDirectory = targetDirectory;
    }

    public void setTargetFile(String targetFile) {
        this.targetFile = targetFile;
    }
}

People：数据库映射的javaBean类

public class People implements Serializable {
    private String username;
    private int age;
    private String address;
    private Date birthday;
    
    ..get...set...
}

ReadFileMapper：读取文件字段的映射

package spring.batch.readFile;

import org.springframework.batch.item.file.mapping.FieldSetMapper;
import org.springframework.batch.item.file.transform.FieldSet;
import org.springframework.validation.BindException;

/**
 * @Author: huhx
 * @Date: 2017-11-01 上午 10:11
 */
public class ReadFileMapper implements FieldSetMapper<People> {

    @Override
    public People mapFieldSet(FieldSet fieldSet) throws BindException {
        People people = new People();
        people.setUsername(fieldSet.readString(0));
        people.setAge(fieldSet.readInt(1));
        people.setAddress(fieldSet.readString(2));
        people.setBirthday(fieldSet.readDate(3));
        return people;
    }
}

FileProcessor：对读取的数据做进一步的处理，这里我们是过滤操作

package spring.batch.readFile;

import org.springframework.batch.item.ItemProcessor;

/**
 * @Author: huhx
 * @Date: 2017-11-01 上午 9:43
 */
public class FileProcessor implements ItemProcessor<People, People> {
    @Override
    public People process(People item) throws Exception {
        return needsToBeFiltered(item) ? null : item;
    }

    private boolean needsToBeFiltered(People item) {
        int age = item.getAge();
        return age > 30 ? true : false;
    }
}

FileWriter：对最终过滤后的数据写入数据库中

package spring.batch.readFile;

import org.springframework.batch.item.ItemWriter;
import org.springframework.jdbc.core.JdbcTemplate;

import javax.sql.DataSource;
import java.util.List;

/**
 * @Author: huhx
 * @Date: 2017-11-01 上午 9:42
 */
public class FileWriter implements ItemWriter<People> {
    private static final String INSERT_PRODUCT = "INSERT INTO batch_user (user_name, age, address, birthday) VALUES(?, ?, ?, ?)";

    private JdbcTemplate jdbcTemplate;

    public FileWriter(DataSource dataSource) {
        this.jdbcTemplate = new JdbcTemplate(dataSource);
    }

    @Override
    public void write(List<? extends People> items) throws Exception {
        for (People people : items) {
            jdbcTemplate.update(INSERT_PRODUCT, people.getUsername(), people.getAge(), people.getAddress(), people.getBirthday());
        }
    }
}

FileCleanTasklet：做流程最后的清理处理，删除zip文件

package spring.batch.readFile;

import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;
import org.springframework.core.io.Resource;

import java.io.File;

/**
 * @Author: huhx
 * @Date: 2017-11-01 上午 9:44
 */
public class FileCleanTasklet implements Tasklet {
    private Resource resource;

    public void setResource(Resource resource) {
        this.resource = resource;
    }

    @Override
    public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
        File file = resource.getFile();
        file.deleteOnExit();
        return RepeatStatus.FINISHED;
    }
}

三、修改JobLaunch.java文件里面的内容

Job job = (Job) context.getBean("readFlatFileJob");

zip解压后的文件内容：

姓名|年龄|地址|生日
李元芳|32|黄冈|1985-12-15
王昭君|23|武汉|1995-10-15
狄仁杰|21|天津|1958-12-12
孙悟空|34|黄冈|1985-12-25
牛魔王|26|武汉|1999-09-12
孙尚香|27|天津|1969-12-12

运行后的数据库batch_user表的数据：

友情链接

相关阅读:
模型驱动架构探索之游戏引擎设计（二）开始建模
 模型驱动架构探索之游戏引擎设计（二）粒度统一
 模型驱动架构探索之游戏引擎设计（一）
模型驱动架构探索之游戏引擎设计（序）
时隔几年，再写传统的简单问题算法，又有何不同？
【自学笔记】0基础自学机器学习（第三天）
【自学笔记】0基础自学机器学习（第二天）
JavaSE-知识点总结
 Java框架之SpringBoot 09-Web构建-yml-模块-注解
 Java框架之SpringSecurity 08-权限系统
原文地址：https://www.cnblogs.com/huhx/p/baseusespringbatch2.html