1、项目结构如下:
2、文件说明:
2.1、CreditBill:表示信用卡消费记录领域对象
2.2、CreditBillProcessor:记录处理类,本场景仅打印信息
2.3、credit-card-bill-201910.csv:原始账单数据
2.4、job.xml:作业定义文件
2.5、job-context.xml :Spring Batch 批处理任务需要的基础信息
2.6、JobLaunch:调用批处理作业类
2.7、JobLaunchTest:Junit单元测试类,使用Spring提供的测试框架类。
2.8、pom.xml:引用相关jar包
3、文件内容:
3.1、CreditBill:实体类对象
/** * @author miaosj * @version 1.0 * @date 2019/10/8 */ public class CreditBill { /** * 银行账户 */ private String accountID; /** * 账户名 */ private String name; /** * 消费金额 */ private double amount; /** * 消费日期 */ private String date; /** * 消费场所 */ private String address; public String getAccountID() { return accountID; } public void setAccountID(String accountID) { this.accountID = accountID; } public String getName() { return name; } public void setName(String name) { this.name = name; } public double getAmount() { return amount; } public void setAmount(double amount) { this.amount = amount; } public String getDate() { return date; } public void setDate(String date) { this.date = date; } public String getAddress() { return address; } public void setAddress(String address) { this.address = address; } }
3.2、credit-card-bill-201910.csv:原始账单数据
4047390012345678,tom,100.00,2013-2-2 12:00:08,Lu Jia Zui road 4047390012345678,tom,320.00,2013-2-3 12:00:08,Lu Jia Zui road 4047390012345678,tom,674.00,2013-2-6 12:00:08,Lu Jia road 4047390012345678,tom,793.00,2013-2-9 12:00:08,Lu Jia Zui road 4047390012345678,tom,360.00,2013-2-11 12:00:08,Lu Jia Zui road 4047390012345678,tom,893.00,2013-2-28 12:00:08,Lu Jia Zui road
3.3、job-context.xml:job基础设施
<?xml version="1.0" encoding="UTF-8"?> <beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd" default-autowire="byName"> <!--定义作业仓库SpringBatch提供了两种作业仓库来记录job执行期产生的信息:一种是内存,另一种是数据库,此处采用内存--> <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> </bean> <!--定义作业调度器,用来启动Job --> <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher"> <property name="jobRepository" ref="jobRepository"/> </bean> <!--事务管理器,用于springbatch框架在对数据操作过程中体统事务能力--> <bean id="transactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/> </beans>
3.4、job.xml:定义job
<?xml version="1.0" encoding="UTF-8"?> <bean:beans xmlns="http://www.springframework.org/schema/batch" xmlns:bean="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch-2.2.xsd"> <!--引入job-context.xml配置文件--> <bean:import resource="classpath:job-context.xml"/> <!--定义billJob billStep 包含读数据 处理数据 写数据--> <job id="billJob"> <step id="billStep"> <tasklet transaction-manager="transactionManager"> <!--commit-interval="2" 表示任务提交间隔的大小 此处表示每处理2条数据 进行一次写入操作--> <chunk reader="csvItemReader" writer="csvItemWriter" processor="creditBillProcessor" commit-interval="2"> </chunk> </tasklet> </step> </job> <!-- 读取信用卡账单文件,CSV格式 --> <bean:bean id="csvItemReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step"> <!--设置读取的文件资源--> <bean:property name="resource" value="classpath:data/credit-card-bill-201910.csv"/> <!--将文本中的每行记录转换为领域对象--> <bean:property name="lineMapper"> <bean:bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper"> <!--引用lineTokenizer--> <bean:property name="lineTokenizer" ref="lineTokenizer"/> <!--fieldSetMapper根据lineTokenizer中定义的names属性映射到领域对象中去--> <bean:property name="fieldSetMapper"> <bean:bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper"> <bean:property name="prototypeBeanName" value="creditBill"> </bean:property> </bean:bean> </bean:property> </bean:bean> </bean:property> </bean:bean> <!-- lineTokenizer 定义文本中每行的分隔符号 以及每行映射成FieldSet对象后的name列表 --> <bean:bean id="lineTokenizer" class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer"> <bean:property name="delimiter" value=","/> <bean:property name="names"> <bean:list> <bean:value>accountID</bean:value> <bean:value>name</bean:value> <bean:value>amount</bean:value> <bean:value>date</bean:value> <bean:value>address</bean:value> </bean:list> </bean:property> </bean:bean> <!-- 写信用卡账单文件,CSV格式 --> <bean:bean id="csvItemWriter" class="org.springframework.batch.item.file.FlatFileItemWriter" scope="step"> <!--<bean:property name="resource" value="file:target/data/outputFile.csv"/>--> <!--<bean:property name="resource" value="file:target/outputFile.csv"/>--> <bean:property name="resource" value="classpath:data/outputFile.csv"/> <bean:property name="lineAggregator"> <bean:bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator"> <bean:property name="delimiter" value=","></bean:property> <bean:property name="fieldExtractor"> <bean:bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor"> <bean:property name="names" value="accountID,name,amount,date,address"> </bean:property> </bean:bean> </bean:property> </bean:bean> </bean:property> </bean:bean> <!--领域对象 并标注为原型--> <bean:bean id="creditBill" scope="prototype" class="CreditBill"> </bean:bean> <!--负责业务数据的处理--> <bean:bean id="creditBillProcessor" scope="step" class="CreditBillProcessor"> </bean:bean> </bean:beans>
3.5、JobLaunch:java调用
import org.springframework.batch.core.Job; import org.springframework.batch.core.JobExecution; import org.springframework.batch.core.JobParameters; import org.springframework.batch.core.launch.JobLauncher; import org.springframework.context.ApplicationContext; import org.springframework.context.support.ClassPathXmlApplicationContext; /** * @author E101206 * @version 1.0 * @date 2019/10/8 */ public class JobLaunch { @SuppressWarnings("resource") public static void main(String[] args) { //初始化应用上下文 ApplicationContext context = new ClassPathXmlApplicationContext("job/job.xml"); //获取作业调度,根据Bean的名称从Spring的上下文获取 JobLauncher launcher = (JobLauncher) context.getBean("jobLauncher"); //获取任务对象 Job job = (Job) context.getBean("billJob"); try { JobExecution result = launcher.run(job, new JobParameters()); System.out.println(result.toString()); } catch (Exception e) { e.printStackTrace(); } } }
3.6、JobLaunchTest:单位测试
/** * @author E101206 * @version 1.0 * @date 2019/10/8 */ import org.junit.After; import org.junit.Before; import org.junit.Test; import org.junit.runner.RunWith; import org.springframework.batch.core.Job; import org.springframework.batch.core.JobExecution; import org.springframework.batch.core.JobParameters; import org.springframework.batch.core.launch.JobLauncher; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.beans.factory.annotation.Qualifier; import org.springframework.test.context.ContextConfiguration; import org.springframework.test.context.junit4.SpringJUnit4ClassRunner; @RunWith(SpringJUnit4ClassRunner.class) @ContextConfiguration(locations={"/job/job.xml"}) public class JobLaunchTest { @Autowired private JobLauncher jobLauncher; @Autowired@Qualifier("billJob") private Job job; @Before public void setUp() throws Exception { } @After public void tearDown() throws Exception { } @Test public void billJob() throws Exception { JobExecution result = jobLauncher.run(job, new JobParameters()); System.out.println(result.toString()); } }
3.7、pom.xml
<?xml version="1.0" encoding="UTF-8"?> <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.msj</groupId> <artifactId>spring-batch-example</artifactId> <version>1.0-SNAPSHOT</version> <properties> <jdk.version>1.8</jdk.version> <spring.version>4.3.8.RELEASE</spring.version> <spring.batch.version>3.0.7.RELEASE</spring.batch.version> <junit.version>4.11</junit.version> </properties> <build> <plugins> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <configuration> <source>${jdk.version}</source> <target>${jdk.version}</target> </configuration> </plugin> </plugins> </build> <dependencies> <!--<!– Spring Core –>--> <!--<dependency>--> <!--<groupId>org.springframework</groupId>--> <!--<artifactId>spring-core</artifactId>--> <!--<version>${spring.version}</version>--> <!--</dependency>--> <!--<!– Spring jdbc, for database –>--> <!--<dependency>--> <!--<groupId>org.springframework</groupId>--> <!--<artifactId>spring-jdbc</artifactId>--> <!--<version>${spring.version}</version>--> <!--</dependency>--> <!--<!– Spring XML to/back object –>--> <!--<dependency>--> <!--<groupId>org.springframework</groupId>--> <!--<artifactId>spring-oxm</artifactId>--> <!--<version>${spring.version}</version>--> <!--</dependency>--> <!-- Spring Batch dependencies --> <dependency> <groupId>org.springframework.batch</groupId> <artifactId>spring-batch-core</artifactId> <version>${spring.batch.version}</version> </dependency> <!--<dependency>--> <!--<groupId>org.springframework.batch</groupId>--> <!--<artifactId>spring-batch-infrastructure</artifactId>--> <!--<version>${spring.batch.version}</version>--> <!--</dependency>--> <!-- Spring Batch unit test --> <dependency> <groupId>org.springframework.batch</groupId> <artifactId>spring-batch-test</artifactId> <version>${spring.batch.version}</version> </dependency> <!-- Junit --> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>${junit.version}</version> <scope>test</scope> </dependency> </dependencies> </project>
4、概念
4.1、Job Repository:作业仓库,负责job、step执行过程中的状态保存
4.2、Job launcher:作业调度器,提供执行job的入口
4.3、Job:作业,由多个step组成,封装整个批处理操作
4.4、Step:作业步,job的一个执行环节
4.5、Tasklet:Step中具体执行逻辑的操作,可以重复执行,可以设置具体的同步、异步操作
4.6、Chunk:给定数量的Item的集合,可以定义对Chunk读操作、处理操作、写操作,提交间隔等
4.7、Item:一条记录
4.8、ItemReader:从数据源读取Item
4.9、ItemProcessor:在Item写入数据源之前,对数据进行处理如:数据清洗,数据转换,数据过滤、数据校验等
4.10、ItemWriter:将Item批量写入数据源