This post explores the functionality of Spring Boot Batch Processing, a tool tailored for efficient management of extensive datasets. Specifically designed to automate and oversee intricate processes associated with large data volumes, Spring Boot Batch operates seamlessly without requiring user intervention. The framework significantly streamlines workflows involved in handling substantial data volumes in everyday scenarios. The subsequent sections provide a comprehensive, step-by-step guide for configuring Spring Boot Batch Processing, culminating in the creation of a sample application.
How Spring Boot Batch works
Spring Boot Batch encompasses a collection of classes designed to facilitate batch processing. The JobLauncher class takes the lead in initiating and executing Spring Boot Batch. The Job class holds essential details about the specific job executed whenever the batch is invoked. To configure the sequential steps in batch execution, the Step class is employed. These fundamental steps encompass reading, processing, and writing data. Configuration of the Spring Boot Batch application’s steps involves utilizing the ItemReader, ItemProcessor, and ItemWriter classes. Together, these components play a crucial role in orchestrating and executing batch processes efficiently.
ItemReader Interface
In batch processing, the ItemReader class is employed to retrieve data, serving as the initial step in the overall process. Within the ItemReader interface, a read method is defined, which takes no arguments and returns an object. This object is subsequently forwarded to the processor interface, initiating the subsequent steps in the batch processing flow. The ItemReader plays a pivotal role in efficiently reading and supplying data for further processing within the batch workflow.
In the given example, the reader class is configured to retrieve data from a string array. With each job execution, the read method is designed to return a string from the array. The batch reading process continues until the index reaches the end of the string array, at which point the reading operation will cease. This setup ensures that the batch processing efficiently extracts and processes data from the specified string array, culminating in a streamlined and controlled reading mechanism.
MyBatchReader.java
package com.test; import org.springframework.batch.item.ItemReader; import org.springframework.batch.item.NonTransientResourceException; import org.springframework.batch.item.ParseException; import org.springframework.batch.item.UnexpectedInputException; public class MyBatchReader implements ItemReader{ private String[] stringArray = { "0", "1", "2", "3" }; private int index = 0; @Override public String read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException { if (index >= stringArray.length) { System.out.println("MyBatchReader : No More data to process"); return null; } String data = index + " " + stringArray[index]; index++; System.out.println("MyBatchReader : Reading data : "+ data); return data; } }
ItemProcessor Interface
In the batch processing flow, the data retrieved from the ItemReader interface undergoes processing through the ItemProcessor interface. The data, returned by the ItemReader class, is received by the ItemProcessor class, where the process method is defined. This method takes the data as an argument and executes the necessary processing steps. Subsequently, the processed data is forwarded to the ItemWriter interface for further handling and storage.
In the provided example, the data is accepted in the process method and displayed in the console window. Subsequently, before being sent to the ItemWriter interface, the data undergoes a transformation where it is converted to uppercase. This modification ensures that the processed data adheres to the specified requirements before being further handled by the ItemWriter interface.
MyBatchProcessor.java
package com.test; import org.springframework.batch.item.ItemProcessor; public class MyBatchProcessor implements ItemProcessor<String, String> { @Override public String process(String data) throws Exception { System.out.println("MyBatchProcessor : Processing data : "+data); data = data.toUpperCase(); return data; } }
ItemWriterInterface
To persistently store data, the ItemWriter Interface is employed in batch processing. The processed data, modified as per the requirements, is directed to a storage device of a different type. The writer retrieves the data from the ItemProcessor interface. Within the ItemWriter interface, a write method is defined, accepting an object list as its argument. In this method, the processed data is written to the specified storage device, completing the final step in the batch processing workflow.
In the given example, the write method is configured to accept the processed string as its input. This processed string is then displayed in the console window.
MyBatchWriter.java
package com.test; import java.util.List; import org.springframework.batch.item.ItemWriter; public class MyBatchWriter implements ItemWriter { @Override public void write(List<? extends String> list) throws Exception { for (String data : list) { System.out.println("MyBatchWriter : Writing data : " + data); } } }
JobExecutionListenerSupport class
The JobExecutionListenerSupport class serves as a job execution listener in Spring Batch. During the execution of a job, the listener methods are invoked. Within the JobExecutionListenerSupport class, two primary methods exist: beforeJob and afterJob. The beforeJob method is triggered before the job initiates its execution, while the afterJob method is invoked upon the completion of the job execution. To tailor the job status according to specific requirements, it is necessary to override and customize these two methods. This allows for fine-grained control over the job execution process and enables the adjustment of status based on customized logic.
JobCompletionListener.java
package com.test; import org.springframework.batch.core.JobExecution; import org.springframework.batch.core.listener.JobExecutionListenerSupport; public class JobCompletionListener extends JobExecutionListenerSupport { @Override public void beforeJob(JobExecution jobExecution) { System.out.println("Batch is starting now. Status="+jobExecution.getStatus()); } @Override public void afterJob(JobExecution jobExecution) { System.out.println("Batch is completed Successfully. Status="+jobExecution.getStatus()); } }
Spring Boot Batch Configuration
In a Spring Boot application, the loading of all Spring Boot Batch beans is facilitated by the Spring Boot Batch configuration. For this configuration class to effectively manage batch processing, it should be annotated with @EnableBatchProcessing. To activate batch processing at the onset of a Spring Boot application, incorporate the @EnableBatchProcessing annotation into the main class.
To set up the execution of all steps within a job, it is essential to create both a Job bean and a Step bean. Additionally, for effective job execution monitoring, a JobExecutionListener bean needs to be established. The creation of a Job bean involves utilizing the JobBuilderFactory, which ensures that the JobExecutionListener is associated with all job steps. Meanwhile, the StepBuilderFactory is employed to craft a step that configures the implementation of the ItemReader, ItemProcessor, and ItemWriter interfaces.
SpringBootBatchConfiguration.java
package com.test; import org.springframework.batch.core.Job; import org.springframework.batch.core.JobExecutionListener; import org.springframework.batch.core.Step; import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing; import org.springframework.batch.core.configuration.annotation.JobBuilderFactory; import org.springframework.batch.core.configuration.annotation.StepBuilderFactory; import org.springframework.batch.core.launch.support.RunIdIncrementer; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; @Configuration @EnableBatchProcessing public class SpringBootBatchConfiguration { @Autowired public JobBuilderFactory jobBuilderFactory; @Autowired public StepBuilderFactory stepBuilderFactory; @Bean public Job createJob() { return jobBuilderFactory.get("MyJob") .incrementer(new RunIdIncrementer()).listener(listener()) .flow(buildBatchSteps()).end().build(); } @Bean public Step buildBatchSteps() { return stepBuilderFactory.get("MySteps").<String, String> chunk(1) .reader(new MyBatchReader()) .processor(new MyBatchProcessor()) .writer(new MyBatchWriter()) .build(); } @Bean public JobExecutionListener listener() { return new JobCompletionListener(); } }
Pom.xml file configurations
The pom.xml
file must encompass all dependencies essential for Spring Boot Batch functionality. Specifically, for Spring Boot Batch, inclusion of the spring-boot-starter-batch
dependency is imperative. If there’s a need to incorporate a Spring Web module and integrate a database into an already existing web application, additional dependencies are warranted. The ensuing example illustrates the comprehensive set of dependencies utilized for testing purposes in the example scenario.
pom.xml
<dependencies> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-batch</artifactId> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId> <scope>runtime</scope> </dependency> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-test</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>org.springframework.batch</groupId> <artifactId>spring-batch-test</artifactId> <scope>test</scope> </dependency> </dependencies>
How to run the Spring Boot Batch
Start the Spring Boot Batch application, and upon launching, the batch processing will automatically initiate. The console window will exhibit the Spring Boot Batch logs detailing the processing steps. In this particular example, input data is sourced from a string array, leading to a single execution cycle, after which the batch concludes. In scenarios where Spring Boot Batch is configured to read data from an external source, data retrieval occurs at the designated execution intervals. The console window log will manifest the progress and details of the batch execution, as demonstrated below.
2023-04-29 11:29:52.019 INFO 11468 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [FlowJob: [name=MyJob]] launched with the following parameters: [{run.id=1}] Batch is starting now. Status=STARTED 2023-04-29 11:29:52.081 INFO 11468 --- [ main] o.s.batch.core.job.SimpleStepHandler : Executing step: [MySteps] MyBatchReader : Reading data : 0 0 MyBatchProcessor : Processing data : 0 0 MyBatchWriter : Writing data : 0 0 MyBatchReader : Reading data : 1 1 MyBatchProcessor : Processing data : 1 1 MyBatchWriter : Writing data : 1 1 MyBatchReader : Reading data : 2 2 MyBatchProcessor : Processing data : 2 2 MyBatchWriter : Writing data : 2 2 MyBatchReader : Reading data : 3 3 MyBatchProcessor : Processing data : 3 3 MyBatchWriter : Writing data : 3 3 MyBatchReader : No More data to process 2023-04-29 11:29:52.131 INFO 11468 --- [ main] o.s.batch.core.step.AbstractStep : Step: [MySteps] executed in 49ms Batch is completed Successfully. Status=COMPLETED 2023-04-29 11:29:52.153 INFO 11468 --- [ main] o.s.b.c.l.support.SimpleJobLauncher : Job: [FlowJob: [name=MyJob]] completed with the following parameters: [{run.id=1}] and the following status: [COMPLETED] in 87ms