Spring Batch Concepts Chapter

时间:2024-11-15 14:34:31

Spring Batch Concepts Chapter

The below figure shows two kinds of Spring Batch components:infrastructure components and application components. The infrastructure componentsare the job repository and the job launcher. Spring Batch provides implementations for both – and you do need to configure these components – but there’slittle chance you’ll have to create your own implementations.

Spring Batch Concepts Chapter

The main Spring Batch components. The framework provides ajob repository to store job metadata and a job launcher to launch jobs, and theapplication developer configures and implements jobs. The infrastructure components– provide by Spring Batch – are in gray, and application components –implemented by the developer – are in white.

Table ‘The main components of a Spring Batch application’

Component

Description

Job repository

An infrastructure component that persists job execution metadata

Job launcher

An infrastructure component that starts job executions

Job

An application component that represents a batch process

Step

A phase in a job; a job is a sequence of steps

Tasklet

A transactional, potentially repeatable process occurring in a step

Item

A record read from or written to a data source

Chunk

A list of items of a given size

Item reader

A component responsible for reading items from a data source

Item processor

A component responsible for processing(transforming, validating, or filtering) a read item before it’s written

Item writer

A component responsible for writing a chunk to a data source

How Spring Batch interacts with the outside world

A job starts in response to an event. You’ll always use theJobLauncher interface and JobParameters class, but the event can come fromanywhere; a system scheduler like cron that runs periodically, a script thatlauncher a Spring Batch process, an HTTP request to a web container thatlaunches the job, and so on.

Batch jobs are about processing data, and that’s why belowfigure shows Spring Batch communicating with data sources. These data sourcescan be of any kind, the file system and a database being the most common, but a job can read and write messages to Java Message Service(JMS) queues as well.

Jobs can communicate with data sources, but so does the job repository. In fact, the job repository stores job execution metadata in a databaseto provide Spring Batch reliable monitoring and restart features.

Spring Batch can run anywhere the Spring Framework can run:in its own Java process, in a web container, in an application, or even in anOpen Services Gateway initiative(OSGi) container. The container depends on your requirements, and Spring batch is flexible in this regard.

Spring Batch Concepts Chapter

Spring Batch infrastructure need to deal mainly with two components: the job launcher and the job repository. These concepts match two straightforwardjava interfaces: JobLauncher and JobRepository.

The JobLauncher interface have run function, which acceptstwo parameters: Job, which is typically a Spring bean configured in Spring BatchXML, and Jobparameters, which is usually created on the fly by the launchingmechanism.

Who calls the job launcher? Your own java program can usethe job launcher to launch a job, but so can command-line programs orschedulers(like cron or the Java-based Quartz scheduler).

The job launcher encapsulates launching strategies such asexecuting a job synchronously or asynchronously. Spring Batch provides oneimplementation of the JobLauncher interface: SimpleJobLauncher.

The job repository maintains all metadata related to jobexecutions.

The JobRepository interface provides all the services tomanage the batch job lifecycle: creation,update, and so on.

What constitutes runtime metadata? It includes the list ofexecuted steps: how many items Spring Batch read, wrote, or skipped; theduration of each step: and so forth.

Spring Batch provides two implementations of theJobRepositorty interface: one stores metadata in memory, which is useful fortesting or when you don’t want monitoring or restart capabilities; the otherstores metadata in a relational database.