Skip to main content

Persistence

Ratchet persists all job state in the selected store backend. SQL stores use JPA entities and DDL-backed tables; the MongoDB store maps the same model to documents and collections. The shared persistence layer is built around a composable SPI interface, UUIDv7 identifiers, and dialect-specific constraint detection where the backend needs it.

Entity Model

The core logical model is JobEntity. In SQL stores it is split across a cold metadata table (scheduler_job) and a hot executable queue table (scheduler_job_queue). The cold table owns immutable job shape and terminal history; the hot table exists only while a job is live and owns claim/poll state. MongoDB maps the same logical model to collections. Supporting entities handle batches, executions, workflow conditions, locks, nodes, and archived jobs.

┌─────────────────────────────────────────┐
│ scheduler_job │
│ (cold metadata + terminal state) │
├─────────────────────────────────────────┤
│ job_id (UUIDv7 PK) │
│ priority, job_type │
│ payload, params, tags │
│ max_retries, backoff_policy │
│ cron_expr, zone_id, next_fire │
│ idempotency_key, business_key │
│ depends_on, superseded_by │
│ caller_principal │
│ resource_name │
│ terminal status/error/timing/result │
└─────────────────────────────────────────┘
│ 1:0/1 while live

┌─────────────────────────────────────────┐
│ scheduler_job_queue │
│ (hot claim/poll state) │
├─────────────────────────────────────────┤
│ job_id (UUIDv7 PK/FK) │
│ status, scheduled_time │
│ attempts, picked_by, picked_at │
│ paused_from_status, last_error │
│ version, updated_at │
└─────────────────────────────────────────┘

│ 1:N

┌─────────────────────┐ ┌─────────────────────────┐
│ scheduler_job_tag │ │ scheduler_job_execution │
│ (tags per job) │ │ (JobExecutionEntity) │
└─────────────────────┘ └─────────────────────────┘

┌─────────────────────┐ ┌─────────────────────────┐
│ scheduler_batch │ │ scheduler_batch_metrics │
│ (BatchEntity) │ │ (BatchMetricsEntity) │
└─────────────────────┘ └─────────────────────────┘

┌─────────────────────────────┐ ┌─────────────────────┐
│ scheduler_workflow_condition│ │ scheduler_lock │
│ (WorkflowConditionEntity) │ │ (LockEntity) │
└─────────────────────────────┘ └─────────────────────┘

┌─────────────────────┐ ┌─────────────────────────┐
│ scheduler_node │ │ scheduler_job_archive │
│ (NodeEntity) │ │ (ArchivedJobEntity) │
└─────────────────────┘ └─────────────────────────┘

┌──────────────────────┐ ┌──────────────────────────┐
│ scheduler_resource_ │ │ scheduler_dlq_alert │
│ limit / permit │ │ (DlqAlertEntity) │
└──────────────────────┘ └──────────────────────────┘

SQL Job Tables

The SQL stores denormalize a few immutable fields into scheduler_job_queue (job_type, priority, business_key, timeout_sec, max_retries) so the claim path can populate lightweight claim DTOs from one hot table.

ColumnTypePurpose
job_idBINARY(16)/uuid (UUIDv7)Primary key, time-ordered
scheduler_job.job_typeVARCHAR(16)Internal execution type (SINGLE, BATCH_CHILD, etc.)
scheduler_job.priorityINTPriority ordinal (0=LOWEST to 4=CRITICAL)
scheduler_job.payloadJSONSerialized job definition (target, method, args)
scheduler_job.paramsJSONKey-value parameters accessible via JobContext
idempotency_keyVARCHAR(36) UNIQUEGlobally unique deduplication key
business_keyVARCHARActive-unique key for concurrent execution prevention
depends_onBINARY(16)/uuidFK to parent job for chains
superseded_byBINARY(16)/uuidFK to replacement job
caller_principalVARCHAR(255)Captured Jakarta Security caller principal, if available
resource_nameVARCHAR(100)Resource pool for permit acquisition
terminal_status / terminal_errorVARCHAR / TEXTCold survivor fields set at terminal transition
scheduler_job_queue.statusVARCHAR(16)Live lifecycle state (PENDING, RUNNING, PAUSED)
scheduler_job_queue.scheduled_timeTIMESTAMPWhen the job becomes eligible for polling
scheduler_job_queue.attemptsINTCurrent attempt count while live
scheduler_job_queue.picked_by / picked_atVARCHAR(64) / TIMESTAMPNode claim ownership and claim time
scheduler_job_queue.versionINTOptimistic locking version for live queue mutations

The scheduler_business_key_reservation table owns active business-key uniqueness. Terminal rows keep their business_key for audit/search, but they do not block a future active job from using the same key.

Indexes

SQL stores define hot queue indexes for the Poller and supporting cold-table indexes for traversal/search. MongoDB defines analogous collection indexes in ratchet-store-mongodb:

IndexColumnsPurpose
idx_claim_executablescheduler_job_queue(job_type, scheduled_time, priority, job_id) for pending rowsExecutable claim filter
idx_queue_orphanscheduler_job_queue(status, picked_at, picked_by)Orphan recovery by node
pk_scheduler_business_key_reservationscheduler_business_key_reservation(business_key)Active business-key uniqueness and lookup
idx_job_depends_onscheduler_job(depends_on)Chain/workflow traversal
idx_job_superseded_byscheduler_job(superseded_by)Replacement lookup
idx_job_created_atscheduler_job(created_at)Operational search and retention
idx_job_recurring_pendingrecurring state (job_type, rec_status, next_fire)Transitional recurring-master scheduling

UUIDv7 Identifiers

Ratchet uses RFC 9562 §5.7 UUIDv7 for primary keys. UUIDs are 128-bit values that are time-ordered, coordination-free, and globally unique.

Layout

  0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
┌──────────────────────────────┬─────┬──────┬──┬───────────────┐
│ 48 bits: unix_ts_ms │ ver │rand_a│va│ 62 bits │
│ Unix epoch milliseconds │ 7 │ 12bit│r │ rand_b │
└──────────────────────────────┴─────┴──────┴──┴───────────────┘
FieldBitsPurpose
unix_ts_ms48Wall-clock millisecond timestamp
ver4Version constant 7
rand_a12Per-millisecond monotonic counter
var2RFC 9562 variant constant 10
rand_b62Cryptographic random (SecureRandom)

Properties

  • Time-ordered: The 48-bit timestamp prefix preserves B-tree locality — inserts cluster at the right edge, range scans by time work directly.
  • Monotonic within a millisecond: rand_a is used as a per-ms counter; on overflow inside a single ms, generation busy-spins via Thread.onSpinWait until the wall clock advances (RFC 9562 §6.2 wait-for-tick). The timestamp is never advanced past wall-clock time.
  • Coordination-free: 62 bits of randomness in rand_b make collisions vanishingly unlikely without inter-node coordination.
  • 128-bit java.util.UUID: Standard Java type, no special storage adapter on PostgreSQL (native uuid). MySQL stores as BINARY(16) and uses the MySQL store's META-INF/orm-mysql.xml mapping plus UuidByteArrayConverter so non-Hibernate JPA providers bind UUID fields as 16 bytes. MongoDB stores BSON UUID subtype 4 (UuidRepresentation.STANDARD).

Utility Methods

// Generate a new UUIDv7
UUID id = UuidV7Factory.create();

// Extract creation timestamp (high 48 bits)
Instant created = UuidV7Factory.timestampOf(id);

Why UUIDv7 Instead of TSID or Auto-Increment

ConcernAuto-IncrementTSIDUUIDv7
Multi-node generationRequires coordinationManual node-id slot (10 bits = 1024 nodes)Coordination-free
Concurrent generators before collisionsn/a~38 (birthday paradox on 10-bit node + 12-bit seq)Effectively unbounded (62 random bits)
Insert contentionB-tree hotspotDistributedDistributed (timestamp prefix only)
Temporal orderingNeeds created_atEmbeddedEmbedded
Range scan by timeNeeds indexUse IDUse ID
Migration / mergeConflictsRisk if node ids reusedGlobally unique

JobStore SPI

The JobStore interface is a marker that composes the store SPIs used by the RI. Store implementations (MySQL, PostgreSQL, MongoDB, or your own backend) implement that full surface through one CDI bean.

public interface JobStore
extends JobCrudStore,
JobClaimStore,
JobTerminalStore,
JobRetryStore,
JobPauseStore,
JobBatchStatusStore,
JobStatusStore, // Deprecated compatibility marker for one alpha release
JobBulkStore,
BatchStore,
LockStore,
NodeStore,
ArchiveStore,
ExecutionStore,
JobLogStore,
TagStore,
WorkflowConditionStore,
BatchMetricsStore,
DlqAlertStore,
ResourcePermitStore {
// Marker interface — all methods inherited from sub-interfaces
}

Sub-Interface Responsibilities

InterfaceResponsibility
JobCrudStoreCreate, read, update, delete individual jobs
JobClaimStoreAtomic batch claiming (SKIP LOCKED for SQL stores, atomic updates for MongoDB)
JobTerminalStoreTerminal success, failure, and cancellation transitions
JobRetryStoreRetry scheduling and attempt-state updates
JobPauseStorePause and resume transitions
JobBatchStatusStoreNon-terminal status, pickup, orphan, and recurring-cancel operations
JobStatusStoreDeprecated compatibility marker that composes the four status-focused SPIs above
JobBulkStoreBulk operations (DLQ purge, batch insert)
BatchStoreBatch parent/child management, progress tracking
LockStoreDistributed lock acquisition and release
NodeStoreNode registration and heartbeat
ArchiveStoreJob archival to the archive table/collection
ExecutionStoreExecution history recording
JobLogStoreStructured job log storage
TagStoreTag-based job queries
WorkflowConditionStoreWorkflow condition persistence and retrieval
BatchMetricsStoreBatch-level metrics and progress
DlqAlertStoreDLQ alert audit trail and deduplication
ResourcePermitStoreResource permit acquisition and release

Why Sub-Interfaces?

The decomposition serves multiple purposes:

  1. TCK modularity: Future TCK versions can test sub-interfaces independently
  2. Cognitive load: Each interface has a focused, understandable contract
  3. Dependency injection: RI services depend only on the sub-interfaces they need (e.g., Poller depends on JobClaimStore, not the full JobStore)
  4. Alternative implementations: A NoSQL store might implement JobCrudStore and JobClaimStore differently while reusing other sub-interface implementations

Constraint Detection

Different databases report constraint violations differently. The ConstraintDetector interface abstracts this:

public interface ConstraintDetector {
boolean isUniqueConstraintViolation(Exception e);
boolean isForeignKeyViolation(Exception e);
}

Each SQL store module provides a dialect-specific implementation:

  • MySQL: Parses for "Duplicate entry" in the error message
  • PostgreSQL: Checks SQL state codes (23505 for unique violation, 23503 for FK violation)

This is used primarily for idempotency key enforcement -- when a duplicate key is detected, the submission is silently rejected rather than throwing an error to the caller.

DDL Schema

SQL store modules ship DDL as plain SQL files in src/main/resources/ddl/. The *-schema.sql file is the authoritative clean-install schema for that dialect, and it reserves a ratchet_schema_version table for ordered upgrades.

Ratchet still does not run migrations automatically by default. Your application remains responsible for applying schema changes, whether through Flyway, Liquibase, or another deployment-time mechanism. When incremental Ratchet migration scripts are added, they live under ddl/migrations/ and follow the V###__description.sql convention. Those ordered V* files must compose to the same schema shipped in the clean-install DDL.

For SQL stores, if you do not already use a migration framework, ratchet-store-core also exposes SchemaMigrator, a small optional utility that discovers ordered V* scripts, serializes startup with a database advisory lock, validates checksums in ratchet_schema_version, and applies only pending scripts. Call it from a SchedulerLifecycleHook.beforeStart hook so migrations finish before the poller starts claiming jobs.

ratchet-store-mysql/src/main/resources/ddl/mysql-schema.sql
ratchet-store-postgresql/src/main/resources/ddl/postgresql-schema.sql

MongoDB does not ship SQL DDL. The ratchet-store-mongodb module creates the required collections and indexes at startup.

UUID Inspection by Store

  • PostgreSQL: query UUID columns directly; psql renders native uuid values as hyphenated strings.
  • MySQL: raw BINARY(16) values are not readable in CLI output. Apply the optional ddl/views/vw_jobs.sql operator views and query those views for hyphenated UUID strings. The views use BIN_TO_UUID(col) with no swap flag; BIN_TO_UUID(col, 1) is for MySQL's UUIDv1 time-reorder format and does not match Ratchet's Java-standard byte order.
  • MongoDB: use a MongoClient configured with UuidRepresentation.STANDARD so BSON subtype 4 UUID values round-trip correctly; mongosh renders them as UUID("...").

Optimistic Locking

SQL stores use JPA @Version on the version column, while MongoDB uses atomic filter-and-update operations. Both paths prevent lost updates when two nodes attempt to modify the same job concurrently. The engine uses compare-and-swap patterns for critical transitions:

// Atomic status transition — fails if another thread changed the status
boolean success = jobStore.compareAndSwapStatus(
jobId, JobStatus.RUNNING, JobStatus.SUCCEEDED, null);

Combined with FOR UPDATE SKIP LOCKED in SQL stores or atomic document claiming in MongoDB, this ensures a ready job is claimed by only one node at a time.