Overview
A high-level overview of how the ClickSet Worker Server is designed and operates.
What is the Worker Server?
The ClickSet Worker Server is a background task processor built in TypeScript on Node.js. It connects to the ClickSet API server's PostgreSQL database, picks up queued tasks, processes them, and writes results back. It is designed to run continuously as a standalone service, separate from the main API server.
Key principle: This worker does not own or manage the database. All tables and schema are controlled by the ClickSet API server. The worker reads from tasks and writes to tasks, task_events, datasets (field updates), and dataset_records (inserts and backfill updates).
Architecture
The system follows a simple producer-consumer pattern. The API server acts as the producer (creating tasks), and this worker acts as the consumer (processing them).
Task Lifecycle
Every task moves through a defined set of statuses:
- queued — Created by the API server and waiting to be picked up
- processing — Claimed by a worker and currently being executed
- completed — Successfully finished with results written back
- failed — Encountered an error during processing
- cancelled — Cancelled by the API server while queued or during processing
The worker checks for cancellation before starting a task and can detect cancellation mid-processing to stop work early.
Atomic Task Claiming
To support multiple workers running in parallel without conflicts, task claiming uses PostgreSQL's FOR UPDATE SKIP LOCKED. This ensures:
- Only one worker can claim a given task
- Workers don't block each other — if a task is already being claimed by another worker, it's simply skipped
- Tasks are claimed in FIFO order (oldest first)
Supported Task Types
The worker handles ten task types, each routed to a dedicated handler:
ai_response— Generates AI-powered text responses (e.g., chat completions)ai_generate_records— Generates structured data records using AI — see Recordsai_generate_fields— Generates and optionally backfills field definitions — see Fieldsai_generate_datasets— Generates new datasets or enriches existing ones using AI — see Datasetsimport_dataset— Creates a dataset from an uploaded CSV, JSON, or NDJSON file — see Datasetsbatch_update_datasets— Updates name, description, tags, or options on multiple datasets — see Datasetsbatch_delete_datasets— Soft-deletes or permanently deletes multiple datasets — see Datasetsduplicate_dataset— Copies a dataset's metadata and optionally all its records — see Datasetscleanup_stale_attachments— Deletes pending attachment files and DB rows older than a TTL — see Datasetscleanup_dataset_attachments— Deletes all attachment files for a deleted dataset from storage — see Datasets
Each handler receives the full task object, processes it, and writes results back to the database via task events and status updates.
Streaming via Task Events
For tasks that produce incremental output (like AI text generation), the worker writes task_events rows to the database as it works. The API server can then stream these events to the end user in real time. Event types include:
text_delta— A chunk of generated textusage— Token usage statisticsdone— Signals the task is completeerror— Reports an error during processing
Project Structure
src/
├── index.ts # Entry point — starts worker and dashboard
├── db.ts # PostgreSQL connection pool
├── types.ts # TypeScript interfaces
├── poller.ts # Polling loop — claims and dispatches tasks
├── router.ts # Routes tasks to handlers by type
├── task-helpers.ts # completeTask, failTask, writeTaskEvent, isTaskCancelled
├── stats.ts # In-memory stats tracking for this worker instance
├── storage.ts # StorageClient abstraction (backed by Replit Object Storage)
├── server.ts # Express dashboard server (port 5000)
├── providers/
│ └── index.ts # Multi-provider AI abstraction (OpenAI, Anthropic, Gemini, xAI) + web search
└── handlers/
├── ai-response.ts # Handler for ai_response tasks
├── ai-generate-records.ts # Handler for ai_generate_records tasks
├── ai-generate-fields.ts # Handler for ai_generate_fields tasks
├── ai-generate-datasets.ts # Handler for ai_generate_datasets tasks
├── import-dataset.ts # Handler for import_dataset tasks
├── batch-update-datasets.ts # Handler for batch_update_datasets tasks
├── batch-delete-datasets.ts # Handler for batch_delete_datasets tasks
├── duplicate-dataset.ts # Handler for duplicate_dataset tasks
├── cleanup-stale-attachments.ts # Handler for cleanup_stale_attachments tasks
└── cleanup-dataset-attachments.ts # Handler for cleanup_dataset_attachments tasks
Database Connection
The worker connects to the API server's database using either:
- A single
CLICKSET_DATABASE_URLconnection string (preferred), or - Individual variables:
PGHOST,PGPORT,PGUSER,PGDATABASE,CLICKSET_PGPASSWORD
The environment variable is named CLICKSET_DATABASE_URL (not DATABASE_URL) to avoid conflicts with hosting platform auto-provisioned databases. SSL is enforced on all connections.
Monitoring Dashboard
The worker includes a built-in web dashboard on port 5000 that shows:
- Worker uptime and start time
- Task counts — claimed, completed, failed, cancelled (for this worker instance only)
- Global queue depth — how many tasks are queued or processing across all workers
- Current task being processed (type, ID, duration)
- Recent task history with status, duration, and error details
Stats are tracked in memory and reset when the worker restarts. The dashboard auto-refreshes every 3 seconds.
Graceful Shutdown
When the worker receives a SIGTERM or SIGINT signal, it:
- Stops the polling loop (no new tasks are claimed)
- Waits for the current task to finish (if any)
- Closes the database connection pool
- Exits cleanly