← Back to Docs Index

Store & Forward (S&F) — Quick Guide

This guide explains how to collect monitoring data at the edge, buffer it safely to files, and then ingest it into the cloud database with the Orchestrator — all using the single entrypoint main.py.

Overview

  • Edge Monitoring writes NDJSON files (optionally gzip-compressed) instead of writing directly to the database.
  • Files are rotated by size/lines/time and moved to a transfer folder.
  • A transport mechanism (e.g., shared folder or sync agent) delivers files to the cloud inbox.
  • The S&F Orchestrator watches the inbox, sorts records by timestamp, and performs idempotent inserts/updates.

Edge — How to Enable S&F

  • Command:
  • python main.py --monitoring <DATASOURCE> --store-and-forward --saf-source-id edge-01
  • Optional flags:
  • --saf-max-lines 10000 (default)
  • --saf-max-bytes 8 (MB, default)
  • --saf-max-age 30 (seconds, default)
  • --saf-outbox instance/saf/outbox
  • --saf-transfer instance/saf/transfer
  • --saf-compress gzip (default)
  • --saf-storage-limit 100 (MB; warn at 90%, delete oldest at 100%)
  • Behavior:
  • Outbox: temporary staging for active .ndjson.part files while they are being written.
  • Transfer: finalized files are moved to transfer/ (as .ndjson or .ndjson.gz). This is the folder you sync to the cloud.
  • If storage reaches 100% of the limit, the oldest files are deleted and a warning is logged.

Cloud — Orchestrator

  • Command (headless):
  • python main.py --saf-orchestrator --saf-inbox instance/saf/inbox
  • With Web UI:
  • python main.py --ux --saf-orchestrator --saf-inbox instance/saf/inbox
  • Navigate to /monitoring/saf-orchestrator
  • Behavior:
  • Reads top-level files in the inbox (ignores subfolders), validates JSON, sorts globally by value_ts, and performs idempotent upserts.
  • Moves processed files to inbox/processed/. Invalid files go to inbox/quarantine/.
  • Processed retention: the orchestrator enforces a size limit on inbox/processed/ (default 500MB). At 90% it logs a warning; at 100% it deletes the oldest files.

Record Format (NDJSON)

Each line is a JSON object with minimal fields:

{
  "source_id": "edge-01",
  "source_seq": 123,
  "created_at": "2025-10-17T10:00:00Z",
  "datasource": "rest-api",
  "stream_id": "stream-1",
  "stream_name": "S1",
  "value": { ... },
  "value_ts": "2025-10-17T10:00:00Z",
  "status": "online",
  "event_type": "scan_success"
}
  • source_seq is assigned at the edge per record; cloud uses it for idempotency.
  • value_ts drives chronological ordering across files.

Tips & Troubleshooting

  • Avoid running multiple Monitoring processes on the same edge directory (to prevent duplicate work).
  • If you see repeated ingest logs, ensure files are moving to processed/ and the inbox contains only fresh .ndjson(.gz) files at top level.
  • If Recent Files is empty after changing inbox, it will display the latest from processed/ as a fallback once available.
  • For large backlogs, increase --saf-max-lines/bytes/age and consider a larger storage limit.
  • If streams are missing in the cloud DB, create MonitoredStream entries (future option: auto-create placeholders).

Safety Notes

  • Compression on, no encryption in Phase 1.
  • Storage policy (edge transfer / cloud processed): warn at 90%; at 100% delete oldest files to keep the system running.
  • Idempotency: re-ingesting the same files updates existing records; no duplicates.

Version: beta

On this page