Logs that survive your cluster.

Long-term log archival and restoration for Graylog Open — the feature only the Enterprise edition ships. Streaming exports, SHA256 integrity, point-in-time restore, and an independent compliance audit Graylog admins can't tamper with.

Graylog Open Self-hosted Archival Notifications Independent audit Open source Apache 2.0
Three-step install → View on GitHub

Prerequisites. Linux host with python3 ≥ 3.10, pip, and git. The installer creates a system user, sets up systemd, generates a self-signed cert and brings up the Web UI on HTTPS port 8990.

sudo git clone https://github.com/jasoncheng7115/jt-glogarch.git /opt/jt-glogarch
sudo bash /opt/jt-glogarch/deploy/install.sh
Web UI
https://<server-ip>:8990 (login with your Graylog credentials)
Config
/opt/jt-glogarch/config.yaml
Logs
journalctl -u jt-glogarch -f
Upgrade
sudo bash /opt/jt-glogarch/deploy/upgrade.sh (safe, takes a DB snapshot first)
Uninstall
sudo bash /opt/jt-glogarch/deploy/uninstall.sh (prompts before deleting any data)

Then point it at Graylog. Edit config.yaml with your Graylog API URL and a token (Graylog → System → Users → Edit tokens). Optional: add OpenSearch hosts to enable the ~5× faster Direct mode. Full setup → README · Configuration

Two paths in. Two paths out.

Pick API or OpenSearch Direct on the way out — pick GELF or OpenSearch Bulk on the way back in. Cross-mode dedup keeps everything coherent.

Graylog API export

Standard REST API path. Works with any Graylog Open install. Stream filters, time-window pagination beyond the 10K offset limit, JVM heap monitor that auto-pauses when Graylog gets stressed.

OpenSearch Direct export

Bypasses Graylog and queries OpenSearch directly. ~5× faster. Per-index export with search_after pagination (no 10K limit), host failover, transient retry on 500/502/503/429.

GELF import

Default restore path. Each message replays through the full Graylog input → process → indexer chain. Pipeline rules, extractors, stream routing, alerts — all preserved.

OpenSearch Bulk import

5–10× faster direct write to OpenSearch via _bulk. Skips Graylog processing entirely. For "restore as-is" — disaster recovery, migration, point-in-time investigations.

Six things you'll do with it.

The same archive serves multiple needs — once the data lives in compressed .json.gz files with checksums, you can pull from any of these angles.

Compliance retention

Keep 1 year of authentication logs while Graylog's hot index retention stays at 90 days. Schedule a daily stream-filtered export and let cheap storage take everything past 90 days.

Forensic restore

An incident from 6 months ago needs investigation but those logs rotated out of OpenSearch. Find the archive, click Import, point at a Graylog instance with GELF — back online and searchable.

Cluster migration

Move from old Graylog to new Graylog. OpenSearch Direct exports the whole cluster fast, GELF or Bulk imports it into the new one. No vendor lock-in, no cloud middleman.

Disaster recovery

Mount remote storage at /data/graylog-archives and schedule daily exports. Even if your live Graylog cluster dies, you keep searchable archives at another site.

Cost reduction

OpenSearch hot tier is expensive. Archive older indices to compressed storage (~10× compression), let the active cluster carry only recent searches. Old data still recoverable in minutes.

Independent audit

Graylog's built-in audit log is admin-deletable. jt-glogarch's independent operation audit captures what really happened on the cluster, in a database Graylog admins don't own.

Engineered for the long haul.

Built to run quietly for years against a busy Graylog cluster.

Streaming write

Never holds messages in memory. .json.gz chunks roll over at 50 MB by default. OOM-proof on multi-GB indices.

Smart deduplication

Same-mode exact match prevents re-export. Cross-mode coverage check prevents duplicates when switching between API and OpenSearch.

Resume from interruption

Service crash mid-export? Restart and continue from the last completed chunk. Nothing lost, nothing duplicated.

SHA256 integrity

Every archive ships with a .sha256 sidecar. Scheduled re-verification runs in parallel workers; corrupted files are flagged on the dashboard.

Cron-based schedules

Export, cleanup, verify — POSIX cron with the day-of-week numbering you actually expect. Live edits via Web UI take effect immediately, no service restart.

JVM heap guard

API exports auto-pause when Graylog heap exceeds 85%. Resume after GC recovers; stop after 5 minutes if it doesn't. Your Graylog stays alive.

Pre-flight compliance

Before any GELF or Bulk send, jt-glogarch checks target capacity, fixes mapping conflicts, and reconciles indexer failures after import. Zero indexer failures = compliance pass.

Notifications

Telegram · Discord · Slack · Microsoft Teams · Nextcloud Talk · Email. Bilingual (EN / 繁體中文). Independent triggers per event type.

Web UI + CLI

Full Web UI (light/dark, EN/zh-TW) plus a glogarch CLI for scripting. Same operations, same database, your choice.

Audit that admins can't tamper with.

Compliance frameworks (ISO 27001, PCI-DSS, GDPR…) require an audit trail of administrator actions. Graylog's built-in audit log is managed by Graylog itself — anyone with admin rights can delete or modify it. From an auditor's perspective: untrustworthy.

nginx access log → syslog

nginx in front of Graylog forwards every API call to jt-glogarch's syslog listener over UDP. Independent path, independent storage.

Graylog admins can delete their own audit log. They cannot delete this one.

60+ operation types

Stream / pipeline / user / search / dashboard / content pack / lookup table / extractor — every meaningful create / modify / delete is captured and classified.

Filterable in Web UI by user, method, URI, status, time range, sensitive-only.

Resolved usernames

Basic Auth, API tokens, browser sessions, cookie auth — all resolve to the actual Graylog username. Token prefix cache, per-user lookups, IP fallback.

Even when 5 admins share an IP, cookie-based session resolution distinguishes them.

Sensitive-operation alerts

Configurable patterns trigger real-time notifications via your channels. User deletions, admin grants, content pack installs — pings as they happen, not in next morning's report.

Active heartbeat

Every 5 minutes jt-glogarch probes through nginx and watches for the syslog echo. Quiet pipeline → instant alert. Catches "someone disabled the forwarding" without false positives.

Independent retention

Audit retention is separate from archive retention (default 180 days, tunable). Even if log archives are pruned, the operation history stays as long as you need.

Setup steps + nginx config template → README · Operation Audit

Screenshots

Web UI is bilingual (EN / 繁體中文) with light and dark themes.