Long-term log archival and restoration for Graylog Open — the feature only the Enterprise edition ships. Streaming exports, SHA256 integrity, point-in-time restore, and an independent compliance audit Graylog admins can't tamper with.
Prerequisites.
Linux host with python3 ≥ 3.10, pip, and git.
The installer creates a system user, sets up systemd, generates a self-signed
cert and brings up the Web UI on HTTPS port 8990.
sudo git clone https://github.com/jasoncheng7115/jt-glogarch.git /opt/jt-glogarch
sudo bash /opt/jt-glogarch/deploy/install.sh
sudo git clone https://github.com/jasoncheng7115/jt-glogarch.git /opt/jt-glogarch
cd /opt/jt-glogarch
sudo pip install --no-build-isolation .
sudo cp deploy/jt-glogarch.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now jt-glogarch
https://<server-ip>:8990 (login with your Graylog credentials)/opt/jt-glogarch/config.yamljournalctl -u jt-glogarch -fsudo bash /opt/jt-glogarch/deploy/upgrade.sh (safe, takes a DB snapshot first)sudo bash /opt/jt-glogarch/deploy/uninstall.sh (prompts before deleting any data)
Then point it at Graylog.
Edit config.yaml with your Graylog API URL and a token
(Graylog → System → Users → Edit tokens). Optional: add OpenSearch hosts to
enable the ~5× faster Direct mode. Full setup → README · Configuration
Pick API or OpenSearch Direct on the way out — pick GELF or OpenSearch Bulk on the way back in. Cross-mode dedup keeps everything coherent.
Standard REST API path. Works with any Graylog Open install. Stream filters, time-window pagination beyond the 10K offset limit, JVM heap monitor that auto-pauses when Graylog gets stressed.
Bypasses Graylog and queries OpenSearch directly. ~5× faster. Per-index export with search_after pagination (no 10K limit), host failover, transient retry on 500/502/503/429.
Default restore path. Each message replays through the full Graylog input → process → indexer chain. Pipeline rules, extractors, stream routing, alerts — all preserved.
5–10× faster direct write to OpenSearch via _bulk. Skips Graylog processing entirely. For "restore as-is" — disaster recovery, migration, point-in-time investigations.
The same archive serves multiple needs — once the data lives in compressed .json.gz files with checksums, you can pull from any of these angles.
Keep 1 year of authentication logs while Graylog's hot index retention stays at 90 days. Schedule a daily stream-filtered export and let cheap storage take everything past 90 days.
An incident from 6 months ago needs investigation but those logs rotated out of OpenSearch. Find the archive, click Import, point at a Graylog instance with GELF — back online and searchable.
Move from old Graylog to new Graylog. OpenSearch Direct exports the whole cluster fast, GELF or Bulk imports it into the new one. No vendor lock-in, no cloud middleman.
Mount remote storage at /data/graylog-archives and schedule daily exports. Even if your live Graylog cluster dies, you keep searchable archives at another site.
OpenSearch hot tier is expensive. Archive older indices to compressed storage (~10× compression), let the active cluster carry only recent searches. Old data still recoverable in minutes.
Graylog's built-in audit log is admin-deletable. jt-glogarch's independent operation audit captures what really happened on the cluster, in a database Graylog admins don't own.
Built to run quietly for years against a busy Graylog cluster.
Never holds messages in memory. .json.gz chunks roll over at 50 MB by default. OOM-proof on multi-GB indices.
Same-mode exact match prevents re-export. Cross-mode coverage check prevents duplicates when switching between API and OpenSearch.
Service crash mid-export? Restart and continue from the last completed chunk. Nothing lost, nothing duplicated.
Every archive ships with a .sha256 sidecar. Scheduled re-verification runs in parallel workers; corrupted files are flagged on the dashboard.
Export, cleanup, verify — POSIX cron with the day-of-week numbering you actually expect. Live edits via Web UI take effect immediately, no service restart.
API exports auto-pause when Graylog heap exceeds 85%. Resume after GC recovers; stop after 5 minutes if it doesn't. Your Graylog stays alive.
Before any GELF or Bulk send, jt-glogarch checks target capacity, fixes mapping conflicts, and reconciles indexer failures after import. Zero indexer failures = compliance pass.
Telegram · Discord · Slack · Microsoft Teams · Nextcloud Talk · Email. Bilingual (EN / 繁體中文). Independent triggers per event type.
Full Web UI (light/dark, EN/zh-TW) plus a glogarch CLI for scripting. Same operations, same database, your choice.
Compliance frameworks (ISO 27001, PCI-DSS, GDPR…) require an audit trail of administrator actions. Graylog's built-in audit log is managed by Graylog itself — anyone with admin rights can delete or modify it. From an auditor's perspective: untrustworthy.
nginx in front of Graylog forwards every API call to jt-glogarch's syslog listener over UDP. Independent path, independent storage.
Graylog admins can delete their own audit log. They cannot delete this one.
Stream / pipeline / user / search / dashboard / content pack / lookup table / extractor — every meaningful create / modify / delete is captured and classified.
Filterable in Web UI by user, method, URI, status, time range, sensitive-only.
Basic Auth, API tokens, browser sessions, cookie auth — all resolve to the actual Graylog username. Token prefix cache, per-user lookups, IP fallback.
Even when 5 admins share an IP, cookie-based session resolution distinguishes them.
Configurable patterns trigger real-time notifications via your channels. User deletions, admin grants, content pack installs — pings as they happen, not in next morning's report.
Every 5 minutes jt-glogarch probes through nginx and watches for the syslog echo. Quiet pipeline → instant alert. Catches "someone disabled the forwarding" without false positives.
Audit retention is separate from archive retention (default 180 days, tunable). Even if log archives are pruned, the operation history stays as long as you need.
Setup steps + nginx config template → README · Operation Audit
Web UI is bilingual (EN / 繁體中文) with light and dark themes.