AnalogJ/scrutiny

Item: AnalogJ/scrutiny
Rating: 5
Author: RepoPilot

Hard Drive S.M.A.R.T Monitoring, Historical Trends & Real World Failure Thresholds

Healthy

Healthy across the board

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit 1w ago
✓25+ active contributors
✓Distributed ownership (top contributor 29% of recent commits)

Show all 6 evidence items →

✓MIT licensed
✓CI configured
✓Tests present

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/analogj/scrutiny)](https://repopilot.app/r/analogj/scrutiny)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/analogj/scrutiny on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: AnalogJ/scrutiny

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/AnalogJ/scrutiny shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across the board

Last commit 1w ago
25+ active contributors
Distributed ownership (top contributor 29% of recent commits)
MIT licensed
CI configured
Tests present

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live AnalogJ/scrutiny repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/AnalogJ/scrutiny.

What it runs against: a local clone of AnalogJ/scrutiny — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in AnalogJ/scrutiny | Confirms the artifact applies here, not a fork | | 2 | License is still MIT | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | Last commit ≤ 40 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>AnalogJ/scrutiny</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of AnalogJ/scrutiny. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/AnalogJ/scrutiny.git
#   cd scrutiny
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of AnalogJ/scrutiny and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "AnalogJ/scrutiny(\\.git)?\\b" \\
  && ok "origin remote is AnalogJ/scrutiny" \\
  || miss "origin remote is not AnalogJ/scrutiny (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(MIT)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"MIT\"" package.json 2>/dev/null) \\
  && ok "license is MIT" \\
  || miss "license drift — was MIT at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 40 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~10d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/AnalogJ/scrutiny"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

Scrutiny is a hard drive health monitoring dashboard that integrates with smartd to track S.M.A.R.T metrics over time and predict drive failures using real-world failure thresholds rather than manufacturer defaults. It provides a web UI for visualizing drive health trends across multiple drives, storing historical S.M.A.R.T attribute data in SQLite, and alerting on degradation patterns that precede failure. Monorepo with two main components: collector/ (Go CLI tools for gathering S.M.A.R.T metrics via collector-metrics and self-testing via collector-selftest) and webapp/ (frontend UI). The collector uses a pluggable shell abstraction (collector/pkg/common/shell) to handle local or remote command execution, stores data via GORM with SQLite, and provides configuration via Viper.

👥Who it's for

System administrators and homelab operators managing servers with multiple hard drives who need centralized visibility into drive health beyond smartd's command-line interface. Contributors are Go backend developers and frontend engineers building monitoring infrastructure.

🌱Maturity & risk

Active development with solid CI/CD setup (GitHub Actions workflows for CI, Docker builds, releases), test coverage tracking via codecov, and Go 1.25 support. The README explicitly notes 'Scrutiny is a Work-in-Progress and still has some rough edges', suggesting it's production-capable but actively evolving rather than fully stable.

Single-author repo (AnalogJ) creates maintenance concentration risk. Dependency footprint is moderate but includes external integrations (InfluxDB client, shoutrrr notifications, ghw hardware detection) that could introduce breaking changes. The project explicitly acknowledges rough edges and lack of mature S.M.A.R.T threshold defaults—real-world threshold accuracy is still being refined.

Active areas of work

The repo shows active CI/CD with nightly Docker builds and release pipelines. The file structure indicates ongoing work on collector metrics (collector/pkg/collector/metrics.go with tests), configuration management (config_test.go present), and shell abstraction testing. No specific recent commit data is visible, but the presence of comprehensive test files and multiple devcontainer configs suggests active development and contributor onboarding effort.

🚀Get running

git clone https://github.com/AnalogJ/scrutiny.git
cd scrutiny
make
# See Makefile for specific build targets

Daily commands:

# Build collector binaries
make build-collector
# Start web UI (inferred from repo structure)
make run
# For development with devcontainers, use .devcontainer configs for Docker, Podman, or rootless Docker

🗺️Map of the codebase

collector/pkg/collector/metrics.go: Core logic for collecting and parsing S.M.A.R.T metrics from smartd, critical to understanding the data flow
collector/pkg/collector/base.go: Base collector interface and orchestration logic for all metrics gathering operations
collector/pkg/config/config.go: Configuration schema and loading logic for thresholds, notifications, and drive-specific settings
collector/pkg/common/shell/interface.go: Abstraction for executing shell commands (critical for smartctl integration and testability)
collector/cmd/collector-metrics/collector-metrics.go: CLI entry point for the metrics collection daemon, shows how metrics collector is invoked
.golangci.yml: Go linting configuration; essential for understanding code quality standards in this project
.github/workflows/ci.yaml: CI/CD pipeline definition showing test, build, and release automation

🛠️How to make changes

Adding S.M.A.R.T attribute tracking: modify collector/pkg/collector/metrics.go and add tests in metrics_test.go
New alert/notification types: extend collector/pkg/collector/base.go and integrate with shoutrrr in config
Custom thresholds: edit collector/pkg/config/config.go and factory logic
Frontend UI changes: modify files in webapp/frontend/src/
Shell command execution changes: adjust abstraction in collector/pkg/common/shell/ (local_shell.go for real execution, mock_shell.go for tests)

🪤Traps & gotchas

Shell abstraction must be tested via mock_shell.go: local_shell.go directly executes smartctl commands, so changes here require corresponding mock implementations. 2. S.M.A.R.T threshold data is manufacturer/model-dependent: config thresholds in collector/pkg/config are not universal—real-world thresholds may vary significantly by drive model. 3. SQLite persistence is blocking on high-frequency metric collection: concurrent writes to the database from multiple collectors may lock; consider connection pooling if scaling beyond single-machine deployment. 4. devcontainer setup requires specific Docker or Podman versions: three separate devcontainer configs exist (.devcontainer/docker, .devcontainer/docker-rootless, .devcontainer/podman) and the wrong one can cause setup failures. 5. No API versioning visible: webapp may depend on internal API contracts that could break with schema changes.

💡Concepts to learn

S.M.A.R.T (Self-Monitoring, Analysis and Reporting Technology) — Core domain of this project; understanding S.M.A.R.T attributes, thresholds, and failure prediction is essential to contribute meaningfully to metrics logic and threshold tuning
Dependency Injection via Interfaces — Scrutiny uses interface-based design extensively (shell/interface.go, config/interface.go) to enable mocking and testing; understanding this pattern is critical for modifying collectors and adding new components
Mock Objects and Testability — The collector uses mock_shell.go for testing command execution without invoking real smartctl; this pattern is repeated throughout and contributors must maintain it
GORM ORM and Database Migrations — Historical S.M.A.R.T data persistence uses GORM with gormigrate for schema management; understanding GORM associations and migrations is needed for adding new metric types or altering data storage
Factory Pattern for Configuration — The collector/pkg/config/factory.go and shell/factory.go use factory pattern to instantiate objects based on configuration; this pattern controls how different backends (local shell, mock shell) are selected
Time-Series Data and Historical Trending — A core value proposition of Scrutiny is tracking S.M.A.R.T metrics over time to detect degradation; understanding time-series aggregation and anomaly detection concepts helps with threshold tuning and new metrics
CLI Tool Design with urfave/cli — The collector binaries (collector-metrics, collector-selftest) use urfave/cli/v2 for argument parsing and subcommand dispatch; understanding this framework is needed to add new collector modes or modify CLI behavior

smartmontools/smartmontools — The upstream smartd/smartctl project that Scrutiny wraps; understanding smartctl output format is essential
prometheus/node_exporter — Similar monitoring exporter pattern; provides alternative S.M.A.R.T collection via textfile collector that Scrutiny could integrate with
TrueNAS/TrueNAS — Enterprise NAS platform with built-in S.M.A.R.T monitoring UI; competitor/inspiration for Scrutiny's dashboard design
influxdata/influxdb — Optional time-series backend for Scrutiny metrics export (already integrated via influxdb-client-go in go.mod)

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add comprehensive unit tests for collector/pkg/detect/devices_*.go platform detection

The repository has platform-specific device detection for Linux, Windows, FreeBSD, and Darwin (collector/pkg/detect/devices_*.go), but only devices_linux_test.go exists. This creates coverage gaps for Windows, FreeBSD, and Darwin detection logic. Adding tests for these platform-specific implementations would catch regressions in device enumeration across different operating systems and improve overall test coverage significantly.

[ ] Create collector/pkg/detect/devices_windows_test.go with mock implementations for Windows registry/WMI calls
[ ] Create collector/pkg/detect/devices_freebsd_test.go with test cases for FreeBSD-specific device enumeration
[ ] Create collector/pkg/detect/devices_darwin_test.go with test cases for macOS device detection
[ ] Use the existing go.uber.org/mock patterns found in the codebase to mock system calls
[ ] Ensure tests cover both happy paths and error cases (missing devices, permission errors, etc.)

Add integration tests for collector command execution with override configurations

The config/testdata directory contains override_commands.yaml and override_device_commands.yaml test fixtures, but there are no corresponding integration tests that verify the collector actually uses these overridden commands. This is critical for ensuring custom S.M.A.R.T command configurations work correctly. Adding tests would validate the full flow from config parsing through command execution.

[ ] Create collector/pkg/collector/collector_test.go with integration tests that load override_commands.yaml and override_device_commands.yaml
[ ] Mock the shell execution using collector/pkg/common/shell/mock/mock_shell.go to verify correct commands are invoked
[ ] Add tests for both device-level and global command overrides
[ ] Test that invalid command configurations (collector/pkg/config/testdata/invalid_commands_*.yaml) properly fail with meaningful errors

Implement unit tests for collector/pkg/config/config.go parsing and validation

While config/testdata contains multiple YAML test fixtures (allow_listed_devices_present.yaml, device_type_comma.yaml, ignore_device.yaml, raid_device.yaml, simple_device.yaml), there is no corresponding collector/pkg/config/config_test.go file. This leaves critical configuration parsing logic untested. Adding comprehensive tests would ensure all YAML configurations are correctly parsed and validated before the collector runs.

[ ] Create collector/pkg/config/config_test.go with test functions for each YAML fixture
[ ] Add tests that validate allow-listing/ignore-listing device configuration parsing
[ ] Test RAID device detection and comma-separated device_type parsing
[ ] Test error cases with invalid_commands_*.yaml fixtures to ensure proper validation failures
[ ] Verify configuration merging behavior (global vs. device-specific overrides)

🌿Good first issues

Add unit tests for collector/pkg/collector/selftest.go: currently no corresponding _test.go file exists, so adding self-test execution tests would improve coverage and is isolated work
Document S.M.A.R.T attribute threshold overrides in collector/pkg/config/: the config structure exists but lacks usage examples or inline docs explaining how to customize per-drive-model thresholds
Implement shell command timeout handling in collector/pkg/common/shell/local_shell.go: smartctl commands may hang on unresponsive drives; adding configurable timeouts with tests in local_shell_test.go is a contained improvement

⭐Top contributors

Click to expand

@kaysond — 29 commits
@AnalogJ — 27 commits
@packagr-io[bot] — 8 commits
@dependabot[bot] — 5 commits
@mcarbonne — 4 commits

📝Recent commits