gh0stkey/HaE

Item: gh0stkey/HaE
Rating: 3
Author: RepoPilot

HaE - Highlighter and Extractor, Empower ethical hacker for efficient operations. 赋能白帽，高效作战！

Mixed

Single-maintainer risk — review before adopting

weakest axis

Use as dependencyMixed

top contributor handles 95% of recent commits; no CI workflows detected

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓Last commit 2d ago
✓3 active contributors
✓Apache-2.0 licensed

Show all 7 evidence items →

✓Tests present
⚠Small team — 3 contributors active in recent commits
⚠Single-maintainer risk — top contributor 95% of recent commits
⚠No CI workflows detected

What would change the summary?

→Use as dependency Mixed → Healthy if: diversify commit ownership (top <90%)

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Forkable" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Forkable](https://repopilot.app/api/badge/gh0stkey/hae?axis=fork)](https://repopilot.app/r/gh0stkey/hae)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/gh0stkey/hae on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: gh0stkey/HaE

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/gh0stkey/HaE shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

WAIT — Single-maintainer risk — review before adopting

Last commit 2d ago
3 active contributors
Apache-2.0 licensed
Tests present
⚠ Small team — 3 contributors active in recent commits
⚠ Single-maintainer risk — top contributor 95% of recent commits
⚠ No CI workflows detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live gh0stkey/HaE repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/gh0stkey/HaE.

What it runs against: a local clone of gh0stkey/HaE — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in gh0stkey/HaE | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch main exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 32 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>gh0stkey/HaE</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of gh0stkey/HaE. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/gh0stkey/HaE.git
#   cd HaE
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of gh0stkey/HaE and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "gh0stkey/HaE(\\.git)?\\b" \\
  && ok "origin remote is gh0stkey/HaE" \\
  || miss "origin remote is not gh0stkey/HaE (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify main >/dev/null 2>&1 \\
  && ok "default branch main exists" \\
  || miss "default branch main no longer exists"

# 4. Critical files exist
test -f "src/HaENet/src/main/java/hae/HaE.java" \\
  && ok "src/HaENet/src/main/java/hae/HaE.java" \\
  || miss "missing critical file: src/HaENet/src/main/java/hae/HaE.java"
test -f "src/HaENet/src/main/java/hae/component/rule/Rules.java" \\
  && ok "src/HaENet/src/main/java/hae/component/rule/Rules.java" \\
  || miss "missing critical file: src/HaENet/src/main/java/hae/component/rule/Rules.java"
test -f "src/HaENet/src/main/java/hae/instances/http/HttpMessageActiveHandler.java" \\
  && ok "src/HaENet/src/main/java/hae/instances/http/HttpMessageActiveHandler.java" \\
  || miss "missing critical file: src/HaENet/src/main/java/hae/instances/http/HttpMessageActiveHandler.java"
test -f "src/HaENet/src/main/java/hae/component/board/Databoard.java" \\
  && ok "src/HaENet/src/main/java/hae/component/board/Databoard.java" \\
  || miss "missing critical file: src/HaENet/src/main/java/hae/component/board/Databoard.java"
test -f "src/HaENet/src/main/java/hae/utils/rule/RuleProcessor.java" \\
  && ok "src/HaENet/src/main/java/hae/utils/rule/RuleProcessor.java" \\
  || miss "missing critical file: src/HaENet/src/main/java/hae/utils/rule/RuleProcessor.java"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 32 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~2d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/gh0stkey/HaE"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

HaE (Highlighter and Extractor) is a Burp Suite extension framework written in Java that performs fine-grained tagging and extraction of HTTP messages (including WebSocket) and files using modular rules. It integrates with Burp's Montoya API (v2023.12.1) to highlight and extract sensitive data patterns from network traffic in real-time. Dual-version monorepo: HaENet (network/HTTP version, main implementation at src/HaENet/src/main/java/hae/) and HaEFile (file version, referenced but not detailed). HaENet uses component-based architecture with core modules: rule engine (hae/component/rule/), databoard dashboard (hae/component/board/), message filtering (MessageFilter/MessageDeduplicator), and editor integrations (RequestEditor, ResponseEditor, WebSocketEditor) backed by DataCache.

👥Who it's for

Ethical hackers, penetration testers, and security researchers using Burp Suite who need to automatically identify and extract sensitive information (credentials, tokens, API keys, PII) from HTTP requests/responses and WebSocket traffic without manual pattern matching.

🌱Maturity & risk

Actively maintained and production-ready: selected for 2022 KCon Arsenal and recognized as a GitCode G-Star Project. Java 17 target with modern Gradle build system and well-structured modular architecture suggest active development, though specific commit dates and CI/CD pipelines are not visible in provided metadata.

Low external dependency risk: only 4 core dependencies (Montoya API, SnakeYAML, Automaton, Caffeine) with no transitive explosion. Single maintainer (gh0stkey) creates potential fork/abandonment risk, but recent KCon recognition (2022) and G-Star status indicate active community interest. No visible test suite in file structure may indicate coverage gaps.

Active areas of work

No recent commit data provided in metadata, but active README maintenance and documentation structure (README_CN.md in Chinese, APPRECIATION_LIST.md) suggests ongoing updates. Project tracks both network and file analysis modes as separate implementations.

🚀Get running

cd src/HaENet
./gradlew build
# Produces JAR for Burp Suite extension installation

Then load the compiled JAR as a Burp extension via Extensions > Installed > Add.

Daily commands:

cd src/HaENet
./gradlew jar
# Output: build/libs/HaENet-all.jar (fat JAR with dependencies)
# Then import into Burp Suite Extensions tab

No dev server; runs as Burp extension directly.

🗺️Map of the codebase

src/HaENet/src/main/java/hae/HaE.java — Main entry point and plugin initialization for the Burp Suite extension; all contributors must understand the plugin lifecycle and how HaE registers with Montoya API.
src/HaENet/src/main/java/hae/component/rule/Rules.java — Core rule management system that drives the highlighting and extraction logic; essential for understanding how rules are loaded, validated, and applied to messages.
src/HaENet/src/main/java/hae/instances/http/HttpMessageActiveHandler.java — Primary HTTP message interceptor and processor; the main integration point with Burp Suite that triggers rule evaluation on all network traffic.
src/HaENet/src/main/java/hae/component/board/Databoard.java — UI component displaying extracted data and matches; critical for understanding how results are rendered and persisted in the Burp interface.
src/HaENet/src/main/java/hae/utils/rule/RuleProcessor.java — Rule parsing and compilation engine that converts YAML rules into executable matchers; foundational for regex and extraction logic.
src/HaENet/src/main/resources/rules/Rules.yml — Default rule definition file in YAML format; demonstrates the rule syntax and patterns that the framework evaluates against network data.
src/HaENet/src/main/java/hae/repository/impl/DataRepositoryImpl.java — Data persistence layer managing extracted entries and caching; critical for understanding state management and storage between sessions.

🛠️How to make changes

Add a new extraction rule

Open or edit src/HaENet/src/main/resources/rules/Rules.yml and add a new rule entry with name, pattern (regex), scope, and color fields (src/HaENet/src/main/resources/rules/Rules.yml)
Reload the Rules component in HaE UI or restart Burp Suite to parse the new rule via RuleProcessor (src/HaENet/src/main/java/hae/utils/rule/RuleProcessor.java)
The rule will automatically be evaluated by HttpMessageActiveHandler on all future HTTP messages (src/HaENet/src/main/java/hae/instances/http/HttpMessageActiveHandler.java)

Add a new message filter or UI feature to Databoard

Create a new filter method in MessageFilter.java or add UI controls in Databoard.java's constructor (src/HaENet/src/main/java/hae/component/board/Databoard.java)
Implement filtering logic that updates the MessageTableModel to show/hide rows based on criteria (src/HaENet/src/main/java/hae/component/board/message/MessageFilter.java)
Register any new UI events in MessageTableModel.getValueAt() or the table's selection listener (src/HaENet/src/main/java/hae/component/board/message/MessageTableModel.java)

Add support for a new message type or protocol handler

Create a new handler class extending Burp's handler interface (e.g., ProxyMessageHandler or WebSocketMessageHandler pattern) (src/HaENet/src/main/java/hae/instances/http/HttpMessageActiveHandler.java)
Implement the message interception logic and call MessageProcessor.process() to apply rule matching (src/HaENet/src/main/java/hae/instances/http/utils/MessageProcessor.java)
Register the new handler in HandlerRegistry.registerHandlers() so it is loaded by HaE.initialize() (src/HaENet/src/main/java/hae/service/HandlerRegistry.java)

Persist and retrieve custom extraction data

Ensure MessageEntry objects are created with all needed fields in MessageProcessor or the rule matcher (src/HaENet/src/main/java/hae/component/board/message/MessageEntry.java)
Call DataRepository methods (save, find, delete) from the UI or handler to persist entries using DataRepositoryImpl (src/HaENet/src/main/java/hae/repository/impl/DataRepositoryImpl.java)
Configure persistence settings (file path, cache size, TTL) in the Config UI and ConfigLoader (src/HaENet/src/main/java/hae/component/Config.java)

🔧Why these technologies

Burp Suite Montoya API (2023.12.1) — Enables deep integration with Burp Suite as a commercial plugin; provides modern async message interception and UI integration hooks
YAML (SnakeYAML 2.0) for rule definitions — Human-readable, minimal syntax for security researchers to define regex patterns and extraction rules without code changes
Caffeine caching (3. — undefined

🪤Traps & gotchas

Montoya API version lock: build.gradle pins montoya-api:2023.12.1 — incompatible with older/newer Burp versions; update requires testing against specific Burp Suite version. 2. Fat JAR dependency merging: build.gradle duplicatesStrategy=EXCLUDE may silently drop conflicting transitive classes; test carefully if adding dependencies. 3. No test suite visible: no src/test/ in provided structure; ensure manual testing before deployment. 4. Rule format expectations: Rules.java likely expects YAML config format (SnakeYAML imported) but example configs not provided in file list — check README_CN.md for config schema. 5. WebSocket editor: WebSocketEditor.java exists but may have incomplete Montoya API support in older extension versions.

🏗️Architecture

💡Concepts to learn

Montoya API extension hooks — HaE is a Burp extension; understanding how HttpMessageActiveHandler integrates with Montoya's event-driven model is essential for modifying interception behavior
Automaton-based regex engine — Dependency dk.brics.automaton provides DFA-based pattern matching used for rule compilation; more efficient than Java regex for large-scale traffic filtering
Message deduplication with caching — MessageDeduplicator and Caffeine cache prevent re-processing identical request/response pairs; critical optimization for Burp traffic flowing through databoard continuously
Fine-grained data extraction pipelines — HaE's core design applies modular rules sequentially (filter → deduplicate → render); understanding composition patterns helps extend extraction capabilities
YAML rule serialization — Rules are stored as YAML via SnakeYAML; users define extraction patterns in config files, so understanding YAML schema is required for rule authoring
TableModel/MVC for Swing databoards — Databoard uses MessageTableModel (MVC pattern) to render extracted findings in GUI; modifications to display logic require understanding Swing component binding
Scope filtering in network traffic — ScopedDataboardDialog limits rule matching to in-scope targets; critical for focusing extraction on target application only, reducing noise in burp logs

PortSwigger/burp-extensions-montoya-api — Official Burp Montoya API documentation and examples; required reference for understanding extension hooks used throughout HaE
OWASP/O-Saft — Alternative OWASP Burp extension for security testing; overlapping use case of extracting sensitive patterns from traffic
PortSwigger/turbo-intruder — Complementary Burp extension for advanced HTTP manipulation; users of HaE often pair with this for automated exploitation pipelines
projectdiscovery/nuclei — External companion tool for YAML-based vulnerability scanning; HaE rules could be adapted to nuclei format for CI/CD integration
gh0stkey/HaEFile — File analysis sibling of HaENet in same repo; processes extracted data from HaE for offline forensics

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add unit tests for RegularMatcher and MessageProcessor utilities

The HTTP message processing logic in src/HaENet/src/main/java/hae/instances/http/utils/ handles critical regex matching and message processing for rule extraction. These utilities lack unit tests, making it difficult to verify correctness of pattern matching behavior and prevent regressions when rules are updated. This is especially important for a security tool where false positives/negatives are problematic.

[ ] Create src/HaENet/src/test/java/hae/instances/http/utils/RegularMatcherTest.java with tests for various regex patterns (basic, complex, edge cases)
[ ] Create src/HaENet/src/test/java/hae/instances/http/utils/MessageProcessorTest.java with tests for message extraction, deduplication, and filtering logic
[ ] Add test configuration to build.gradle (testImplementation 'junit:junit:4.13.2' and 'org.mockito:mockito-core:5.x')
[ ] Test edge cases like malformed input, special characters, and empty messages

Add GitHub Actions CI workflow for Java build validation

The project has build.gradle and Java 17 target compatibility defined, but lacks automated CI to verify builds on commits. This means contributors could inadvertently break builds for the HaENet Burp extension or create compilation errors that go undetected until pull request review, slowing development cycles.

[ ] Create .github/workflows/build.yml with Java 17 setup and gradle build validation
[ ] Add test execution step: 'gradle test' to run any unit tests present
[ ] Add jar artifact build step to verify the extension builds correctly: 'gradle jar'
[ ] Configure workflow to run on push to main and all pull requests

Add integration tests for Rule evaluation against real HTTP messages

The Rule.java and Tester.java components in src/HaENet/src/main/java/hae/component/rule/ are core to the extraction functionality, but lack integration tests demonstrating rule application against realistic HTTP request/response payloads. This makes it difficult to verify that the rule system correctly extracts data per the configuration, and prevents regression detection when the rule engine is modified.

[ ] Create src/HaENet/src/test/java/hae/component/rule/RuleIntegrationTest.java with sample HTTP messages and rules
[ ] Add test cases for: basic keyword extraction, regex pattern matching, multi-line response parsing, WebSocket message handling
[ ] Create test fixture YAML rule files in src/HaENet/src/test/resources/rules/ with examples (credential patterns, API keys, tokens)
[ ] Mock Burp Montoya API components needed for Rule.execute() testing

🌿Good first issues

Add JUnit 5 test suite for src/HaENet/src/main/java/hae/component/rule/Rule.java and Tester.java covering regex and automaton pattern matching edge cases (currently no test/ directory visible)
Document rule YAML schema with examples in src/HaENet/README.md — currently users must infer config format from Rules.java implementation
Implement unit tests for MessageDeduplicator.java to verify duplicate detection logic and cache eviction under high-volume traffic scenarios

⭐Top contributors

Click to expand

@gh0stkey — 95 commits
@chen — 3 commits
@0Chencc — 2 commits

📝Recent commits

Click to expand

b8a27c9 — Create LICENSE (gh0stkey)
c948b7b — Delete src/HaENet/LICENSE (gh0stkey)
fe78fe4 — Fix rules library URL in README_CN.md (gh0stkey)
24fc6d1 — Update README.md (gh0stkey)
03a5d23 — Version: 5.2.2 Update (gh0stkey)
7ab4bd9 — Version: 5.2.2 Update (gh0stkey)
ef7af9a — Version: 5.2.2 Update (gh0stkey)
3614499 — Version: 5.2.1 Update (gh0stkey)
0ac4abb — Version: 5.2 Update (gh0stkey)
7d8e49f — Update README.md (gh0stkey)

🔒Security observations

HaE is a Burp Suite extension for HTTP/WebSocket message highlighting and extraction. The codebase demonstrates reasonable security practices with a modular architecture and centralized validation service. However, there are concerns regarding dependency management, YAML parsing safety, and potential data exposure through caching mechanisms. The project would benefit from dependency verification mechanisms, explicit input validation documentation, and a security-focused code review of message processing pipelines. The fat JAR bundling strategy complicates vulnerability tracking. Overall security posture is moderate but could be significantly improved with addressed recommendations.

Medium · Outdated Burp Montoya API dependency — src/HaENet/build.gradle. The project uses 'net.portswigger.burp.extensions:montoya-api:2023.12.1' which is over a year old. Newer versions may contain security patches and bug fixes that are absent in this version. Fix: Update to the latest stable version of montoya-api. Check the official Burp Suite documentation for the latest available version and update accordingly.
Low · Potential unsafe YAML parsing with SnakeYAML — src/HaENet/build.gradle, src/main/resources/rules/Rules.yml, src/HaENet/src/main/java/hae/utils/ConfigLoader.java. The project includes 'org.yaml:snakeyaml:2.0' for YAML configuration parsing. While version 2.0 is relatively recent, YAML parsing can be vulnerable to code execution if untrusted YAML is loaded without proper configuration. The file 'src/main/resources/rules/Rules.yml' suggests user-supplied or external YAML content may be parsed. Fix: Ensure SnakeYAML is configured with safe settings: use SafeConstructor instead of the default Constructor, disable code execution features, and validate all YAML input before parsing. Review ConfigLoader.java to verify safe YAML handling practices.
Low · Fat JAR bundling without verification — src/HaENet/build.gradle (jar task). The build.gradle uses 'duplicatesStrategy = DuplicatesStrategy.EXCLUDE' and bundles all runtime dependencies into a single JAR. This approach can mask dependency conflicts and makes vulnerability scanning more difficult. No dependency verification or signature checking is configured. Fix: Implement dependency verification using Gradle's built-in verification-metadata.xml. Consider using 'dependencyLocking' to ensure reproducible builds. Document the fat JAR approach and implement regular dependency scanning with tools like OWASP Dependency-Check or Snyk.
Low · Missing input validation framework — src/HaENet/src/main/java/hae/service/ValidatorService.java, src/HaENet/src/main/java/hae/instances/http/utils/RegularMatcher.java. The project contains 'ValidatorService.java' but the file structure suggests it may not comprehensively validate user inputs across all modules. Particular concern for HTTP message processing (HttpMessageActiveHandler.java, RegularMatcher.java) and WebSocket handling (WebSocketMessageHandler.java) which may process untrusted data. Fix: Implement centralized input validation for all user-supplied data. Specifically validate regex patterns before compilation to prevent ReDoS (Regular Expression Denial of Service) attacks in RegularMatcher.java. Document validation rules for HTTP and WebSocket message processing.
Low · Potential sensitive data exposure in caching — src/HaENet/src/main/java/hae/cache/DataCache.java. The project implements a DataCache component (DataCache.java) using Caffeine caching library. If HTTP requests/responses containing sensitive data (credentials, tokens, PII) are cached without expiration or encryption, they could be exposed. Fix: Review caching implementation to ensure: 1) Sensitive data is not cached, 2) Cache entries have appropriate TTL (time-to-live), 3) Cache is cleared on application shutdown, 4) Access to cached data is protected. Consider using encrypted storage for cached items.

LLM-derived; treat as a starting point, not a security audit.

👉Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

gh0stkey/HaE

Embed the "Forkable" badge

Onboarding doc

Onboarding: gh0stkey/HaE

🤖Agent protocol

🎯Verdict

✅Verify before trusting

⚡TL;DR

👥Who it's for

🌱Maturity & risk

Active areas of work

🚀Get running

🗺️Map of the codebase

🛠️How to make changes

Add a new extraction rule

Add a new message filter or UI feature to Databoard

Add support for a new message type or protocol handler

Persist and retrieve custom extraction data

🔧Why these technologies

🪤Traps & gotchas

🏗️Architecture

💡Concepts to learn

🔗Related repos

🪄PR ideas

Add unit tests for RegularMatcher and MessageProcessor utilities

Add GitHub Actions CI workflow for Java build validation

Add integration tests for Rule evaluation against real HTTP messages

🌿Good first issues

⭐Top contributors

Top contributors

📝Recent commits

Recent commits

🔒Security observations

👉Where to read next