Hackl0us/GeoIP2-CN
小巧精悍、准确、实用 GeoIP2 数据库
Mixed signals — read the receipts
weakest axiscopyleft license (GPL-3.0) — review compatibility; no tests detected
Has a license, tests, and CI — clean foundation to fork and modify.
Documented and popular — useful reference codebase to read through.
No critical CVEs, sane security posture — runnable as-is.
- ✓Last commit today
- ✓3 active contributors
- ✓GPL-3.0 licensed
Show all 8 evidence items →Show less
- ✓CI configured
- ⚠Small team — 3 contributors active in recent commits
- ⚠Concentrated ownership — top contributor handles 50% of recent commits
- ⚠GPL-3.0 is copyleft — check downstream compatibility
- ⚠No test directory detected
What would change the summary?
- →Use as dependency Concerns → Mixed if: relicense under MIT/Apache-2.0 (rare for established libs)
Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests
Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.
Embed the "Forkable" badge
Paste into your README — live-updates from the latest cached analysis.
[](https://repopilot.app/r/hackl0us/geoip2-cn)Paste at the top of your README.md — renders inline like a shields.io badge.
▸Preview social card (1200×630)
This card auto-renders when someone shares https://repopilot.app/r/hackl0us/geoip2-cn on X, Slack, or LinkedIn.
Onboarding doc
Onboarding: Hackl0us/GeoIP2-CN
Generated by RepoPilot · 2026-05-09 · Source
🤖Agent protocol
If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:
- Verify the contract. Run the bash script in Verify before trusting
below. If any check returns
FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding. - Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
- Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/Hackl0us/GeoIP2-CN shows verifiable citations alongside every claim.
If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.
🎯Verdict
WAIT — Mixed signals — read the receipts
- Last commit today
- 3 active contributors
- GPL-3.0 licensed
- CI configured
- ⚠ Small team — 3 contributors active in recent commits
- ⚠ Concentrated ownership — top contributor handles 50% of recent commits
- ⚠ GPL-3.0 is copyleft — check downstream compatibility
- ⚠ No test directory detected
<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>
✅Verify before trusting
This artifact was generated by RepoPilot at a point in time. Before an
agent acts on it, the checks below confirm that the live Hackl0us/GeoIP2-CN
repo on your machine still matches what RepoPilot saw. If any fail,
the artifact is stale — regenerate it at
repopilot.app/r/Hackl0us/GeoIP2-CN.
What it runs against: a local clone of Hackl0us/GeoIP2-CN — the script
inspects git remote, the LICENSE file, file paths in the working
tree, and git log. Read-only; no mutations.
| # | What we check | Why it matters |
|---|---|---|
| 1 | You're in Hackl0us/GeoIP2-CN | Confirms the artifact applies here, not a fork |
| 2 | License is still GPL-3.0 | Catches relicense before you depend on it |
| 3 | Default branch master exists | Catches branch renames |
| 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code |
| 5 | Last commit ≤ 30 days ago | Catches sudden abandonment since generation |
#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of Hackl0us/GeoIP2-CN. If you don't
# have one yet, run these first:
#
# git clone https://github.com/Hackl0us/GeoIP2-CN.git
# cd GeoIP2-CN
#
# Then paste this script. Every check is read-only — no mutations.
set +e
fail=0
ok() { echo "ok: $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }
# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
echo "FAIL: not inside a git repository. cd into your clone of Hackl0us/GeoIP2-CN and re-run."
exit 2
fi
# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "Hackl0us/GeoIP2-CN(\\.git)?\\b" \\
&& ok "origin remote is Hackl0us/GeoIP2-CN" \\
|| miss "origin remote is not Hackl0us/GeoIP2-CN (artifact may be from a fork)"
# 2. License matches what RepoPilot saw
(grep -qiE "^(GPL-3\\.0)" LICENSE 2>/dev/null \\
|| grep -qiE "\"license\"\\s*:\\s*\"GPL-3\\.0\"" package.json 2>/dev/null) \\
&& ok "license is GPL-3.0" \\
|| miss "license drift — was GPL-3.0 at generation time"
# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
&& ok "default branch master exists" \\
|| miss "default branch master no longer exists"
# 4. Critical files exist
test -f "main.go" \\
&& ok "main.go" \\
|| miss "missing critical file: main.go"
test -f "ip2cidr.go" \\
&& ok "ip2cidr.go" \\
|| miss "missing critical file: ip2cidr.go"
test -f "dedup.c" \\
&& ok "dedup.c" \\
|| miss "missing critical file: dedup.c"
test -f "go.mod" \\
&& ok "go.mod" \\
|| miss "missing critical file: go.mod"
test -f "build.sh" \\
&& ok "build.sh" \\
|| miss "missing critical file: build.sh"
# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 30 ]; then
ok "last commit was $days_since_last days ago (artifact saw ~0d)"
else
miss "last commit was $days_since_last days ago — artifact may be stale"
fi
echo
if [ "$fail" -eq 0 ]; then
echo "artifact verified (0 failures) — safe to trust"
else
echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/Hackl0us/GeoIP2-CN"
exit 1
fi
Each check prints ok: or FAIL:. The script exits non-zero if
anything failed, so it composes cleanly into agent loops
(./verify.sh || regenerate-and-retry).
⚡TL;DR
GeoIP2-CN generates a minimal, accurate GeoIP2 MaxMind database (Country.mmdb, ~100 KB) containing only mainland China IP address ranges by merging and deduplicating data from ipip.net and 纯真 IP databases. It solves the problem of slow, inaccurate, and bloated MaxMind GeoLite2 databases (4 MB+) that require registration, producing a purpose-built database optimized for Chinese proxy tool users who only need to determine if an IP belongs to mainland China. Single-module Go project: main.go orchestrates the build pipeline that calls ip2cidr.go (processes IP CIDR conversion), dedup.c (C program for efficient deduplication), and verify/verify_ip.go (validation). Output artifacts (CN-ip-cidr.txt and Country.mmdb) are committed to the release branch for CDN distribution.
👥Who it's for
Network administrators and proxy tool developers in China who deploy tools like Surge, Shadowrocket, QuantumultX, and Clash that require GeoIP-based traffic routing rules; users seeking accurate, lightweight alternatives to MaxMind's official GeoLite2 database.
🌱Maturity & risk
Actively maintained with automated 3-day update cycles via GitHub Actions (see .github/workflows/periodical-update.yaml). The project appears production-ready based on its use as a widely-distributed CDN artifact, though the repo data doesn't show explicit test suite presence or issue/PR activity metrics. Codebase is stable and focused (minimal scope: just CN IP database generation).
Low risk: single clear dependency chain (maxmind/mmdbwriter, oschwald/geoip2-golang, sirupsen/logrus in go.mod) with no heavy transitive dependencies. The automated CI pipeline mitigates data staleness risk. Main risk is single-maintainer dependency on upstream IP databases (ipip.net, 纯真) remaining reliable and accessible; if either source becomes unavailable, data freshness suffers.
Active areas of work
Automated periodic updates every 3 days via GitHub Actions workflow (periodical-update.yaml); the project maintains continuous data freshness by fetching latest IP ranges from upstream sources, deduplicating, and regenerating the MaxMind database format.
🚀Get running
git clone https://github.com/Hackl0us/GeoIP2-CN.git && cd GeoIP2-CN && go mod download && bash build.sh
Daily commands: bash build.sh (executes the full pipeline: fetches upstream IP data, deduplicates via dedup.c, converts to CIDR via ip2cidr.go, generates Country.mmdb via maxmind/mmdbwriter, validates with verify/verify_ip.go)
🗺️Map of the codebase
main.go— Entry point that orchestrates GeoIP2 database generation from multiple IP data sources.ip2cidr.go— Core logic converting IP address ranges to CIDR notation and deduplication for database building.dedup.c— C implementation for high-performance IP range deduplication, critical for data processing pipeline.go.mod— Declares maxmind/mmdbwriter dependency for generating GeoIP2-format databases and go version requirements.build.sh— Automated build and data fetch script that orchestrates compilation, data acquisition, and database generation..github/workflows/periodical-update.yaml— GitHub Actions workflow enabling 3-day automatic data updates, keeping GeoIP2 database current without manual intervention.verify/verify_ip.go— Validation tool for testing generated GeoIP2 database accuracy against known IP addresses.
🛠️How to make changes
Add Support for a New IP Data Source
- Add new data fetch URL and parsing logic in main.go within the data sourcing section (alongside ipip.net and 纯真 fetches). (
main.go) - Parse the new source format into IP range objects and append to the master list before deduplication. (
main.go) - Invoke the existing dedup.c pipeline or extend ip2cidr.go to handle the new ranges. (
ip2cidr.go) - Test against verify/verify_ip.go to ensure the new source improves accuracy on known IPs. (
verify/verify_ip.go)
Modify Build Frequency or Add Build Hooks
- Edit the cron schedule in the GitHub Actions workflow (e.g., change '3 days' interval to daily). (
.github/workflows/periodical-update.yaml) - Add pre- or post-build steps (e.g., linting, compression, S3 upload) by extending the workflow YAML. (
.github/workflows/periodical-update.yaml) - Update build.sh if additional shell commands, environment setup, or notification steps are needed. (
build.sh)
Enhance Deduplication or IP Range Logic
- Optimize or modify the C dedup algorithm in dedup.c (e.g., add new conflict resolution strategy or compression). (
dedup.c) - Recompile dedup via build.sh and test the performance impact on real IP datasets. (
build.sh) - If new CIDR generation logic is needed, add helper functions to ip2cidr.go for range merging or filtering. (
ip2cidr.go) - Run verify/verify_ip.go to confirm accuracy on test IP vectors. (
verify/verify_ip.go)
Add Validation or Test Coverage
- Extend verify/verify_ip.go to test additional IP ranges, edge cases, or regions. (
verify/verify_ip.go) - Create test datasets (e.g., known CN vs non-CN IPs) and integrate them into the workflow. (
.github/workflows/periodical-update.yaml) - Update build.sh to run verification after database generation, with failure notification on mismatch. (
build.sh)
🔧Why these technologies
- Go (main.go, ip2cidr.go, verify_ip.go) — Cross-platform, fast startup, minimal dependencies; ideal for data processing pipelines and CLI tools without requiring a runtime.
- C (dedup.c) — High-performance deduplication of large IP range datasets; eliminates memory overhead and achieves near-native speed for millions of ranges.
- MaxMind mmdbwriter — Produces standardized GeoIP2 binary format (.mmdb) compatible with existing geolocation libraries; enables drop-in replacement for MaxMind GeoLite2.
- GitHub Actions + GitHub Pages/CDN — Fully automated 3-day refresh cycle with zero infrastructure cost; leverages CDN for global distribution and easy download access.
- CIDR + IP range merging — Compresses thousands of IP entries into compact CIDR blocks (~100 KB database) for fast lookup and minimal memory footprint in production.
⚖️Trade-offs already made
-
China-only GeoIP2 database instead of global like GeoLite2
- Why: Eliminates need for registration, vastly reduces database size, and improves accuracy for the primary use case (CN routing rules).
- Consequence: Cannot lookup non-CN IPs; users relying on global geolocation must still use MaxMind GeoLite2 or other sources for out-of-CN queries.
-
Reliance on third-party IP sources (ipip.net, 纯真) rather than official APNIC/ARIN registries
- Why: Third-party sources are more frequently updated, better curated for China accuracy, and easier to automate.
- Consequence: Potential lag behind registry changes; dependency on uptime of external data sources; accuracy as good as weakest merged source.
-
Fully automated 3-day update cycle vs. manual updates
- Why: Removes operational overhead, ensures database freshness, and enables subscribers to always use latest CN IP allocations.
- Consequence: No manual review of changes; if a source introduces bad data, it will propagate into releases until verification catches it.
-
Pure Go + C implementation vs. scripting (Python, Ruby, etc.)
- Why: Faster execution, no external runtime dependency, and compiled binaries can run in restricted CI/CD environments.
- Consequence: Steeper learning curve for contributors unfamiliar with Go/C; slightly more verbose code for simple data transformations.
🚫Non-goals (don't propose these)
- Does not support global (non-CN) IP geolocation; use MaxMind GeoLite2 or other sources for worldwide coverage.
- Does not provide real-time IP geolocation API; outputs static .mmdb file for offline use.
- Does not validate upstream data source correctness; assumes ipip.net and 纯真 provide accurate CN IP allocations.
- Does not offer multi-country or region-specific databases; limited to Mainland China (CN) classification.
🪤Traps & gotchas
build.sh likely requires network access to ipip.net and 纯真 IP database sources (URLs/credentials not visible in provided files, may be environment variables). The dedup.c program must compile on the CI system; verify build tools are available. The output artifacts (Country.mmdb, CN-ip-cidr.txt) are pushed to the 'release' branch, not main—verify your git workflow targets the correct branch. No explicit tests visible in file list; validation relies solely on verify/verify_ip.go.
🏗️Architecture
💡Concepts to learn
- CIDR (Classless Inter-Domain Routing) notation — ip2cidr.go converts raw IP ranges into CIDR blocks (e.g., 1.0.1.0/24) which are the standard input format for MaxMind mmdbwriter and enable efficient IP range matching in proxy tools
- MaxMind mmdb binary database format — Country.mmdb uses this proprietary but open-source format; understanding its structure (memory-mapped, searchable tree) explains why the project exists as an alternative to ASCII lists and why dedup.c optimization matters
- GeoIP geolocation database — Proxy tools use GeoIP databases to determine geographic location of IP addresses and make routing decisions (direct vs. proxy); this project creates a specialized CN-only version for Chinese users
- GitHub Actions CI/CD automation — periodical-update.yaml orchestrates the entire data refresh cycle every 3 days without manual intervention; understanding this workflow is essential to maintaining the project's core value proposition (automatic freshness)
- CDN-distributed static artifacts — The project leverages jsDelivr CDN to distribute Country.mmdb and CN-ip-cidr.txt globally; understanding edge caching behavior and CDN purge timing is important for deployment guarantees
🔗Related repos
maxmind/geoipupdate— Official MaxMind tool for downloading and updating GeoIP2 databases; this project reimplements database generation as a specialized alternative for CN-only useoschwald/geoip2-golang— Go library used to query GeoIP2 databases; verify/verify_ip.go likely depends on this to test the generated Country.mmdbv2fly/geoip— Similar project that generates GeoIP databases for proxy tools (e.g., v2ray); common alternative approach to the same problem domainLoyalsoldier/geoip— Another popular GeoIP database generator for Chinese proxy tools; maintains mmdb, dat, and text formats for Clash, Surge, v2ray compatibilitymaxmind/mmdbwriter— Direct dependency for writing MaxMind binary database format; core library enabling this project to generate Country.mmdb output
🪄PR ideas
To work on one of these in Claude Code or Cursor, paste:
Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.
Add integration tests for MMDB generation and query accuracy in verify/verify_ip.go
The repo has a verify/ directory with verify_ip.go but no actual test suite. Given that accuracy is the project's core selling point, adding comprehensive tests to validate that the generated MMDB correctly identifies CN IPs and rejects non-CN IPs is critical. This should test against known IP ranges from ipip.net and 纯真 databases.
- [ ] Create verify/verify_ip_test.go with test cases for known CN IP ranges (e.g., major ISPs, cloud providers)
- [ ] Add test cases for known non-CN IPs that should NOT match
- [ ] Test edge cases around IP range boundaries to catch off-by-one errors in dedup.c
- [ ] Run tests in periodical-update.yaml workflow to catch regressions in generated databases
Implement deduplication and overlap detection tests for dedup.c
The dedup.c file handles merging and deduplicating IP ranges from multiple sources (ipip.net, 纯真), but there are no tests validating correctness. Bugs here directly impact accuracy. Add tests to verify ranges are properly merged, overlaps eliminated, and no IP segments are lost or corrupted.
- [ ] Create a test suite (C or Go) that validates dedup.c logic with sample CIDR inputs
- [ ] Test cases: adjacent ranges should merge, overlapping ranges should deduplicate, non-overlapping ranges should remain separate
- [ ] Validate output CIDR blocks are in canonical form and properly sorted
- [ ] Add benchmarks to ensure dedup performance remains acceptable as dataset grows
Add GitHub Actions workflow to validate MMDB output against upstream data sources
The periodical-update.yaml workflow generates the database but has no validation step. Add a dedicated workflow that downloads the latest ipip.net and 纯真 databases, generates the CN MMDB, and runs spot-checks against known IPs to catch upstream data quality issues before publishing.
- [ ] Create .github/workflows/validate-mmdb.yaml that runs on periodical-update completion
- [ ] Step: Download both ipip.net and 纯真 source databases (or their test datasets)
- [ ] Step: Run main.go to generate MMDB and execute verify/verify_ip.go tests
- [ ] Step: Add sample IP validation (e.g., test 10+ major CN ISP IPs and 5+ foreign IPs) to catch regressions
- [ ] Optional: Add coverage reporting or comparison metrics vs previous generation
🌿Good first issues
- Add Go unit tests for ip2cidr.go to verify CIDR conversion logic handles edge cases (overlapping ranges, single IPs, empty input); currently validation is manual-only
- Document the exact format and URL sources for ipip.net and 纯真 IP databases in README.md's Technical Details section, including any authentication requirements or rate limits
- Create a GitHub Actions pre-flight check job that validates Country.mmdb output against a set of known mainland China IPs (e.g., from Alibaba Cloud, Tencent) to catch regressions before release
⭐Top contributors
Click to expand
Top contributors
📝Recent commits
Click to expand
Recent commits
4eb80da— Merge pull request #42 from TechCiel/master (Hackl0us)c1fc379— Deep deduplication for text CIDR list (TechCiel)c053afa— Update README (Stash) (Hackl0us)a56eb02— Update README (Hackl0us)a97f6e8— Add detailed usages and warnings. (Hackl0us)c0a257c— Add GPL v3.0 LICENSE (Hackl0us)4624c81— Remove artifacts folder and its contents. Artifacts have been moved to release branch permanently. (Hackl0us)554dd48— Add Social Preview picture and its license. (Hackl0us)9a009a8— Force push artifacts to release branch to avoid .git foler takes up too much space as commits increase. (Hackl0us)10181e2— Modify Readme and repo name to avoid trademark disputes; Add usages for proxy tools; (Hackl0us)
🔒Security observations
The codebase has significant security concerns primarily around outdated dependencies that are 3-4+ years old and no longer receive security updates. Go 1.14 is also end-of-life. While the project's core functionality (IP database processing) appears sound, the lack of modern dependency management and supply chain security practices presents moderate to high risk. Immediate action is required to update all dependencies and the Go version. Additionally
- High · Outdated Dependency - mmdbwriter —
go.mod. The dependency 'github.com/maxmind/mmdbwriter' is pinned to version v0.0.0-20200911190049-91ab57d2e8e9 from September 2020. This is over 3+ years old and may contain unpatched security vulnerabilities. No security patches or updates have been applied. Fix: Update to the latest version of mmdbwriter. Run 'go get -u github.com/maxmind/mmdbwriter' and test thoroughly. Review the changelog for breaking changes. - High · Outdated Dependency - logrus —
go.mod. The dependency 'github.com/sirupsen/logrus' is pinned to version v1.6.0 from August 2020. This is significantly outdated and may have known security issues. Current versions are v1.9.x+. Fix: Update to the latest stable version of logrus. Run 'go get -u github.com/sirupsen/logrus' and verify no breaking changes affect the codebase. - High · Outdated Dependency - geoip2-golang —
go.mod. The dependency 'github.com/oschwald/geoip2-golang' is pinned to version v1.4.0 from August 2019. This is extremely outdated (4+ years old) and likely contains unpatched vulnerabilities. Fix: Update to the latest version of geoip2-golang. Check for breaking API changes and thoroughly test the update. - Medium · Outdated Go Version —
go.mod. The project specifies 'go 1.14' which was released in February 2020 and reached end-of-life in August 2021. This version no longer receives security updates. Fix: Update to Go 1.21 or later (latest LTS/stable release). Ensure code is compatible with modern Go versions and rebuild/test thoroughly. - Medium · Automated Workflow without Secret Management Audit —
.github/workflows/periodical-update.yaml. The repository contains a periodical update workflow (periodical-update.yaml) that likely performs automated data updates. Without reviewing the workflow file, there's a risk of credentials or tokens being exposed in CI/CD configurations. Fix: Review the workflow file for hardcoded secrets. Ensure all credentials use GitHub Secrets, not environment variables or hardcoded values. Implement branch protection rules and require code reviews for workflow changes. - Medium · Missing SBOM and Supply Chain Security —
Project root / build.sh. No Software Bill of Materials (SBOM), dependency lock verification, or supply chain security measures are evident. The project downloads and integrates external IP databases without documented verification mechanisms. Fix: Implement checksum verification for downloaded databases. Generate and maintain an SBOM. Use Go's module verification (go.sum) and consider signing releases. Document data source integrity checks. - Low · Potential Data Source Integrity Risk —
dedup.c / build.sh. The project merges IP data from multiple external sources (ipip.net and 纯真 databases). No documented verification mechanism is visible for ensuring data authenticity and integrity of these external sources. Fix: Implement cryptographic verification (checksums/signatures) for all external data sources. Document the process for validating data source integrity and include it in CI/CD. - Low · Missing Code Signing and Release Security —
Project release management. No evidence of signed releases, artifacts, or code signing practices. Users cannot cryptographically verify the authenticity of distributed binaries. Fix: Implement signed releases using GPG keys. Sign binary artifacts and provide verification instructions. Consider using tools like cosign for artifact signing.
LLM-derived; treat as a starting point, not a security audit.
👉Where to read next
- Open issues — current backlog
- Recent PRs — what's actively shipping
- Source on GitHub
Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.