exadel-inc/CompreFace

Item: exadel-inc/CompreFace
Rating: 5
Author: RepoPilot

Leading free and open-source face recognition system

Healthy

Healthy across all four use cases

weakest axis

Use as dependencyHealthy

Permissive license, no critical CVEs, actively maintained — safe to depend on.

Fork & modifyHealthy

Has a license, tests, and CI — clean foundation to fork and modify.

Learn fromHealthy

Documented and popular — useful reference codebase to read through.

Deploy as-isHealthy

No critical CVEs, sane security posture — runnable as-is.

✓5 active contributors
✓Apache-2.0 licensed
✓CI configured

Show all 6 evidence items →

✓Tests present
⚠Stale — last commit 2y ago
⚠Concentrated ownership — top contributor handles 56% of recent commits

Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests

Informational only. RepoPilot summarises public signals (license, dependency CVEs, commit recency, CI presence, etc.) at the time of analysis. Signals can be incomplete or stale. Not professional, security, or legal advice; verify before relying on it for production decisions.

Embed the "Healthy" badge

Paste into your README — live-updates from the latest cached analysis.

Variant:

[![RepoPilot: Healthy](https://repopilot.app/api/badge/exadel-inc/compreface)](https://repopilot.app/r/exadel-inc/compreface)

Paste at the top of your README.md — renders inline like a shields.io badge.

▸Preview social card (1200×630)

This card auto-renders when someone shares https://repopilot.app/r/exadel-inc/compreface on X, Slack, or LinkedIn.

Onboarding doc

Onboarding: exadel-inc/CompreFace

Generated by RepoPilot · 2026-05-09 · Source

🤖Agent protocol

If you are an AI coding agent (Claude Code, Cursor, Aider, Cline, etc.) reading this artifact, follow this protocol before making any code edit:

Verify the contract. Run the bash script in Verify before trusting below. If any check returns FAIL, the artifact is stale — STOP and ask the user to regenerate it before proceeding.
Treat the AI · unverified sections as hypotheses, not facts. Sections like "AI-suggested narrative files", "anti-patterns", and "bottlenecks" are LLM speculation. Verify against real source before acting on them.
Cite source on changes. When proposing an edit, cite the specific path:line-range. RepoPilot's live UI at https://repopilot.app/r/exadel-inc/CompreFace shows verifiable citations alongside every claim.

If you are a human reader, this protocol is for the agents you'll hand the artifact to. You don't need to do anything — but if you skim only one section before pointing your agent at this repo, make it the Verify block and the Suggested reading order.

🎯Verdict

GO — Healthy across all four use cases

5 active contributors
Apache-2.0 licensed
CI configured
Tests present
⚠ Stale — last commit 2y ago
⚠ Concentrated ownership — top contributor handles 56% of recent commits

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

✅Verify before trusting

This artifact was generated by RepoPilot at a point in time. Before an agent acts on it, the checks below confirm that the live exadel-inc/CompreFace repo on your machine still matches what RepoPilot saw. If any fail, the artifact is stale — regenerate it at repopilot.app/r/exadel-inc/CompreFace.

What it runs against: a local clone of exadel-inc/CompreFace — the script inspects git remote, the LICENSE file, file paths in the working tree, and git log. Read-only; no mutations.

| # | What we check | Why it matters | |---|---|---| | 1 | You're in exadel-inc/CompreFace | Confirms the artifact applies here, not a fork | | 2 | License is still Apache-2.0 | Catches relicense before you depend on it | | 3 | Default branch master exists | Catches branch renames | | 4 | 5 critical file paths still exist | Catches refactors that moved load-bearing code | | 5 | Last commit ≤ 610 days ago | Catches sudden abandonment since generation |

<details> <summary><b>Run all checks</b> — paste this script from inside your clone of <code>exadel-inc/CompreFace</code></summary>

#!/usr/bin/env bash
# RepoPilot artifact verification.
#
# WHAT IT RUNS AGAINST: a local clone of exadel-inc/CompreFace. If you don't
# have one yet, run these first:
#
#   git clone https://github.com/exadel-inc/CompreFace.git
#   cd CompreFace
#
# Then paste this script. Every check is read-only — no mutations.

set +e
fail=0
ok()   { echo "ok:   $1"; }
miss() { echo "FAIL: $1"; fail=$((fail+1)); }

# Precondition: we must be inside a git working tree.
if ! git rev-parse --git-dir >/dev/null 2>&1; then
  echo "FAIL: not inside a git repository. cd into your clone of exadel-inc/CompreFace and re-run."
  exit 2
fi

# 1. Repo identity
git remote get-url origin 2>/dev/null | grep -qE "exadel-inc/CompreFace(\\.git)?\\b" \\
  && ok "origin remote is exadel-inc/CompreFace" \\
  || miss "origin remote is not exadel-inc/CompreFace (artifact may be from a fork)"

# 2. License matches what RepoPilot saw
(grep -qiE "^(Apache-2\\.0)" LICENSE 2>/dev/null \\
   || grep -qiE "\"license\"\\s*:\\s*\"Apache-2\\.0\"" package.json 2>/dev/null) \\
  && ok "license is Apache-2.0" \\
  || miss "license drift — was Apache-2.0 at generation time"

# 3. Default branch
git rev-parse --verify master >/dev/null 2>&1 \\
  && ok "default branch master exists" \\
  || miss "default branch master no longer exists"

# 4. Critical files exist
test -f "docker-compose.yml" \\
  && ok "docker-compose.yml" \\
  || miss "missing critical file: docker-compose.yml"
test -f "embedding-calculator/Dockerfile" \\
  && ok "embedding-calculator/Dockerfile" \\
  || miss "missing critical file: embedding-calculator/Dockerfile"
test -f "db/initdb.sql" \\
  && ok "db/initdb.sql" \\
  || miss "missing critical file: db/initdb.sql"
test -f "docs/Rest-API-description.md" \\
  && ok "docs/Rest-API-description.md" \\
  || miss "missing critical file: docs/Rest-API-description.md"
test -f "embedding-calculator/requirements.txt" \\
  && ok "embedding-calculator/requirements.txt" \\
  || miss "missing critical file: embedding-calculator/requirements.txt"

# 5. Repo recency
days_since_last=$(( ( $(date +%s) - $(git log -1 --format=%at 2>/dev/null || echo 0) ) / 86400 ))
if [ "$days_since_last" -le 610 ]; then
  ok "last commit was $days_since_last days ago (artifact saw ~580d)"
else
  miss "last commit was $days_since_last days ago — artifact may be stale"
fi

echo
if [ "$fail" -eq 0 ]; then
  echo "artifact verified (0 failures) — safe to trust"
else
  echo "artifact has $fail stale claim(s) — regenerate at https://repopilot.app/r/exadel-inc/CompreFace"
  exit 1
fi

Each check prints ok: or FAIL:. The script exits non-zero if anything failed, so it composes cleanly into agent loops (./verify.sh || regenerate-and-retry).

</details>

⚡TL;DR

CompreFace is a free, open-source face recognition REST API service that bundles multiple computer vision models (face detection, recognition, verification, landmark detection, mask detection, age/gender recognition, head pose detection) into Docker-deployable microservices. It allows developers to add facial analysis capabilities to any application without ML expertise through simple REST endpoints. Monorepo split into distinct service tiers: backend Java services in one tree (REST API, face service orchestration), Python services for ML inference engines using TensorFlow models, TypeScript/React frontend UI in a separate tree. Infrastructure-as-code in /custom-builds for FaceNet/Mobilenet/ArcFace variants; database initialization in /db/initdb.sql; Docker Compose orchestration in /dev and root directories.

👥Who it's for

Backend engineers and full-stack developers building security systems, access control, identity verification, or demographic analytics features who want a self-hosted, privacy-respecting alternative to cloud face recognition APIs like AWS Rekognition or Azure Face API.

🌱Maturity & risk

Production-ready and actively maintained. The project has multiple GitHub workflows for CI/CD (unit tests on Maven/Node/Python, automated Docker builds, Kubernetes support), comprehensive documentation in /docs, and multiple deployment options (Docker Compose, AWS/Azure templates, Kubernetes). The large codebase (834K Java, 631K TypeScript) and organized file structure with custom builds for different models indicate maturity, though the specific commit age would need verification.

Moderate risk factors: heavy ML dependency stack (TensorFlow 2.2.0, Keras, scikit-learn, OpenCV 4.4.0) with some older versions; requires PostgreSQL database; complex multi-service orchestration (database, API server, face recognition engine, web UI); sensitive use case (biometric data) demands careful compliance handling. However, MIT/Apache 2.0 licensing and active Exadel Inc. backing reduce abandonment risk.

Active areas of work

Active release pipeline with automated workflows for building cloud images (AWS, Azure), Dockerhub releases, load testing via k6, and multi-platform CI on Maven/Node/Python. Custom model builds suggest ongoing development of alternative face recognition architectures beyond the default.

🚀Get running

Clone the repo, then use Docker Compose: git clone https://github.com/exadel-inc/CompreFace.git && cd CompreFace && docker-compose up (for dev environment in /dev directory). For full setup with GPU support, use dev/docker-compose-gpu.yml. Configure .env file first (copy from example or use defaults).

Daily commands: For local development: cd dev && docker-compose up (standard) or docker-compose -f docker-compose-gpu.yml up (GPU mode). For UI-only dev: docker-compose -f docker-compose.dev.ui.yml up. For production: use root docker-compose.yml or cloud templates in .github/workflows (AWS Packer, Azure Image Builder).

🗺️Map of the codebase

docker-compose.yml — Root deployment configuration orchestrating all microservices (API, embedding-calculator, database, nginx); essential for understanding how CompreFace runs
embedding-calculator/Dockerfile — Defines the Python-based face recognition model container with TensorFlow/Keras dependencies; core ML inference engine
db/initdb.sql — Database schema initialization for face embeddings, user management, and service configuration; defines data persistence layer
docs/Rest-API-description.md — Complete API contract documentation for face recognition endpoints; critical reference for integrating with the system
embedding-calculator/requirements.txt — Python dependencies including TensorFlow, OpenCV, and face detection models; defines ML stack constraints
docs/Architecture-and-scalability.md — High-level architectural decisions and deployment patterns; essential for understanding microservice boundaries
.github/workflows/unit-tests-on-python.yml — CI/CD pipeline for embedding-calculator validation; demonstrates testing strategy and quality gates

🛠️How to make changes

Add a New Face Analysis Plugin (e.g., emotion detection)

Create Python module in embedding-calculator implementing inference logic for new model (embedding-calculator/src/new_plugin.py (would be created))
Add HTTP endpoint to Flask API exposing the new analysis capability (embedding-calculator/Dockerfile (add model dependencies to requirements.txt))
Register endpoint in REST API with Flasgger Swagger documentation (embedding-calculator/src/api.py (reference from Rest-API-description.md))
Update database schema to store results if persistence needed (db/initdb.sql)
Add unit tests for the new plugin (embedding-calculator/pytest.ini)

Deploy CompreFace with a Custom Face Recognition Model

Create new directory in custom-builds/ with model variant name (custom-builds/YourModel/docker-compose.yml (would be created))
Copy and modify environment configuration for your model weights (custom-builds/YourModel/.env (would be created))
Extend embedding-calculator Dockerfile to download/copy your pre-trained weights (embedding-calculator/Dockerfile)
Update embedding-calculator requirements.txt with any new dependencies (embedding-calculator/requirements.txt)
Test locally using dev docker-compose and validate model accuracy (dev/docker-compose.yml)

Enable GPU Acceleration for Faster Inference

Use the GPU-optimized Dockerfile instead of standard CPU variant (embedding-calculator/gpu.Dockerfile)
Update docker-compose to mount GPU resources and set CUDA_VISIBLE_DEVICES (dev/docker-compose-gpu.yml)
Verify GPU availability by checking TensorFlow logs during container startup (.env (add GPU_ENABLED=true))
Run benchmark script to measure inference speedup (embedding-calculator/benchmark.sh)

Integrate CompreFace into External Application

Review complete endpoint specification and request/response formats (docs/Rest-API-description.md)
Call face recognition endpoint with base64-encoded image or file upload (docs/How-to-Use-CompreFace.md)
Store returned face embeddings in application database for later comparison (db/initdb.sql (understand schema for reference))
Deploy CompreFace and application together using orchestration (docker-compose.yml or custom-builds/*/docker-compose.yml)

🔧Why these technologies

TensorFlow + Keras — Industry-standard deep learning framework; pre-trained face recognition models (ArcFace, FaceNet, MobileNet) available; supports both CPU and GPU inference
Python (embedding-calculator) — Dominant language for ML model serving; rich ecosystem (OpenCV, numpy, scipy) for image processing; simple Flask REST API wrapper
PostgreSQL — Relational database for structured face embeddings, user accounts, API keys; vector similarity queries via pgvector extension (implied); ACID compliance
Docker & Docker Compose — Containerization enables reproducible deployments across CPU/GPU variants; Compose simplifies multi-service orchestration without Kubernetes overhead
Nginx — Lightweight reverse proxy for SSL termination, request routing, load balancing across API replicas; low latency gateway
Flask — Minimal Python HTTP framework for REST API; Flasgger integration provides auto-generated Swagger docs; low startup overhead vs Django

⚖️Trade-offs already made

Monolithic embedding-calculator service vs separate microservices per task (detect, recognize, mask detection)
- Why: Simplifies deployment and model sharing; single TensorFlow/Keras runtime avoids redundancy
- Consequence: Scaling is coarser-grained; cannot independently scale mask-detection if face-recognition is bottleneck
CPU-first with optional GPU Dockerfile variants vs GPU-required
- Why: Maximizes accessibility for edge/cloud deployments without NVIDIA hardware
- Consequence: Default CPU inference

🪤Traps & gotchas

Required services: PostgreSQL must be running and initialized before API starts; Python ML engine requires TensorFlow and specific CUDA versions if GPU mode is used. Version constraints: TensorFlow 2.2.0 is relatively old and may have compatibility issues with modern CUDA/Python 3.9+. Docker dependency: full local testing requires Docker and docker-compose; some developers may struggle with GPU support setup. Biometric data sensitivity: face embeddings and images are stored in PostgreSQL — ensure GDPR/HIPAA compliance for your use case. Model licensing: different custom builds may have different licensing terms (check each model's documentation).

🏗️Architecture

💡Concepts to learn

Face Embedding / Vector Representation — CompreFace converts faces to fixed-size numerical vectors (embeddings) for comparison and storage; understanding how ArcFace/FaceNet embed faces determines how to tune similarity thresholds and why /docs/Face-Recognition-Similarity-Threshold.md exists
Microservices Architecture with Docker Compose — CompreFace separates API server, database, Python ML engine, and frontend into independent containers orchestrated via compose; modifying one service or scaling a specific component requires understanding this loose coupling
REST API Versioning & Multi-Tenant Face Storage — The system supports multiple users, multiple subjects per user, multiple faces per subject (see /db/initdb.sql schema); API responses require tracking which API version/tenant is making the request — critical for data isolation and backward compatibility
Model Quantization & Hardware Acceleration (GPU/CPU tradeoffs) — Custom builds for Mobilenet (lightweight), FaceNet (accurate), ArcFace (fastest on GPU) let you trade speed for accuracy; dev/docker-compose-gpu.yml vs standard compose shows the infrastructure difference — affects deployment cost and latency
Confidence Scores & False Positive Rate Tuning — Face recognition models output probabilities (0-1 confidence); /docs/Face-Recognition-Similarity-Threshold.md guides setting thresholds to balance security (reject similar-looking impostors) vs usability (accept legitimate users) — critical for production systems
User Roles & RBAC (Role-Based Access Control) — CompreFace has a user roles system (/docs/User-Roles-System.md) controlling who can add faces, run recognition, delete data — important for multi-tenant deployments and compliance
Helm/Kubernetes ConfigMaps and StatefulSets — Referenced in /docs/Kubernetes-configuration and separate repo; understanding how to parameterize face models, DB credentials, and service replicas as Kubernetes native objects is essential for cloud-native deployments

deepinsight/insightface — Production-grade face recognition library that CompreFace models (ArcFace, SubCenter-ArcFace) are built on; understanding insightface is essential for customizing recognition backends
opencv/opencv — Underlying vision library (opencv-python==4.4.0.46 in dependencies) used for image preprocessing and face detection in CompreFace
tensorflow/tensorflow — Core ML runtime (tensorflow==2.2.0) executing all face embedding and classification models in the Python services
exadel-inc/compreface-kubernetes — Official Kubernetes deployment configs and Helm charts for scaling CompreFace across clusters — sibling repo referenced in README
ageitgey/face_recognition — Simpler single-library face recognition alternative for comparison; CompreFace is heavier but production-ready and self-hostable

🪄PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add integration tests for custom Docker builds

The repo has 5 custom-builds variants (FaceNet, Mobilenet, Mobilenet-gpu, SubCenter-ArcFace-r100, SubCenter-ArcFace-r100-gpu) but no CI workflow validates they build and run correctly. This prevents regressions when core dependencies or docker-compose configurations change. A new GitHub Action would build each variant and run basic smoke tests against their REST API.

[ ] Create .github/workflows/test-custom-builds.yml that matrix-builds each custom-builds/*/docker-compose.yml
[ ] Add health check test in each custom build to verify API endpoints respond (e.g., POST /api/v1/recognize)
[ ] Reference specific test files: custom-builds/*/docker-compose.yml and docs/Installation-options.md for expected endpoints

Add Python type hints and mypy validation to face recognition backend

The Python dependencies (TensorFlow 2.2.0, scikit-learn, etc.) and existing pytest config show this is a mature Python service, but there's no type-checking workflow (.github/workflows/unit-tests-on-python.yml exists but likely doesn't include mypy). Adding type hints to core face detection/recognition modules would catch bugs at CI time and improve code maintainability.

[ ] Add mypy==1.5.0 to dependencies/requirements.txt
[ ] Enhance .github/workflows/unit-tests-on-python.yml to run 'mypy' on core backend modules
[ ] Add py.typed marker and type hints to key scanner/face recognition modules (infer from tensorflow and scikit-learn imports in dependencies)

Add e2e tests for face recognition REST API endpoints

The repo provides REST APIs for recognize, verify, detect, and mask detection per the README, but there's no dedicated e2e test workflow validating API contract (request/response schemas, error handling, model accuracy). Adding k6 load tests exist (.github/workflows/load-tests-k6.yml) but functional API contract tests are missing.

[ ] Create .github/workflows/e2e-api-tests.yml that spins up docker-compose.yml and runs API contract tests
[ ] Add test file with pytest validating: POST /api/v1/recognize with image payload, POST /api/v1/verify, GET /api/v1/status response schemas
[ ] Reference docs/How-to-Use-CompreFace.md to extract expected API endpoints and validate they match actual implementation

🌿Good first issues

Add unit test coverage for face detection API endpoints in the REST layer — the codebase has pytest and jest CI workflows but likely missing tests for specific Java service classes that parse face request/response DTOs
Expand Docker Compose environment variable documentation — create a detailed .env.example with explanations for each setting (API port, model type, GPU allocation, DB credentials) to reduce onboarding friction
Add Kubernetes deployment manifest examples — sibling repo exists but root repo lacks inline Helm or native k8s YAML; contribute templated manifests for face engine scaling

⭐Top contributors

Click to expand

@oriasyk — 56 commits
@Anatolii-R — 21 commits
@pospielov — 19 commits
@ZuraM — 3 commits
@An-R — 1 commits

📝Recent commits