Onboarding: kilimchoi/engineering-blogs

Item: kilimchoi/engineering-blogs
Rating: 1
Author: RepoPilot

Generated by RepoPilot · 2026-05-05 · Source

Verdict

AVOID — Stale and unlicensed — last commit 2y ago

5 active contributors
Distributed ownership (top contributor 46%)
CI configured
⚠ Stale — last commit 2y ago
⚠ Small team — 5 top contributors
⚠ No license — legally unclear to depend on
⚠ No test directory detected

<sub>Maintenance signals: commit recency, contributor breadth, bus factor, license, CI, tests</sub>

TL;DR

kilimchoi/engineering-blogs is a curated, crowd-sourced list of software engineering blogs maintained as a Markdown README and a companion OPML file. The Ruby script generate_opml.rb parses README.md to extract blog names and URLs, then outputs engineering_blogs.opml — a standard RSS/Atom feed subscription bundle importable into any RSS reader. It solves the problem of discovering and subscribing to engineering blogs across companies, individual engineers, and product/technology categories in a single place. The repo is flat and minimal: README.md is the canonical data source containing all blog entries in alphabetical sections; generate_opml.rb is a single Ruby script that reads the README and writes engineering_blogs.opml. The GitHub Actions workflow in .github/workflows/engineering_blogs.yml automates OPML regeneration.

Who it's for

Software engineers and engineering managers who want to follow technical writing from the industry — whether that's keeping up with infrastructure choices at Airbnb, reading Artsy's open-source culture posts, or finding niche individual contributor blogs. Contributors are typically developers who want to add their own or a favourite company's engineering blog to the curated list.

Maturity & risk

The repo is one of the most-starred 'awesome list' style repositories on GitHub with thousands of stars and years of history. It has a GitHub Actions CI workflow at .github/workflows/engineering_blogs.yml and a .ruby-version file pinning the Ruby version, showing deliberate maintenance. The project is best described as actively maintained for its purpose — it is not a software library but a living document that accepts ongoing community contributions.

Risk is extremely low for readers: the deliverables are a README and an OPML file with no runtime dependencies beyond Ruby's standard library (plus whatever is in the Gemfile). The main risk is link rot — blog URLs go stale and there is no automated validation of URLs in CI that is visible in the file list. Single-maintainer risk exists (kilimchoi), though community PRs drive most additions.

Active areas of work

Active work is almost entirely community-driven PR submissions adding new blog entries to README.md. The GitHub Actions workflow (engineering_blogs.yml) appears to automate regenerating the OPML file on changes to the README. No large refactors or new features are visible in the file structure.

Get running

git clone https://github.com/kilimchoi/engineering-blogs.git cd engineering-blogs bundle install ruby generate_opml.rb

This regenerates engineering_blogs.opml from README.md

Daily commands: bundle exec ruby generate_opml.rb

Output is written to engineering_blogs.opml in the project root

Map of the codebase

README.md — The primary artifact of this repo — a curated, alphabetically-sorted list of engineering blogs that contributors directly edit to add or remove entries.
generate_opml.rb — The only executable script in the repo; it parses README.md and generates the engineering_blogs.opml file, so any structural change to README formatting must be reflected here.
engineering_blogs.opml — The auto-generated OPML feed file consumed by RSS readers; it is the primary machine-readable output of this project and must stay in sync with README.md.
contributing.md — Defines the exact conventions (alphabetical order, link format, section rules) every contributor must follow when editing README.md.
.github/workflows/engineering_blogs.yml — CI workflow that automatically runs generate_opml.rb and validates/updates the OPML file on every push, making it the enforcement gate for repository correctness.
Gemfile — Declares Ruby dependencies required to run generate_opml.rb; contributors need this to set up their local environment.

Components & responsibilities

README.md (Markdown) — Authoritative human-readable and human-editable list of all curated engineering blogs, organised alphabetically by company and individual.
- Failure mode: A formatting error (wrong list syntax, broken link, non-alphabetical placement) silently corrupts the parsed output or fails CI.
generate_opml.rb (Ruby, standard library XML/string processing) — Parses README.md via regex, extracts blog name/URL pairs, and writes a valid OPML XML file with two outline groups (Companies, Individuals).
- Failure mode: Regex mismatch on README formatting changes causes entries to be silently omitted from the OPML output.
engineering_blogs.opml (OPML (XML dialect)) — Machine-readable output consumed by RSS feed readers; the primary deliverable for end-users who want to subscribe to all listed blogs at once.
- Failure mode: Out-of-sync with README.md if CI is skipped or the commit step fails; malformed XML breaks all feed reader imports.
GitHub Actions Workflow (GitHub Actions, Ruby) — Automatically runs the generator on every push/PR and commits updated OPML, ensuring the two files never drift apart.
- Failure mode: If the workflow token lacks write permissions, the OPML auto-commit step silently fails and the files drift.

Data flow

Contributor → README.md — Manually adds or edits a blog entry in the correct alphabetical section via PR.
README.md → generate_opml.rb — Script reads the full README text and extracts all * [Name](URL) entries per section using regex.
generate_opml.rb → engineering_blogs.opml — Script writes structured OPML XML with one outline element per blog entry, grouped by Companies and Individuals.
engineering_blogs.opml → End User RSS Reader — User downloads or imports the raw OPML file from GitHub to subscribe to all listed feeds at once.

How to make changes

Add a new company engineering blog

Find the correct alphabetical section (e.g. '## A Companies') in the README and insert a new markdown list entry in the format * [Company Name](https://blog.url), maintaining alphabetical order within the section. (README.md)
Run ruby generate_opml.rb locally to regenerate the OPML file and verify your new entry appears correctly with the right title and URL in the XML output. (generate_opml.rb)
Commit both README.md and the updated engineering_blogs.opml together; the CI workflow will re-validate on push. (engineering_blogs.opml)

Add a new individual/group contributor blog

Locate the '## Individuals/Group Contributors' section and the correct alphabetical sub-section, then insert * [Author Name](https://blog.url) in sorted order. (README.md)
Regenerate the OPML by running ruby generate_opml.rb and confirm the new outline entry appears under the correct OPML group. (generate_opml.rb)
Commit the updated README.md and engineering_blogs.opml; the CI workflow enforces consistency automatically. (.github/workflows/engineering_blogs.yml)

Modify the OPML generation logic (e.g. add a new XML attribute)

Edit the Ruby script to change how outline elements are constructed, e.g. adding a new xmlUrl or category attribute to each OPML entry. (generate_opml.rb)
Run ruby generate_opml.rb locally and inspect engineering_blogs.opml to verify the new attribute appears correctly on all entries. (engineering_blogs.opml)
Ensure the CI workflow still passes by checking it doesn't assert on a fixed OPML schema that would break with your new attribute. (.github/workflows/engineering_blogs.yml)

Update contribution rules or formatting conventions

Edit the contributing guidelines to reflect new rules (e.g. requiring RSS feed URLs, new link format). (contributing.md)
If the new convention changes the Markdown format that generate_opml.rb parses (e.g. new link structure), update the regex or parsing logic accordingly. (generate_opml.rb)
Update the CI workflow if new validation steps are needed to enforce the new conventions automatically on every PR. (.github/workflows/engineering_blogs.yml)

Why these technologies

Markdown (README.md) — Renders natively on GitHub as the project homepage and is trivially editable by contributors without tooling; the alphabetical structure is human-verifiable.
OPML — De-facto standard format for sharing lists of RSS/Atom feeds; allows users to import the entire curated list into any feed reader with one file.
Ruby (generate_opml.rb) — Lightweight scripting language well-suited for text parsing and XML generation; low dependency footprint for a simple transformation script.
GitHub Actions — Native CI for GitHub repos; keeps OPML regeneration automatic and zero-config for contributors who only edit README.md.

Trade-offs already made

Commit generated OPML to the repository
- Why: Allows users to download the OPML directly from GitHub without running any tooling or build step.
- Consequence: The repo always has a two-file change per blog addition (README + OPML), and merge conflicts on the OPML file are possible on concurrent PRs.
Use README.md as the single source of truth rather than a structured data file (JSON/YAML)
- Why: Maximises contributor accessibility — anyone can edit Markdown; no schema knowledge required.
- Consequence: The parser in generate_opml.rb is fragile to formatting inconsistencies; a malformed link can silently drop entries or break generation.
No web frontend or database
- Why: Keeps the project purely static and maintenance-free; GitHub renders everything needed.
- Consequence: No search, filtering, or categorisation beyond alphabetical grouping is possible without significant re-architecture.

Non-goals (don't propose these)

Validating that linked blogs are still live or publishing new content
Providing blog content summaries or previews
Supporting user accounts, personalisation, or bookmarking
Automated discovery of new engineering blogs
Full-text search across listed blogs

Anti-patterns to avoid

Regex-based Markdown parsing — generate_: undefined

Traps & gotchas

The .ruby-version file pins a specific Ruby version — if your local Ruby differs you may get warnings or failures; use rbenv or rvm to match it exactly. The OPML file must be regenerated and committed alongside any README change — PRs that only edit README.md without updating engineering_blogs.opml are often flagged. Entries must follow the precise Markdown format * Name URL with no trailing slash variations; generate_opml.rb likely relies on regex matching this exact pattern.

Architecture

Concepts to learn

OPML (Outline Processor Markup Language) — OPML is the file format generated by this repo's core script — understanding its XML structure is essential to modifying generate_opml.rb or debugging malformed output.
RSS Feed Aggregation — The entire value proposition of engineering_blogs.opml is that it bundles hundreds of RSS/Atom feed URLs into one importable file — knowing how RSS aggregators use OPML explains why this format matters.
Awesome List Format — This repo follows the sindresorhus/awesome convention for structured Markdown curation lists, including alphabetical sectioning and badge requirements — knowing this format explains the README structure.
Markdown Parsing via Regex — generate_opml.rb almost certainly uses Ruby regex to extract name/URL pairs from Markdown bullet points rather than a full Markdown AST parser — understanding this approach explains its fragility to format deviations.
GitHub Actions Workflow Triggers — The .github/workflows/engineering_blogs.yml workflow automates OPML regeneration — understanding on: push vs on: pull_request triggers explains when the generated file is updated versus when contributors must do it manually.

Related repos

sindresorhus/awesome — The 'awesome list' meta-repository whose format and badge this repo follows; defines the contributing standards this project is built around.
wrongsideofmemphis/Planet-GNOME-opml — Another community-maintained OPML file for a developer community, showing the same pattern of curated blog aggregation.
nickhould/craft-beers-dataset — Example of a similar 'curated data as a repo' pattern maintained via README and generated artifacts.
rust-lang/this-week-in-rust — An alternative approach to engineering blog aggregation — a newsletter format that surfaces community blog posts weekly, solving a similar discovery problem.
feedbin/feedbin — An RSS reader that can directly import the engineering_blogs.opml this repo generates, representing a key downstream consumer of this project's output.

PR ideas

To work on one of these in Claude Code or Cursor, paste: Implement the "<title>" PR idea from CLAUDE.md, working through the checklist as the task list.

Add a CI workflow step to validate all URLs in README.md are reachable (detect dead blog links)

The repo curates hundreds of engineering blog URLs. Over time, blogs move or shut down, leaving dead links. The existing .github/workflows/engineering_blogs.yml likely runs the OPML generator but there is no link-checking step. Adding a dead-link checker (e.g. using lychee or awesome_bot) directly to the existing workflow would automatically flag broken entries, keeping the list trustworthy and reducing manual maintenance burden.

[ ] Open .github/workflows/engineering_blogs.yml and inspect its current steps
[ ] Add a new job or step using the lycheeverse/lychee-action GitHub Action (or awesome_bot gem, which is Ruby-friendly given the existing Gemfile) to crawl all URLs in README.md
[ ] Configure an allowlist/timeout to avoid false positives from rate-limited sites (e.g. Medium, Substack)
[ ] Set the job to run on a schedule (e.g. weekly via cron) so dead links are caught continuously, not just on PRs
[ ] Document the new check in contributing.md so contributors know dead links will be caught automatically

Add a validation script that enforces alphabetical ordering of entries within each section of README.md

The README is organized into lettered sections (A–Z for companies, individual contributors, etc.). As new entries are added via PRs, it is easy to accidentally insert a blog out of alphabetical order. There is currently no automated check for this. A small Ruby script (fitting naturally alongside the existing generate_opml.rb) that parses each section and asserts sorted order would prevent ordering regressions and make reviews easier.

[ ] Create a new file validate_order.rb in the repo root, mirroring the style of generate_opml.rb
[ ] Parse README.md section by section (using the existing ## A Companies, ## B Companies heading pattern) and extract all hyperlink labels
[ ] Assert each section's labels are in case-insensitive alphabetical order; exit non-zero with a clear error message if not
[ ] Add a step in .github/workflows/engineering_blogs.yml that runs ruby validate_order.rb on every pull request
[ ] Update contributing.md to mention the alphabetical ordering requirement and note that the CI check enforces it

Add a consistency check that ensures every URL in README.md also appears in engineering_blogs.opml (and vice versa)

The repo maintains two representations of the same data: README.md (human-readable) and engineering_blogs.opml (machine-readable for RSS readers). The generate_opml.rb script generates the OPML from the README, but there is no automated test asserting the two files are actually in sync. A contributor could update README.md without regenerating the OPML, causing silent drift. A diff/validation script run in CI would catch this immediately.

[ ] Add a CI step in .github/workflows/engineering_blogs.yml that runs ruby generate_opml.rb and then uses git diff --exit-code engineering_blogs.opml to assert the generated file matches the committed file
[ ] If generate_opml.rb does not already write to the file (only prints to stdout), update it to support a --write flag or redirect output, enabling the diff check
[ ] Optionally, create a separate validate_sync.rb script that cross-references URLs extracted from README.md against xmllint-parsed entries in engineering_blogs.opml and reports any asymmetry
[ ] Update contributing.md to instruct contributors to run ruby generate_opml.rb > engineering_blogs.opml locally before submitting a PR that adds or removes a blog
[ ] Add a clear CI failure message explaining how to fix the out-of-sync

Good first issues

Add a URL validator script that checks each link in README.md for HTTP 200 responses and reports dead links — currently there is no such tooling visible in the repo. 2. Add RSpec or Minitest tests for generate_opml.rb covering edge cases like entries with parentheses in names or HTTPS-only URLs — there are currently zero test files. 3. Expand the GitHub Actions workflow to automatically validate that engineering_blogs.opml is in sync with README.md on every PR, preventing stale OPML submissions.

Top contributors

@kilimchoi — 29 commits
@kaizensoze — 21 commits
[@Kilim Choi](https://github.com/Kilim Choi) — 7 commits
@nucreativa — 3 commits
@frwdd — 3 commits

Recent commits

50eab27 — Update workflow file (kilimchoi)
54a6e7f — remove blogs that are not working (kilimchoi)
924f9f6 — Add "Palantir" Specializes in big data analytics (#1152) (Erfan-ffa)
5373fe3 — Update README.md (kilimchoi)
24b9a03 — Update README.md (kilimchoi)
a00fb44 — add godaddy (Kilim Choi)
995dda5 — fix opml generator (Kilim Choi)
f7d1f99 — Update Shopify engineering blog links (readme and opml) (#1057) (summersk)
cb0ca61 — add back the check for generating opml file (Kilim Choi)
4b7c232 — use feedly api to fetch the url (Kilim Choi)

Security observations

This repository is a low-risk, static content project (a curated list of engineering blogs) with no backend services, databases, authentication mechanisms, or sensitive data processing. The attack surface is minimal. The identified vulnerabilities are primarily low-severity hygiene issues: a committed .DS_Store file leaking minor filesystem metadata, potential unpinned CI/CD actions posing a supply chain risk, unverified dependency pinning, community-contributed URLs lacking formal validation, and the absence of a vulnerability disclosure policy. No hardcoded secrets, injection vulnerabilities, exposed ports, or Docker misconfigurations were detected. Overall security posture is good for a project of this nature, with improvements recommended primarily around supply chain security and repository hygiene.

Low · .DS_Store File Committed to Repository — .DS_Store. The macOS metadata file '.DS_Store' has been committed to the repository. This file can reveal directory structure information and internal path details about the developer's local filesystem, which may assist attackers in reconnaissance. Fix: Remove the .DS_Store file from the repository using 'git rm --cached .DS_Store', add '**/.DS_Store' to the .gitignore file to prevent future commits, and audit git history to ensure no sensitive data was inadvertently exposed.
Low · Absence of Dependency Version Pinning Verification — Gemfile, Gemfile.lock. The Gemfile and Gemfile.lock are present, which is good practice. However, without visibility into the actual content of these files, it cannot be confirmed that all dependencies are pinned to specific secure versions. Unpinned or loosely pinned Ruby gem dependencies (e.g., using '~>' or '>=' without upper bounds) can expose the project to supply chain attacks or inadvertent inclusion of vulnerable versions. Fix: Ensure all gems in the Gemfile are pinned to specific versions. Regularly run 'bundle audit' (via bundler-audit gem) as part of the CI/CD pipeline to detect known vulnerabilities in dependencies. Review Gemfile.lock to confirm all transitive dependencies are locked.
Low · GitHub Actions Workflow May Use Unpinned Actions — .github/workflows/engineering_blogs.yml. The GitHub Actions workflow file at '.github/workflows/engineering_blogs.yml' may reference third-party GitHub Actions by branch name (e.g., 'uses: actions/checkout@main') rather than by a specific commit SHA. This exposes the workflow to supply chain attacks where a compromised upstream action could execute malicious code in the CI/CD pipeline. Fix: Pin all GitHub Actions to a specific commit SHA (e.g., 'uses: actions/checkout@a81bbbf8298c0fa03ea29cdc473d45769f953675') rather than a branch or tag. Regularly review and update pinned SHAs using tools like Dependabot or manual audits.
Low · OPML File May Contain Unvalidated External URLs — engineering_blogs.opml, generate_opml.rb, README.md. The 'engineering_blogs.opml' file and 'README.md' contain a large number of external URLs contributed by the community. If the OPML file or README is consumed programmatically (e.g., by the generate_opml.rb script without sanitization), malformed or malicious URLs could potentially cause unexpected behavior such as SSRF if URLs are fetched server-side, or open redirect/phishing risks if rendered in a UI context. Fix: Validate all URLs in the OPML file against an allowlist of safe schemes (https://, http://) and ensure no javascript: or data: URIs are present. If generate_opml.rb fetches URLs, implement proper input validation and avoid following unvalidated redirects. Add automated URL validation in the CI pipeline.
Low · No SECURITY.md or Vulnerability Disclosure Policy — Repository root. The repository does not appear to include a SECURITY.md file or a defined vulnerability disclosure policy. This means security researchers have no clear channel to responsibly report vulnerabilities found in the project. Fix: Add a SECURITY.md file to the repository root that outlines the supported versions, how to report a vulnerability, and the expected response timeline. GitHub also supports a dedicated security advisory feature that should be enabled.

LLM-derived; treat as a starting point, not a security audit.

Where to read next

Open issues — current backlog
Recent PRs — what's actively shipping
Source on GitHub

Generated by RepoPilot. Verdict based on maintenance signals — see the live page for receipts. Re-run on a new commit to refresh.

kilimchoi/engineering-blogs

Embed this verdict

Onboarding doc