Type Safety Everywhere & The Pipeline Grows

This week was all about making things smarter and safer — from end-to-end type safety across the database SDK to an expanding web scraper pipeline and a more capable recording infrastructure.

Week 69 repos · 103 commits

PostgreSDK

A landmark week for PostgreSDK, shipping nine releases (v0.17.0 through v0.18.10) with a focus on TypeScript type safety and developer ergonomics:

  • Select/exclude field filtering added to all CRUD operations, include methods, and JSONB tables — with full TypeScript overloads so return types narrow automatically based on what you select
  • Automatic type inference for included relations — nested includes now carry proper TypeScript types all the way through
  • Idempotent generation with smart caching — the code generator now detects file-level changes and skips unnecessary regeneration, with version-based cache invalidation
  • Fixed nested include parameter resolution to use the correct target table's spec
  • Fixed pair combination method generation for tables with 4–6 relationships

Link Scraper Pipeline

The scraper grew significantly this week, adding new sources and reliability improvements:

  • Added five new scraping sources, bringing the total to nine
  • Domain URL limits and import deduplication prevent runaway crawls and redundant data
  • Parallel page discovery with database-level conflict resolution for higher throughput
  • Enhanced page discovery with homepage link extraction and dry-run testing support
  • Round-robin website selection for balanced recorder submissions
  • URL normalization and metadata sanitization overhauled for cleaner data
  • Expanded page type taxonomy and refined AI classification categories
  • SNS state-change alerts with Terraform IaC for operational visibility

Inspiration Index Pipeline

  • Tech stack detection added as a new field — pages are now analyzed for the technologies they use
  • Migrated to a normalized technology storage schema with type-safe SDK includes
  • Improved AI classification with better examples and a model upgrade
  • Taxonomy color delimiters updated for more reliable parsing
  • Schema cleaned up and DB SDK regenerated

Autoscroll Recorder API

  • Custom tech detection replaces the third-party Wappalyzer library, with a new detectTechnologies API param
  • Switched to Ghostery adblocker for more reliable cookie and ad blocking
  • Persistent Chrome profile replaces dynamic locale detection for more consistent browser behavior
  • Deterministic S3 keys replace date-partitioned keys for simpler asset management
  • Navigation errors excluded from browser crash detection to reduce false positives
  • ECS task event capture added for operational observability

Autoscroll Recorder Web

  • Role-based access control with admin user support
  • Device selection added to the job creation API
  • Bulk job retry API endpoint for operational recovery
  • Waitlist system added with improved retry distribution handling

Inspiration Index App

  • Feedback system with save buttons and navigation improvements shipped
  • Path-based device routing replaces URL param-based routing for cleaner URLs
  • API payloads optimized using the new field selection/exclusion from the Pipeline DB SDK
  • Saved items Zod schemas extracted to a shared directory

Silo Event Store API

  • Bot detection system launched with user agent analysis, UBID tracking, and iOS browser support
  • Team membership enrichment system added
  • UBID enrichment system for improved visitor identity tracking
  • Bot enrichment optimized with bulk SQL updates
  • Idempotent team enrichment with corrected property field names

Formgen

  • Rating matrix export now expands fields into individual columns for better spreadsheet usability
  • Excel export fallback fixed for nested data structures

This shiplog is partially AI-generated from commit history.