Pipeline Hardening & GPU-Accelerated Recording

Week 38

#Inspiration Index

The pipeline infrastructure received significant reliability improvements.

  • S3-only upload support means the pipeline now handles video uploads directly from S3 without requiring recorder metadata.
  • Staging environment fixes resolved Step Functions failures and MediaConvert S3 path issues that were blocking staging deployments.
  • Infrastructure automation standardized naming conventions and improved error handling across Terraform configurations.
  • Streamlined testing improved the /test-pipeline command with clearer documentation and better enforcement.

#Autoscroll Recorder

Switched Chrome rendering from EGL to GLX for proper X11/Xorg WebGL support, resolving GPU task failures and enabling hardware-accelerated recording.

Built comprehensive retry functionality for failed recordings:

  • Automatic detection and recovery of failed recordings.
  • Fixed retry job status tracking to prevent false completion states.
  • Clear error messages when S3 videos are detected during recovery.
  • Proper permission handling for S3 object deletion and ECS task management.
  • Docker image metadata logging for better debugging.
  • Resolved timeout issues at 120s for GPU tasks.
  • Fixed EventBridge Pipe and Vercel IAM permissions for full ECS and SQS access.