Computer Vision Engineer (Detection, Tracking & 2D Metric Calibration Specialist)

Remote Full-time
Project Context CrackCoach is an AI platform for automatic analysis of show-jumping videos. This role builds the IMAGE-level perception and geometry stack that everything depends on: detection, tracking, obstacle understanding, jump segmentation, and metric calibration in real-world competition footage. Without a rock-solid perception and geometric foundation, pose estimation, biomechanics, and AI coaching are not reliable. ⸻ Core Mission and Responsibilities You will design, implement, and validate a production-grade computer vision pipeline capable of ingesting raw competition videos and producing robust, structured, and metric-aware outputs. Your responsibilities include: • Video ingestion and preprocessing: handle codecs, resolutions, FPS, orientation, stabilization, and cropping policies. • Horse-and-rider detection using state-of-the-art detectors (YOLO / RT-DETR / Detectron2 or equivalent). • Persistent tracking across frames (ByteTrack, BoT-SORT, DeepSORT, Kalman-based trackers). • Obstacle detection and scene understanding for show-jumping arenas (rails, poles, standards). • Obstacle-to-jump association logic: correctly identify which obstacle is being jumped and when. • Automatic segmentation of a full round video into individual jump clips (per-obstacle segments). • 2D trajectory reconstruction of the horse in image space with stable, low-jitter trajectories. 2D Metric Calibration (Image → Ground Plane) In addition to perception, this role includes implementing a robust 2D metric calibration module: • Estimate a ground-plane homography (image → ground) using stable scene references such as obstacle bases or other ground contact points. • Compute a pixel-to-meter scale, ideally leveraging known or user-declared obstacle heights (e.g. “course at 1.35m”) when available. • Project horse trajectories from image space to ground-plane coordinates in meters. • Enable metric estimates such as: • approach speed (m/s) • distances between obstacles (m) • take-off and landing distances at ground level (m) • approximate stride length at ground level (when combined later with biomechanics) • Provide a calibration confidence indicator and gracefully fall back to relative (pixel-based) measures when calibration is unreliable. The calibration module must be robust, non-blocking, and designed for real-world competition footage (single camera, uncontrolled viewpoints). ⸻ Required Technical Skills • Strong background in computer vision applied to video (sports footage experience is a strong plus). • Proven experience with object detection (YOLO family, Detectron2, RT-DETR, etc.). • Multi-object tracking expertise (ByteTrack / BoT-SORT / DeepSORT; handling occlusions and ID switches). • Experience with segmentation models (Mask R-CNN, YOLO-Seg, SAM-family) if needed for background removal. • Solid understanding of image-space geometry and camera perspective limitations. • Experience implementing 2D metric calibration using planar homography and RANSAC. • Comfortable working with pixel-to-meter conversions and expressing metric uncertainty. • Advanced Python and OpenCV; deep learning framework (PyTorch preferred). • Experience building modular, maintainable pipelines with clear interfaces and exports. ⸻ Key Technical Challenges • Highly variable camera angles, zoom levels, and lighting conditions. • Dynamic occlusions from obstacles, rails, other horses, and spectators. • Motion blur and compression artifacts in user-generated videos. • Background clutter and false positives (banners, rails, similar shapes). • Maintaining stable trajectories despite noisy detections and temporary misses. • Correct obstacle differentiation and obstacle association in multi-obstacle scenes. • Metric calibration with a single camera, limited scene control, and partial reference data. • Performance constraints: processing HD videos in minutes, not hours. ⸻ Expected Deliverables • A fully modular computer vision pipeline (source code) that ingests raw video and outputs: • detections • tracks • obstacle detections • jump segments • 2D trajectories • ground-plane metric projections (when calibration is reliable) • A 2D calibration module producing pixel-to-meter scale, ground-plane mapping, and confidence scores. • Trained detection/segmentation models (weights + training scripts) when custom training is required. • Clean data exports (JSON / CSV) and stable ROI frame exports for pose estimation and biomechanics. • Visual validation outputs (overlays showing boxes, tracks, obstacles, jump boundaries, and metric projections). • Clear technical documentation defining interfaces and data formats for downstream pose estimation, biomechanics, and AI coaching stages. ⸻ Important Notes • This role does NOT include pose estimation or biomechanics (handled by separate specialists). • Metric calibration is 2D ground-plane based, not full 3D reconstruction. • Robustness and graceful degradation are more important than theoretical precision. Apply tot his job
Apply Now →
← Back to Home