Senior AI Engineer, Video Search (Applied Research & Product)

DepartmentTechnology
Employment TypeFull-time
LocationRemote / Hybrid / Conshohocken, PA
Reports ToDirector of AI

About ZeroEyes, Inc.

ZeroEyes was founded by former Navy SEALs, self-starters and elite technologists with a mission to reduce the threat and impact of mass shootings and gun-related violence using our best-in-class artificial intelligence (AI) platform that detects visible firearms before there’s a threat. As a member of the ZeroEyes team, you’ll have the unique opportunity to join a forward-facing, purpose-driven company, and your perseverance and individual skill set will become crucial to our mission’s success.

About the role

We’re hiring a Senior AI Engineer to help lead applied research and productionization of video search, from natural-language queries to fast, scalable retrieval across archives and live streams. You’ll develop models, pipelines, and high-performance APIs. We value people who care more about truth than winning arguments, mentor generously, and take personal responsibility for the organization’s success.

What you’ll do

  • Contribute to video search stack end-to-end: dataset curation, model training/fine-tuning, indexing, retrieval APIs, latency/throughput optimization, and real-world evaluation.
  • Applied research → production: Evaluate and integrate V-JEPA2 style representations for video understanding and retrieval; compare/compose with CLIP/SigLIP/TimeSformer/ViViT/Video-LLMs for NL→video.
  • Text–video alignment: Build query encoders for natural-language search (prompting, adapters, contrastive losses, distillation) and robust negative mining; support multilingual queries.
  • Temporal grounding: Deliver moment-localization and highlight detection (segment-level embeddings, token-aligned pooling, temporal R@K / mAP).
  • Indexing at scale: Stand up vector/search infra (FAISS, Milvus, pgvector, Pinecone) with sharding, HNSW/IVF/ScaNN, hybrid signals (text + metadata + structure).
  • Latency & cost: Optimize preprocessing (frame sampling, shot detection), feature caching, batch inference, and low-latency serving (ONNX Runtime/TensorRT or ROCm paths).
  • Cross-GPU strategies: Design and implement multi-GPU training and serving—FSDP/ZeRO, tensor & pipeline parallelism, sharded/streamed decoding, NCCL/RCCL communication tuning, mixed precision/quantization, and elastic autoscaling.
  • Quality & evaluation: Define task-specific metrics (R@K, nDCG, mAP, temporal mAP), build dashboards and AB tests; run bias/robustness checks and failure-mode analyses.
  • Security & compliance aware: Design for privacy, auditability, and clean separation of controlled data; collaborate with platform/DevOps on IaC, CI/CD, and observability.
  • Mentor & collaborate: Level-up adjacent teams (ML Ops, backend, product). Write clear design docs and ADRs; lead design reviews.

What you’ll bring

  • 6–10+ years total; 4+ years applying deep learning to video, vision, or multimodal retrieval with shipped features or products.
  • Hands-on with PyTorch (preferred) and modern video backbones; practical experimentation with V-JEPA/V-JEPA2 (or JEPA-style self-supervised video objectives).
  • Strong with text–image/video retrieval (CLIP-family, BLIP/BLIP-2, SigLIP, Q-Former/adapters) and contrastive training at scale.
  • GPU performance & serving: mixed precision, ONNX Runtime/TensorRT (NVIDIA) or ROCm paths; profiling (nsys/nvprof/rocprof), post-training quantization, distillation.
  • Cross-GPU & distributed training: FSDP/ZeRO, DDP, tensor/pipeline parallelism, NCCL/RCCL, model sharding/checkpointing, and cluster scheduling (Kubernetes + GPU operators).
  • ROCm/MIGraphX experience (preferred): building/optimizing models on AMD GPUs; familiarity with MIOpen, MIGraphX backends, and ROCm toolchain.
  • Search infrastructure: FAISS/Milvus/Pinecone/pgvector, ANN indexes (HNSW/IVF), re-ranking (cross-encoders), and caching strategies.
  • Data & MLOps: scalable curation, labeling/weak supervision, feature stores, experiment tracking (Weights & Biases/MLflow), CI for ML, and reproducible training.
  • Solid software engineering: Python (prod-grade), plus a systems language (Go/C++/Rust) or strong willingness to learn; API design; testing; code reviews.
  • Clear communicator with a bias to measure, publish results, and change direction quickly when the data says so.

Nice-to-haves

  • Temporal detection/segmentation, tracking, re-ID, and multi-camera association.
  • Video-RAG and structured retrieval (combining embeddings with metadata/knowledge graphs).
  • On-device or edge inference; WebRTC/RTSP ingest; FFmpeg/GStreamer pipelines.
  • Experience in regulated or high-assurance environments (FedRAMP/HIPAA/CJIS) and privacy-preserving ML.

Values

  • No jerks
  • Be authentic
  • Be effective
  • Attention to detail
  • All in, all the time

Eligibility

  • Must be authorized to work in the U.S. Ability to obtain and maintain a Public Trust or other clearance may be required.

Apply for Senior AI Engineer, Video Search (Applied Research & Product) at ZeroEyes

Ready to join our team? Submit your application below for this role. We look forward to reviewing your application.

0 / 200 words

Max. file size: 128 MB.

By submitting this form, you agree to our privacy policy and terms of service.