Senior AI Engineer, Video Search (Applied Research & Product)

DepartmentTechnology

Employment TypeFull-time

LocationRemote / Hybrid / Conshohocken, PA

Reports ToDirector of AI

About ZeroEyes, Inc.

ZeroEyes was founded by former Navy SEALs, self-starters and elite technologists with a mission to reduce the threat and impact of mass shootings and gun-related violence using our best-in-class artificial intelligence (AI) platform that detects visible firearms before there’s a threat. As a member of the ZeroEyes team, you’ll have the unique opportunity to join a forward-facing, purpose-driven company, and your perseverance and individual skill set will become crucial to our mission’s success.

About the role

We’re hiring a Senior AI Engineer to help lead applied research and productionization of video search, from natural-language queries to fast, scalable retrieval across archives and live streams. You’ll develop models, pipelines, and high-performance APIs. We value people who care more about truth than winning arguments, mentor generously, and take personal responsibility for the organization’s success.

What you’ll do

Contribute to video search stack end-to-end: dataset curation, model training/fine-tuning, indexing, retrieval APIs, latency/throughput optimization, and real-world evaluation.
Applied research → production: Evaluate and integrate V-JEPA2 style representations for video understanding and retrieval; compare/compose with CLIP/SigLIP/TimeSformer/ViViT/Video-LLMs for NL→video.
Text–video alignment: Build query encoders for natural-language search (prompting, adapters, contrastive losses, distillation) and robust negative mining; support multilingual queries.
Temporal grounding: Deliver moment-localization and highlight detection (segment-level embeddings, token-aligned pooling, temporal R@K / mAP).
Indexing at scale: Stand up vector/search infra (FAISS, Milvus, pgvector, Pinecone) with sharding, HNSW/IVF/ScaNN, hybrid signals (text + metadata + structure).
Latency & cost: Optimize preprocessing (frame sampling, shot detection), feature caching, batch inference, and low-latency serving (ONNX Runtime/TensorRT or ROCm paths).
Cross-GPU strategies: Design and implement multi-GPU training and serving—FSDP/ZeRO, tensor & pipeline parallelism, sharded/streamed decoding, NCCL/RCCL communication tuning, mixed precision/quantization, and elastic autoscaling.
Quality & evaluation: Define task-specific metrics (R@K, nDCG, mAP, temporal mAP), build dashboards and AB tests; run bias/robustness checks and failure-mode analyses.
Security & compliance aware: Design for privacy, auditability, and clean separation of controlled data; collaborate with platform/DevOps on IaC, CI/CD, and observability.
Mentor & collaborate: Level-up adjacent teams (ML Ops, backend, product). Write clear design docs and ADRs; lead design reviews.

What you’ll bring

6–10+ years total; 4+ years applying deep learning to video, vision, or multimodal retrieval with shipped features or products.
Hands-on with PyTorch (preferred) and modern video backbones; practical experimentation with V-JEPA/V-JEPA2 (or JEPA-style self-supervised video objectives).
Strong with text–image/video retrieval (CLIP-family, BLIP/BLIP-2, SigLIP, Q-Former/adapters) and contrastive training at scale.
GPU performance & serving: mixed precision, ONNX Runtime/TensorRT (NVIDIA) or ROCm paths; profiling (nsys/nvprof/rocprof), post-training quantization, distillation.
Cross-GPU & distributed training: FSDP/ZeRO, DDP, tensor/pipeline parallelism, NCCL/RCCL, model sharding/checkpointing, and cluster scheduling (Kubernetes + GPU operators).
ROCm/MIGraphX experience (preferred): building/optimizing models on AMD GPUs; familiarity with MIOpen, MIGraphX backends, and ROCm toolchain.
Search infrastructure: FAISS/Milvus/Pinecone/pgvector, ANN indexes (HNSW/IVF), re-ranking (cross-encoders), and caching strategies.
Data & MLOps: scalable curation, labeling/weak supervision, feature stores, experiment tracking (Weights & Biases/MLflow), CI for ML, and reproducible training.
Solid software engineering: Python (prod-grade), plus a systems language (Go/C++/Rust) or strong willingness to learn; API design; testing; code reviews.
Clear communicator with a bias to measure, publish results, and change direction quickly when the data says so.

Nice-to-haves

Temporal detection/segmentation, tracking, re-ID, and multi-camera association.
Video-RAG and structured retrieval (combining embeddings with metadata/knowledge graphs).
On-device or edge inference; WebRTC/RTSP ingest; FFmpeg/GStreamer pipelines.
Experience in regulated or high-assurance environments (FedRAMP/HIPAA/CJIS) and privacy-preserving ML.

Values

No jerks
Be authentic
Be effective
Attention to detail
All in, all the time

Eligibility

Must be authorized to work in the U.S. Ability to obtain and maintain a Public Trust or other clearance may be required.

Apply for Senior AI Engineer, Video Search (Applied Research & Product) at ZeroEyes

Ready to join our team? Submit your application below for this role. We look forward to reviewing your application.