Note: The job is a remote job and is open to candidates in USA. Strattmont is a construction-tech company headquartered in Riyadh, focused on building a connected-worker platform for large-scale construction sites. They are seeking a Computer Vision & AI Lead to own the SiteGuard stack, develop detection models, and lead a team of senior engineers in enhancing safety compliance through advanced AI technologies.
Responsibilities
- Own hiring end-to-end for SiteGuard: define the bar across ML, backend, and frontend disciplines, run the process, and make decisions. You have hired senior engineers before and know what good looks like
- Manage performance directly - set clear expectations, give continuous feedback, run meaningful reviews, and act decisively when someone is not meeting the bar
- Build a high-ownership engineering culture where engineers take initiative, write their own issues, and feel accountable for product outcomes, not just task completion
- Mentor engineers at every level - from onboarding new contributors to developing senior engineers. Your track record includes engineers who grew significantly under your leadership
- Own the squad's delivery: run agile ceremonies, plan sprints, and partner with Product to translate the SiteGuard roadmap into shippable increments
- Own the full lifecycle of SiteGuard's detection capabilities - PPE compliance, hazard detection, unsafe-behavior recognition - from problem framing through training, evaluation, and production deployment
- Lead the strategic use of modern architectures and foundation models: transformer-based detectors (RT-DETR, YOLOv10/11, Co-DETR) alongside classical YOLO-family models; zero/few-shot approaches via CLIP, DINOv2, GroundingDINO, and SAM 2; and VLMs (Gemini Vision, GPT-4V, LLaVA, Qwen-VL) for scene understanding and incident reasoning where they outperform purpose-built detectors
- Lead the use of generative AI to augment training data - diffusion-based synthetic image generation (ControlNet, Stable Diffusion) for rare PPE violations, lighting conditions, and site environments that real-world collection cannot cover economically
- Define and drive model-quality targets (precision/recall, false-alarm rates) and the retraining loops that sustain them as site conditions change
- Oversee dataset strategy: collection, AI-assisted annotation (using SAM, CLIP, or VLMs as labeling tools), curation, and governance of site imagery and video
- Lead development of the on-site edge agent that runs inference close to the camera - optimizing transformer-based and classical models for constrained hardware (quantization, INT8/FP16, TensorRT, ONNX, batching, CUDA/NPU accelerators)
- Lead development of cloud camera agents for cloud-based deployments, ensuring the same consistent detection output as on-site agents regardless of deployment type
- Engineer for the realities of the field: on-prem gateway deployment, intermittent connectivity, store-and-forward, and graceful degradation
- Ensure low-latency, reliable detection-to-alert pipelines from camera to platform, including on-device pre-filtering before cloud VLM calls where cost and latency demand it
- Own the golden labeled dataset as the single source of truth for evaluation, fine-tuning, and production monitoring. Run production reviewer signals (labels, corrections) back into the dataset continuously
- Design evaluation metrics for both classical detector output and VLM-generated detections (precision/recall, false-positive control, human-agreement). Run every model, prompt, or schema change as a regression test against the golden set before release, no change ships without a measured quality bar
- Architect data-extraction, AI-assisted annotation, and training pipelines that ensure reproducibility and versioning of datasets and models (experiment tracking, model registries, dataset versioning)
- Implement CI for models and code: automated retraining/evaluation, LLM-as-judge patterns for open-ended detection outputs, and production monitoring for model drift, cost, and accuracy degradation
- Build deterministic post-processing guardrails over model output - domain/OSHA rule filters, confidence calibration, audit trails - so the product behaves predictably even when model output varies
- Raise engineering maturity across SiteGuard repositories: test coverage, CI gating, and coverage reporting
- Own the SiteGuard product surfaces (web dashboards and frontends) and the APIs that deliver detected violations and safety events into the core Company's platform
- Ensure alerting, reporting, and analytics turn raw detections into clear, prioritized actions - including LLM-generated incident summaries and natural-language search over safety event history
- Enforce code review, testing, and quality standards across model and application code
- Champion privacy-by-design for video and personal data - anonymization, access controls, retention limits, and responsible use of footage in compliance with Company and customer requirements. Implement responsible-AI safeguards on all AI-generated outputs: confidence thresholds, human-in-the-loop review for high-severity alerts, and audit trails
- Collaborate with Hardware/camera, DevOps, Field Engineering, and customer teams to validate SiteGuard in real site conditions and incorporate feedback
- Lead R&D into new detection capabilities - multi-modal models combining video and sensor data, behavioral analysis, crowd analytics - and evaluate emerging CV/AI approaches with honest, grounded judgment
- Stay current with the fast-moving foundation-model and VLM ecosystem and translate new research into concrete roadmap decisions for the squad
Skills
- 6+ years of software/ML engineering experience with a clear progression from senior IC to engineering leadership
- 3+ years directly managing engineers - leading teams, owning hiring, running performance cycles. Mentoring is not the same as managing; this role requires the latter
- A track record of hiring: you have built or significantly grown an engineering team and made independent hiring decisions at the senior engineer level and above
- Proven experience shipping production systems built on vision/multimodal foundation models (VLMs/LLMs via cloud APIs) - owning quality, latency, and cost from prototype to scale
- Hands-on experience operating high-throughput, asynchronous video/media processing pipelines in production
- Strong Python (async-first: asyncio, FastAPI or equivalent); production service design with a focus on reliability and observability
- Hands-on with multimodal/VLM APIs - Gemini/Vertex AI, OpenAI, or Anthropic equivalents: prompt engineering, structured/JSON-schema-constrained output, context/caching, and per-model parameter tuning
- Computer-vision foundations: object detection and segmentation across both classical architectures (YOLO-family) and modern transformer-based detectors (RT-DETR, Co-DETR, GroundingDINO); video frame extraction and handling (OpenCV/FFmpeg); spatial reasoning over model output
- Foundation models and zero/few-shot approaches: CLIP, DINOv2, SAM 2 for annotation assistance and detection; VLMs for scene understanding and incident reasoning
- Edge inference optimization: ONNX, TensorRT, quantization (INT8/FP16), deployment to constrained hardware (Jetson, Hailo, or equivalent)
- Distributed pipeline design: message brokers, relational databases with async ORM and migration tooling, object storage - comfortable across cloud providers (GCP, AWS, or Azure)
- MLOps stack: experiment tracking, model registries, dataset versioning, and CI pipelines for model evaluation
- Primary daily experience with Claude Code and Codex - used in real engineering work, with formed opinions about when to trust their output and when not to
- Current awareness of the AI model landscape: practical differences between frontier models (Gemini, GPT-4o, Claude, DeepSeek) for vision tasks, code generation, and structured output
- Tracks AI trends actively - model releases, VLM capabilities, agentic framework developments - and translates this into concrete, grounded team guidance
- Exceptional written and spoken English. This is a hard requirement. You write clearly and precisely - design documents, evaluation reports, and stakeholder updates are well-structured and unambiguous. You can explain a model's failure mode to an HSE manager and an architecture decision to an engineer with equal clarity
- Experience with safety, surveillance, or video analytics in industrial or construction environments. OSHA/EHS domain knowledge is a strong plus
- Experience with synthetic data generation pipelines (ControlNet, Stable Diffusion) for computer-vision training data augmentation
- LLMOps / observability for model-backed services: tracing model calls, output monitoring, A/B testing of prompts and schemas
- Agentic frameworks (LangChain, LlamaIndex, AutoGen, or equivalent) applied to safety workflows or multi-step incident management
- Comfort working config-over-code for multi-tenant rollouts (per-project model and prompt configuration)
- Familiarity with on-prem/edge deployment, gateways, and operating under intermittent connectivity
- Experienced people manager: you have had difficult performance conversations, managed out engineers who were not meeting the bar, and done so with fairness and directness
- Research-oriented curiosity balanced with production pragmatism - you read papers, run experiments, and ship to production. You distinguish AI approaches that deliver real site-safety value from impressive benchmarks that do not survive the field
- Excellent written and spoken English - able to translate model behavior, failure modes, and confidence levels to non-technical stakeholders, including explaining why an AI made a specific safety call
- Strategic, outcome-driven thinking: makes technology decisions based on product value and field reliability, not novelty
- Comfort operating in a fast-paced, evolving environment where both site priorities and the AI landscape shift quickly
Benefits
- Competitive salary, performance bonus, and equity participation.
- High-autonomy role with direct product and company impact - you are building a safety AI product from the ground up, not maintaining an inherited codebase.
- A small, senior engineering team where your decisions matter and your name is on the architecture.
- Relocation support for candidates joining from outside KSA.
- Health insurance, annual flights, and standard Company benefits package.
Company Overview
Strattmont offers IT services, consulting, and talent acquisition, focusing on hiring and managed IT solutions. It was founded in 2020, and is headquartered in Brvenica, Brvenica, MKD, with a workforce of 11-50 employees. Its website is https://strattmont.com.