POC to PROD: Hard Lessons from 200+ Enterprise GenAI Deployments - Randall Hunt, Caylent

39.5K views · Jul 23, 2025 · 19:15 min · Watch on YouTube ↗

Takeaway

In enterprise GenAI, frame annotation, multimodal pooled embeddings, and boring infra (pgvector, Elasticsearch) beat exotic approaches.

Summary

Caylent's 200+ enterprise GenAI builds: multimodal video search for Nature Footage using Amazon Nova Pro captions + Titan v2 pooled frame embeddings in Elasticsearch.
Sports highlight extraction hack: ffmpeg amplitude spectrograph of crowd cheering to find clips, then embeddings of audio/video, push notifications via SNS/Amazon end-user messaging.
Annotating video frames (e.g. drawing a blue line on the 3-point arc) dramatically outperforms raw video for VLM Q&A; SAM 2 can auto-annotate.
Favors Postgres + pgvector over OpenSearch for production vector stores; warns GenAI is not the silver bullet your CTO read about in WSJ.
Built decarbonization-of-buildings HVAC agent for Brainbox AI (Time 100 invention) and water-conservation systems for Simmons.

multimodalenterprisevideo-search

Original description

The transition from experimental GenAI demonstrations to robust, production-grade systems involves significant technical and organizational complexities. Humans provide a ceiling on the true ROI of automations. This session synthesizes key patterns and practical strategies gathered from more than 200 GenAI implementations across multiple industries and business sizes.

Beyond the general lessons that apply to most products leveraging GenAI, we'll cover detailed observations within three application areas: multimodal understanding and search, enterprise knowledge retrieval, and AI agent architectures. We will share real-world comparative performance data and metrics on embedding models, vector index implementations, and explore various implementation methodologies that balance performance and cost.

Additionally, the session addresses organizational insights critical to successful AI deployments, such as the importance of clearly defined evaluation processes and understanding real-world user interaction challenges, highlighted by examples from healthcare environments. Attendees will gain an understanding of decision-making criteria, including the appropriate complexity of prompt engineering versus more elaborate orchestration methods, token/cost management strategies in multilingual settings, and the challenges in driving behavioral change with new UX and application interaction capabilities.

Participants will leave equipped with practical, data-supported insights for effectively navigating their own GenAI projects, including benchmarks and criteria for informed technology selection, and techniques to streamline the transition from initial concept to sustainable operational deployment. Please note, we all know this field evolves rapidly and we will mark which lessons we believe are immutable.

About Randall Hunt
Randall Hunt is a technology leader, investor, and hands-on keyboard coder based in Los Angeles, CA. Previously, Randall led software and developer relations teams at Facebook, SpaceX, AWS, MongoDB, and NASA. Randall spends most of his time listening to customers, building demos, writing blog posts, and mentoring junior engineers. Python and C++ are his favorite programming languages, but he begrudgingly admits that Javascript rules the world. Outside of work, Randall loves to read science fiction, advise startups, travel, and ski.

Recorded at the AI Engineer World's Fair in San Francisco. Stay up to date on our upcoming events and content by joining our newsletter here: https://www.ai.engineer/newsletter