Merge to main
Merge to main
Podcast Description
Merge to Main is a podcast where we talk with people who've been in the trenches of tech leadership. No fancy buzzwords or corporate speak - just honest conversations about what it's actually like to build and lead engineering teams. Each week, we explore the real challenges of engineering leadership - from hiring and team culture to shipping quality products. Practical, no-nonsense conversations that help you become a better leader today.
Podcast Insights
Content Themes
The podcast emphasizes engineering leadership themes such as hiring practices, team culture, quality assurance, and the adaptation to new technologies like AI. For instance, episodes might delve into hiring strategies that prioritize soft skills alongside technical abilities, examine cultural shifts within teams, and analyze the balance between quality and development speed.

Merge to Main is a podcast where we talk with people who’ve been in the trenches of tech leadership. No fancy buzzwords or corporate speak – just honest conversations about what it’s actually like to build and lead engineering teams. Each week, we explore the real challenges of engineering leadership – from hiring and team culture to shipping quality products. Practical, no-nonsense conversations that help you become a better leader today.
From BERT to Agents: Building Production AI at Booking.com
After seven years building ML systems that serve millions of travelers, Georgios Chouliaras has watched the field transform from hand-coded chatbot rules to autonomous agents—and he's learned which shiny new approaches actually work in production.
Georgios Chouliaras, Senior Machine Learning Scientist at Booking.com, joins me to share hard-won insights from deploying AI at scale. His journey spans customer service chatbots that broke during COVID (because the training data didn't include ”global pandemic”), company-wide ML best practices, and now the cutting edge of agent development.
In this episode, we explore:
- Why LLMs represent the biggest abstraction leap since high-level programming languages, and what control you sacrifice for that flexibility
- The practical framework for deciding when LLMs beat classical ML (hint: it's not always about having text data)
- How to build LLM judges that actually work: starting with binary labels, achieving annotator agreement before anything else, and why boundary cases matter most for few-shot examples
- What's genuinely unsolved in agents right now, memory as lifelong learning and planning approaches that don't collapse under complexity
Georgios challenges some popular assumptions: the REACT pattern everyone implements? He hasn't seen it consistently outperform simpler approaches. Massive parameter counts? Architecture and training data now matter more. His underhyped pick: straightforward function calling often beats elaborate agent architectures.
The core takeaway: Use the simplest tool that solves your problem. Production users don't care if you're running a sophisticated multi-agent system, they care if it works.
Connect with Georgios:
- LinkedIn: https://www.linkedin.com/in/chouligi/
Connect with me:
Check out our awesome sponsor, dearmachines.com, QA AI Agents for Continuous Testing.

Disclaimer
This podcast’s information is provided for general reference and was obtained from publicly accessible sources. The Podcast Collaborative neither produces nor verifies the content, accuracy, or suitability of this podcast. Views and opinions belong solely to the podcast creators and guests.
For a complete disclaimer, please see our Full Disclaimer on the archive page. The Podcast Collaborative bears no responsibility for the podcast’s themes, language, or overall content. Listener discretion is advised. Read our Terms of Use and Privacy Policy for more details.