Technically Speaking with Chris Wright

Technically Speaking with Chris Wright
Podcast Description
Struggling to keep pace with the ever-changing world of technology? For experienced tech professionals, making sense of this complexity to find real strategic advantages is key. This series offers a clear path, featuring insightful, casual conversations with leading global experts, innovators, and key voices from Red Hat, all cutting through the hype.
Drawing from Red Hat's deep expertise in open source and enterprise innovation, each discussion delves into new and emerging technologies-- from artificial intelligence and the future of cloud computing to cybersecurity, data management, and beyond. The focus is on understanding not just the 'what,' but the important 'why' and 'how': exploring how these advancements can shape long-term strategic developments for your organization and your career. Gain an insider’s perspective that humanizes complex topics, helping you anticipate what’s next and make informed decisions. Equip yourself with the knowledge to turn today's emerging tech into valuable, practical strategies and apply innovative thinking in your work.
Tune in for forward-looking discussions that connect the dots between cutting-edge technology and real-world application, leveraging a rich understanding of the enterprise landscape. Learn to navigate the future of tech with confidence.
Podcast Insights
Content Themes
The podcast explores a variety of technology topics including artificial intelligence, cloud computing, open source innovation, and cybersecurity. Specific episodes include discussions on AI optimization strategies with experts like Nick Hill and insights into enterprise AI implementations with Brian Stevens, focusing on real-world applications and strategic developments.

Struggling to keep pace with the ever-changing world of technology? For experienced tech professionals, making sense of this complexity to find real strategic advantages is key. This series offers a clear path, featuring insightful, casual conversations with leading global experts, innovators, and key voices from Red Hat, all cutting through the hype.
Drawing from Red Hat’s deep expertise in open source and enterprise innovation, each discussion delves into new and emerging technologies– from artificial intelligence and the future of cloud computing to cybersecurity, data management, and beyond. The focus is on understanding not just the ‘what,’ but the important ‘why’ and ‘how’: exploring how these advancements can shape long-term strategic developments for your organization and your career. Gain an insider’s perspective that humanizes complex topics, helping you anticipate what’s next and make informed decisions. Equip yourself with the knowledge to turn today’s emerging tech into valuable, practical strategies and apply innovative thinking in your work.
Tune in for forward-looking discussions that connect the dots between cutting-edge technology and real-world application, leveraging a rich understanding of the enterprise landscape. Learn to navigate the future of tech with confidence.
Scaling LLM inference for production isn’t just about adding more machines, it demands new intelligence in the infrastructure itself. In this episode, we’re joined by Carlos Costa, Distinguished Engineer at IBM Research, a leader in large-scale compute and a key figure in the llm-d project. We discuss how to move beyond single-server deployments and build the intelligent, AI-aware infrastructure needed to manage complex workloads efficiently.
Carlos Costa shares insights from his deep background in HPC and distributed systems, including:
• The evolution from traditional HPC and large-scale training to the unique challenges of distributed inference for massive models.
• The origin story of the llm-d project, a collaborative, open-source effort to create a much-needed “”common AI stack”” and control plane for the entire community.
• How llm-d extends Kubernetes with the specialization required for AI, enabling state-aware scheduling that standard Kubernetes wasn’t designed for.
• Key architectural innovations like the disaggregation of prefill and decode stages and support for wide parallelism to efficiently run complex Mixture of Experts (MOE) models.
Tune in to discover how this collaborative, open-source approach is building the standardized, AI-aware infrastructure necessary to make massive AI models practical, efficient, and accessible for everyone.

Disclaimer
This podcast’s information is provided for general reference and was obtained from publicly accessible sources. The Podcast Collaborative neither produces nor verifies the content, accuracy, or suitability of this podcast. Views and opinions belong solely to the podcast creators and guests.
For a complete disclaimer, please see our Full Disclaimer on the archive page. The Podcast Collaborative bears no responsibility for the podcast’s themes, language, or overall content. Listener discretion is advised. Read our Terms of Use and Privacy Policy for more details.