Tech Threads: Weaving the Intelligent Future
Tech Threads: Weaving the Intelligent Future
Podcast Description
This podcast hosted by Baya Systems explores the cutting edge of technology, from AI acceleration to data movement and chiplet innovation. Each episode dives into groundbreaking advancements shaping the future of computing, featuring insights from industry experts on the trends and challenges defining the tech landscape. Tune in to stay ahead in the rapidly evolving world of technology.
Podcast Insights
Content Themes
The podcast covers cutting-edge themes in technology such as AI acceleration, data movement, and chiplet innovation. For instance, in the inaugural episode, the discussion revolves around the rapid evolution of the semiconductor industry, highlighting the impact of AI on chip design and the implications of a slowing Moore's Law.

This podcast hosted by Baya Systems explores the cutting edge of technology, from AI acceleration to data movement and chiplet innovation. Each episode dives into groundbreaking advancements shaping the future of computing, featuring insights from industry experts on the trends and challenges defining the tech landscape. Tune in to stay ahead in the rapidly evolving world of technology.
In this episode of Tech Threads: Weaving the Intelligent Future, Baya Systems’ Nandan Nayampally sits down with Charlie Cheng, founder and CEO of TC Lab, for an in-depth conversation on the memory wall and why it has become one of the defining bottlenecks in AI infrastructure. While memory constraints have existed for decades, AI inference is bringing the issue into sharper focus by turning memory bandwidth into a direct driver of user experience, system performance, and data center economics.
Charlie shares his perspective on the industry’s shift toward alternative AI architectures, from high-bandwidth memory and SRAM-based approaches to emerging 3D memory technologies and hybrid-bonded architectures that bring memory much closer to compute. He explains why inference workloads, especially token generation and KV cache access, can quickly become bandwidth-bound, and why solving that challenge requires rethinking the relationship between compute, memory, packaging, and on-chip data movement.
The discussion also explores what happens when memory bottlenecks are reduced or removed. As more bandwidth becomes available to AI accelerators, the pressure shifts to the rest of the system, including networks-on-chip, chiplet fabrics, and data movement architectures. For companies building next-generation AI chips, hyperscale infrastructure, autonomous systems, and edge inference platforms, this creates both a challenge and an opportunity: the need for more flexible, scalable, and software-defined approaches to moving data efficiently across increasingly complex systems.
Tune in for an expert look at why the future of AI performance depends as much on memory innovation and data movement as it does on compute, and how new architectures could help unlock faster, more efficient, and more scalable AI systems.

Disclaimer
This podcast’s information is provided for general reference and was obtained from publicly accessible sources. The Podcast Collaborative neither produces nor verifies the content, accuracy, or suitability of this podcast. Views and opinions belong solely to the podcast creators and guests.
For a complete disclaimer, please see our Full Disclaimer on the archive page. The Podcast Collaborative bears no responsibility for the podcast’s themes, language, or overall content. Listener discretion is advised. Read our Terms of Use and Privacy Policy for more details.