Tenstorrent Unveils Next-Gen Servers for Fast Tokens, No Disaggregation Needed

eetimes.com 2026-04-28 Sally Ward-Foxton

Entities

Companies:Tenstorrent NVIDIA Equinix Cirrascale AI&Turiyam Better Brain OrionVM GUC Wiwynn MRPeasy Siemens EDA

People:Jim Keller Jasmina Vasiljevic Amr El-Ashmawi Jensen Huang David Bennett Harry Foster

Technologies:3nm EUV Blackhole chips Galaxy Blackhole LLM video generation agentic AI pipeline parallelism TTLang CUDA SRAM DRAM

Tags

AI chips Large Language Models Server hardware Token generation Compute clusters Heterogeneous computing High-performance computing AI inference Data center Compute optimization Software stack Edge computing

News Summary

Tenstorrent is set to unveil its next-generation Galaxy Blackhole servers and cluster systems, designed to deliver fast token generation without relying on the industry trend toward disaggregated infe... Read original →

Industry Analysis

By rejecting disaggregated inference and unifying prefill/decode on a single platform, Tenstorrent is forcing a paradigm shift back toward integrated AI server design. Its 32-chip Galaxy Blackhole system leverages 3nm EUV and massive on-chip SRAM to reduce DRAM bottlenecks—but intensifies co-design demands on EDA (e.g., Siemens) and advanced packaging. While sidestepping some U.S. export controls, reliance on TSMC’s 3nm node introduces supply chain fragility. Against NVIDIA’s CUDA moat, Tenstorrent’s bet on TTLang and general-purpose compatibility aims to lure enterprises frustrated by vendor lock-in—potentially pressuring NVIDIA to open its inference stack faster. If Equinix’s Distributed AI Hub proves deployment efficiency within 18 months, the industry may pivot from raw FLOPS to TCO-optimized edge clusters, undermining the NVLink-centric HPC orthodoxy.

Read Original Article →

This page displays AI-generated summaries and metadata for research purposes. Original content belongs to the respective publishers.