SemiPulse | AI-Powered Semiconductor Supply Chain Intelligence & Market Signals

Boosting MoE Training Throughput with Advanced Fusion Kernels | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-16 NVIDIA Developer

Mixture-of-experts (MoE) models have quickly become a foundational component of modern, large-scale AI systems. They are widely adopted because they enable substantially larger model capacity while activating only a subset of parameters for each token, offering an unparalleled approach for scaling performance within a practical compute budget. As model scales continue to grow, the optimization of

Mixture-of-Experts AI Training Optimization GPU Performance Deep Learning Acceleration NVIDIA Technical Blog cuDNN Frontend Transformer Engine Megatron-Core

NVIDIA Achieves Leading Agentic Coding Performance on First Agentic AI Benchmark | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-13 NVIDIA Developer

AI agents have fundamentally changed the complexity of inference workloads. Until now, the industry has struggled to define a standard for measuring how inference systems perform under these conditions. Artificial Analysis AgentPerf (AA-AgentPerf) offers the industry’s first multi-vendor open benchmarks profiling trajectories that are representative of real-world AI agent coding tasks. This post

AI Agents Benchmarking NVIDIA GPU Performance Inference Systems Concurrent Agents Hardware Optimization Co-design

Run DiffusionGemma on NVIDIA for Developer-Ready, High-Throughput Text Generation | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-11 NVIDIA Developer

Developers building real-time AI—such as chat assistants, copilots, and agentic workflows—are often constrained by token-by-token generation speed. This limits responsiveness, increases serving costs, and makes fluid, interactive experiences difficult to achieve. DiffusionGemma, created by Google DeepMind and optimized to run efficiently across NVIDIA platforms, introduces a new approach to tex

AI text generation NVIDIA GPU DiffusionGemma parallel computing large language model low-latency inference multimodal AI enterprise AI applications

Designing Production-Ready Battery Energy Storage Systems for AI Factories | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-10 NVIDIA Developer

AI factories are changing what data-center infrastructure must do. Unlike traditional data centers, AI factories are built to manufacture intelligence at scale. They run power-dense training and inference workloads, increasingly support agentic and reasoning models, and must deliver predictable performance even as compute demand shifts rapidly. In this environment, electrical infrastructure is no

AI Factory Battery Energy Storage System BESS Data Center Infrastructure Power Grid Smart Grid Renewable Energy Power Quality

Accelerating Federated Learning Research with AI Agents and NVIDIA FLARE Auto-FL | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-10 NVIDIA Developer

Federated learning (FL) research often begins with a deceptively simple question: What should we try next? A new aggregation rule, a FedProx coefficient, a server optimizer setting, a SCAFFOLD variant, or a model architecture tweak may all look promising before an experiment starts. After the run finishes, the harder questions begin: Did the change actually improve the metric? Was the comparison

Federated Learning AI Agents Automated Experimentation NVIDIA FLARE Machine Learning Optimization Model Training Experiment Ledger Algorithm Improvement

Train Models Faster with JAX and MaxText Using NVFP4 on NVIDIA Blackwell | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-09 NVIDIA Developer

Pre-training frontier LLMs comes down to throughput. When training spans trillions of tokens across thousands of accelerators, every percentage point of step time can add up to days of training and substantial compute costs. Numerical precision is one of the highest-leverage knobs available, but low- bit mixed-precision pretraining is hard to get right. To address this, the NVFP4 training recipe

Large Language Models AI Training NVIDIA Blackwell Mixed Precision Training TransformerEngine JAX MaxText 4-bit Quantization

NVIDIA Nemotron 3 Ultra Powers Faster, More Efficient Reasoning for Long-Running Agents | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-04 NVIDIA Developer

Single-turn chatbots are evolving into long-running agents that can reason, maintain context, use tools, and run efficiently across many turns to complete complex workflows. However, these multi-agent workflows cause token counts to grow quickly. Agents plan, call tools, invoke sub-agents, receive information, and then pass history, outputs, and reasoning steps back into the model continuously. A

NVIDIA Large Language Model Agent Orchestration Multi-Agent Systems Reasoning Efficiency Mixture-of-Experts Long-running Agents Model Optimization

Deploy Agentic-Ready AI at the Edge with Memory Efficiency in NVIDIA JetPack 7.2 | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-02 NVIDIA Developer

As AI agents move from the digital world to the physical environment, they can readily use NVIDIA Jetson to accelerate real-world deployment with optimized memory and performance. NVIDIA JetPack 7.2 directly supports one-command deployment of NVIDIA NemoClaw, an open source stack that adds privacy and security controls to OpenClaw. It introduces NVIDIA agent skills for Jetson—Jetson device-side

NVIDIA Jetson Edge AI Agentic AI Memory Efficiency AI Deployment Embedded Systems GPU Partitioning MIG Technology

Develop Physical AI Reasoning, World, and Action Models with NVIDIA Cosmos 3 | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-01 NVIDIA Developer

Physical AI systems must understand the real world before they can act within it. Robots, autonomous vehicles, and smart spaces need to understand what’s happening in their world, predict what’s likely to happen next, and generate actions for specific environments, embodiments, and tasks. NVIDIA Cosmos 3 is a frontier foundation model for physical AI that combines physical reasoning, world gener

Physical AI World Modeling Action Generation NVIDIA Robotics Autonomous Driving Synthetic Data Vision-Language Model

Advancing AI Infrastructure for Agentic AI with NVIDIA DOCA In-Silicon Security | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-01 NVIDIA Developer

The AI era is driving a new class of infrastructure: AI factories that transform data into intelligence for autonomous AI agents operating at unprecedented scale. Powered by accelerated computing, AI factories enable enterprises to train, fine-tune, and deploy AI with greater speed and efficiency. This new class of infrastructure also introduces a fundamentally new attack surface spanning infras

AI Infrastructure NVIDIA BlueField DOCA Security In-silicon Security AI Factory Cybersecurity Data Protection Agentic AI

NVIDIA Vera CPU Sets a New Standard for Agentic Workloads in AI Factories | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-01 NVIDIA Developer

__fail__

NVIDIA DSX OS Delivers Open, Modular Software for Operating AI Factories at Scale | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-06-01 NVIDIA Developer

AI is now essential infrastructure, powered by AI factories that generate intelligence in the form of tokens. As demand grows, these factories must scale faster, operate more efficiently, and lower the cost of intelligence across the five-layer stack: energy, chips, infrastructure, models, and applications. NVIDIA DSX platform provides the complete playbook for designing, simulating, building, an

AI Factory NVIDIA DSX Open Source Software Modular Architecture AI Infrastructure GPU Cluster Energy Efficiency Intelligent Scheduling

Run Step 3.7 Flash on NVIDIA GPUs with Enterprise-Ready Multimodal AI | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-05-29 NVIDIA Developer

AI applications are moving beyond text generation to multimodal systems that can perceive, search, and reason across images, documents, video, and language in real time—turning fragmented information into actionable insights. Step 3.7 Flash, the latest from StepFun, brings these capabilities to production and enterprise-scale, available on NVIDIA-accelerated infrastructure. It is a 198B-paramet

Multimodal AI NVIDIA GPU StepFun Enterprise AI Vision-Language Model Mixture-of-Experts Inference Optimization NVIDIA NIM

NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-05-28 NVIDIA Developer

The cold-start problem In production inference deployments, demand fluctuates over time, requiring inference replicas to scale elastically. However, cold-starting inference workloads on Kubernetes can take several minutes. During that time, GPUs are allocated but idle, generating no tokens and serving no requests. This delay increases the risk of service level agreement (SLA) violations during t

AI Inference Kubernetes GPU Acceleration Cold Start Problem Containerization Checkpoint/Restore NVIDIA Dynamo CUDA

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-05-27 NVIDIA Developer

Large language models (LLMs) are revolutionizing the financial trading landscape by enabling sophisticated analysis of vast amounts of unstructured data to generate actionable trading insights. These advanced AI systems can process financial news, social media sentiment, earnings reports, and market data to predict stock price movements and automate investment strategies with unprecedented accurac

Large Language Models Financial Trading STAC-AI Benchmark NVIDIA Blackwell LLM Inference RAG Pipeline EDGAR Dataset TensorRT LLM

NVIDIA CUDA 13.3 Enhances GPU Development with Tile Programming in C++, Compiler Autotuning, and Python Updates | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-05-27 NVIDIA Developer

NVIDIA CUDA 13.3 brings new capabilities and performance optimizations to developers across the CUDA ecosystem. The launch of NVIDIA CUDA Tile programming in C++, enables high-level, tile-based kernel development that automatically manages complex low-level GPU details for optimal performance and portability. Additionally, CUDA Tile programming is now supported on Compute Capability 9.0 (NVIDIA Ho

NVIDIA CUDA GPU Development Tile Programming Compiler Optimization Python Support GPU Architecture Hopper Architecture

Extract More Kernel Performance with NVIDIA CompileIQ Auto-Tuning | NVIDIA Technical Blog - NVIDIA Developer

0.92

developer.nvidia.com 2026-05-27 NVIDIA Developer

NVIDIA CompileIQ tackles one of the hardest problems in performance engineering: finding the compiler options that unlock the best performance for a specific workload. Consider a team that has spent weeks optimizing an LLM inference pipeline on GPUs, tuning batch sizes, quantizing to FP8, adopting flash attention, fusing every kernel they can. The profiler says there’s nothing left to squeeze. B

GPU Performance Optimization Compiler Optimization AI Infrastructure NVIDIA CUDA AI Inference Acceleration Auto-Tuning Kernel Optimization Machine Learning

Building Token‑Metered AI Services on Telco AI Factories | NVIDIA Technical Blog - NVIDIA Developer

0.75

developer.nvidia.com 2026-05-21 NVIDIA Developer

__fail__

NVIDIA-Verified Agent Skills Provide Capability Governance for AI Agents | NVIDIA Technical Blog - NVIDIA Developer

0.85

developer.nvidia.com 2026-05-20 NVIDIA Developer

__fail__

How the NVIDIA Vera Rubin Platform is Solving Agentic AI’s Scale-Up Problem | NVIDIA Technical Blog - NVIDIA Developer

0.9

developer.nvidia.com 2026-05-15 NVIDIA Developer

__fail__

Semiconductor News & Analysis Feed