Research

Sustainable NPU Architecture

The surge in AI applications has made the design of efficient neural processing units (NPUs) crucial. The rapid growth of AI applications leads to diverse characteristics in AI models tailored to their purpose. This diversity expands across all aspects, including operator types, data formats, and numeric formats that AI models utilize, beyond just model sizes. Industry and academia propose customized solutions and design new NPU chips for emerging AI models with novel characteristics.
We raise the question: "Is this approach sustainable?"

Scalable on-chip memory system and management technique for scale-out NPU architectures

On-chip memory management technique for embedding vector operations

Energy and cost efficient heterogeneous off-chip memory system for NPUs

Processing unit architecture for abitrary numeric formats

Future Memory Systems

To be added.

Comprehensive analysis of CXL-based server memory systems

Software Techniques for GPUs

To be added.

Automatic thread allocation technique for multi-tenant inference on embedded GPUs

Memory oversubscription-aware tensor migration scheduling technique

Accelerating K-means clustering algorithm on embedded GPUs