Articles tagged with "GPUs, Dataflow, DeepLearning"

Efficient Synchronous Dataflow Execution For GPUs (NVIDIA, UW-Madison)

Researchers from NVIDIA and the University of Wisconsin-Madison have published a technical paper titled “Kitsune: Enabling Dataflow Execution on GPUs with Spatial Pipelines.” The paper discusses the challenges of using GPUs for deep learning applications due to their bulk-synchronous execution model. The researchers introduce Kitsune, a set of primitives and an end-to-end compiler based on PyTorch Dynamo, to enable dataflow execution on GPUs. Kitsune shows significant performance improvements and reduced off-chip traffic for both inference and training tasks across various applications.

SemiEngineering•

5 months ago

Efficient Synchronous Dataflow Execution For GPUs (NVIDIA, UW-Madison)

We use cookies