We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.

Back to home

Efficient Synchronous Dataflow Execution For GPUs (NVIDIA, UW-Madison)

Source

SemiEngineering

Published

TL;DR

AI Generated

Researchers from NVIDIA and the University of Wisconsin-Madison have published a technical paper titled “Kitsune: Enabling Dataflow Execution on GPUs with Spatial Pipelines.” The paper discusses the challenges of using GPUs for deep learning applications due to their bulk-synchronous execution model. The researchers introduce Kitsune, a set of primitives and an end-to-end compiler based on PyTorch Dynamo, to enable dataflow execution on GPUs. Kitsune shows significant performance improvements and reduced off-chip traffic for both inference and training tasks across various applications.

Efficient Synchronous Dataflow Execution For GPUs (NVIDIA, UW-Madison) - Tech News Aggregator