Back to home

Articles tagged with "GPUs, Dataflow, DeepLearning"

Efficient Synchronous Dataflow Execution For GPUs (NVIDIA, UW-Madison)

Efficient Synchronous Dataflow Execution For GPUs (NVIDIA, UW-Madison)

Researchers from NVIDIA and the University of Wisconsin-Madison have published a technical paper titled “Kitsune: Enabling Dataflow Execution on GPUs with Spatial Pipelines.” The paper discusses the challenges of using GPUs for deep learning applications due to their bulk-synchronous execution model. The researchers introduce Kitsune, a set of primitives and an end-to-end compiler based on PyTorch Dynamo, to enable dataflow execution on GPUs. Kitsune shows significant performance improvements and reduced off-chip traffic for both inference and training tasks across various applications.

SemiEngineering

No more articles to load

We use cookies

We use cookies to ensure you get the best experience on our website. For more information on how we use cookies, please see our cookie policy.