cuda

cover image

A quick and easy introduction to CUDA programming for GPUs. This post dives into CUDA C++ with a simple, step-by-step parallel programming example.

cover image

Parallel thread execution (PTX) is a virtual machine instruction set architecture that has been part of CUDA from its beginning. You can think of PTX as the…

cover image

Porting functions that don't exist in common graph manipulation tools to an NPU.

cover image

Today I've installed from scratch Ubuntu 23.10 on my computer. After having installed all the software I need, I tried to install CUDA from the NVIDIA website, following their instructions: https://

cover image

CUDA Templates for Linear Algebra Subroutines.

cover image

CUDA Graphs can provide a significant performance increase, as the driver is able to optimize execution using the complete description of tasks and dependencies. Graphs provide incredible benefits for…

cover image

While there have been efforts by AMD over the years to make it easier to port codebases targeting NVIDIA's CUDA API to run atop HIP/ROCm, it still requires work on the part of developers.

cover image

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically...

cover image

I've tried multiple ways to get CUDA running but can't past the driver recognition problem (at least that's what I think the issue is). Surely…

Minimal first-steps instructions to get CUDA running on a standard system.

cover image

NVIDIA announces the newest CUDA Toolkit software release, 12.0. This release is the first major release in many years and it focuses on new programming models and CUDA application acceleration…

cover image

Follow this series to learn about CUDA programming from scratch with Python. Part 4 of 4.

cover image

There seem be be several options to install CUDA on Ubuntu 20.10: It is pre-bundled with 20.10, there are various installers at the official NVIDIA page, etc. Question: What is a recommended way to

cover image

The new NVIDIA A100 GPU based on the NVIDIA Ampere GPU architecture delivers the greatest generational leap in accelerated computing. The A100 GPU has revolutionary hardware capabilities and we’re…

cover image

Numba is an open-source Python compiler from Anaconda that can compile Python code for high-performance execution on CUDA-capable GPUs or multicore CPUs.