Thursday, July 22, 2010

Nvidia CUDA Toolkit 3.1 up for grabs

Developers enticed by using GPU power to accelerate their applications can download, use and abuse a new version of CUDA Toolkit which is available for Windows, Mac OS and Linux, and includes extra features and support.

The CUDA Toolkit 3.1 packs the following updates and additions:

- GPUDirect gives 3rd party devices direct access to CUDA Memory

- Support for 16-way concurrency allows up to 16 different kernels to run at the same time on Fermi architecture GPUs

- Runtime / Driver interoperability enables applications to mix-n-match use of the CUDA Driver API with CUDA C Runtime and math libraries via buffer sharing and context migration

- New language features added to CUDA C / C++:

Support for printf() in device code

Support for function pointers and recursion make it easier to port many existing algorithms to Fermi GPUs

- Unified Visual Profiler now supports both CUDA C/C++ and OpenCL, and now includes support for CUDA Driver API tracing

- Math Libraries Performance Improvements, including:

Improved performance of selected transcendental functions from the log, pow, erf, and gamma families

Significant improvements in double-precision FFT performance on Fermi-architecture GPUs for 2^n transform sizes

Streaming API now supported in CUBLAS for overlapping copy and compute operations

CUFFT Real-to-complex (R2C) and complex-to-real (C2R) optimizations for 2^n data sizes

Improved performance for GEMV and SYMV subroutines in CUBLAS

Optimized double-precision implementations of divide and reciprocal routines for the Fermi architecture

- New and updated SDK code samples demonstrating how to use:

Function pointers in CUDA C/C++ kernels

OpenCL / Direct3D buffer sharing

Hidden Markov Model in OpenCL

Microsoft Excel GPGPU example showing how to run an Excel function on the GPU

... and can be found on this page .

No comments:

Post a Comment