This new version offers several key enhancements. Through unified memory, applications can access CPU and GPU without the need to copy data between the two. It also allows developers to add support for GPU acceleration in a number of programming languages.
Drop-in libraries vastly multiply the speed of applications’ BLAS and FFTW calculations by replacing CPU libraries with GPU-accelerated equivalents.
Additionally, CUDA 6 features redesigned BLAS and FFT GPU libraries, which automatically scale performance across up to eight GPUs in a single node. This enhancement supports workloads of up to 512 GB, as it provides over nine teraFLOPS of performance per node. A new BLAS drop-in library also allows for multi-GPU scaling.
“By automatically handling data management, Unified Memory enables us to quickly prototype kernels running on the GPU and reduces code complexity, cutting development time by up to 50 percent,” said Rob Hoekstra, manager, Scalable Algorithms Department, Sandia National Laboratories. “Having this capability will be very useful as we determine future programming model choices and port more sophisticated, larger codes to GPUs.”
CUDA 6 will also include a complete kit of programming tools, GPU-accelerated math libraries, and documentation and programming guides. CUDA 6 is expected to be available in the first quarter of 2014.
SAMSUNG GALAXY S8 PLUS
The Samsung Galaxy S8 Plus is a beautifully crafted smartphone with nearly no bezel, curvaceous in design and reflects a…
How to: Connect to Exchange Online Using Multi-Factor Authentication
Using PowerShell to manage your Microsoft cloud services like Exchange Online and using multi-factor authentication (MFA) separately is awesome. Using…