CUDA 1.0 ( C for NVidia GPUs) now documents the PTX assembly language.

The CUDA Developer SDK provides examples with source code to help you get started with CUDA.
Examples include:
Parallel bitonic sort
Matrix multiplication
Matrix transpose
Performance profiling using timers
Parallel prefix sum (scan) of large arrays
Image convolution
1D DWT using Haar wavelet <-- is this like what Prime95 uses ?
OpenGL and Direct3D graphics interoperation examples
CUDA BLAS and FFT library usage examples
CPU-GPU C and C++ code integration

NVidia is also releasing a Tesla line of GPU cards,
calculation accelerators without video connections.

The base model C870 has 1.5 GB memory and a price to match $1299.
Correction: The G92 GPU will be in the GeForce 9800 cards.
