Saturday, March 15, 2008

how to run or execute cuda program

Learn by Example (CUDA Programming Part 1)

Code
:Device Query

Website (Download page) :

Windows : http://developer.download.nvidia.com/compute/cuda/sdk/Projects/deviceQuery.zip
Linux : http://developer.download.nvidia.com/compute/cuda/sdk/Projects/deviceQuery.tar.gz

This program is compiled under linux ( Ubuntu 7.10, Nvdia 8800 GTX)

Program Execution
:

extract the code on the desktop in Test directory

run the make file

root@fluorine8:~/Desktop/Test/NVIDIA_CUDA_SDK$ make

if you hit the "nvcc: command not found" Error

then add this paths using command line

export PATH=$PATH:/usr/local/cuda/bin

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib

run make once more


make[1]: Entering directory `/home/class/wasp01/Desktop/Test/NVIDIA_CUDA_SDK/common'
a - obj/release/bank_checker.cpp_o
a - obj/release/cmd_arg_reader.cpp_o
a - obj/release/cutil.cpp_o
a - obj/release/error_checker.cpp_o
a - obj/release/stopwatch.cpp_o
a - obj/release/stopwatch_linux.cpp_o
a - obj/release/cutil_interop.cpp_o
make[1]: Leaving directory `/home/class/wasp01/Desktop/Test/NVIDIA_CUDA_SDK/common'
make -C projects/deviceQuery/
make[1]: Entering directory `/home/class/wasp01/Desktop/Test/NVIDIA_CUDA_SDK/projects/deviceQuery'
make[1]: Leaving directory `/home/class/wasp01/Desktop/Test/NVIDIA_CUDA_SDK/projects/deviceQuery'
Finished building all

This message means compilation went through properly


Now to run the program go to the bin/linux/release directory

root@fluorine8:~/Desktop/Test/NVIDIA_CUDA_SDK$ cd bin/linux/release/

check the deviceQuery file is in place


root@fluorine8:~/Desktop/Test/NVIDIA_CUDA_SDK/bin/linux/release$ ls
deviceQuery

run the file by ./deviceQuery

root@fluorine8:~/Desktop/Test/NVIDIA_CUDA_SDK/bin/linux/release$ ./deviceQuery

RESULT

There is 1 device supporting CUDA

Device 0: "GeForce 8800 GTS 512"
Major revision number: 1
Minor revision number: 1
Total amount of global memory: 536150016 bytes
Total amount of constant memory: 65536 bytes
Total amount of shared memory per block: 16384 bytes
Total number of registers available per block: 8192
Warp size: 32
Maximum number of threads per block: 512
Maximum sizes of each dimension of a block: 512 x 512 x 64
Maximum sizes of each dimension of a grid: 65535 x 65535 x 1
Maximum memory pitch: 262144 bytes
Texture alignment: 256 bytes
Clock rate: 1674000 kilohertz

Test PASSED

Press ENTER to exit...

HURRAA.... through your self a pary... you are done with the first step in CUDA software compilation and software bench testing process.

Fell free to post the comments and queries...

Regards,
Mihir

4 comments:

Anonymous said...

excellent..thanks for these instructions

farhad said...

Thanks a lot man! I was confused how to run a CUDA project on linux and your instructions helped :-)

Anonymous said...

if you cannot run ./deviceQuery and get a linking error problem with cudart.so.4 or similar then I suggest modifying your /etc ./ld.so.config by adding
/etc/cuda/lib64
/etc/cuda/lib

Unknown said...

[deviceQuery] starting...

I just downloaded the most recent drivers from the NVIDIA website and I got this error. Please help!

./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 35
-> CUDA driver version is insufficient for CUDA runtime version
[deviceQuery] test results...
FAILED