The host runtime component of the CUDA software environment can be used only by host functions. It provides functions to handle the following:
It comprises two APIs:
Since version 3.1 of the CUDA software, these APIs are interoperable; applications can do most operations in the more streamlined CUDA Runtime API while seamlessly interoperating with the lower-level interfaces provided by the CUDA Driver API on an as-needed basis.
Most commonly, applications will use the CUDA Runtime API, which greatly eases device management by providing implicit initialization, context management, and device code module management.
In contrast, while the CUDA Driver API offers more explicit control in certain limited circumstances, it requires more code and is somewhat harder to program. For example, it is more difficult to configure and launch kernels using the CUDA Driver API, since the execution configuration and kernel parameters must be specified with explicit function calls instead of the execution configuration syntax (<<<…>>>).
The C/C++ host code generated by nvcc utilizes the CUDA Runtime, so applications that link to this code will depend on the CUDA Runtime; similarly, any code that uses the CUBLAS, CUFFT, and other CUDA Toolkit libraries will also depend on the CUDA Runtime, which is used internally by these libraries.
The two APIs can be easily distinguished: the CUDA Driver API is delivered through the nvcuda/libcuda dynamic library and all its entry points are prefixed with cu; while the CUDA Runtime is delivered through the cudart dynamic library and all its entry points are prefixed with cuda. Note that the APIs relate only to host code; the kernels that are executed on the device are the same, regardless of which API is used.
The functions that make up these two APIs are explained in the CUDA Reference Manual.