CUDA 6.5 starter example with timing and Unified Memory access.

Now, I’m on the Chinese New Year holiday, got some time to learn CUDA which I left it behind for long time. I found the CUDA Programing Doc doesn’t present a complete unified memory code. Here is the a blog article about the unified memory in the CUDA 6+, so what I did here is just implementing this developer friendly feature merging with the working method of timing both GPU kernal run time and the same function on CPU’s run time.