Here we have implemented K fold cross validation algorithm parallely using CUDA C. Defined CUDA kernel in C to run the code in parallel for linear and logistic regression. We have achieved 3 times faster speed up with only 1000 data points and it is helpful for all machine learning algorithms