v2.8.0
This cherry-picked commit fixes a bug preventing computing LSTM and GRU for RaggedTensors via cuDNN, resulting in a large speedup (easily 10 times). A TF2-only test comparing ragged and masked tensor LSTM and GRU on CPU and GPU is also provided.