ADAM optimizer, perplexity, and a bunch of CUDA kernels

Opcodes Novigrad opcodes are currently these: pub enum OpCode { GemmNTN, GemmNNN, GemmTNN, GemmTNT, ReduceSum, Add, ScalarAdd, ScalarMul, Clip, Normalize, Mul, Div, Sqrt, Softmax, Sub, Reshape(Vec<usize>), Sigmoid, CrossEntropyLoss, ResidualSumOfSquares, Dropout(f32), Concat, Unconcat, }  see   CUDA I read Optimizing Parallel Reduction in CUDA by Mark Harris of NVIDIA.  I am not a CUDA ninja. But I wrote a bunch of kernels. The reduction stuff is nice. See : All of them: Adam I implemented the ADAM optimizer. S