The heatmap on the right in the image shows the error. It gets progressively worse as the numbers get larger. Notably, also, the error is not symmetric in the operands, so the model is not aware that addition is commutative. Even after 2^128 or so training examples (it seems the training set is every pair of unsigned 64-bit integers) it couldn’t figure out that a+b = b+a