Article 2 Nyströmformer: Approximating self-attention in linear time and memory via the Nyström method