https://www.borealisai.com/research-blogs/tutorial-17-transformers-iii-training/ https://www.lesswrong.com/posts/b3CQrAo2nufqzwNHF/how-to-train-your-transformer https://thegradient.pub/nlp-imagenet/ —> predictions : ( to govern how NLP will function, we can try to gauge the effect of ImageNet )