Anybody can write a few lines of code for TensorFlow or PyTorch that defines a neural network for Natural Language Understanding tasks. Scientific papers reflecting modern research seem clear enough for implementing in code (eg. Python or Java/DL4J), at least the few ones I reviewed this year and a few before (eg. those on GPT-3, BERT)
However, unless you have a breakthrough neural network architecture to rollout to production, whose pitching won you some capital, i.e. at least a few dozens thousands USD to spare for weeks of TPUs/GPU cluster in AWS/GCP/Azure, you won't be able train a modern competitive language model for Natural Language Understanding tasks anew.
The biggest challenges:
The NLU tasks in scope are question answering, text summarization, text generation and translation.
As you can judge from text completions by GPT-J , it is not so versed in peculiarities of Cloud Architectures, for example.
That's when Transfter Learning comes in handy. It allows you to re-use the base language model, whose pre-training itself costed a few dozens thousands in compute costs only. Transfer Learning enables feeding pre-trained model with word associations captured in the augmenting text dataset.
Running GPT-J on commodity hardware poses a few challenges by itself:
You might get access to Google's TPU Research Cloud , which would dramatically speed up transfer learning exercises with GPT-J. My application for TRC did not yield yet, though.