Wanting into the maths and the information reveals that transformers are each overused and underused.
Transformers are finest identified for his or her purposes in pure language processing. They had been initially designed for translating between languages,[1] and are actually most well-known for his or her use in massive language fashions like ChatGPT (generative pretrained transformer).
However since their introduction, transformers have been utilized to ever extra duties, with nice outcomes. These embrace picture recognition,[2] reinforcement studying,[3] and even climate prediction.[4]
Even the seemingly particular job of language technology with transformers has quite a lot of surprises, as we’ve already seen. Massive language fashions have emergent properties that really feel extra clever than simply predicting the following phrase. For instance, they could know varied information in regards to the world, or replicate nuances of an individual’s fashion of speech.
The success of transformers has made some individuals ask the query of whether or not transformers can do the whole lot. If transformers generalize to so many duties, is there any motive not to make use of a transformer?
Clearly, there may be nonetheless a case for different machine studying fashions and, as is commonly forgotten lately, non-machine studying fashions and human mind. However transformers do have quite a lot of distinctive properties, and have proven unbelievable outcomes thus far. There’s additionally a substantial mathematical and empirical foundation…