Transformer (architecture)

Neural sequence model built mainly on attention: parallelizable and foundational for LLMs since "Attention Is All You Need" (2017).

The Transformer architecture replaced many recurrent motifs with self-attention: each token attends to others to build contextual vectors. Encoder-only, decoder-only, encoder–decoder hybrids exist (BERT, GPT-style causally etc.).

Nearly all contemporary large language models (LLMs) use transformers at scale: pretrained on token corpora, then optionally fine-tuned for chat or tools.