Explain encoder-only, decoder-only, and encoder-decoder transformer architectures and typical NLP tasks for which each is used.

Prepare for the AI Prompt Engineering Test with detailed flashcards and insightful questions. Master key Machine Learning and NLP concepts with explanations for every query. Ace your exam!

Multiple Choice

Explain encoder-only, decoder-only, and encoder-decoder transformer architectures and typical NLP tasks for which each is used.

Explanation:
The key idea is matching the architecture to the kind of processing a task needs. An encoder-only model reads the input and builds rich, contextual representations, which makes it ideal for comprehension and embedding tasks—like understanding a sentence, answering questions, or labeling the text—where you care about the meaning encoded in the input. A decoder-only model generates text autoregressively, predicting the next token based on what came before, so it shines at text generation, continuation, and conversational responses. An encoder-decoder model combines both: the encoder turns the input into a contextual representation, and the decoder uses that representation to produce an output sequence, making it well suited for translation, summarization, and other text-to-text transformations. This aligns with the statement that encoder-only models (like BERT) handle comprehension and embeddings, decoder-only models (like GPT) handle text generation, and encoder-decoder models (like T5) handle translation and summarization. Other options misstate these roles: translation isn’t typically done with an encoder alone because generating the target text requires a decoding step; embeddings aren’t the primary purpose of an end-to-end encoder–decoder setup; and saying all three are used equally for all NLP tasks ignores the strengths and typical use cases of each architecture.

The key idea is matching the architecture to the kind of processing a task needs. An encoder-only model reads the input and builds rich, contextual representations, which makes it ideal for comprehension and embedding tasks—like understanding a sentence, answering questions, or labeling the text—where you care about the meaning encoded in the input. A decoder-only model generates text autoregressively, predicting the next token based on what came before, so it shines at text generation, continuation, and conversational responses. An encoder-decoder model combines both: the encoder turns the input into a contextual representation, and the decoder uses that representation to produce an output sequence, making it well suited for translation, summarization, and other text-to-text transformations.

This aligns with the statement that encoder-only models (like BERT) handle comprehension and embeddings, decoder-only models (like GPT) handle text generation, and encoder-decoder models (like T5) handle translation and summarization.

Other options misstate these roles: translation isn’t typically done with an encoder alone because generating the target text requires a decoding step; embeddings aren’t the primary purpose of an end-to-end encoder–decoder setup; and saying all three are used equally for all NLP tasks ignores the strengths and typical use cases of each architecture.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy