About 13,700,000 results
Open links in new tab
  1. [1706.03762] Attention Is All You Need - arXiv.org

    Jun 12, 2017 · View a PDF of the paper titled Attention Is All You Need, by Ashish Vaswani and 7 other authors

  2. Attention Is All You Need - arXiv.org

    Aug 2, 2023 · In this work, we presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder …

  3. [1706.03762] Attention Is All You Need - ar5iv

    Mar 3, 2024 · In this work, we presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder …

  4. In this work, we presented the Transformer, the first sequence transduction model based entirely on attention, replacing the recurrent layers most commonly used in encoder-decoder …

  5. [2501.06425] Tensor Product Attention Is All You Need - arXiv.org

    Jan 11, 2025 · In this paper, we propose Tensor Product Attention (TPA), a novel attention mechanism that uses tensor decompositions to represent queries, keys, and values …

  6. arXiv.org e-Print archive

    This paper introduces the Transformer model, a novel architecture for natural language processing tasks based on self-attention mechanisms.

  7. [2501.05730] Element-wise Attention Is All You Need - arXiv.org

    Jan 10, 2025 · In contrast to these approaches, we propose a novel element-wise attention mechanism, which uses the element-wise squared Euclidean distance, instead of the dot …

  8. [1902.10186] Attention is not Explanation - arXiv.org

    Feb 26, 2019 · In this work, we perform extensive experiments across a variety of NLP tasks that aim to assess the degree to which attention weights provide meaningful `explanations' for …

  9. arXiv.org e-Print archive

    arXiv.org e-Print archive

  10. [2412.01818] Beyond Text-Visual Attention: Exploiting Visual Cues …

    Dec 2, 2024 · Most existing works use attention scores between text and visual tokens to assess the importance of visual tokens. However, in this study, we first analyze the text-visual …