Transformer (deep learning architecture)#Efficient implementation wikipedia

约 102,000 个结果

时间不限

查看更多
前往 Wikipedia 查看全部内容
Wikipedia
https://en.wikipedia.org/wiki/Transformer_(deep...
Transformer (deep learning architecture) - Wikipedia
A transformer is a deep learning architecture developed by researchers at Google and based on the multi-head attention mechanism, proposed in a 2017 paper "Attention Is All You Need". Text is converted to numerical representations called tokens, and each token is converted into a vector via looking up from … 展开
History
Predecessors
For many years, sequence modelling and generation was done by using plain recurrent neural networks (RNNs). A well-cited early example was the 展开
Training
Methods for stabilizing training
The plain transformer architecture had difficulty converging. In the original paper the authors recommended using learning rate warmup. That is, the learning rate should linearly scale up from 0 to maximal value for the first part of … 展开
Subsequent work
Alternative activation functions
The original transformer uses ReLU activation function. Other activation functions were developed. … 展开
Applications
The transformer has had great success in natural language processing (NLP). Many large language models such as GPT-2, GPT-3, 展开
Architecture
All transformers have the same primary components:
• Tokenizers, which convert text into tokens. 展开
Full transformer architecture
Sublayers
Each encoder layer contains 2 sublayers: the self-attention and the feedforward network. Each decoder layer contains 3 sublayers: the causally masked self-attention, the cross-attention, and the feedforward network. 展开
See also
• seq2seq – Family of machine learning approaches
• Perceiver – Variant of Transformer designed for multimodal data
• Vision transformer – Variant of Transformer designed for vision processing 展开
来自维基百科
内容
History
Training
Architecture
Full transformer architecture
Subsequent work
查看所有章节
CC-BY-SA 许可证中的维基百科文本
反馈
谢谢!告诉我们更多信息
arXiv.org
https://arxiv.org/abs/2009.06732
[2009.06732] Efficient Transformers: A Survey - arXiv.org
网页2020年9月14日 · Transformer model architectures have garnered immense interest lately due to their effectiveness across a range of domains like language, vision and …
- 作者: Yi Tay, Mostafa Dehghani, Dara Bahri, Donald Metzler
- Publish Year: 2020
arXiv.org
https://arxiv.org/abs/2001.04451
[2001.04451] Reformer: The Efficient Transformer - arXiv.org
网页2020年1月13日 · We introduce two techniques to improve the efficiency of Transformers. For one, we replace dot-product attention by one that uses locality-sensitive hashing, …
- 作者: Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya
- Publish Year: 2020
Transformer (deep learning architecture)#Efficient implementation …

bing.com/videos
观看视频
13:01
Transformers Neural Networks Explained | NLP with Deep Learning | Deep Learning Course | Edureka
已浏览 2.1万次2021年5月21日
YouTubeedureka!
观看视频
1:01:13
Lecture 21 - Transformer Implementation
已浏览 2.8万次2022年12月2日
YouTubeDeep Learning Systems Course
观看视频
22:36
L19.5.1 The Transformer Architecture
已浏览 1.8万次2021年5月14日
YouTubeSebastian Raschka
观看视频
1:19:24
Live -Transformers Indepth Architecture Understanding- Attention Is All You Need
已浏览 21.8万次2020年9月3日
YouTubeKrish Naik
arXiv.org
https://arxiv.org/abs/2210.01069
Dual-former: Hybrid Self-attention Transformer for Efficient Image ...
网页2022年10月3日 · Recently, image restoration transformers have achieved comparable performance with previous state-of-the-art CNNs. However, how to efficiently leverage …
标记:
Deep Learning
Convolutional Neural Networks
Machine Learning Mastery
https://machinelearningmastery.com/the-transformer-model
The Transformer Model - MachineLearningMastery.com
Tutorial Overview
Prerequisites
The Transformer Architecture
Sum Up: The Transformer Model
Comparison to Recurrent and Convolutional Layers
Summary
In this tutorial, you discovered the network architecture of the Transformer model. Specifically, you learned: 1. How the Transformer architecture implements an encoder-decoder structure without recurrence and convolutions 2. How the Transformer encoder and decoder work 3. How the Transformer self-attention compares to recurrent and convolutional l...
在machinelearningmastery.com上查看更多信息
你可能喜欢的搜索
DataCamp
https://www.datacamp.com/tutorial/how-…
How Transformers Work: A Detailed Exploration of …
网页2024年1月9日 · A transformer is a type of artificial intelligence model that learns to understand and generate human-like text by analyzing patterns in large amounts of text data. Transformers are a current state-of-the-art …
标记:
The Transformer Model
The Transformer Architecture
MDPI
https://www.mdpi.com/2076-3417/14/10/4316
A Historical Survey of Advances in Transformer Architectures
网页2024年3月19日 · Due to its prowess in sequence modeling and machine translation, the transformer architecture was initially widely implemented and indeed emerged as the …
标记:
Deep learning
The Transformer Model
The Transformer Architecture
Turing
https://www.turing.com/kb/brief-introduction …
The Ultimate Guide to Transformer Deep Learning
网页A Transformer is a deep learning model that adopts the self-attention mechanism. This model also analyzes the input data by weighting each component differently. It is used primarily in artificial intelligence (AI) and …
标记:
Deep learning
The Transformer Model
Nature
https://www.nature.com/articles/s41746-024-01235-0
Zero shot health trajectory prediction using transformer
网页1 天前 · ETHOS is a novel application of the transformer deep-learning architecture, originally conceptualized for natural language processing 3. This architecture, a …
标记:
The Transformer Model
The Transformer Architecture
Towards Data Science
https://towardsdatascience.com/a-deep-div…
A Deep Dive Into the Transformer Architecture — The …
网页2020年7月21日 · The introduction of the vanilla Transformer in 2017 disrupted sequence-based deep learning significantly. By doing away with recurrent connections entirely, transformer architectures are better …
标记:
Deep learning
The Transformer Architecture
Machine Learning
其他用户还搜索过
efficient transformers architecture
efficient transformers 論文
efficient transformers
transformer wikipedia
efficient transformers research paper
reformer transformer model
Transformer (deep learning architecture)#Efficient implementa…
分页
- 1
- 2
- 3
- 4
- 下一页

Transformer (deep learning architecture) - Wikipedia

[2009.06732] Efficient Transformers: A Survey - arXiv.org

[2001.04451] Reformer: The Efficient Transformer - arXiv.org

Transformer (deep learning architecture)#Efficient implementation …

Dual-former: Hybrid Self-attention Transformer for Efficient Image ...

The Transformer Model - MachineLearningMastery.com

你可能喜欢的搜索

How Transformers Work: A Detailed Exploration of …

A Historical Survey of Advances in Transformer Architectures

The Ultimate Guide to Transformer Deep Learning

Zero shot health trajectory prediction using transformer

A Deep Dive Into the Transformer Architecture — The …

Transformer (deep learning architecture)#Efficient implementa…

浏览更多