Warning: mkdir(): No space left on device in /var/www/hottg/post.php on line 59

Warning: file_put_contents(aCache/aDaily/2025-07-20/post/DataScienceM/--): Failed to open stream: No such file or directory in /var/www/hottg/post.php on line 72
How do transformers work? Learn it by hand 👇 @Data Science Machine Learning Data Analysis Books
TG Telegram Group & Channel
Data Science Machine Learning Data Analysis Books | United States America (US)
Create: Update:

How do transformers work? Learn it by hand 👇

𝗪𝗮𝗹𝗸𝘁𝗵𝗿𝗼𝘂𝗴𝗵

[1] Given
↳ Input features from the previous block (5 positions)

[2] Attention
↳ Feed all 5 features to a query-key attention module (QK) to obtain an attention weight matrix (A). I will skip the details of this module. In a follow-up post I will unpack this module.

[3] Attention Weighting
↳ Multiply the input features with the attention weight matrix to obtain attention weighted features (Z). Note that there are still 5 positions.
↳ The effect is to combine features across positions (horizontally), in this case, X1 := X1 + X2, X2 := X2 + X3....etc.

[4] FFN: First Layer
↳ Feed all 5 attention weighted features into the first layer.
↳ Multiply these features with the weights and biases.
↳ The effect is to combine features across feature dimensions (vertically).
↳ The dimensionality of each feature is increased from 3 to 4.
↳ Note that each position is processed by the same weight matrix. This is what the term "position-wise" is referring to.
↳ Note that the FFN is essentially a multi layer perceptron.

[5] ReLU
↳ Negative values are set to zeros by ReLU.

[6] FFN: Second Layer
↳ Feed all 5 features (d=3) into the second layer.
↳ The dimensionality of each feature is decreased from 4 back to 3.
↳ The output is fed to the next block to repeat this process.
↳ Note that the next block would have a completely separate set of parameters.

#ai #tranformers #genai #learning

💯 BEST DATA SCIENCE CHANNELS ON TELEGRAM 🌟

This media is not supported in your browser
VIEW IN TELEGRAM
How do transformers work? Learn it by hand 👇

𝗪𝗮𝗹𝗸𝘁𝗵𝗿𝗼𝘂𝗴𝗵

[1] Given
↳ Input features from the previous block (5 positions)

[2] Attention
↳ Feed all 5 features to a query-key attention module (QK) to obtain an attention weight matrix (A). I will skip the details of this module. In a follow-up post I will unpack this module.

[3] Attention Weighting
↳ Multiply the input features with the attention weight matrix to obtain attention weighted features (Z). Note that there are still 5 positions.
↳ The effect is to combine features across positions (horizontally), in this case, X1 := X1 + X2, X2 := X2 + X3....etc.

[4] FFN: First Layer
↳ Feed all 5 attention weighted features into the first layer.
↳ Multiply these features with the weights and biases.
↳ The effect is to combine features across feature dimensions (vertically).
↳ The dimensionality of each feature is increased from 3 to 4.
↳ Note that each position is processed by the same weight matrix. This is what the term "position-wise" is referring to.
↳ Note that the FFN is essentially a multi layer perceptron.

[5] ReLU
↳ Negative values are set to zeros by ReLU.

[6] FFN: Second Layer
↳ Feed all 5 features (d=3) into the second layer.
↳ The dimensionality of each feature is decreased from 4 back to 3.
↳ The output is fed to the next block to repeat this process.
↳ Note that the next block would have a completely separate set of parameters.

#ai #tranformers #genai #learning

💯 BEST DATA SCIENCE CHANNELS ON TELEGRAM 🌟
Please open Telegram to view this post
VIEW IN TELEGRAM
👍73


>>Click here to continue<<

Data Science Machine Learning Data Analysis Books




Share with your best friend
VIEW MORE

United States America Popular Telegram Group (US)


Warning: Undefined array key 3 in /var/www/hottg/function.php on line 115

Fatal error: Uncaught mysqli_sql_exception: Can't create/write to file '/tmp/#sql-temptable-a06e-58b9d0-27fd.MAI' (Errcode: 28 "No space left on device") in /var/www/hottg/function.php:216 Stack trace: #0 /var/www/hottg/function.php(216): mysqli_query() #1 /var/www/hottg/function.php(115): select() #2 /var/www/hottg/post.php(351): daCache() #3 /var/www/hottg/route.php(63): include_once('...') #4 {main} thrown in /var/www/hottg/function.php on line 216