A Dive Into Multihead Attention, Self-Attention and Cross-Attention

Tech Social Net

Share This Page

Share this page with your family and friends.

Twitter

Gmail

Could We Be Detecting the Effects of "Hyperdrive" Travel? | More Wow Signals

April 11, 2023

Posted by admin
Next Level ChatGPT? Auto Mini AGI Agents Running in your Browser!

April 11, 2023

Posted by admin
10 Top AI Stocks to Buy Now for the Next 10 Years (1 of 2)

April 11, 2023

Posted by admin
Fermi Paradox: Why don't we see signs of alien civilizations?

April 11, 2023

Posted by admin

Videos » A Dive Into Multihead Attention, Self-Attention and Cross-Attention

A Dive Into Multihead Attention, Self-Attention and Cross-Attention

Posted by admin

In this video, I will first give a recap of Scaled Dot-Product Attention, and then dive into Multihead Attention. After that, we will see two different ways of using the attention mechanism, which is Self-Attention and Cross-Attention. Solution of the exercise: We have X: T1xd Y: T2xd So, we build Q from Y, so that means Q will be Q: T2xd And we build K and V from X, therefore, K: T1xd V: T1xd Then, QK^t (compatibility matrix) will be QK^t: T2xT1 And the final output Z, will be Softmax(1/sqrt(d) QK^t) * V Z: T2xd

Posted July 9, 2023

click to rate

Embed | 179 views

Tech Social Net

Be part of our social community, share your technology experiences with others and make the community an amazing place with your presence.

+1-777-777-7777

admin@techsocialnet.com

techsocialnet.com
- MEET
- Virtual Technology Tradeshow
- Pages
- Events
- Invite
- Classifieds
- CONNECT
- Members
- Groups
- Blogs
- Forums
- Chat
- MEDIA
- Videos
- Tech Magazine
- Livestream
- Albums
- Music
Latest Registered Members