llm
(5)What's the big deal, intuition on query-key-value vectors, multiple heads, multiple layers, and more.
21 May 2023  ·  8 min  ·  deeplearning nlp llm
It started with a question that had no clear answer, and led to eight PRs from the community.
Should chat be the main UX for LLMs? I don't think so and believe we can do better.
Also, shortcomings in document retrieval and how to overcome them with search & recsys techniques.
09 Apr 2023  ·  14 min  ·  llm deeplearning learning 🛠
Asking LLMs to generate biographies to get a sense of how they memorize and regurgitate.
Join 4,800+ readers getting updates on machine learning, engineering, and mechanisms.