llm
(25)How the sharing of 1M Bluesky posts uncovered the strong anti-AI sentiment on Bluesky.
Look at and label your data, build and evaluate your LLM-evaluator, and optimize it against your labels.
27 Oct 2024  ·  13 min  ·  llm eval learning 🛠 🩷
Being a human judge at the Weights & Biases LLM-as-a-Judge Hackathon
Use cases, techniques, alignment, finetuning, and critiques against LLM-evaluators.
18 Aug 2024  ·  49 min  ·  llm eval production 🔥
Special double-feature closing keynote from the 6 authors of the hit O'Reilly article on Applied LLMs.
27 Jun 2024  ·  2 min  ·  llm ai engineering production
Challenges and lessons from deploying LLM experiences: evals, scalability, guardrails.
31 May 2024  ·  2 min  ·  llm engineering production leadership
Structured input/output, prefilling, n-shots prompting, chain-of-thought, reducing hallucinations, etc.
26 May 2024  ·  17 min  ·  llm production 🔥
From the tactical nuts & bolts to the operational day-to-day to the long-term business strategy.
12 May 2024  ·  1 min  ·  llm engineering production leadership 🔥
Building an AI coach with speech-to-text, text-to-speech, an LLM, and a virtual number.
Evals for classification, summarization, translation, copyright regurgitation, and toxicity.
31 Mar 2024  ·  33 min  ·  llm eval machinelearning
Overcoming the bottleneck of human annotations in instruction-tuning, preference-tuning, and pretraining.
Some fundamental papers and a one-sentence summary for each; start your own paper club!
How to use open-source, permissive-use data and collect less labeled samples for our tasks.
05 Nov 2023  ·  12 min  ·  llm eval machinelearning python
The biggest deployment challenges, backward compatibility, multi-modality, and SF work ethic.
Evals, retrieval-augmented generation, guardrails, and collecting feedback; all that good stuff.
09 Oct 2023  ·  17 min  ·  llm ai engineering production
Reference, context, and preference-based metrics, self-consistency, and catching hallucinations.
Distinguishing problems with external vs. internal LLMs, and data vs non-data patterns
13 Aug 2023  ·  6 min  ·  llm production
Evals, RAG, fine-tuning, caching, guardrails, defensive UX, and collecting user feedback.
30 Jul 2023  ·  66 min  ·  llm engineering production 🔥
Writing drafts via retrieval-augmented generation. Also reflecting on the week's journal entries.
11 Jun 2023  ·  6 min  ·  llm engineering 🛠
What's the big deal, intuition on query-key-value vectors, multiple heads, multiple layers, and more.
21 May 2023  ·  8 min  ·  deeplearning nlp llm 🔥
It started with a question that had no clear answer, and led to eight PRs from the community.
Should chat be the main UX for LLMs? I don't think so and believe we can do better.
Generating Dr. Seuss headlines, fake WSJ quotes, HackerNews troll comments, and more.
Also, shortcomings in document retrieval and how to overcome them with search & recsys techniques.
09 Apr 2023  ·  14 min  ·  llm deeplearning learning 🛠 🔥
Asking LLMs to generate biographies to get a sense of how they memorize and regurgitate.
Join 9,300+ readers getting updates on machine learning, RecSys, LLMs, and engineering.