We're in beta. Stay tuned for updates.x
Loading...
PODCAST

Mechanical Dreams

An automatically generated podcast about machine learning and natural language processing. The two fictional hosts talk about papers that I want to learn more about on my way to work.It"s not good, but it"s useful.

All Episodes

11:03
Optimal Linear Decay Learning Rate Schedules and...
Mechanical Dreams ·
2025/01/05
en-us
9:34
Normalization Layer Per-Example Gradients are...
Mechanical Dreams ·
2024/12/20
en-us
11:48
Efficient and Approximate Per-Example Gradient Norms...
Mechanical Dreams ·
2024/12/20
en-us
9:04
Phi-4
Mechanical Dreams ·
2024/12/14
en-us
11:02
Rephrasing natural text data with different languages...
Mechanical Dreams ·
2024/12/13
en-us
11:58
Unveiling and Consulting Core Experts in...
Mechanical Dreams ·
2024/12/12
en-us
8:58
EXAONE 3.5
Mechanical Dreams ·
2024/12/11
en-us
6:14
Model soups - averaging weights of multiple...
Mechanical Dreams ·
2024/12/09
en-us
13:13
Model-Aware Data Selection for Efficient Pretraining...
Mechanical Dreams ·
2024/12/06
en-us
12:39
Nemotron-CC
Mechanical Dreams ·
2024/12/05
en-us
12:03
Tülu 3
Mechanical Dreams ·
2024/12/02
en-us
13:03
The Zamba2 Suite
Mechanical Dreams ·
2024/11/29
en-us
10:07
Small-scale proxies for large-scale Transformer...
Mechanical Dreams ·
2024/11/26
en-us
10:28
Hardware Scaling Trends and Diminishing Returns in...
Mechanical Dreams ·
2024/11/25
en-us
8:33
Scaling Laws and Compute-Optimal Training Beyond...
Mechanical Dreams ·
2024/11/19
en-us
9:10
Understanding WSD Learning Rates
Mechanical Dreams ·
2024/11/18
en-us
7:29
Toward Understanding Why Adam Converges Faster Than...
Mechanical Dreams ·
2024/11/16
en-us
13:36
Amuro & Char - Analyzing the Relationship between...
Mechanical Dreams ·
2024/11/08
en-us
6:27
Evaluation data contamination in LLMs: How do we...
Mechanical Dreams ·
2024/11/07
en-us
7:40
How Does Critical Batch Size Scale in Pre-training?
Mechanical Dreams ·
2024/11/04
en-us
91 results

Similar Podcasts