Shorter, lower-overhead pieces: paper reading notes, conference reflections, and project-status updates from QMI Lab and AstroLLM. Long-form essays live at /writing/.
Subscribe via the notes feed.
-
The headline that was within noise
I wrote down three predictions about which retrieval arm would win, then watched the aggregate Recall@10 ranking dissolve into noise at twenty-nine queries. The findings that held up were single queries, not averages.
-
The label review that lowered my score
A retrieval pilot over 500 real exoplanet papers scored Recall@10 in the low 0.8s; reviewing my own relevance labels pulled it down to 0.69. The drop is the part worth trusting.