Shorter, lower-overhead pieces: paper reading notes, conference reflections, and project-status updates from QMI Lab and AstroLLM. Long-form essays live at /writing/.

Subscribe via the notes feed.

  • The headline that was within noise

    I wrote down three predictions about which retrieval arm would win, then watched the aggregate Recall@10 ranking dissolve into noise at twenty-nine queries. The findings that held up were single queries, not averages.

  • The label review that lowered my score

    A retrieval pilot over 500 real exoplanet papers scored Recall@10 in the low 0.8s; reviewing my own relevance labels pulled it down to 0.69. The drop is the part worth trusting.