Status:
TODO: Strong wrap-up & relink to taskonomy ideas
Summary of changes:
- Tutorial objectives explanation needs fixing → ambiguous terms and could be clarified (fixed)
- Andrew's slides reference pre-training and post-training. This was potentially fine at the time but now in 2025 these terms are very well defined for LLMs and this could be a point of confusion. It might be worth adding in a small terminological note just before the start of this video so that there isn't any confusion.
- Contrastive learning, comment on real vector fixed Link back to ResNet in prior Neuromatch course
- Conflation of terminology, some stuff not explained (prenormalization) and leaky ReLU, "plain networks" (updated)
- Added explanation / intuition for WHY residual blocks are more successful
- Explanation for why the high diagonals exist isn't fully explained in the solution, so I added some more intuition there.
- Added in a bit more info about the numerical stability, to make students a bit more comfortable about skipping some of the technical details that aren't directly related to how contrastive learning works
- Part of the coding solution relies on use of a function that is hidden away in the collapsed "Helper Function" section and is not accessible by a Google search. It's effectively hidden away and almost impossible to figure out without clicking the answer. I think this is too hard on the students and so I've provided a hint where they can find the answer. Otherwise, I think students would spend too long stuck on this section before giving in and looking at the solution
- The idea of a non-matching pair is not explained and I can imagine students getting VERY confused by this, so I've added an explanation
- Added "Big Picture" explanation, brought it right back to re-state the main points that should be emphasised
- Amended the section on contrastive learning in the real world to be a bit more intuitive and fixed some small language errors / typos