W2D3: Microlearning


One intro video and one long lecture, listed as being 2 hours long.
It's dense and full of potential difficulties and potential points to trip up for students who are unsure of this level of complexity. However, it presents a few difficulties to edit:
  • In the global goal to reduce content by 30%, with tutorial lengths being advertised as needing to take up 3 hours, this more dense material is already offset by a reduced time for people to complete
  • Reducing it anymore might result in people having less to do on a single day and you can't reduce 30% overall by taking too much away from a single day



Things to check / do
  • L(delta W) = not just W?
  • Reinforce the idea of (W' - W) being the directional deriviative
  • Clarify psi is a matrix of perturbations (kind of implied by could be made more explicit)
  • Very confusing for L(psi) and L(0) to be L(W') and L(W) (with and without perburbations)
  • Code verified that that IS what is going on - definitely should be clarified in text
  • Ask Xaq what the purpose is of keeping this and if it's not more clear to use W prime and W? I'd certainly feel like I understand it way clearer
  • Also, why sigma and not multiplied by identity matrix to show it's not just a single value - OR IS IT? The same shift in every weight? Not randomly sampled and independent?
  • Clarify where MLP code can be found to study ("helper functions")
  • Clarify role of Wh and Wy as being the layer weights (input-to-hidden and hidden-to-output)
  • Description of the network in Section one should be tied closer with variable names from the preceding code - this would be slightly out of order but still helpful for students to review and go back and check their understanding
  • Make it clear that PSI is a matrix of random values from multivariate Gaussian and it's actually not single value replicated and shifting all weights in the same direction (or maybe it is??) Ah it is matrix but because this model is about considering local updates, it makes more sense to think of it at the single synapse level and keep lowercase sigma
  •