• PlexSheep@infosec.pub
    link
    fedilink
    English
    arrow-up
    2
    ·
    3 months ago

    Thanks for answering my frustrated questions, was a long day yesterday. I’ll try to understand the deeper truths later, but I can already tell the matrix stuff goes over my head.

    • affiliate@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      3 months ago

      anytime. i’ve also had my fair share of long days studying analysis. and i feel like most of my time spent trying to learn analysis was spent fighting with the textbooks. i think the (ε,δ) stuff is to blame for that, but that’s a whole other topic.

      anyways, i was thinking a bit more about the matrix stuff and i think i have a better explanation if you’re interested, since my previous one was probably a bit too abstract. i think it should honestly be criminal to teach multivariate analysis before linear algebra, since a lot of the purpose of multivariate analysis is to turn complicated problems into linear problems. but anyways, here’s the big picture:

      you don’t really need to understand the ins and outs of matrices and be super familiar with them to get a sense of what the total derivative is, and how it should behave. for that purpose, here are some of the highlights of matrices and the total derivative:

      Let A be an m x n matrix. Then:

      • Multiplication with A defines a so-called “linear function” from ℝn to ℝm. put simply, this means that if you have a line in ℝn, and you multiply each point in that line with A, then the result is a line in ℝm. (This is because, under the hood, matrix multiplication is just a bunch of scalar multiplication and addition.)
      • There’s a slight catch to what I said above: sometimes you multiply the points in a line with a matrix and they all get sent to the 0 vector instead of to another line. (Compare this to what happens when A is a 1 x 1 matrix, i.e. a number, and multiplying every point in ℝ with A will either give you only the number 0, or it will give you all of ℝ.)
      • Now think about a plane: it’s something spanned by two lines. (The simplest case being ℝ2, which is spanned by the x and y axis.) Since matrices send lines to either lines or 0, there are three options for what can happen to a plane: it gets sent to a plane (no spanning lines get sent to 0), or a line (one of the spanning lines get sent to 0), or a 0 (both spanning lines get sent to 0). You can do some fancy math to show that the first case (where a plane get sent to a plane) is much more likely than the other two cases. So this is where the idea of a tangent plane comes from: approximate a function with a matrix, and the matrix corresponds to a plane that “stays close” to the function.
      • In any case, matrix multiplication is an extremely easy thing for computers to do, because there’s a formula for it. In contrast, evaluating arbitrary functions is not easy, and there’s no formula for that. This is really the main benefit of the total derivative: you can approximate the behavior of a function with matrix multiplication. And we know a whole lot more about dealing with matrices than we do about dealing with random functions.

      So those are two ways to look at the total derivative: you can try to get a geometric understanding of what it does (approximate the function with the best fitting plane), or try to look at why it’s useful (turning harder problems into easier problems). But just to be clear, dealing with matrices is still hard, it’s just comparably a lot easier than dealing with random functions.