There are thousands of different proofs of the Pythagorean theorem, and some of them are really cool. The purely trigonometric proof that was found by some high school students recently is a great one. However, I think the greatest proof of all is this little gem that has been attributed to Einstein [1].
Take any right triangle. You can divide it into two non-overlapping right triangles that are both similar to the original triangle by dropping a perpendicular from the right angle to the hypotenuse. To see that the triangles are similar, you just compare interior angles. (It's better to leave that as an exercise than to describe it in words, but in any case, this is a very commonly known construction.) The areas of the two small triangles add up to the area of the big triangle, but the two small triangles have the two legs of the big triangle as their respective hypotenuses. Because area scales as the square of the similarity ratio (which I think is intuitively obvious), it follows that the squares of the legs' lengths must add up to the square of the hypotenuse's length, QED.
It's really a perfect proof: it's simple, intuitive, as direct as possible, and it's pretty much impossible to forget.
[1] https://paradise.caltech.edu/ist4/lectures/Einstein%E2%80%99...
This proof assumes that the area a triangle is some function k c^2 of the hypotenuse c where k is constant for similar triangles.
This doesn’t seem super obvious to me, and it’s a bit more than just assuming area scales with the square of hypotenuse length, it indeed needs to be a constant fraction.
To me that truth isn’t necessarily any less fundamental than the Pythagorean theorem itself. But to each their own.
BTW Terrence Tao has a write up of this proof as well: https://terrytao.wordpress.com/2007/09/14/pythagoras-theorem...
> This proof assumes that the area a triangle is some function k c^2 of the hypotenuse c where k is constant for similar triangles.
It is elementary to show that the area of a triangle is base * height / 2. (It follows from the fact that you can make a rectangle out of it using two identical sub-triangles. I assume you're willing to concede that the area of a rectangle is base * height.) If you scale your triangle by c, both base and height will be multiplied by c, and 2 will not.
> This proof assumes that the area a triangle is some function k c^2 of the hypotenuse c where k is constant for similar triangles.
Area in what units? "Square" units? But we're free to choose any unit we want, so I choose units where the triangle itself with hypotenuse H has area H^2 units. To justify that, I think the only thing we need is the fact that area scales as the square of length. (There's that word "square" again, which implies a specific shape that is actually completely arbitrary when talking about area. Perhaps it's better to say that "area scales as length times length.")
> To me that truth isn’t necessarily any less fundamental than the Pythagorean theorem itself.
I think the Pythagorean Theorem is surprisingly non-fundamental, in that you can get surprisingly far without it. It's surprising because we usually learn about it so early.
You got a bunch of responses already, here is an intuitive reason.
In similar triangles all distances are scaled by a factor k, by definition. Then, intuitively the areas are scaled by a factor of k^2, since you obtain an area by multiplying two distances.
So the ratio of the area over the hypothenuse is scaled by a factor of k^2/k=k.
It is not hard to confirm the intuition that the areas are scaled by a factor of k^2, since it is precisely the product of the lengths of the two sides adjacent to the right angle.
I think it would be clearer with an explicit function of a rectangle's area with respect to its diagonals:
A = x y x = r cos(θ) y = r sin(θ) => A(r) = r²cos(θ)sin(θ) A(c) = A(a) + A(b) = c²cos(θ)sin(θ) = a²cos(θ)sin(θ) + b²cos(θ)sin(θ) => c² = a² + b²
I don't get his "modern" proof. Specifically the step where he says "it's easy to see geometrically that these matrices differ by a rotation" seems to be doing a lot of heavy lifting. The first matrix transforms e1 to (a,-b), the second scales e1 to (c,0). If you can see that you obtain one of these vectors by rotating the other, then you've shown that their lengths are equal (i.e. a²+b²=c²), which is what we want to show in the first place.
Draw it on graph paper:
Set B as the origin, and let BC (the 'a' side), be on the the positive side of the x-axis. Let AC be on the positive side of the y-axis.
The left matrix is a clockwise rotation and scaling. This is clearly seen if you draw the transformation applied to the two axis basis vectors. (The scaling factor isn't obvious yet.)
Then the left matrix varies (1,0) to the side AB, which has magnitude c. Z and carries (0,1) to an perpendicular line of the (importantly) same magnitude,
So it's a rotation and a scaling by c.
The right matrix obviously is a scaling by c.
> If you can see that you obtain one of these vectors by rotating the other, then you've shown that their lengths are equal (i.e. a²+b²=c²), which is what we want to show in the first place.
Yes, that's the point!
You're assuming that we know that the length of vector (a, -b) is a²+b². We don't know that.
We start by assuming that the position vector (a, -b) has length c. This implies that we can rotate that vector until it becomes the position vector (c, 0).
As you note, we can create the two vectors above from (1, 0) using linear transformation matrices [(a, b), (-b, a)] and [(c, 0), (0, c)]
So we could create the position vector (c, 0) by starting at (1, 0), applying the linear transformation [(a, b), (-b, a)], then applying a rotation to bring it back to the e1 axis.
Thus for some rotation matrix R,
R × [(a, b), (-b, a)] = [(c, 0), (0, c)]
The determinant of a rotation matrix is 1, so the determinant of the left side is 1×(a²+b²), while the determinant of the right side is c², which is how we end up with a²+b²=c².
Now the only thing which I'm not sure of is whether there's a way to show that the determinant of a rotation matrix is 1 without assuming the Pythagorean identity already.
> Now the only thing which I'm not sure of is whether there's a way to show that the determinant of a rotation matrix is 1 without assuming the Pythagorean identity already
You can define the determinant that way. Now the question is why the cross multiplication formula for determinant accurately computes the area.
You can prove that via decomposition into right triangles https://youtu.be/_OiMiQGKvvc?si=TyEge1_0W4rb648b
Or you can go in reverse from the coordinate formula, to prove that the area is correctly predicted by the determinant.
Yep - I'm just not sure if any of those proofs implicitly assume Pythagoras, and haven't thought through them properly.
I was initially going to say we know that det R = 1 by using the trigonometric identity cos²x+sin²x=1, but then found out that all the proofs of it seem to assume Pythagoras, and in fact, the identity is called the Pythagorean trigonometric identity.
> This proof assumes that the area a triangle is some function k c^2 of the hypotenuse c where k is constant for similar triangles.
I would state it differently: Given the specific triangle, the ratio between the area of the square on the hypotenuse and the area of the triangle is some constant k that is invariant under scaling of the triangle.
This should be intuitively obvious: When you have a picture with some shapes in it, scaling the picture won’t change the relative proportions of the shapes in the picture. You don’t have to know the absolute dimensions of the picture to determine the area ratio of two shapes within the picture. The constant k above will be different for differently shaped triangles, but will be the same for triangles of the same shape (same angles).
So, for a given triangle T1 with hypotenuse c we have, for some k: area(T1) = k x c²
Now, we subdivide T1 into two smaller triangles T2 and T3 that have the property of both being a scaled version of T1, and their hypotenuses being the a and b of T1. Hence we have:
area(T2) = k x a²
area(T3) = k x b²
all with the same k, since they have the same shape (only differing by scaling).
Because we have
area(T1) = area(T2) + area(T3),
it follows that
k x c² = k x a² + k x b²
and since k ≠ 0, we get
c² = a² + b².
In pictures (as far as possible with ASCII art):
(The three are supposed to be the same exact picture, just rotated and scaled, and T1 can be split into T2 and T3, hence c² must also be the sum of a² and b².)_______ | |\ | a² | \ | |T2\ | |.‘ ¯¯¯¯¯¯¯ + . .T3\ |¯¯¯¯| | b² | |____| = / \ / \ / \ |\ c² \ | \ / |T1\ / | \ / ¯¯¯¯The underlying “miracle” is that you can subdivide any right triangle into two smaller copies of itself. The Pythagorean theorem then follows immediately from that. This subdivision capability is something that might be amenable to some further underlying explanation.
- [deleted]
P.S.: T2/T3 are also mirrored, not just scaled, from T1, but obviously that doesn’t change anything about their area.
> where k is constant for similar triangles.
you can see that by simply scaling the figure of (the triangle + square on its hypotenuse) as a whole; whatever size the triangle is the ratio of the two pieces doesn't change
> This doesn’t seem super obvious to me, and it’s a bit more than just assuming area scales with the square of hypotenuse length, it indeed needs to be a constant fraction.
The second half of your sentence is not correct; if area scales with the square of any one-dimensional measurement (including hypotenuse length, because the hypotenuse is one-dimensional), that is sufficient to prove the theorem.
The statement you're looking for is: "triangle A is similar to triangle C with a length ratio of a/c, therefore the area of triangle A is equal to the area of triangle C multiplied by the square of that ratio".
It is in fact necessary that the area will scale with the square of hypotenuse length, because the hypotenuse is one-dimensional and area is two-dimensional. If you decided to measure the area of the circle that runs through the three corners of the triangle, the triangle's area would scale linearly with that.
It isn't clear to me what scenario you're thinking might mess with the proof.
> This proof assumes that the area a triangle is some function k c^2 of the hypotenuse c where k is constant for similar triangles.
So, for similar shapes, you can set your own measurements.
1. Say I have two triangles X and Y and they're similar. I take a straightedge, mark off the length of the longest side (x) of triangle X, and say "this length is 1". Then I calculate the area of triangle X. It will be something. Call it k.
2. Now I take a second straightedge, mark off the length of the longest side (y) of triangle Y, and I label that length "1". I can calculate the area of triangle Y and, by definition, it must be k. But it is equal to k using a scale that differs from the scale I used to measure triangle X.
3. We can ask what the area of triangle Y would be if I measured it using the ruler marked in "x"es instead of the one marked in "y"s. This is easier if we have the same area in a shape that's easier to measure. So construct a square, using the "y" ruler, with area equal to k.
4. Now measure that square with the "x" ruler. The side length, measured in y units, is √k. Measured in our new x units, it's (y/x)√k. When we square that, we find that the x-normed area is equal to... k(y/x)².
This is why it's obvious that k must be constant for similar triangles. k is just a name for the scale-free representation of whatever it is that you're measuring. It has to be constant because, when you change the scaling that you use to label a shape, the shape itself doesn't change. And that's what similarity means.
unfortunately doesn't work for me because of difficulty visualizing things, so I suppose there are probably a good number of people with the same problem.
So I guess for one particular subset of the population it is difficult, impossible to understand, and because it cannot be understood it will not be remembered.
Not complaining just noting the amusing thing that different explanations may have all sorts of problems with it.
Although if there was a video of it I guess I would understand it then. Not sure if everyone with visualization issues would though.
To be fair, I'm constrained by plain text on Hacker News. The argument I wrote down requires a diagram to be fully understood, so I described it in words expecting the reader to draw it themselves, or at least mentally visualize it (for those used to doing it).
To be clear though, as far as I know every proof of the Pythagorean Theorem requires some sort of diagram, and the one I gave requires literally the least amount of drawing out of all the proofs (which is a bold claim, but call it a conjecture). That's why I felt comfortable writing out the proof just in words.
Also, iterating that triangular subdivision gives rise to the Pythagorean tree fractal: https://en.wikipedia.org/wiki/Pythagoras_tree_(fractal)
Indeed, a wonderful proof. It does, though, make one implicit assumption that if one stretches the fabric by the same amount, all holes in it stretch by the same amount. In particular, it assumes that triangle stretching is size-independent. Perhaps there are fabrics where that is not true...
You mean non-Euclidean fabric?
https://www.johndcook.com/blog/2022/09/08/trig-hyperbolic-ge...
https://math.hmc.edu/funfacts/spherical-pythagorean-theorem/
That is one possibility. Probably the only one, though.
The Pythagorean theorem is only true in Euclidian space, where that “stretching” assumption is true. So you are right about there being assumptions, and indeed they are imposing limitations to the applicability of the theorem.
> it follows that...
"Now just draw the rest of the owl."
Does it not feel like you skipped something here? The areas add up and area scales quadratically, therefore... Pythagorean Theorem? It definitely is not clear how this follows, even after the questionable assumption that it's obvious area scales quadratically.
He didn't skip anything but he left the "obvious" details (for a mathematician) to the reader:
Let C be the area of the big triangle, A and B be the areas of the two small triangles. By construction we know that C = A + B. Moreover, a, b, c are the hypotenuses of the triangles A, B and C.
The area scaling quadratically with the similarity ratio means that
A = (a/c)² C, and B = (b/c)² C.
Now, plug this into A + B = C, cancel C, rearrange.
The math is obvious enough, I agree. But the description of the approach feels like it's lacking something - specifically, something along the lines of "now write down the scaling equations and simplify the area summation." I feel like it's not at all clear they're switching to an algebraic argument there.
Mathematicians explain things the way I imagine musicians would if the ancient Greeks had insisted on making all musical instruments in a range audible only to dogs.
I'd be like, "How do I actually hear the difference between a major and minor sixth?" And the musician would be like, "Just play them into the cryptophone and note the difference in the way your dog raises its eyebrows."
The very few remaining musicians in this hellscape would be the ones who are unwittingly transposing everything to the human range in their sleep, then spending the day teaching from the Second Edition of the Principles of Harmonic Dog Whistling for all us schmucks.
Luckily we don't live in that musical universe. But mathwise, something like that seems to be the case.
Look, I think it's pretty hard for most of us to read long math arguments in plain text, so I wrote in the simplest language I could, leaving the simple details for the reader to fill in.
I will add that in the vast majority of mathematical literature, both in pedagogy and in research, the active participation of the reader is assumed: the reader is expected to verify the argument for themselves, and that often includes filling in the details of some simple arguments. That's exactly why math literature uses the plural first-person "we," because it's supposed to be as if the writer and reader are developing the argument together.
In contrast, listening to music can be purely passive (but doesn't need to be).
The thing is that in my head there is no algebraic argument: we go from (1) similarity ratios being A:B:C and (2) the first two areas adding up to the third area, straight to the conclusion of A^2 + B^2 = C^2. I think your point about a step being missing here is valid, but when I search my intuition, it's still not coming up as algebraic. I suspect this is the same for others like me who are inclined to think geometrically, but I'd like to hear their opinions.
Here's an attempt at filling in the geometric intuition with something more concrete. You know how it's common to visualize the theorem with squares on the three sides of the triangle and saying that the two small squares add up to the big one? And then everyone stares at it and says "huh?" because that fact is far from obvious from that diagram. Here's the thing though, we're free to choose different area units if we want. So just choose units where our triangle itself with a given hypotenuse H has area H^2 units. Then we can give the argument above without any extra factors and cancellations.
To fully justify the "choose any units," you do need to check that it's logically consistent, which you could say is more missing steps, but I think this idea is far more fundamental than the Pythagorean Theorem. Our use of squares to define the fundamental units of area really is a completely arbitrary choice. We call them "square units," which already biases us to think of area in a specific way, but there's absolutely no reason we can't use any other shape. Of course squares are convenient because you can stack them up neatly and count them, but that doesn't seem to be helpful at all in this context, so it's natural to choose something else.
> So just choose units where our triangle itself with a given hypotenuse H has area H^2 units.
This is not at all trivial. You're claiming you can choose units in such a way (reusing my notation from before) that simultaneously
A = a², B = b², C = c².
Intuitively, you can do that precisely because the triangles are similar and area is quadratic in the similarity ratio. But there is definitely some algebra behind that.
To be clear, I'm just claiming that we can choose a specific area unit, and the three equations you wrote are then obvious consequences of that. It's true, you do need to assume area scales as the square of length, but IMO that's a pretty fundamental fact, and I think that's intuitive for many others. But as always, YMMV.
> The purely trigonometric proof that was found by some high school students recently is a great one.
It was geometric, using trigonometric vocabulary.
Trig relies on some geometric assumptions. The definition of sin and cos is going to require a right triangle, for instance,
I think proof #6 on this page is easier to follow and uses the same similar triangles. But then it’s just some basic algebra without assuming anything about areas of similar triangles :)
That one's also very neat!
> The purely trigonometric proof that was found by some high school students recently is a great one.
I failed to understand what was so cool about that proof. It relied on concepts such as Cartesian coordinate systems, and the measure of an angle (not just a pure geometric concept), and even concepts like convergence of infinite sums, which weren't purely geometric.
Geometry had been formalized in the 20th century and had moved past informal proofs
Einstein's proof relies on the fact that the theorem works with any shape, not just squares, such as pentagons: https://commons.wikimedia.org/wiki/File:Pythagoras_by_pentag...
Or any arbitrary vector graphics, like Einstein's face. So in the proof, the shape on the hypotenuse is the same as the original triangle, and on the other two sides there are two smaller versions of it, which when joined have the same area (and shape) as the big one.
Fair enough. However, none of the hundreds or thousands of proofs explain it. They all prove it, like by saying "this goes here, that goes there, this is the same as that, therefore logically you're stupid," but it still seems like weird magic to me. Some explanation is missing.
Draw a square around Einstein's face. Call the side length of the square a and the area of the square A. We have A=a^2. Einstein takes up some portion p < 1 of that area, so Einstein has area E = pA. Now we scale the whole thing by factor f. So the new square has side lengths fa, and thus area A' = (fa)^2 = f^2×a^2 = f^2×A. Since the relative portion the face takes up doesn't change with scaling, the face now has size pA' = p×f^2×A = f^2 × pA = f^2 E.
Does that help or was that not the part you were missing?
No, that part is fine: I'm happy with the fact that it works with arbitrary shapes. What bothers me is that the area on the hypotenuse is equal to the sum of the areas on the other two sides, when the triangle has a right angle.
This somewhat like saying that I'm troubled by the fact that 1+1=2, I know. But that's a potentially distracting sidetrack, let's not get into that one.
What definition of area are you using in the first place, for non-swuare objects? Most people find area intuitive and informal, but if you describe area formally, it should be easy to use your definition to account for scaling.
I was saying two separate things. Thing 1, the non-square shapes are relevant to Einstein's nice proof. Thing 2, considering squares now if you like, pythagoras's theorem has a magical quality which proofs can't dispel.
If you travel some distance, square it, travel some other distance perpendicularly, square that too, and add the results, you get the square of the straight distance from start to finish. Every proof just seems like a reformulation of this freaky fact.
That is interesting and made me think. Only after following some of the other subcomments did I manage to understand it. Personally, replacing the word similarity ratio with scale factor made all the difference. At first I thought it was a circular argument, relying on pythag to prove pythag but that scale factor is the key actually, and the fact that side lengths scale linearly but the area scales quadratically. It feels like a similar trick we see when adding logarithms gives us multiplication.
Yeah, the similarity ratio/scale factor and its connection to area is the key part of the proof. Sorry, that could have been clearer.
All good! thanks for the maths though, just when I think I understand it... I dont! Its one of those love hate relationships, I love it, it hates me.