Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> I think you are not using 'proof' in the way that most people would understand it.

I have corrected many math exams of students over the years.

A "significant" amount of the "proofs" provided by the students in those exams were wrong in one way or another.

Many students continued the exams under the assumption that what they just incorrectly proved, was actually correct.

They had no way to verify whether they proofs were correct or not. No theorem provers were allowed in the exam.

Yet all of them called what they wrote "a proof".

---

This is pretty similar to how Rust works.

Without unsafe, Rust proves your programs sound for you.

Unsafe let's you supply your own soundness proofs for parts of the program that the compiler does not know how to prove (or disprove).

If the proofs are not correct, RUst makes no guarantees.

The claim that because "Rust does not prove these for you, they are not proofs" is unfair.

That's like claiming that any "proof" that has not been proved in a theorem prover does not deserve to be called "a proof".

I respect your way to see this. I just don't see it this way.



sort by: page size:

> Putting aside their steep learning curve, formal proof methods do not guarantee that the code you've written is bug free.

People seem to always bring this up but what's better? Verified code is about as close as you're ever going to get to bug free.

If you're doing a large proof, getting the specification wrong and not eventually noticing while working on the proof isn't common. You'll likely have something that isn't provable and get proof goals that don't make any sense. I don't think it's a common problem for mathematicians either that they end up proving the wrong thing.

I haven't heard of any examples of complex verified programs where later someone found a huge flaw in the specification. It would be like writing a set of unit tests and a complex program that passed them which had behaviour that goes against what you wanted.


> However, if you make a mistake in your proof, then the compiler will happily verify the wrong thing.

I don't use theorem provers in my to day to day tasks, but I had the 101 course at University. Back then I wrote basic proof with the tools, and if your problem is correctly stated and entered in the system, you simply CAN'T arrive to an invalid proof. The only way to fool the compiler is to use logical shortcuts, which are clearly defined in term of language keywords, so you know exactly where is the weakness in the proof, and look for them.

Edit : I don't know every theorem provers out there, so to give a bit more context about my experience, it was with the Coq theorem prover


> If you could actually write the proof, then you would not want the dynamic checks, as all they offer is protection against errors that you can prove you have not made.

Even if I write a formal proof, I'm not guaranteed that my program won't index out of bounds. Formals proofs have errors in them all the time. I want dynamic checks to catch me in those cases.

Further to the point, let me reiterate: programmers do not (in almost all cases) write these proofs you are talking about.


> But at the same time, people often call either flawed proofs or purported proofs formal proofs in a looser sense.

They are wrong. A purported proof is only a proof if it is actually correct. If you cannot reliably come up with a proof, not mere purported proofs, then just be honest and say so. We are not in a consultant-client relationship, so you gain nothing by lying about your skills to me.

> You know that my point was that a manually verified proof might still be wrong.

Only if you cannot reliably come up with correct proofs.

> And yet rather than addressing that point you decide to evade it by an ill-considered nit pick on whether an incorrect proof is still a proof.

It is not an “ill-considered nitpick”. It is essential to my point. If you can prove that your assertions hold, you do not need to check them dynamically. (Of course, this is logically a triviality, and it is awful that I need to explicitly make this point.)


> Proofs are a big part of any decent data structures and algorithms course.

There is no need for anything more than a loosely agreed upon pseudo code for for the proofs though, actually many proofs can be made without any code.

> So you want a language that makes it possible for students to prove things about programs written in it, without hand-waving away large parts of its semantics

You can write proofs in any language. I'm not convinced that using a language that is easy to automatically prove correctness of is useful for learning proofs. If anything it's the perfect opportunity for syntax and semantics to blind the student to the fundamental assumptions and arguments that allow them to understand and write proofs.


> And then save time because you can be sure that the proofs you write are correct.

For better or worse, we both know that's not required to be a successful mathematician. I remember reading a presentation from a guy advocating mechanised proof who realised that almost every important proof in his field had been later shown to be false.

> Did you try isar?

I had a go at using Isabelle (in general) and truthfully never really got onto Isar - I discovered that both my supervisors have written large theories in HOL4 (JIT compilers etc), but have no recent experience with Isabelle, so that's what I'm using. I've seen examples of Isar which look nice, but I'm skeptical about its scalability. You make a good point, thouugh.

> It's not at all a good point. There are theorem provers that rely on the correctness of just a very small kernel. Everything on top (libraries) is then guaranteed to follow the rules imposed by the kernel.

Yes, of course - but the richness of a theorem prover comes from the logics you define on top of it.

If the library implementation of a particularly complex structure defined externally to the logic with no easy to coverify definition had a mistake in its definition which is not caught, any entailed inconsistencies are transmitted to the theorems that rely on this data structure.

Sure, your theorems are correct in the logic - but they might not be useful. I obviously pick the more engineering based example, but that's merely due to my being ignorant of most pure mathematics - there are probably analogues.

This is a growing problem for companies like Intel, who use a software model to test their firmwares before the hardware is available [1] - they want to verify that the operations are identical.

[1] - http://www.cs.utexas.edu/users/hunt/FMCAD/FMCAD13/papers/44-...


> Then your proof would be rejected by the compiler.

Only for some types of mistake, surely. If that was always the case, we'd have a compiler that could read minds.

> As for whether you're proving the right thing or the wrong thing, a type system is no less helpful than a test suite.

Really? I can write down in my test suite, "assert (add 1 1) == 2". Anyone can come along and look at that and make sure it matches the spec. We can add additional tests for various bounds, and possibly use a quickcheck-like tool in addition, and for 99.99% of use cases be happy and confident that we're at least writing the right thing.

What's the type for that and does it actually have any resemblance to "the add function adds two numbers together", or do I have to read a couple of papers and have a background in university-level math to convince myself that the proof actually does what it says?


> One major benefit that proof assistants have over other formal methods is that they can be used to prove essentially arbitrary properties of programs, which is not true of any automated methods.

It is true of model checkers.

> There are plenty of other examples, like the seL4 microkernel.

Yep, there are plenty of such examples, I'd say 2 or 4, and they're all less than 10KLOC, i.e. 1/5 of jQuery.

> I also do not understand what bearing computational complexity results have on humans manually formalizing their informal intuitions about program correctness in a proof assistant.

Because humans can't outperform Turing machines.

> Humans do not work on arbitrary problems where they have no intuition at all about why a program is correct, hence referring to worst-case computational complexity results does not make much sense here.

For one, the results aren't just worst-case. One of the most important ones is on parameterized complexity. Second, problems that are very hard in the worst case have rarely been proven easy in the "real world" case, unless the real world case was somehow categorized in a neat subset, and this isn't the case here. It's possible that we're usually far from the worst case, but if so, automated methods could also exploit that, and still be better than manual ones in most cases.

> In fact, if you have no intuition about why a program works, then how did you manage to write it in the first place? You surely did not write random garbage in the hopes that you would get a working program.

This is a common assumption, but a wrong one: https://pron.github.io/posts/people-dont-write-programs

> In this sense, proof assistants subsume all other formal methods, since they can be "scripted" to include them.

The same can be said about all other formal methods. Indeed, various sound static analysis tools like Frama-C or optionally fall back to manual proofs when needed. In the end, though, in practice deductive proofs don't scale as well as other methods. It's a good tool to have in your toolbox, but you don't want it as your default. Your model checker could just as easily ask for help from a proof assistant. So it's not the case that any one method subsumes all others, but you could combine them. Which is why dependent types are problematic, as they make these combinations harder than, say, contract systems, that are flexible in the choice of verification method.

> You may be thinking of a programming style where the type of a program encodes all the properties you are interested in, but you can avoid this style where necessary, by separating your program and its correctness proof.

True, but that's not the style discussed here, of dependent Haskell or Idris.


> I don't think a big proof that has been accepted by the community has ever been found to be false by programmatic methods.

Right, because that's not how it works. Errors aren't discovered after the proof has been translated into machine language because that is the only step of the verification. As soon as the code is written, the proof is verified. Necessarily, then, any error discovery has to happen before that.


> Explicit proofs (e.g., in math and CS) are more precise but longer, harder to check, and harder to debug.

Sorry, I've done a LOT in both (1) proofs in pure and applied math and (2) algorithms and software, and your statement is strongly wrong and backwards -- the math is much easier than the computing.

E.g., my Ph.D. dissertation in applied math had both proofs and the results of some software. There was no problem getting errors out of the proofs.

Proofs are really easy to check. I've done thousands of proofs, and I'm not sure I've ever had a mistake in one.

I've written perhaps a million lines of software, and over 50% of the programs had bugs until I found and fixed them.

One of the real advantages, powers, of math is how easy it is to check proofs.

Commonly it is much easier to have a correct and relatively easy to read proof of a math theorem than it is to have a correct and relatively easy to read proof of an algorithm where it is easy to see that the proof and the algorithm are as claimed. Even for just a simple, old algorithm such as Quicksort, it is not easy to prove it is correct; even the notation is not easy. In computing, proofs of correctness commonly have to be in the framework of mathematical induction, and there some notation is needed typically not seen otherwise for the algorithm.

Uh, mathematical induction is to show something is true for all the natural numbers, and the technique basically returns to one of the good definitions of the natural numbers.

Beyond just algorithms, in computing proofs of correctness are so difficult that we essentially give up: The reason is, the assumptions are too difficult to list. If you will, a common problem is "too many edge cases".

It is true that there is something to learn about how to write proofs.

Pure math has a standard that in precision puts it far above essentially all other writing: Some pictures or intuitive explanations may be tolerated but they are not necessary and mostly not wanted. Also, there is no role for "try it and find out", for "doing some examples to see how it goes", some manual manipulation like do when when learning to use socket wrenches or a chef's knife. An old joke was, how complicated a math description would be for how to put on a suit coat.

E.g., a missed definition sticks out worse than a sore thumb. E.g., it should be clear, maybe even mentioned, in the proof, for each of the assumptions, where it got used.

So, the math writing has to give really solid results just from the words and math notation. Bluntly, that is unique in writing.

For accomplishing good, new things, I'm strongly on the side of math with proofs.

The core of my startup is an example. There are no heuristics there, just some math with proofs. Without the math, there would be only guessing and heuristics on how to manipulate the input data to give the desired outputs, and the chances of any heuristics having stumbled onto the same manipulations as provided by the math are zip, zilch, and zero.

E.g., some of the math is close to some math commonly considered, but my math fixes some essentially universal mistakes, really misunderstandings, that have become standard. That is, the common understanding is that there is a fundamental problem there, but that is not true, and my math, with proofs, shows that. Without the math and proofs, I would not have thought of the data manipulations or, if I did, have confidence in their correctness or power.

Sorry, guys: In my opinion, math is the main tool for the most important progress in applying computing to help civilization.


> And are you saying formal proof assistants like coq and isabelle are fools errands[?]

No, and I have no clue how you could get that from what I wrote. To move back into a more familiar domain, Rice's theorem puts a frustrating limit on what we can accomplish via static analysis of programs, and yet the static analysis we can do remains extremely helpful in programming.


> I cannot understand this "solution" of turning to machine-verified proofs. Coq, Isabelle, etc. invariably contain bugs. Even if they didn't, you can still run into hardware bugs. Computers are not a source of perfect computation and software is certainly not a source of perfect logic.

The issue is not that computers are always perfect. The issue is that humans are far, far worse than computers at detecting errors in proofs. So-called proofs are later found to be wrong, even after lots of human peer review, and there are often doubts about published papers. If pure math is about proof (and it is), then checking proofs with only humans is unacceptable in the long term. Humans simply aren't good enough at it, and why should we make humans do a job (verifying proofs) when computers are obviously better at it?

In addition, there are many mechanisms for countering computer errors that make their likelihood essentially zero. Hardware bugs happen, but executing software on multiple different computers using diverse CPUs basically makes that disappear.

Software can have errors, but there are ways to counter that too. The Metamath community verifies set.mm (its main database at http://us.metamath.org/mpeuni/mmset.html ) by running running 4 different verifiers that were implemented by four different developers in 4 different programming languages (C, Rust, Python, and Java). There are at least 13 different Metamath verifiers, so a few more could be added if desired. The underlying Metamath language is extremely simple, too, so a verifier can be written in only a few hundred lines of code (reducing the risk of error in any one of the verifiers). There's evidence that N-version programming doesn't reduce errors as quickly as if errors were independent (see Knight and Leveson's paper), but even so, it's still quite difficult to slip by that many independent verifiers. In the long term, proof verifiers can be proven correct. It's hard to prove large programs correct, but you only need to verify the verifier; there are already examples such as ivy for prover9.

I can imagine that computers might someday be better at creating proofs. But that isn't necessary for computers to be useful. The main issue today is that proofs are typically not computer-verified, and computers can do that today. Many tools have managed to do well on Freek's "Formalizing 100 Theorems" challenge list at http://www.cs.ru.nl/%7Efreek/100/ - and for the most part that is without a lot of investment.

It's not easy to use existing tools, that's definitely true. But a big part of the problem is that there hasn't been a lot of work to use or grow them. If more work and money was invested in improving the systems for verifying proofs, and it was held in higher regard, a lot more would happen.

I believe future mathematicians will view proofs unchecked by computers the same way we view alchemy today. I suspect they'll say something like, "once in a while those alchemists and pre-mathematicians happened to be correct, but of course we don't accept their work without computer verification today." I think we should be working to make that future happen sooner.


> That's why the need for assertions to check that invariants hold.

No, you need proof.

> Until we all program in Coq or similar

So you're saying humans are fundamentally incapable of establishing the logical validity of what they assert by themselves? This contradicts historical evidence that people have done this for well over 2 millennia, using various methods and tools.

> A correct program is a spectrum, not a binary option.

Some errors might be easier to fix or have less disastrous consequences than others, but a correct program is one that has no errors, so I don't see where the spectrum is.


> I'd be interested in whether proofs like these will be formalized in proof assistants that can be checked with computer code, so that it removes doubt of error.

This does not remove doubt of error.


> I want proof before I run the program.

Except that for most interesting propositions you can't have proof. You need to show that the crude propositions that your type system can prove are what makes all the difference. I'm not so sure that's the case.


> One reason it matters to me is that if I write a program that computes something in a proof, I need to be able to understand and verify (or possibly check that other people I trust have verified) the source and algorithms

Do you actually do this verification? How do you accomplish this? The software stacks are huge. Why do you trust other people over the people who develop Mathematica, who just happened to be paid?


> I was shocked to learn that the binary search program that Bentley proved correct and subsequently tested in Chapter 5 of Programming Pearls contains a bug.

I haven't seen the mentioned proof, but if said proof is not formal and mechanized and/or does not consider all possibilities, including overflow, then should we really consider it to be a proof of correctness? It might prove some desirable properties, but I don't think we should leave it at that. I certainly don't think we should claim that "It is not sufficient merely to prove a program correct; you have to test it too."

When it comes to software programs, I believe proofs can and should be exhaustive. That does not necessarily mean you need to exhaustively test every input, but it does mean you need to prove correctness for every input, including inputs that might result in overflow or undefined behavior. Otherwise, we should not consider it a proof of correctness.


"And at the very bottom there is a base piece of theorem-proving software that is only a few hundred lines of code that has been validated only by humans and self tests."

Is it possible for that part of the system, to proof it's own correctness?

It's obvious that the danger with such a proof is that the code may think it's correct while it's incorrect in thinking it's correct.

But if I have learned one thing, then that Mathematicians can sometimes be really sly foxes.


> I gather you understand the topic, but I'm still left wondering: who is this for, and what will they do with it?

'Understand' is a big word. I've played around with Idris (which is a Haskell-like programming language that supports Homotopy type theory as well). I have a grasp of the underlying mathematics but don't really understand the stuff deeply.

Yes, I went on a bit of a rant and totally ignored your (totally reasonable) question. Theorem proving and proof checking are quite niche, both within computer science and mathematics. In computer science, the idea is that you can formally prove the correctness of computer programs to avoid software failures and have very robust programs. In mathematics, the ideal is to have

  1. A universal language for proofs
  2. A database of computer-verified results
  3. Easy verification of new results
It's a mighty interesting topic, and unfortunately, I've also found it quite difficult to get more than a shallow understanding of it.
next

Legal | privacy