6 comments

  • enum 7 minutes ago
    The post says this in other words: in Lean, Rocq, or any other theorem prover, you get a formally-verified proof, but you do NOT get a formally verified theorem statement.

    So, even if the proof is correct, you need to determine if the theorem is what you want. Making that determination requires expertise. Since you cannot "run the theorem", you cannot vibe-code your way through it. E.g., there is no equivalent of "web app seems to be working!" You have to actually understand what the theorems are saying in a deep way.

  • Paracompact 33 minutes ago
    > Third, you need to decide how far “down the stack” you want to go. That is to say, the software you want to verify operates over some kind of more complex system, for instance, maybe it’s C code which gets compiled down to X86 and runs on a particular chip, or maybe it’s a controller for a nuclear reactor and part of the system is the actual physical dynamics of the reactor. Do you really want your proof to involve specifying the semantics of the C compiler and the chip, or the way that the temperature and other variables fluctuate in the reactor?

    I can appreciate what he's getting at, but my utopian vision for the future is that we won't need to reinvent the wheel like this each time we want verified software! E.g. for high-consequence systems, the hard part of compiler correctness is already handled by the efforts of CompCert, and SystemVerilog assertions for the design guarantees of processors is becoming more commonplace.

  • crvdgc 36 minutes ago
    Some valid points, but I hope the authors had developed them more.

    On the semantic gap between the original software and its representation in the ITP, program extraction like in Rocq probably deserves some discussion, where the software is written natively in the ITP and you have to prove the extraction itself sound. For example, Meta Rocq did this for Rocq.

    For the how far down the stack problem, there are some efforts from https://deepspec.org/, but it's inherently a difficult problem and often gets less love than the lab environment projects.

  • Rochus 2 days ago
    Interesting article, thanks. There is indeed a "semantic gap". However, there is also a practical solution: bidirectional LLM translation. You can verify the formal specification by back-translating it to natural language with another LLM session, allowing human review at the intent level rather than requiring expertise in e.g. Event-B syntax (see https://rochuskeller.substack.com/p/why-rust-solves-a-proble...). This addresses the concern about "mis-defining concepts" without requiring the human to be a formal methods expert. The human can review intent and invariants in natural language, not proof obligations. The AI handles the mathematical tedium while the human focuses on domain correctness, which is exactly where human expertise belongs.
    • ratmice 1 hour ago
      why do we invent these formal languages except to be more semantically precise than natural language? What does one gain besides familiarity by translation back into a more ambiguous language?

      Mis-defining concepts can be extremely subtle, if you look at the allsome quantifier https://dwheeler.com/essays/allsome.html you'll see that these problems predate AI, and I struggle to see how natural language is going to help in cases like the "All martians" case where the confusion may be over whether martians exist or not. Something relatively implicit.

      • Rochus 9 minutes ago
        We build pretty complex systems only based on "natural language" specifications. I think you are conflating specification ambiguity with verification accessibility.

        > What does one gain besides familiarity by translation back into a more ambiguous language?

        You gain intent verification. Formal languages are precise about implementation, but they are often opaque about intent. A formal specification can be "precisely wrong". E.g. you can write a perfectly precise Event-B spec that says "When the pedestrian button is pressed, the traffic light turns Green for cars"; the formalism is unambiguous, the logic is sound, the proof holds, but the intent is fatally flawed. Translating this back to natural language ("The system ensures that pressing the button turns the car light green") allows a human to instantly spot the error.

        > All Martians are green

        Modern LLMs are actually excellent at explicating these edge cases during back-translation if prompted correctly. If the formal spec allows vacuous truth, the back-translation agent can be instructed to explicitly flag existential assumptions. E.g. "For every Martian (assuming at least one exists), the color is Green", or "If there are no Martians, this rule is automatically satisfied". You are not translating back to casual speech; you are translating back to structured, explicit natural language that highlights exactly these kinds of edge cases.

      • lindenr 33 minutes ago
        I agree, if AI (or humans) have mistranslated a natural language statement to a formal statement, we should not rely on AI to correctly translate the formal statement back into natural language.

        For many statements I expect it's not possible to retain the exact meaning of the formal-language sentence without the natural language becoming at least as complex, and if you don't retain meaning exactly then you're vulnerable to the kind of thing the article warns about.

      • smarx007 43 minutes ago
        > why do we invent these formal languages except to be more semantically precise than natural language

        To be... more precise?

        On a more serious note, cannot recommend enough "Exactly: How Precision Engineers Created the Modern World" by Winchester. While the book talks mostly about the precision in mechanical engineering, it made me appreciate _precision_ itself to a greater degree.

        • ratmice 37 minutes ago
          Rhetorical sentence? My point is that back-translation into natural langauge is translating into a less precise form. How is that going to help? No number of additional abstraction layers are going to solve human confusion.
          • smarx007 24 minutes ago
            Oh well, that flew over my head. You are right.
  • paulajohnson 20 minutes ago
    A formal specification language is a programming language that we don't know how to compile.

    If we can use AI to automatically implement a formal spec, then that formal specification language has just become a programming language.

  • psuedobrain 52 minutes ago
    Slight nitpick, but isn't Agda based on the MLTT family instead of CoC family of languages?