mst/7/index.html

<!DOCTYPE html>
<!-- Akai (pandoc HTML5 template)
     designer:     soimort
     last updated: 2016-05-06
     last adapted: 2016-11-01 -->
<html>
  <head>
    <meta charset="utf-8">
    <meta name="generator" content="pandoc">
    <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
    <meta name="author" content="Mort Yao">
    <meta name="dcterms.date" content="2017-09-06">
    <title>The Logical Implication</title>
    <link rel="canonical" href="https://www.soimort.org/mst/7">
    <style type="text/css">code { white-space: pre; }</style>
    <link rel="stylesheet" href="//cdn.soimort.org/normalize/5.0.0/normalize.min.css">
    <link rel="stylesheet" href="//cdn.soimort.org/fonts/latest/URW-Palladio-L.css">
    <link rel="stylesheet" href="//cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css">
    <link rel="stylesheet" href="/__/css/style.css">
    <link rel="stylesheet" href="/__/css/pygments.css">
    <script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.0/MathJax.js?config=TeX-AMS_CHTML-full" type="text/javascript"></script>
    <!--[if lt IE 9]>
      <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
    <![endif]-->
    <script src="//cdn.soimort.org/jk/latest/jk.min.js"></script>
    <script src="/__/js/main.js"></script>
    <link rel="icon" href="/favicon.png">
    <link rel="apple-touch-icon" href="/favicon.png">
    <link rel="alternate" type="application/atom+xml" href="/feed.atom">
  </head>
  <body>
    <article>
      <header>
        <h1 class="title"><a href="https://www.soimort.org/mst/7">The Logical Implication</a></h1>
        <h1 class="subtitle"><a href="https://www.soimort.org/mst/7">Languages and models of a first-order logic.</a></h1>
        <address class="author">Mort Yao</address>
        <h3 class="date">2017-09-06</h3>
      </header>
      <div id="content">
<p><strong>Prologue.</strong> This is the summary of a set of notes I took in KU’s <a href="http://kurser.ku.dk/course/nmaa13036u/2016-2017">Introduction to Mathematical Logic</a> course, that I should have finished months ago but somehow procrastinated until recently. I basically followed Enderton’s <em>A Mathematical Introduction to Logic</em> book (with all its conventions of mathematical notations). A few issues to address in advance:</p>
<ul>
<li>These notes are mainly about first-order logic. <em>Propositional logic</em> is assumed to be a prerequisite. The important thing to know is that its semantics (as a Boolean algebra) can be fully described by a truth table, which is obviously finite and mechanizable.</li>
<li><em>Naïve set theory</em> is also a prerequisite, that is, one must accept the notion of <em>sets</em> unconditionally: A set is just a collection of objects, and it must exist as we describe it. We are convinced that no paradoxical use of sets will ever emerge, and that no “naïve set” could be a proper class. The formal notion of sets is built upon first-order logic, but without some informal reference to sets we can’t even say what a first-order logic is.
<ul>
<li>A set <span class="math inline">\(S\)</span> is said to be “countable” iff there exists a bijection <span class="math inline">\(f : \mathbb{N} \to S\)</span>; otherwise it is “uncountable”.</li>
</ul></li>
<li>One must be willing to accept the validity of mathematical induction on natural numbers (or the well-ordering principle for natural numbers). Again, to formalize the induction principle we need the set theory, but we don’t have it in the first place (until we introduce a first-order logic).</li>
<li>On the <em>model-theoretic</em> part: Notes on <a href="https://wiki.soimort.org/math/logic/fol/definability/">definability and homomorphisms</a> are omitted from this summary, not to mention ultraproducts and the Löwenheim-Skolem theorem. Model theory is a big topic in its own right and deserves an individual treatment, better with some algebraic and topological contexts. (Note that the homomorphism theorem is used in proving the completeness theorem; also the notion of definability is mentioned in the supplementary sections.)</li>
<li>On the <em>proof-theoretic</em> part: Notes on <a href="https://wiki.soimort.org/math/logic/fol/metatheorems/">some metatheorems</a> are also omitted, as they are a purely technical aspect of a Hilbert-style deductive system. (One may find it convenient to prove an actual theorem with more metatheorems, but they are really not adding any extra power to our system.)</li>
<li>The relation between logic and <em>computability</em> (i.e., <em>Gödel’s incompleteness theorems</em>) is not discussed.
<ul>
<li>But the meanings of “decidable” and “undecidable” are clear from the previous notes <a href="/mst/6/">Mst. #6</a> (from a computer scientist’s perspective).</li>
</ul></li>
<li><em>Axiomatic set theory</em>, which is another big part of the course, is not included in these notes. (Maybe I’m still too unintelligent to grasp the topic.) But it is good to know:
<ul>
<li>First-order logic has its limitation in definability (i.e., it’s not capable of ruling out non-standard models of arithmetic), until we assign to it a set-theoretic context. So set theory is often considered a foundation of all mathematics (for its expressive power).</li>
<li>Axiom of Choice (AC) causes some counter-intuitive consequences, but it was shown to be consistent with ZF axioms (Gödel 1938). And there are models of ZF<span class="math inline">\(\cup\lnot\)</span>AC so well as ZF<span class="math inline">\(\cup\)</span>AC.</li>
<li>Constructivists tend to avoid AC in mathematics. However, <a href="https://wiki.soimort.org/math/logic/fol/completeness/">Henkin’s proof of the completeness theorem</a> in first-order logic assumes AC (Step II in finding a maximal consistent set). (thus it is a non-constructive proof!<a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a>)</li>
</ul></li>
<li><em>Intuitionistic logic</em> is not in the scope of these course notes. (And most logic books, including the Enderton one, are not written by constructive mathematicians.) Basically, in a Hilbert-style system, a classical logic would admit all tautologies in propositional logic as Axiom Group 1. Intuitionistic logic, in contrast, rejects those tautologies that are non-constructive in a first-logic setting.</li>
</ul>
<hr />
<ul>
<li><a href="https://wiki.soimort.org/math/logic/fol/languages/"><strong>First-order language</strong></a>: A <a href="https://wiki.soimort.org/comp/language/">formal language</a> consisting of the following symbols:
<ol type="1">
<li>Logical symbols
<ul>
<li>Parentheses: <span class="math inline">\((\)</span>, <span class="math inline">\()\)</span>.</li>
<li>Connective symbols: <span class="math inline">\(\to\)</span>, <span class="math inline">\(\lnot\)</span>.</li>
<li>Variables: <span class="math inline">\(v_1\)</span>, <span class="math inline">\(v_2\)</span>, …</li>
<li>Equality symbol (optional 2-place predicate symbols): <span class="math inline">\(=\)</span>.</li>
</ul></li>
<li>Parameters (non-logical symbols; open to interpretation)
<ul>
<li>Universal quantifier symbol: <span class="math inline">\(\forall\)</span>.</li>
<li>Predicate symbols (relation symbols): <span class="math inline">\(P_1\)</span>, <span class="math inline">\(P_2\)</span>, …</li>
<li>Constant symbols (0-place function symbols): <span class="math inline">\(c_1\)</span>, <span class="math inline">\(c_2\)</span>, …</li>
<li>Function symbols: <span class="math inline">\(f_1\)</span>, <span class="math inline">\(f_2\)</span>, …</li>
</ul></li>
</ol>
<ul>
<li>When specifying a concrete first-order language <span class="math inline">\(\mathcal{L}\)</span>, we must say (i) whether the quality symbol is present; (ii) what the parameters are.</li>
</ul></li>
</ul>
<p><strong>Remark 7.1. (Language of propositional logic)</strong> The language of propositional logic may be seen as a stripped form of first-order languages, in which parentheses, connective symbols and sentential symbols (the only parameters; may be treated as 0-place predicate symbols) are present. Intuitively, that language might seem too weak to encode our formal reasoning in all kinds of mathematics and many practical areas, so to speak.</p>
<ul>
<li><strong>Terms</strong> and <strong>formulas</strong>
<ul>
<li>An <em>expression</em> is a finite sequence of symbols (i.e., a finite string). Among all expressions, we are interested in two kinds of them which we refer to as terms and formulas.</li>
<li>A <em>term</em> is either:
<ul>
<li>a single variable or constant symbol; or</li>
<li><span class="math inline">\(f t_1 \cdots t_n\)</span>, where <span class="math inline">\(f\)</span> is a <span class="math inline">\(n\)</span>-place function symbol, and every <span class="math inline">\(t_i\)</span> <span class="math inline">\((1 \leq i \leq n)\)</span> is also a term.</li>
</ul></li>
<li>A <em>formula</em> (or <em>wff</em>, well-formed formula) is either:
<ul>
<li><span class="math inline">\(P t_1 \cdots t_n\)</span>, where <span class="math inline">\(P\)</span> is a <span class="math inline">\(n\)</span>-place predicate symbol (or the equality symbol <span class="math inline">\(=\)</span>), and every <span class="math inline">\(t_i\)</span> <span class="math inline">\((1 \leq i \leq n)\)</span> is a term; or</li>
<li>one of the following forms:
<ul>
<li><span class="math inline">\((\lnot \psi)\)</span>, where <span class="math inline">\(\psi\)</span> is also a formula;</li>
<li><span class="math inline">\((\psi \to \theta)\)</span>, where <span class="math inline">\(\psi\)</span> and <span class="math inline">\(\theta\)</span> are also formulas;</li>
<li><span class="math inline">\(\forall v_i \psi\)</span>, where <span class="math inline">\(v_i\)</span> is a variable and <span class="math inline">\(\psi\)</span> is also a formula.</li>
</ul></li>
</ul></li>
<li>A variable may occur <em>free</em> in a formula. A formula without any free variable is called a <em>sentence</em>.</li>
</ul></li>
</ul>
<p><strong>Remark 7.2. (Metatheory and philosophical concerns)</strong> A first-order expression, as a finite sequence (also called a <em>tuple</em>), may be defined in terms of ordered pairs in axiomatic set theory. But we will not appeal to set theory in our first definitions of expressions in logic. (So far we have no notion about what a “set” formally is!)</p>
<p>A further concern is whether our definitions of terms and formulas are well established, that is, since we are defining the notions of terms and formulas <em>inductively</em>, would it be possible that there is a certain term or formula that is covered by our recursive definition, but can never be actually built using these operations? To show that first-order terms/formulas are well-defined, a beginner might try to prove these induction principles by mathematical induction on the complexities of terms/formulas, but that would rely on the fact that the set of natural numbers <span class="math inline">\(\omega\)</span> is well-ordered so that we can apply induction on numbers; to justify things like this, it is essential to use set theory or second-order logic, which we don’t even have until we define a first-order logic. Thus, unavoidable circularity emerges if we try to look really fundamentally.</p>
<p>For now, we must appeal to a metatheory that we can easily convince ourselves by intuition, so that we will accept these induction principles and the notion of “naïve sets” (or <em>collections</em>, if we don’t want to abuse the formal term of sets too much). Notwithstanding, I <em>believe</em> that a prudent person can bootstrap theories like this without drawing in any inconsistency.</p>
<p><strong>Remark 7.3. (Context freedom, unique readability and parentheses)</strong> Since the formations of first-order terms and formulas make use of context-free rules, one familiar with <a href="/mst/6/">formal languages and automata theory</a> might ask, “Are the set of terms/formulas <a href="https://wiki.soimort.org/comp/language/context-free/">context-free languages</a>?” Generally they are not, since our set <span class="math inline">\(V\)</span> of variables (samely for predicate and function symbols) could be infinitely (or even uncountably) large, but a context-free grammar requires that every set must be finite. However, in our first-order language <span class="math inline">\(\mathcal{L}\)</span>, if these symbols can be effectively decidable, then there is an algorithm that accepts terms or formulas (or parses them). Furthermore, such parses are guaranteed to be unique, as shown by the Unique Readability Theorems in Enderton p. 105ff. Indeed, the inclusion of parentheses in our first-order language enables us to write any formula unambiguously. If we leave out all the parentheses, does a formula like <span class="math inline">\(\forall x P x \to \lnot Q x\)</span> mean <span class="math inline">\((\forall x P x \to (\lnot Q x))\)</span> or <span class="math inline">\(\forall x (P x \to (\lnot Q x))\)</span>? An alternative syntax would be to use logical connectives in a prefix manner, e.g., <span class="math inline">\(\to \forall x P x \lnot Q x\)</span> and <span class="math inline">\(\forall x \to P x \lnot Q x\)</span>, but that is hardly as comprehensible as our chosen syntax.</p>
<p><strong>Remark 7.4. (Abbreviations on notation)</strong> Why don’t we have the existential quantifier <span class="math inline">\(\exists\)</span> and some other connectives such like <span class="math inline">\(\land\)</span>, <span class="math inline">\(\lor\)</span> and <span class="math inline">\(\leftrightarrow\)</span>, in our language? Because any first-order formula that makes use of these symbols can be seen as syntactical abbreviations and should be rewritten using <span class="math inline">\(\forall\)</span>, <span class="math inline">\(\to\)</span> and <span class="math inline">\(\lnot\)</span>, as will be shown. A deeper reason is that <span class="math inline">\(\{ \to, \lnot \}\)</span> is a functionally complete set of Boolean algebraic operators that is sufficient to express all possible truth tables in propositional logic. On the other hand, a formula like <span class="math inline">\(\exists x \varphi\)</span> is just <span class="math inline">\((\lnot \forall x (\lnot \varphi))\)</span>, following from our understanding of what an existential quantification is.</p>
<p><strong>Remark 7.5. (Sentences and truth values)</strong> In propositional logic, we don’t <em>generally</em> know whether a formula evaluates to true until every sentential symbol is assigned a truth value. (Sometimes we can tell the truth value with a little less information than what is required, if we apply a so-called short-circuit evaluation strategy, e.g., if <span class="math inline">\(A_1\)</span> is false then we immediately know <span class="math inline">\((A_1 \to A_2)\)</span> is true, or if <span class="math inline">\(A_2\)</span> is true then <span class="math inline">\((A_1 \to A_2)\)</span> is also true. But it is not the general case, and one should expect to evaluate both <span class="math inline">\(A_1\)</span> and <span class="math inline">\(A_2\)</span> before getting the answer.) Similarly, in a first-order logic, every free variable needs to have a definite assignment so as to give rise to the truth value of a formula. This is done by specifying a function <span class="math inline">\(s\)</span> (where <span class="math inline">\(\operatorname{dom} s = V\)</span> is the set of all variables) as the assignment of variables, and when applying <span class="math inline">\(s\)</span> to a formula <span class="math inline">\(\varphi\)</span> we get <span class="math inline">\(\varphi[s]\)</span>, which is a sentence that has a definite meaning (i.e., no variable occurs free). Note that the assignment of variables alone is not sufficient to determine the truth of a sentence – For example, <span class="math inline">\((P x y \to P x f y)\ [s(x \,|\, 0)(y \,|\, 0)]\)</span> is a sentence since no variable occurs free in it, but we can’t decide whether it is true because we don’t know what the predicate <span class="math inline">\(P\)</span> and the function <span class="math inline">\(f\)</span> are. If we say, <span class="math inline">\(P\)</span> is the arithmetic “less-than” relation and <span class="math inline">\(f\)</span> is the successor function <span class="math inline">\(f : x \mapsto x + 1\)</span>, then we can tell that this is a true sentence (in fact <span class="math inline">\(P x y\)</span> is false as <span class="math inline">\(0 &lt; 0\)</span> is false, but <span class="math inline">\(P x f y\)</span> is true as <span class="math inline">\(0 &lt; 1\)</span> is true, so the overall sentence as a conditional is true). We could write <span class="math inline">\(P\)</span> and <span class="math inline">\(f\)</span> as <span class="math inline">\(&lt;\)</span> and <span class="math inline">\(S\)</span>, but the conventional interpretation of these symbols should not be taken for granted as if every symbol comes with an inherited meaning – They don’t, until we give them meanings.</p>
<ul>
<li><a href="https://wiki.soimort.org/math/logic/fol/structures/"><strong>Structures</strong></a>: A <em>structure</em> (or an <em>interpretation</em>) <span class="math inline">\(\mathfrak{A}\)</span> assigns a domain <span class="math inline">\(|\mathfrak{A}|\)</span> to the language <span class="math inline">\(\mathcal{L}\)</span>, and:
<ul>
<li>Every predicate symbol is assigned a relation <span class="math inline">\(P^\mathfrak{A} \subseteq |\mathfrak{A}|^n\)</span>.</li>
<li>Every function symbol is assigned a function <span class="math inline">\(f^\mathfrak{A} : |\mathfrak{A}|^n \to |\mathfrak{A}|\)</span>.</li>
<li>Every constant symbol is assigned a member <span class="math inline">\(c^\mathfrak{A}\)</span> of the domain <span class="math inline">\(\mathfrak{A}\)</span>.</li>
<li>The universal quantifier symbol <span class="math inline">\(\forall\)</span> is assigned the domain <span class="math inline">\(|\mathfrak{A}|\)</span>. (So it makes sense to say: “for all <span class="math inline">\(x\)</span> in <span class="math inline">\(|\mathfrak{A}|\)</span>…”)</li>
</ul></li>
<li><strong>Satisfaction</strong> and <strong>truth</strong>
<ul>
<li>Given a structure <span class="math inline">\(\mathfrak{A}\)</span> and an assignment of variables <span class="math inline">\(s : V \to |\mathfrak{A}|\)</span>, we define an extension function <span class="math inline">\(\bar{s} : T \to |\mathfrak{A}|\)</span> (where <span class="math inline">\(T\)</span> is the set of all terms) that maps any term into the domain <span class="math inline">\(|\mathfrak{A}|\)</span>.</li>
<li>With the term valuation <span class="math inline">\(\bar{s}\)</span>, we define recursively that a structure <span class="math inline">\(\mathfrak{A}\)</span> <em>satisfies</em> a formula <span class="math inline">\(\varphi\)</span> with an assignment <span class="math inline">\(s\)</span> of variables, written as <span class="math display">\[\models_\mathfrak{A} \varphi[s]\]</span> If this is not the case, then <span class="math inline">\(\not\models_\mathfrak{A} \varphi[s]\)</span> and we say that <span class="math inline">\(\mathfrak{A}\)</span> does not satisfy <span class="math inline">\(\varphi\)</span> with <span class="math inline">\(s\)</span>.</li>
<li>For a sentence <span class="math inline">\(\sigma\)</span> (which is just a formula with no free variables), the assignment of variables <span class="math inline">\(s : V \to |\mathfrak{A}|\)</span> does not make a difference whether <span class="math inline">\(\varphi\)</span> is satisfied by <span class="math inline">\(\mathfrak{A}\)</span>. So if <span class="math inline">\(\models_\mathfrak{A} \sigma\)</span>, we say that <span class="math inline">\(\sigma\)</span> is <em>true</em> in <span class="math inline">\(\mathfrak{A}\)</span> or that <span class="math inline">\(\mathfrak{A}\)</span> is a <em>model</em> of <span class="math inline">\(\sigma\)</span>.</li>
<li><strong>Satisfiability</strong> of formulas: A set <span class="math inline">\(\Gamma\)</span> of formulas is said to be <em>satisfiable</em> iff there is a structure <span class="math inline">\(\mathfrak{A}\)</span> and an assignment <span class="math inline">\(s\)</span> of variables such that <span class="math inline">\(\models_\mathfrak{A} \Gamma[s]\)</span>.</li>
</ul></li>
<li><strong>Logical implication</strong> and <strong>validity</strong>
<ul>
<li>In a language <span class="math inline">\(\mathcal{L}\)</span>, we say that a set <span class="math inline">\(\Gamma\)</span> of formulas <em>logically implies</em> a formula <span class="math inline">\(\varphi\)</span>, iff for every structure <span class="math inline">\(\mathfrak{A}\)</span> of <span class="math inline">\(\mathcal{L}\)</span> and every assignment <span class="math inline">\(s : V \to |\mathfrak{A}|\)</span> such that <span class="math inline">\(\models_\mathfrak{A} \gamma [s]\)</span> (for all <span class="math inline">\(\gamma \in \Gamma\)</span>), it also holds that <span class="math inline">\(\models_\mathfrak{A} \varphi [s]\)</span> (note that <span class="math inline">\(\varphi\)</span> is not required to be a sentence): <span class="math display">\[\Gamma \models \varphi\]</span> This is the analogue of tautological implication in propositional logic: <span class="math inline">\(A \Rightarrow B\)</span>, iff every truth assignment that satisfies <span class="math inline">\(A\)</span> also satisfies <span class="math inline">\(B\)</span>.</li>
<li>If the empty set logically implies a formula, i.e., <span class="math inline">\(\emptyset \models \varphi\)</span>, we write this fact simply as <span class="math inline">\(\models \varphi\)</span> and say that <span class="math inline">\(\varphi\)</span> is <em>valid</em>. A formula is valid iff given any assignment of variables, it is true in every structure; this is the analogue of tautologies in propositional logic: something that is considered “always true”.</li>
</ul></li>
</ul>
<p><strong>Remark 7.6. (Dichotomy of semantic truthness and the liar’s paradox)</strong> It should be made clear from the definition that given a structure and an assignment, either <span class="math inline">\(\models_\mathfrak{A} \varphi[s]\)</span> (exclusive) or <span class="math inline">\(\not\models_\mathfrak{A} \varphi[s]\)</span>, but not both! It follows from our intuition that a statement is either semantically true or false; and there is no third possibility.</p>
<p>A problem arises with self-referential terms, woefully: Assume that we have a first-order language <span class="math inline">\(\mathcal{L}\)</span> with a 1-place predicate symbol <span class="math inline">\(P\)</span>, and the structure <span class="math inline">\(\mathfrak{A}\)</span> assigns it the domain <span class="math inline">\(|\mathfrak{A}| = \text{Formula}(\mathcal{L})\)</span>, <span class="math inline">\(P\)</span> is interpreted as <span class="math inline">\(P^\mathfrak{A} = \{ \langle \sigma \rangle \,:\, \models \sigma \}\)</span>, that is, <span class="math inline">\(\sigma \in P^\mathfrak{A}\)</span> iff <span class="math inline">\(\models \sigma\)</span>. Let the sentence <span class="math inline">\(\tau\)</span> be <span class="math inline">\((\lnot P x)\)</span> and the assignment <span class="math inline">\(s : V \to |\mathfrak{A}|\)</span> maps the variable <span class="math inline">\(x\)</span> to the sentence <span class="math inline">\(\tau\)</span>, then is <span class="math inline">\(\tau[s]\)</span> true or false in <span class="math inline">\(\mathfrak{A}\)</span>? If we take <span class="math inline">\(\tau[s]\)</span> as true, that is, <span class="math inline">\((\lnot P \tau)\)</span> is true, then <span class="math inline">\(P \tau\)</span> must be false, so <span class="math inline">\(\tau \not\in P^\mathfrak{A}\)</span> thus <span class="math inline">\(\not\models \tau\)</span>. If we take <span class="math inline">\(\tau[s]\)</span> as false, that is, <span class="math inline">\((\lnot P \tau)\)</span> is false, then <span class="math inline">\(P \tau\)</span> must be true, so <span class="math inline">\(\tau \in P^\mathfrak{A}\)</span> thus <span class="math inline">\(\models \tau\)</span>. This is known as the classical <em>liar’s paradox</em>. One possible way to resolve this (given by Alfred Tarski) is by disabling impredicativity in our structures; more precisely, one can define a semantic hierarchy of structures that allows us to predicate truth only of a formula at a lower level, but never at the same or a higher level. This matter is far beyond the scope of this summary, but the important lesson to learn here is that it is generally a bad idea to allow something both true <em>and</em> false in our semantics; it would put our enduring effort to cumulate all “mathematical truths” into void.</p>
<p><strong>Remark 7.7. (Decidability of truth/validity)</strong> In propositional logic, it is easy to see that given a truth assignment of sentential symbols, every formula can be decided for its truth or falsehood. Moreover, even without any truth assignment, one can enumerate a truth table to find out whether a given formula is a tautology. Truth and validity are decidable in propositional logic. However, this is often not the case in first-order logic: In order to decide whether a sentence is true, one needs to find the truth values of all prime formulas (i.e., formulas like <span class="math inline">\(P t_1 \cdots t_n\)</span> and <span class="math inline">\(\forall v_i \psi\)</span>) first, but the domain <span class="math inline">\(|\mathfrak{A}|\)</span> may be an (uncountably) infinite set, thus makes it impossible to mechanically check the universal quantification for all members; moreover, the functions used in building terms may not be Turing-computable at all. To decide the validity of a sentence, we have to check its truth in all structures of the language (whose set may also be uncountably large), and that is an even more impossible task.<a href="#fn2" class="footnoteRef" id="fnref2"><sup>2</sup></a></p>
<p>If semantic truth/validity is generally undecidable, how do we say that some formula is true in a predefined structure? Well, we can’t, in most general cases, since an infinite argument of truth is a useless argument (you can’t present it to someone / some Turing machine, as no physical device is capable of handling such an infinite object). Fortunately, there is a feasible way to say something is true, without appealing to any specific structures (that may give rise to unwanted undecidability), and that is called a formal deduction (also called a proof, expectedly).</p>
<ul>
<li><a href="https://wiki.soimort.org/math/logic/fol/deductions/"><strong>Formal deduction</strong></a>: Given a set <span class="math inline">\(\Lambda\)</span> of formulas (<em>axioms</em>), a set of <em>rules of inference</em>, we say that a set <span class="math inline">\(\Gamma\)</span> of formulas (<em>hypotheses</em>) <em>proves</em> another formula <span class="math inline">\(\varphi\)</span>, or <span class="math inline">\(\varphi\)</span> is a <em>theorem</em> of <span class="math inline">\(\Gamma\)</span>, iff there is a finite sequence (called a <em>deduction</em> of <span class="math inline">\(\varphi\)</span> from <span class="math inline">\(\Gamma\)</span>) <span class="math inline">\(\langle \alpha_0, \dots, \alpha_n \rangle\)</span> such that
<ol>
<li><span class="math inline">\(\alpha_n\)</span> is just <span class="math inline">\(\varphi\)</span>.</li>
<li>For each <span class="math inline">\(0 \leq k \leq n\)</span>, either
<ul>
<li><span class="math inline">\(\alpha_k \in \Gamma \cup \Lambda\)</span>; or</li>
<li><span class="math inline">\(\alpha_k\)</span> is obtained by a rule of inference from a subset of previous formulas <span class="math inline">\(A \subseteq \bigcup_{0 \leq i &lt; k} \alpha_i\)</span>. <span class="math display">\[\Gamma \vdash \varphi\]</span></li>
</ul></li>
</ol>
<ul>
<li><strong>Formal systems</strong> and <strong>proof calculi</strong>: Different deductive systems made different choices on the set of axioms and rules of inferences. A <em>natural deduction</em> system may consist of no axiom but many rules of inference; on the contrary, a Hilbert-style system (named obviously after David Hilbert) uses many axioms but only two rules of inference. A <em>proof calculus</em> is the approach to formal deduction in a specified system, and as it is called a “calculus”, any derivation in it must contain only a finite number of steps so as to be calculable (by a person or by a machine).</li>
<li>We will use a Hilbert-style deductive system here:
<ul>
<li><strong>Rules of inference</strong>
<ol>
<li><em>Modus ponens</em> <span class="math display">\[\frac{\Gamma \vdash \psi \quad \Gamma \vdash (\psi \to \varphi)}{\Gamma \vdash \varphi}\]</span></li>
<li><em>Generalization</em> <span class="math display">\[\frac{\vdash \theta}{\vdash \forall x_1 \cdots \forall x_n \theta}\]</span> (where <span class="math inline">\(\theta \in \Lambda\)</span>.)</li>
</ol></li>
<li><strong>Logical axioms</strong>: In a deductive system, axioms are better called <em>logical axioms</em>, to stress the fact that they are logically valid formulas in every structure, i.e., that their validity is not open to interpretation.
<ol>
<li>(Tautology) <span class="math inline">\(\alpha\)</span>, where <span class="math inline">\(\models_t \alpha\)</span>. (take sentential symbols to be prime formulas in first-order logic)</li>
<li>(Substitution) <span class="math inline">\(\forall x \alpha \to \alpha^x_t\)</span>, where <span class="math inline">\(t\)</span> is substitutable for <span class="math inline">\(x\)</span> in <span class="math inline">\(\alpha\)</span>.</li>
<li><span class="math inline">\(\forall x (\alpha \to \beta) \to (\forall x \alpha \to \forall x \beta)\)</span>.</li>
<li><span class="math inline">\(\alpha \to \forall x \alpha\)</span>, where <span class="math inline">\(x\)</span> does not occur free in <span class="math inline">\(\alpha\)</span>.</li>
<li><span class="math inline">\(x = x\)</span>.</li>
<li><span class="math inline">\(x = y \to (\alpha \to \alpha&#39;)\)</span>, where <span class="math inline">\(\alpha\)</span> is atomic and <span class="math inline">\(\alpha&#39;\)</span> is obtained from <span class="math inline">\(\alpha\)</span> by replacing <span class="math inline">\(x\)</span> in zero or more places by <span class="math inline">\(y\)</span>.</li>
</ol></li>
</ul></li>
</ul></li>
</ul>
<p><strong>Remark 7.8. (Validity of logical axioms)</strong> It should be intuitively clear that all logical axioms are convincing, and that their validity can be argued without appealing to any specific model. In particular, for an axiom <span class="math inline">\(\theta \in \Lambda\)</span>, there is <span class="math inline">\(\vdash \theta\)</span>; we must be able to argue (in our meta-language) that <span class="math inline">\(\models \theta\)</span>, so that we can be convinced that our deductive system is a <em>sound</em> one. Remember that for any formula <span class="math inline">\(\varphi\)</span>, either <span class="math inline">\(\models \varphi\)</span> or <span class="math inline">\(\not\models \varphi\)</span> (which is just <span class="math inline">\(\models (\lnot \varphi)\)</span>). If a proof of <span class="math inline">\(\theta\)</span> (not as a formal deduction, but as an argument in our meta-language) does not even imply <span class="math inline">\(\models \theta\)</span>, that would be very frustrating.</p>
<p><strong>Remark 7.9. (Tautological implication, logical implication and deduction)</strong> If <span class="math inline">\(\Gamma \models_t \varphi\)</span> (i.e., <span class="math inline">\(\varphi\)</span> is tautologically implied by <span class="math inline">\(\Gamma\)</span> in propositional logic), we can argue that <span class="math inline">\(\Gamma \models \varphi\)</span> when replacing sentential symbols by prime formulas in first-order logic. In the special case that <span class="math inline">\(\Gamma = \emptyset\)</span>, we are proving the validity of Axiom Group 1: <span class="math inline">\(\models_t \alpha \implies \models \alpha\)</span> (every tautology is valid). The converse does not hold though, since we have <span class="math inline">\(\models (\alpha \to \forall x \alpha)\)</span> (by Axiom Group 4), but <span class="math inline">\(\not\models_t (\alpha \to \forall x \alpha)\)</span> as <span class="math inline">\(\alpha\)</span> and <span class="math inline">\(\forall x \alpha\)</span> are two different sentential symbols (surely <span class="math inline">\((A_1 \to A_2)\)</span> is not a tautology in propositional logic!).</p>
<p>It is worth noticing that even though <span class="math inline">\(\Gamma \models \varphi \not\implies \Gamma \models_t \varphi\)</span>, we do have <span class="math inline">\(\Gamma \models \varphi \iff \Gamma \cup \Lambda \models_t \varphi\)</span>. Intuitively, the set <span class="math inline">\(\Lambda\)</span> of logical axioms gives us a chance to establish truths about quantifiers and equalities (other than treating these prime formulas as sentential symbols that are too unrefined for our first-order logic). I haven’t done a proof of this, but I believe it should be <em>non-trivial</em> on both directions. Combining with Theorem 24B in Enderton p. 115, we get the nice result concluding that <span class="math display">\[\Gamma \vdash \varphi \iff \Gamma \cup \Lambda \models_t \varphi \iff \Gamma \models \varphi\]</span> which entails both the soundness and the completeness theorems. It is basically saying that these three things are equivalent:</p>
<ol type="1">
<li><span class="math inline">\(\Gamma\)</span> proves <span class="math inline">\(\varphi\)</span>. (There is a formal deduction that derives <span class="math inline">\(\varphi\)</span> from <span class="math inline">\(\Gamma\)</span> and axioms <span class="math inline">\(\Lambda\)</span> in our deductive system; it is finite and purely syntactical, in the sense that it does not depend on any structure or assignment of variables.)</li>
<li><span class="math inline">\(\Gamma\)</span> logically implies <span class="math inline">\(\varphi\)</span>. (For every structure and every assignment of variables, <span class="math inline">\(\varphi\)</span> holds true given <span class="math inline">\(\Gamma\)</span>. Of course, this is a semantical notion in the sense that it does involve structures and assignments of variables, which are infinite in numbers so it would be impossible for one to check this mechanically.)</li>
<li><span class="math inline">\(\Gamma \cup \Lambda\)</span> tautologically implies <span class="math inline">\(\varphi\)</span>. (We can reduce a first-order logic to propositional logic by adding logical axioms to the set of hypotheses, preserving all truths. For each prime formula, this is still a semantical notion for its truth value depends on structures / assignments of variables.)</li>
</ol>
<ul>
<li><a href="https://wiki.soimort.org/math/logic/fol/soundness/"><strong>Soundness</strong></a>
<ol type="a">
<li><span class="math inline">\(\Gamma \vdash \varphi \implies \Gamma \models \varphi\)</span>.</li>
</ol>
<ul>
<li>Proof idea:
<ol>
<li>For <span class="math inline">\(\varphi \in \Lambda\)</span>, show that every logical axiom is valid (Lemma 25A in Enderton p. 132ff.), that is, <span class="math inline">\(\models \varphi\)</span>. Then trivially <span class="math inline">\(\Gamma \models \varphi\)</span>;</li>
<li>For <span class="math inline">\(\varphi \in \Gamma\)</span>, we have trivially <span class="math inline">\(\Gamma \models \varphi\)</span>;</li>
<li><span class="math inline">\(\varphi\)</span> is obtained by generalization on variable <span class="math inline">\(x\)</span> from a valid formula <span class="math inline">\(\theta\)</span>. Since <span class="math inline">\(\models \theta\)</span> (if <span class="math inline">\(\theta\)</span> is an axiom, then this is already shown in Step 1; if <span class="math inline">\(\theta\)</span> is another generalization, then this can be shown by IH), for every structure <span class="math inline">\(\mathfrak{A}\)</span> and <span class="math inline">\(a \in |\mathfrak{A}|\)</span>, <span class="math inline">\(\models_\mathfrak{A} \theta[s(x|a)]\)</span>, then by definition we have <span class="math inline">\(\models_\mathfrak{A} \forall x \theta[s]\)</span>. Therefore <span class="math inline">\(\models \forall x \theta\)</span>;</li>
<li><span class="math inline">\(\varphi\)</span> is obtained by modus ponens from <span class="math inline">\(\psi\)</span> and <span class="math inline">\((\psi \to \varphi)\)</span>. By IH we have <span class="math inline">\(\Gamma \models \psi\)</span> and <span class="math inline">\(\Gamma \models (\psi \to \varphi)\)</span>. Show that <span class="math inline">\(\Gamma \models \varphi\)</span> using Exercise 1 in Enderton p. 99. (NB. the wording in the last line of Enderton p. 131, i.e., “follows at once”, seems too sloppy to me: we have not proved modus ponens semantically yet.)</li>
</ol></li>
<li><strong>Consistency</strong> of formulas: A set <span class="math inline">\(\Gamma\)</span> of formulas is said to be <em>consistent</em> iff for no formula <span class="math inline">\(\varphi\)</span> it is the case that both <span class="math inline">\(\Gamma \vdash \varphi\)</span> and <span class="math inline">\(\Gamma \vdash (\lnot\varphi)\)</span>.
<ul>
<li>By the soundness theorem, an inconsistent set <span class="math inline">\(\Gamma\)</span> of formulas gives rise to both <span class="math inline">\(\Gamma \models \varphi\)</span> and <span class="math inline">\(\Gamma \models (\lnot\varphi)\)</span>. As discussed before, it would throw our trust on mathematical truths into fire – we will have proved that some statement is both true and false!</li>
</ul></li>
</ul>
<ol start="2" type="a">
<li>If <span class="math inline">\(\Gamma\)</span> is satisfiable, then <span class="math inline">\(\Gamma\)</span> is consistent.<br />
(a. and b. are equivalent representations of the soundness theorem.)</li>
</ol></li>
<li><a href="https://wiki.soimort.org/math/logic/fol/completeness/"><strong>Completeness</strong></a>
<ol type="a">
<li><span class="math inline">\(\Gamma \models \varphi \implies \Gamma \vdash \varphi\)</span>.</li>
<li>If <span class="math inline">\(\Gamma\)</span> is consistent, then <span class="math inline">\(\Gamma\)</span> is satisfiable.<br />
(a. and b. are equivalent representations of the completeness theorem.)</li>
</ol>
<ul>
<li>Proof idea: We will prove first a weaker form of b., i.e., the completeness for a countable language <span class="math inline">\(\mathcal{L}\)</span>. Let <span class="math inline">\(\Gamma\)</span> be a consistent set of formulas. We show that it is satisfiable.
<ol>
<li>Extend the language <span class="math inline">\(\mathcal{L}\)</span> with a countable set <span class="math inline">\(\bar{C}\)</span> of new constant symbols and get a new language <span class="math inline">\(\mathcal{L}_\bar{C}\)</span>;</li>
<li>Given the set <span class="math inline">\(\Gamma\)</span> of <span class="math inline">\(\mathcal{L}\)</span>-formulas, show that there is a set <span class="math inline">\(\bar\Gamma\)</span> of <span class="math inline">\(\mathcal{L}_\bar{C}\)</span>-formulas that is consistent, complete, deductively closed and Henkinized, i.e., for every formula <span class="math inline">\(\exists x \varphi \in \Gamma\)</span> there is a “witness” constant <span class="math inline">\(\bar{c} \in \bar{C}\)</span> such that <span class="math inline">\(\varphi^x_\bar{c} \in \bar{\Gamma}\)</span>;</li>
<li>Build a structure <span class="math inline">\(\mathfrak{A}_0\)</span> from <span class="math inline">\(\bar\Gamma\)</span> where <span class="math inline">\(|\mathfrak{A}_0|\)</span> is the set of terms of <span class="math inline">\(\mathcal{L}_\bar{C}\)</span>. The assignment <span class="math inline">\(s\)</span> maps every variable to itself;</li>
<li>Define an equivalence relation <span class="math inline">\(E\)</span> on <span class="math inline">\(|\mathfrak{A}_0|\)</span>: <span class="math inline">\(t E t&#39; \iff t = t&#39; \in \bar\Gamma\)</span>. Show by induction that for any <span class="math inline">\(\mathcal{L}_\bar{C}\)</span>-formula <span class="math inline">\(\varphi\)</span>, <span class="math inline">\(\models_{\mathfrak{A}_0} \varphi^* [s] \iff \varphi \in \bar\Gamma\)</span> (where <span class="math inline">\(\varphi^*\)</span> is <span class="math inline">\(\varphi\)</span> with <span class="math inline">\(=\)</span> replaced by <span class="math inline">\(E\)</span> everywhere);</li>
<li>Show by the homomorphism theorem that <span class="math inline">\(\models_\mathfrak{A} \varphi[s] \iff \varphi \in \bar\Gamma\)</span> (where <span class="math inline">\(\mathfrak{A} = \mathfrak{A}_0 / E\)</span>);</li>
<li>Restrict the structure <span class="math inline">\(\mathfrak{A}\)</span> (a model of <span class="math inline">\(\mathcal{L}_\bar{C}\)</span>) to <span class="math inline">\(\mathcal{L}\)</span> by dropping all new constants <span class="math inline">\(\bar{C}\)</span>. Then <span class="math inline">\(\Gamma\)</span> is satisfiable with <span class="math inline">\(\mathfrak{A}\)</span> and <span class="math inline">\(s\)</span> in <span class="math inline">\(\mathcal{L}\)</span>.</li>
</ol></li>
</ul></li>
<li><a href="https://wiki.soimort.org/math/logic/fol/compactness/"><strong>Compactness</strong></a>
<ol type="a">
<li><span class="math inline">\(\Gamma \models \varphi \implies\)</span> There is a finite <span class="math inline">\(\Gamma_0 \subseteq \Gamma\)</span> such that <span class="math inline">\(\Gamma_0 \models \varphi\)</span>.</li>
<li>If every finite subset <span class="math inline">\(\Gamma_0\)</span> of <span class="math inline">\(\Gamma\)</span> is satisfiable, then <span class="math inline">\(\Gamma\)</span> is satisfiable.<br />
(a. and b. are equivalent representations of the compactness theorem.)</li>
</ol>
<ul>
<li>Proof idea: A simple corollary of soundness and completeness theorems.</li>
</ul></li>
</ul>
<p><strong>Remark 7.10. (Soundness and completeness)</strong> The soundness and the completeness theorems together evidence the equivalence of consistency and satisfiability of a set of formulas, or that of validity and provability of a formula. The completeness theorem is by no means an obvious result; the first proof was given by Kurt Gödel in 1930<a href="#fn3" class="footnoteRef" id="fnref3"><sup>3</sup></a>, but the proof that we use today is given by Leon Henkin in 1949 <span class="citation" data-cites="henkin1949completeness">[1]</span>, which easily generalizes to uncountable languages.</p>
<p><strong>Remark 7.11. (Completeness and incompleteness)</strong> Note that the completeness theorem should not be confused with Gödel’s <em>incompleteness theorems</em>. Specifically, the completeness theorem claims that (unconditionally) every formula that is logically implied by <span class="math inline">\(\Gamma\)</span> is also deducible from <span class="math inline">\(\Gamma\)</span> (i.e., <span class="math inline">\(\Gamma \models \varphi \implies \Gamma \vdash \varphi\)</span>), while the first incompleteness theorem claims that (under some conditions) some consistent deductive systems are incomplete (i.e., there is some formula <span class="math inline">\(\varphi\)</span> such that neither <span class="math inline">\(\Gamma \vdash \varphi\)</span> nor <span class="math inline">\(\Gamma \vdash (\lnot\varphi)\)</span>). As is clearly seen, the incompleteness theorem is purely syntactical and matters for provability (or decidability, one might say). The aforementioned liar’s paradox, where our semantics raises a contradiction that neither <span class="math inline">\(\Gamma \models \varphi\)</span> nor <span class="math inline">\(\Gamma \models (\lnot\varphi)\)</span> reasonably holds, may be seen as a semantical analogue of the first incompleteness theorem.</p>
<section id="equality-is-logical" class="level2">
<h2>Equality is logical</h2>
<p>The equality symbol <span class="math inline">\(=\)</span> is a logical symbol, in the sense that the equivalence relation it represents is <em>not</em> open to interpretation and always means what it’s intended to mean (i.e., “the LHS is <em>equal</em> to the RHS”). But if so, how do we say <span class="math display">\[1+1=2\]</span> is a true sentence then? Can’t we just interpret the equality symbol as something else in a structure <span class="math inline">\(\mathfrak{A}\)</span> such that <span class="math inline">\(\models_\mathfrak{A} (\lnot 1+1=2)\)</span>?</p>
<p>One reason is that in many first-order systems, functions (operations) are defined as axioms using equalities; we need a general way to say “something is defined as…” or just “something is…” There would be no better way of saying this rather than representing it as an equality, so we won’t have the hassle of interpreting a made-up relation in every model. Consider the language of elementary number theory, in the intended model <span class="math inline">\(\mathfrak{N} = (\mathbb{N}; \mathbf{0}, \mathbf{S}, +, \cdot)\)</span> of Peano arithmetic, the addition function is defined as a set of domain-specific axioms: <span class="math display">\[\begin{align*}
a + \mathbf{0} &amp;= a &amp;\qquad(1) \\
a + \mathbf{S} b &amp;= \mathbf{S} (a + b) &amp;\qquad(2)
\end{align*}\]</span> By Axiom Group 6 we have <span class="math inline">\(\mathbf{S0} + \mathbf{0} = \mathbf{S0} \to (\mathbf{S0} + \mathbf{S0} = \mathbf{S}(\mathbf{S0} + \mathbf{0}) \to \mathbf{S0} + \mathbf{S0} = \mathbf{S}\mathbf{S0})\)</span>. By (1) <span class="math inline">\(\mathbf{S0} + \mathbf{0} = \mathbf{S0}\)</span>. By (2) <span class="math inline">\(\mathbf{S0} + \mathbf{S0} = \mathbf{S}(\mathbf{S0} + \mathbf{0})\)</span>. Applying modus ponens twice, we get <span class="math inline">\(\mathbf{S0} + \mathbf{S0} = \mathbf{S}\mathbf{S0}\)</span>, which is the result we want (sloppily written as <span class="math inline">\(1+1=2\)</span>).</p>
<p>The equality is a little special, since it is the most common relation with <em>reflexivity</em>, as justified by Axiom Group 5, i.e., <span class="math inline">\(x = x\)</span> for any variable <span class="math inline">\(x\)</span>. We could exclude these from our logical axioms, but in many cases we would still need a reflexive relation symbol to denote equivalence (justified by some domain-specific axioms) otherwise. Technically, it would be convenient to just treat it as a logical symbol (together with the useful Axiom Groups 5 and 6). Note that though our logical axioms did not say anything about other properties like <em>symmetry</em>, <em>antisymmetry</em> and <em>transitivity</em>, they follow easily from Axiom Groups 5 and 6, in our deductive system:</p>
<p><strong>Lemma 7.12. (Symmetry)</strong> If <span class="math inline">\(x = y\)</span>, then <span class="math inline">\(y = x\)</span>.</p>
<p><em>Proof.</em> Given <span class="math inline">\(x = y\)</span>. By Axiom Group 5 we have <span class="math inline">\(x = x\)</span>. By Axiom Group 6 we have <span class="math inline">\(x = y \to (x = x \to y = x)\)</span>. Applying modus ponens twice, we get <span class="math inline">\(y = x\)</span>. <p style='text-align:right !important;text-indent:0 !important;position:relative;top:-1em'>&#9632;</p></p>
<p><strong>Lemma 7.13. (Transitivity)</strong> If <span class="math inline">\(x = y\)</span> and <span class="math inline">\(y = z\)</span>, then <span class="math inline">\(x = z\)</span>.</p>
<p><em>Proof.</em> Given <span class="math inline">\(x = y\)</span>, by symmetry it holds <span class="math inline">\(y = x\)</span>. Also <span class="math inline">\(y = z\)</span>. By Axiom Group 6 we have <span class="math inline">\(y = x \to (y = z \to x = z)\)</span>. Applying modus ponens twice, we get <span class="math inline">\(x = z\)</span>. <p style='text-align:right !important;text-indent:0 !important;position:relative;top:-1em'>&#9632;</p></p>
<p><strong>Lemma 7.14. (Antisymmetry)</strong> If <span class="math inline">\(x = y\)</span> and <span class="math inline">\(y = x\)</span>, then <span class="math inline">\(x = y\)</span>.</p>
<p><em>Proof.</em> Trivial.</p>
<p>Note that if any partial order is <a href="https://wiki.soimort.org/math/logic/fol/definability/">definable</a> on a structure, the equality symbol may not be indispensable in our language, e.g., consider the language of set theory, where the equivalence of two sets <span class="math inline">\(x = y\)</span> may be defined as <span class="math display">\[(\forall v (v \in x \to v \in y) \land \forall v (v \in y \to v \in x))\]</span></p>
</section>
<section id="the-limitation-of-first-order-logic" class="level2">
<h2>The limitation of first-order logic</h2>
<p>Consider again the language of elementary number theory, in the intended model <span class="math inline">\(\mathfrak{N} = (\mathbb{N}; \mathbf{0}, \mathbf{S}, +, \cdot)\)</span> of Peano arithmetic, we have the set of all true sentences <span class="math inline">\(\text{Th}\mathfrak{N}\)</span>.<a href="#fn4" class="footnoteRef" id="fnref4"><sup>4</sup></a> Now we add a new constant symbol <span class="math inline">\(c&#39;\)</span> to our language, and extend our theory with a countable set of sentences <span class="math inline">\(\text{Th}\mathfrak{N} \cup \{ \underbrace{\mathbf{S} \cdots \mathbf{S}}_{n\ \text{times}} \mathbf{0} &lt; c&#39; \,:\, n \in \mathbb{N} \}\)</span> (Here we define <span class="math inline">\(x &lt; y\)</span> as <span class="math inline">\(\lnot\forall z ((\lnot z = \mathbf{0}) \to (\lnot x + z = y))\)</span>). Is there still a model for this extended theory?</p>
<p>For any given <span class="math inline">\(n \in \mathbb{N}\)</span>, the sentence <span class="math inline">\(\underbrace{\mathbf{S} \cdots \mathbf{S}}_{n\ \text{times}} \mathbf{0} &lt; c&#39;\)</span> is clearly satisfiable (by simply taking <span class="math inline">\(c&#39; = \mathbf{S} n\)</span>). Then it is easily shown that every finite subset <span class="math inline">\(\Gamma_0 \subseteq \text{Th}\mathfrak{N} \cup \{ \underbrace{\mathbf{S} \cdots \mathbf{S}}_{n\ \text{times}} \mathbf{0} &lt; c&#39; \,:\, n \in \mathbb{N} \}\)</span> is satisfiable. By the compactness theorem (b.), <span class="math inline">\(\text{Th}\mathfrak{N} \cup \{ \underbrace{\mathbf{S} \cdots \mathbf{S}}_{n\ \text{times}} \mathbf{0} &lt; c&#39; \,:\, n \in \mathbb{N} \}\)</span> is also satisfiable. This means that we can construct a <em>non-standard model of arithmetic</em> <span class="math inline">\(\mathfrak{N}&#39;\)</span> with an additional bizarre element (specifically <span class="math inline">\(c&#39;_0\)</span>) that turns out to be greater than any other number (thus the model of this theory is not isomorphic to our standard model <span class="math inline">\(\mathfrak{N}\)</span>).</p>
<p>Recall that in a Peano arithmetic modeled by <span class="math inline">\(\mathfrak{N}&#39;\)</span>, numbers are closed under the successor function <span class="math inline">\(\mathbf{S}\)</span>. More precisely, if <span class="math inline">\(k \in |\mathfrak{N}&#39;|\)</span> , then <span class="math inline">\(\mathbf{S}k \in |\mathfrak{N}&#39;|\)</span> and <span class="math inline">\(\mathbf{S}k \neq \mathbf{0}\)</span>. This implies that not only <span class="math inline">\(c&#39;_0 \in |\mathfrak{N}&#39;|\)</span>, but also <span class="math inline">\(\mathbf{S} c&#39;_0, \mathbf{S}\mathbf{S} c&#39;_0, \mathbf{S}\mathbf{S}\mathbf{S} c&#39;_0, \dots\)</span> are all non-standard numbers in <span class="math inline">\(|\mathfrak{N}&#39;|\)</span>. As none of these numbers is equal to <span class="math inline">\(\mathbf{0}\)</span> (by one of Peano axioms), they form an infinite “chain” separately besides our familiar standard ones. One can write down all the (standard and non-standard) numbers sloppily as the following sequence: <span class="math display">\[\langle 0, 1, 2, \dots, \quad c&#39;_0, c&#39;_1, c&#39;_2, \dots \rangle\]</span> where <span class="math inline">\(0\)</span> is just <span class="math inline">\(\mathbf{0}\)</span>, <span class="math inline">\(1\)</span> is <span class="math inline">\(\mathbf{S0}\)</span>, <span class="math inline">\(2\)</span> is <span class="math inline">\(\mathbf{SS0}\)</span>, <span class="math inline">\(c&#39;_1\)</span> is <span class="math inline">\(\mathbf{S} c&#39;_0\)</span>, <span class="math inline">\(c&#39;_2\)</span> is <span class="math inline">\(\mathbf{SS} c&#39;_0\)</span>, etc.</p>
<p>Clearly, every number but <span class="math inline">\(0\)</span> and <span class="math inline">\(c&#39;_0\)</span> in the sequence has a unique predecessor. There is certainly no such a predecessor <span class="math inline">\(j\)</span> of <span class="math inline">\(0\)</span>, because otherwise we would have <span class="math inline">\(\mathbf{S}j = \mathbf{0}\)</span>, contradicting our axioms. But can we have a predecessor of <span class="math inline">\(c&#39;_0\)</span>? There is no axiom preventing us from constructing that thing! So here we go, enlarge our collection of numbers to: <span class="math display">\[\langle 0, 1, 2, \dots, \quad \dots, c&#39;_{-2}, c&#39;_{-1}, c&#39;_0, c&#39;_1, c&#39;_2, \dots \rangle\]</span> where for each <span class="math inline">\(c&#39;_{i}\)</span>, <span class="math inline">\(c&#39;_{i+1} = \mathbf{S} c&#39;_{i}\)</span>. Again, we know that every such non-standard numbers <span class="math inline">\(c&#39;_i\)</span> is greater than any standard number <span class="math inline">\(n\)</span> (otherwise we can find a standard number <span class="math inline">\(n-i\)</span> such that <span class="math inline">\((\lnot n-i &lt; c&#39;_0)\)</span>, contradicting our initial construction of <span class="math inline">\(c&#39;_0\)</span> by compactness. So the non-standard part is still a separate chain, thus as written above.</p>
<p>We can go even further. Let <span class="math inline">\(|\mathfrak{N}&#39;|\)</span> be this set of standard and non-standard numbers, and <span class="math inline">\(\mathfrak{N}&#39; = (|\mathfrak{N}&#39;|; \mathbf{0}, \mathbf{S}, +, \cdot)\)</span> is still the intended model of Peano arithmetic on <span class="math inline">\(|\mathfrak{N}&#39;|\)</span>. Consider adding yet another constant symbol <span class="math inline">\(c&#39;&#39;\)</span>. Is <span class="math inline">\(\text{Th}\mathfrak{N}&#39; \cup \{ \underbrace{\mathbf{S} \cdots \mathbf{S}}_{n&#39;\ \text{times}} \mathbf{0} &lt; c&#39;&#39; \,:\, n&#39; \in |\mathfrak{N}&#39;| \}\)</span> satisfiable? By the same reasoning as before, we get a model <span class="math inline">\(\mathfrak{N}&#39;&#39;\)</span>, with its domain of numbers <span class="math display">\[\langle 0, 1, 2, \dots, \quad \dots, c&#39;_{-2}, c&#39;_{-1}, c&#39;_0, c&#39;_1, c&#39;_2, \dots, \quad \dots, c&#39;&#39;_{-2}, c&#39;&#39;_{-1}, c&#39;&#39;_0, c&#39;&#39;_1, c&#39;&#39;_2, \dots \rangle\]</span></p>
<p>Counter-intuitively, this is not the kind of “arithmetic” we are used to. What we are trying to do is to formulate the arithmetic for <em>standard</em> natural numbers that we use everyday (i.e., <span class="math inline">\(0, 1, 2, \dots\)</span>) in first-order logic, but these non-standard numbers come out of nowhere and there is an infinite “stage” of them, such that each number in a latter stage is greater than every number in a former stage (How is that even possible?). And what is worse, in a subset of the non-standard model <span class="math inline">\(\mathfrak{N}&#39;\)</span>: <span class="math display">\[\{ \dots, c&#39;_{-2}, c&#39;_{-1}, c&#39;_0, c&#39;_1, c&#39;_2, \dots \}\]</span> There is no minimal element with respect to the intended ordering relation <span class="math inline">\(&lt;\)</span>, thus it is not <em>well-ordered</em> by <span class="math inline">\(&lt;\)</span>, so our good old mathematical induction will no longer work with non-standard numbers.</p>
<p>Well, the lucky part is that we can at least have the induction axiom as a first-order sentence: <span class="math display">\[\varphi(\mathbf{0}, y_1, \dots, y_k) \land \forall x (\varphi (x, y_1, \dots, y_k) \to \varphi (\mathbf{S}(x), y_1, \dots, y_k))
\to \forall x \varphi(x, y_1, \dots, y_k))\]</span> Since the standard model <span class="math inline">\(\mathfrak{N}\)</span> and the non-standard model <span class="math inline">\(\mathfrak{N}&#39;\)</span> are <em>elementarily equivalent</em> (i.e., they satisfy the same sentences in a language excluding non-standard numbers), we still enjoy the nice induction principle for all of standard natural numbers. But for the non-standard part, we’re out of luck.</p>
<p>Ideally, we would like to have a bunch of axioms that perfectly defines <em>the model</em> of arithmetic, where no non-standard part is allowed to exist, i.e., the set of numbers is well-ordered by a definable ordering relation <span class="math inline">\(&lt;\)</span> so that we can apply the induction principle on all of them.<a href="#fn5" class="footnoteRef" id="fnref5"><sup>5</sup></a> Unfortunately, this is infeasible in first-order logic (without formulating the notion of sets). We will demonstrate two potential resolutions.</p>
<p>The intuition here is that in order to rule out any non-standard chains of numbers, we must find a 1-place predicate <span class="math inline">\(P \subseteq |\mathfrak{N}|\)</span> such that for every standard number <span class="math inline">\(n\)</span> we have <span class="math inline">\(P n\)</span>, distinguishing it from <span class="math inline">\((\lnot P n&#39;)\)</span> where <span class="math inline">\(n&#39;\)</span> is non-standard. Certainly <span class="math inline">\(\mathbf{0}\)</span> is a standard number; consequently every standard number <span class="math inline">\(x\)</span> is followed by <span class="math inline">\(\mathbf{S}x\)</span>, which is also a standard one. <span class="math display">\[P \mathbf{0} \land \forall x (P x \to P \mathbf{S} x)\]</span> Once we have this <span class="math inline">\(P\)</span> in mind, we say that every number <span class="math inline">\(n \in P\)</span>, so it is also standard. There would be no other numbers in our model, thus define our set of all numbers: <span class="math display">\[\forall n P n\]</span> Notice that <span class="math inline">\(P\)</span> is not in our language; it is something we have yet to construct for our standard model. How to deal with this issue?</p>
<ol type="1">
<li>The first approach is by allowing quantification over relations. So we can say “for all such relations <span class="math inline">\(P\)</span>, it holds that <span class="math inline">\(\forall n P n\)</span>”. Formally: <span class="math display">\[\forall P (P \mathbf{0} \land \forall x (P x \to P \mathbf{S} x)) \to \forall n P n\]</span> Of course, in our previous definition of first-order languages, for <span class="math inline">\(\forall v_i \psi\)</span> to be a well-formed formula, <span class="math inline">\(v_i\)</span> is a variable such that <span class="math inline">\(v_i \in |\mathfrak{N}|\)</span> given a model <span class="math inline">\(\mathfrak{N}\)</span>; here we have <span class="math inline">\(P \subseteq |\mathfrak{N}|\)</span> hence <span class="math inline">\(P \in \mathcal{P}(|\mathfrak{N}|)\)</span>. So in a first-order logic we would not be able to do this (we can only quantify a variable in the domain <span class="math inline">\(|\mathfrak{N}|\)</span>). This approach leads to a <strong>second-order logic</strong> (where we can not only quantify over a plain variable in <span class="math inline">\(|\mathfrak{N}|\)</span>, but also quantify over a relation variable in the power set of the domain, i.e., <span class="math inline">\(\mathcal{P}(|\mathfrak{N}|)\)</span>; that gives our logic more expressive power!).</li>
<li>As we see, a relation is essentially a subset of <span class="math inline">\(|\mathfrak{N}|\)</span> (thus its range is also a set); it is tempting to formulate Peano arithmetic using the notion of sets. Indeed, we can rewrite the formula in Approach 1 into the language of set theory as: <span class="math display">\[\forall y (\emptyset \in y \land \forall x (x \in y \to S(x) \in y)) \to \forall n\ n \in y\]</span> where we encode the standard number <span class="math inline">\(\mathbf{0}\)</span> as <span class="math inline">\(\emptyset\)</span>, <span class="math inline">\(\mathbf{S}x\)</span> as <span class="math inline">\(S(x) = x \cup \{x\}\)</span>. Clearly there is no non-standard number in this set-theoretic model. This is exactly how we define natural numbers <span class="math inline">\(\mathbb{N}\)</span> (as a minimal <em>inductive set</em> <span class="math inline">\(\omega\)</span>) in <strong>set theory</strong>, and its existence is justified by the so-called <em>axiom of infinity</em>. Note that once we introduce the set theory (using a first-order language), we have the equivalently expressive (sometimes paradoxical) power of any arbitrary higher-order logic. Fundamentally.<a href="#fn6" class="footnoteRef" id="fnref6"><sup>6</sup></a></li>
</ol>
</section>
<section id="references-and-further-reading" class="level2">
<h2>References and further reading</h2>
<p><strong>Books:</strong></p>
<p>Herbert B. Enderton, <em>A Mathematical Introduction to Logic</em>, 2nd ed.</p>
<p>Kenneth Kunen, <em>The Foundations of Mathematics</em>.</p>
<p><strong>Articles:</strong></p>
<p>Terence Tao, “The completeness and compactness theorems of first-order logic,” <a href="https://terrytao.wordpress.com/2009/04/10/the-completeness-and-compactness-theorems-of-first-order-logic/" class="uri">https://terrytao.wordpress.com/2009/04/10/the-completeness-and-compactness-theorems-of-first-order-logic/</a>.</p>
<p>Asger Törnquist, “The completeness theorem: a guided tour,” <a href="http://www.math.ku.dk/~asgert/teachingnotes/iml-completenessguide.pdf" class="uri">http://www.math.ku.dk/~asgert/teachingnotes/iml-completenessguide.pdf</a>.</p>
<p>Eliezer Yudkowsky, “Godel’s completeness and incompleteness theorems,” <a href="http://lesswrong.com/lw/g1y/godels_completeness_and_incompleteness_theorems/" class="uri">http://lesswrong.com/lw/g1y/godels_completeness_and_incompleteness_theorems/</a>.</p>
<p>Eliezer Yudkowsky, “Standard and nonstandard numbers,” <a href="http://lesswrong.com/lw/g0i/standard_and_nonstandard_numbers/" class="uri">http://lesswrong.com/lw/g0i/standard_and_nonstandard_numbers/</a>.</p>
<p>David A. Ross, “Fun with nonstandard models,” <a href="http://www.math.hawaii.edu/~ross/fun_with_nsmodels.pdf" class="uri">http://www.math.hawaii.edu/~ross/fun_with_nsmodels.pdf</a>.</p>
<p><strong>Papers:</strong></p>
<div id="refs" class="references">
<div id="ref-henkin1949completeness">
<p>[1] L. Henkin, “The completeness of the first-order functional calculus,” <em>The journal of symbolic logic</em>, vol. 14, no. 3, pp. 159–166, 1949. </p>
</div>
</div>
</section>
<section class="footnotes">
<hr />
<ol>
<li id="fn1"><p>Is there a constructive approach as a replacement of Henkin’s construction? <a href="https://math.stackexchange.com/questions/1462408/is-there-a-constructive-approach-as-a-replacement-of-henkins-construction" class="uri">https://math.stackexchange.com/questions/1462408/is-there-a-constructive-approach-as-a-replacement-of-henkins-construction</a><a href="#fnref1" class="footnoteBack">↩</a></p></li>
<li id="fn2"><p>We are using the notion of decidability/undecidability here even before we get to Gödel’s incompleteness theorem, but hopefully it’s no stranger to us as computability theory has <a href="/mst/6/">all the model-specific issues</a> (though non-logical) covered.<a href="#fnref2" class="footnoteBack">↩</a></p></li>
<li id="fn3"><p>Gödel’s original proof of the completeness theorem: <a href="https://en.wikipedia.org/wiki/Original_proof_of_G%C3%B6del%27s_completeness_theorem">https://en.wikipedia.org/wiki/Original_proof_of_G%C3%B6del%27s_completeness_theorem</a><a href="#fnref3" class="footnoteBack">↩</a></p></li>
<li id="fn4"><p>It might be interesting to know that <span class="math inline">\(\text{Th}(\mathbb{N}; \mathbf{0}, \mathbf{S}, +, \cdot)\)</span> is an undecidable theory, as shown by the aforementioned incompleteness theorem.<a href="#fnref4" class="footnoteBack">↩</a></p></li>
<li id="fn5"><p>If we accept the Axiom of Choice, then every set can have a well-ordering; so this is actually a reasonable request.<a href="#fnref5" class="footnoteBack">↩</a></p></li>
<li id="fn6"><p>Many logicians (Kurt Gödel, Thoralf Skolem, Willard Van Quine) seem to adhere to first-order logic. And that’s why.<a href="#fnref6" class="footnoteBack">↩</a></p></li>
</ol>
</section>
      </div>
      <!-- (www.soimort.org) last updated: 2016-05-07 -->
      <aside id="soimort-toolbar">
        <a href="/"><i class="fa fa-home" aria-hidden="true"></i></a>
      </aside>
    </article>
  </body>
</html>