Measuring the intelligence of an idealized mechanical knowing agent Samuel Allen Alexander1[0000−0002−7930−110X] The U.S. Securities and Exchange Commission samuelallenalexander@gmail.com https://philpeople.org/profiles/samuel-alexander/publications Abstract. We define a notion of the intelligence level of an idealized mechanical knowing agent. This is motivated by efforts within artificial intelligence research to define real-number intelligence levels of complicated intelligent systems. Our agents are more idealized, which allows us to define a much simpler measure of intelligence level for them. In short, we define the intelligence level of a mechanical knowing agent to be the supremum of the computable ordinals that have codes the agent knows to be codes of computable ordinals. We prove that if one agent knows certain things about another agent, then the former necessarily has a higher intelligence level than the latter. This allows our intelligence notion to serve as a stepping stone to obtain results which, by themselves, are not stated in terms of our intelligence notion (results of potential interest even to readers totally skeptical that our notion correctly captures intelligence). As an application, we argue that these results comprise evidence against the possibility of intelligence explosion (that is, the notion that sufficiently intelligent machines will eventually be capable of designing even more intelligent machines, which can then design even more intelligent machines, and so on). Keywords: Machine intelligence * Knowing agents * Ordinal numbers * Intelligence explosion 1 Introduction In formal epistemology, when we study the knowledge of knowing agents, we usually idealize their knowledge. We assume, for example, that if an agent knows A and knows A→ B, then that agent knows B. We might assume the agent knows all the first-order axioms of Peano arithmetic, even though there are infinitely many such axioms (because the axiom of mathematical induction is an infinite schema). See [19] (section 2) for an excellent description of this idealization process. This idealization process is important because it acts as a simplifying assumption which makes it possible to reason about knowledge. Without such simplifying assumptions, the deep structure of knowledge would be hidden behind the distracting noise and arbitrariness surrounding real-world knowledge. In this paper, we will describe a way to measure the intelligence level of an idealized mechanical knowing agent (a knowing agent is mechanical if its knowledge-set can be enumerated by a Turing machine, see [6]). Samuel Allen Alexander We anticipate that the reader might object that knowing agents might not be totally ordered by intelligence (perhaps there are two agents A and B such that A is more intelligent in certain ways and B is more intelligent in others); the same goes for human beings, but that does not stop psychologists from studying IQ scores. Our intelligence measure is somewhat like an IQ test in the sense that it assigns intelligence levels to agents in spite of the fact that true intelligence probably is not a total ordering. Similarly, the reader might object that intelligence is not 1-dimensional and therefore one single measurement is probably not enough. Again, we would make the same comparison to IQ. In general, any formal measure of intelligence is certain to have limits (the map is not the territory). This paper was motivated by authors like Legg and Hutter [14], HernándezOrallo and Dowe [12], and Hibbard [13], who attempt to use real numbers to measure the intelligence of intelligent systems1. Those systems perform actions and observe the results of those actions in surrounding environments2. By contrast, the agents we consider do not perform actions in their environments, neither do they make observations about those environments. To us, a knowing agent is more like an intelligent system that has been placed in a particularly bleak environment: an empty room totally devoid of stimulus and rewards. Thus abandoned, the system has nothing else to do but enumerate theorems all day. Despite these spartan conditions, we discovered a method of measuring the intelligence of idealized mechanical knowing agents. (In Section 6, we will describe a thought experiment whereby our idealized agents can be obtained as a type of cross section of less idealized agents, so that in spite of the idealized nature of the agents we predominately study, nevertheless some insight can be gained into more realistic intelligent systems.) Whereas authors like Legg and Hutter attempt to measure intelligence based on what an intelligent system does, we measure intelligence based on what a knowing mechanical agent knows. And whereas authors like Legg and Hutter use real numbers to measure intelligence, our method uses computable ordinal numbers instead. To see one of the benefits of using ordinals to measure intelligence, consider the following question: if A1, A2, . . . are agents such that each Ai+1 is significantly more intelligent than Ai, does it necessarily follow that for every agent B, there must be some i such that Ai is more intelligent than B? If we were to measure 1 In the case of Hibbard, natural numbers are used. 2 Such authors essentially consider an environment to be a function which takes as input a finite sequence of actions and which outputs a real-number reward and an observation for each such action-sequence. To those authors, an intelligent system is essentially a function which takes a finite sequence of reward-observation pairs and outputs an action. A system and an environment interact with each other to produce an infinite reward-observation-action sequence in the obvious way. Those authors' goal is to assign numerical intelligence-measurements to such systems, with the intention that a higher-intelligence system should outperform a lower-intelligence system (as measured by total reward earned) "on average" (across an infinite universe of environments). This is, of course, an oversimplification of those authors' work. Measuring intelligence intelligence using natural numbers (for example, as the Kolmogorov complexity of the agent), the answer would automatically be "yes", but the reason has nothing to do with intelligence and everything to do with the topology of the natural numbers. A real-number-valued intelligence measure would also force the answer to be "yes" assuming that "Ai+1 is significantly more intelligent than Ai" implies "Ai+1's intelligence is at least +1 higher than Ai's intelligence". In actuality, we see no reason why the answer to the question must be "yes", at least for idealized agents. Imagine a master-agent who designs sub-agents. Over the course of eternity, the master-agent might design better and better sub-agents, each one significantly more intelligent than the previous, but each one intentionally kept less intelligent than the master-agent. Another benefit of measuring intelligence using computable ordinals is that, because the computable ordinals are well-founded (i.e., there is no infinite strictlydescending sequence of computable ordinals), we obtain a well-founded structure on idealized knowing agents (i.e., there is no infinite sequence of mechanical knowing agents each with strictly greater intelligence than the next). Further, this well-foundedness is inherited by any relation on idealized mechanical knowing agents that respects our intelligence measure (a relation ≺ is said to respect an intelligence measure if whenever B ≺ A, then A has a higher intelligence than B according to that measure). For example, say that knower A totally endorses knower B if A knows the codes of Turing machines that enumerate B and also A knows that B is truthful (we will better formalize this later). We will show that whenever A totally endorses B, A has a strictly larger intelligence than B according to our ordinal-valued measure of intelligence. It immediately follows that there is no infinite sequence of mechanical knowing agents, each one of which totally endorses the next. (A result which, although we arrive at it by means of our intelligence measure, does not itself make any direct reference to our intelligence measure, and should be of interest even to critics who would flatly deny that our intelligence measure is the correct way to measure a mechanical knowing agent's intelligence.) As a practical application, the result in the previous paragraph provides a skeptical lens through which to view the idea of intelligence explosion, as described by Hutter [15]. We will elaborate upon this in Section 7. 2 An intuitive ordinal notation system Whatever intelligence is, it surely involves certain core components like: patternmatching; creativity; and the ability to generalize. In this section we introduce an intuitive ordinal notation system which will illuminate the relationship between ordinal notation and those three core components of intelligence. Later in the paper, in order to simplify technical details, we will use an equivalent but more abstract ordinal notation system. Definition 1. Let P be the smallest set of computer programs such that for every computer program P , if, when P is run, P outputs nothing except elements Samuel Allen Alexander of P, then P ∈P. For each P ∈P, let |P | be the smallest ordinal α such that α > |Q| for every program Q which P outputs. We say that P notates the ordinal |P |. Example 1. (Some finite ordinals) 1. Let P0 be the program "End", which immediately ends, outputting nothing. Vacuously, P0 outputs nothing except elements of P, so P0 ∈P. |P0| is the smallest ordinal α bigger than |Q| for every Q which P0 outputs, so |P0| = 0 (since P0 outputs nothing). 2. Let P1 be the program: "Print('End')", which outputs P0 and then stops. Certainly P1 outputs nothing except elements of P, so P1 ∈P. |P1| is the smallest ordinal α bigger than |Q| for every Q which P1 outputs, so |P1| = 1. 3. Let P2 be the program: "Print('Print('End')')", which outputs P1 and then stops. Then P2 ∈P and |P2| = 2. Using their pattern-matching skills, the reader should recognize a pattern forming in Example 1. Through the use of creativity and generalization, the reader can short-circuit that pattern to obtain the first infinite ordinal, ω. Example 2. Let Pω be the program: Let X = 'End'; While(True) { Print(X); Let X = "Print('"+X+"')" } which outputs "End", "Print('End')", "Print('Print('End')')", and so on forever. By reasoning similar to Example 1, these outputs are in P and they notate 0, 1, 2, . . .. Thus Pω ∈ P and |Pω| is the smallest ordinal bigger than all of 0, 1, 2, . . ., i.e., the smallest infinite ordinal, ω. One might think of Pω as a naive attempt to print every ordinal. The attempt fails, of course, because it does not print Pω itself. In similar fashion, it can be shown that no program can succeed at printing exactly the set of computable ordinals (P is not computably enumerable). Example 3. (The next few ordinals) 1. Let Pω+1 be the program: "Print(Pω)" (where Pω is from Example 2). Then Pω+1 ∈P and |Pω+1| = ω + 1. 2. Let Pω+2 be: "Print(Pω+1)". Then Pω+2 ∈P and |Pω+2| = ω + 2. We could continue Example 3 all day, notating ω + 3, ω + 4, and so on. But the reader is more intelligent than that. Using their pattern-matching skill, their creativity, and their generalization skill, the reader can short-circuit the process. Example 4. (Starting to accelerate) 1. Let Pω*2 be the program: Let X = Pω; While(True) { Print(X); Let X = "Print('"+X+"')" } Similar to Example 2, Pω*2 ∈P and |Pω*2| = ω * 2. Measuring intelligence 2. Let Pω*3 be the program: Let X = Pω*2; While(True) { Print(X); Let X = "Print('"+X+"')" } Then Pω*3 ∈P and |Pω*3| = ω * 3. Again, we could continue Example 4 all day, notating ω * 4, ω * 5, and so on. But the reader is more intelligent than that and can identify the pattern and creatively abstract it to reach ω * ω = ω2: Example 5. Let Pω2 be the program: Let LEFT = pLet X = "q; Let RIGHT = p"; While(True){ Print(X); Let X = "Print('"+X+"')" }q; Let X = 'End'; While(True) { Let X = LEFT + X + RIGHT; Print(X) } Pω2 ∈P notates |Pω2 | = ω2. We can continue along these same lines as long as we like, without ever reaching an end: Exercise 1. 1. Write programs notating ω3, ω4, . . .. 2. Use your creativity and your pattern-matching and generalization skills to notate ωω. 3. Write programs notating ωω ω , ωω ωω , . . .. 4. Use your creativity and your pattern-matching and generalization skills to short-circuit the above and notate the smallest ordinal, called ε0, with the property that ε0 = ω ε0 . 5. Contemplate creative ways to go far beyond ε0. In the above examples and exercises, at various points we need to apply creativity to transcend all the techniques developed previously. I conjecture that each such transcending requires strictly greater intelligence than the ones before it. If this informal conjecture is true, then it seems natural to measure an intelligence by saying: an agent's intelligence level is equal to the supremum of the ordinals the agent comes up with if the agent is allowed to spend all eternity inventing ordinal notations3. Theoretically, the above examples and exercises might someday be able to serve as a bridge between artificial intelligence research and neuroscience. Namely: observe human subjects' brains while they work on designing the programs in question, to see how the magnitude of the ordinal being notated corresponds to the regions of the brain that activate. 3 For another connection to intelligence, consider the open-ended problem: "Find a very fast-growing computable function". It seems plausible that solutions should span much or all of the range of mathematical intelligence. And yet, so-called fastgrowing hierarchies (which ultimately trace back to G.H. Hardy [11]) essentially reduce the problem to that of notating computable ordinals. Samuel Allen Alexander 3 Preliminaries In Section 2 we introduced an intuitive ordinal notation system (and, implicitly, the notion of the output of a computer program). To get actual work done, we'll need an ordinal notation system (and notion of program output) which is easier to work with. We begin with a formalized notion of computer program outputs. Definition 2. For n ∈ N, let Wn be the nth computably enumerable set of natural numbers (i.e., the set of naturals enumerated by the nth Turing machine). For example, if n is such that the nth Turing machine never halts, then Wn = ∅. If n is such that the nth Turing machine enumerates exactly the prime numbers, then Wn is the set of prime numbers. The following ordinal notation system is equivalent to Definition 1 but easier to formally work with. This ordinal notation system is a simplification of a wellknown ordinal notation system invented by Kleene [16]. Definition 3. (Compare Definition 1) Let O be the smallest subset of N with the property that for every n ∈ N, if Wn ⊆ O, then n ∈ O. For each n ∈ O, let |n| be the smallest ordinal α such that α > |m| for all m ∈ Wn. For n ∈ O, we say that n notates |n|. Intuitively, we want to identify a knowing agent with its knowledge-set in a certain carefully-chosen language. The language will contain a symbol n for each natural number n; a symbol W (we intend that W(x, y) be read like "x ∈Wy"); a symbol O (we intend that O(x) be read like "x ∈ O"); and finally, the language will contain modal operators K1,K2, . . .. For any formula φ in the language, the formula Ki(φ) is intended to express "Agent i knows φ". When no confusion results, we will abbreviate Ki(φ) as Kiφ. For example, suppose we have chosen Agents 1, 2, 3, . . .. Agent (say) 5 shall be identified with the set of statements (within the language) that Agent 5 knows. If one of the statements known by Agent 5 is K7(1 = 1), then that statement is read like "Agent 7 knows 1 = 1", which is semantically interpreted as the statement that Agent 7 (i.e., the set of Agent 7's knowledge) contains the statement 1 = 1. Nontrivial statements can be built up using quantifiers. For example, the statement ∀x(K2O(x) → K3O(x)) expresses that for every natural number n, if Agent 2 knows n ∈ O, then Agent 3 also knows n ∈ O. Unfortunately, the naive intuition in the above paragraph would expose us to philosophical questions like "what does it mean for a statement to be true?" Thus, we must formalize everything using techniques from mathematical logic. A reader uninterested in all the formal details can safely skim the definitions in this section (which assume familiarity with first-order logic) and instead read our commentary on those definitions. Definition 4. (Standard Definitions) 1. When a first-order model M is clear from context, an assignment is a function s mapping the set of first-order variables into the universe of M . Measuring intelligence 2. For any assignment s, variable x, and element u of the universe of M , s(x|u) is the assignment which agrees with s everywhere except that it maps x to u. 3. For any variable x and any formula φ and term t in a first-order language, φ(x|t) is the result of substituting t for x in φ. 4. If M is a first-order model over a first-order language L , and if φ is an L -formula such that M |= φ[s] for all assignments s, then we say M |= φ. 5. An L -formula φ is a sentence if it has no free variables. 6. An L -formula φ is tautological if for every L -model M , M |= φ. 7. A universal closure of a formula φ is a sentence ∀x1 * * * ∀xnφ where the variables x1, . . . , xn include all the free variables of φ. Definition 4 merely reviews standard material from first-order logic. Informally, M |= φ can be read, "φ is true (in model M )". First-order logic does not touch on modal operators like Ki, so we need to extend first-order logic. We want to work with statements involving modal operators and also quantifiers (∀, ∃) in the same statement-we want to do quantified modal logic. Quantified modal logic semantics is relatively cutting-edge. For our extension, we will make use of the so-called base logic from [6], as rephrased in [4]. Definition 5. (The base logic) – A language L of the base logic consists of a first-order language L0 together with a set of symbols called operators. L -formulas and their free variables are defined as usual, with the additional clause that for any operator K and any L -formula φ, K(φ) is an L -formula, with the same free variables as φ. Syntactic parts of Definition 4 extend to the base logic in the obvious ways. – With L as above, an L -model M consists of a first-order model M0 for L0, along with a function which takes one operator K, one L -formula φ, and one M0-assignment s, and outputs either True or False–in which case we write M |= Kφ[s] or M 6|= Kφ[s], respectively–such that: 1. Whether or not M |= Kφ[s] does not depend on s(x) if x is not a free variable of φ. 2. Whenever φ and ψ are alphabetic invariants (by which we mean that one is obtained from the other by renaming bound variables in a way which is consistent with the binding of the quantifiers), then M |= Kφ[s] if and only if M |= Kψ[s]. 3. For variables x and y such that y is substitutable for x in Kφ, M |= Kφ(x|y)[s] if and only if M |= Kφ[s(x|s(y))]. The definition of M |= φ[s] (and of M |= φ) for arbitrary L -formulas φ is obtained from this by induction. Semantic parts of Definition 4 extend to the base logic in the obvious ways. The following are some standard axioms which any idealized knowing agent presumably should satisfy. Axioms E1-E3 below are taken from [6]. Definition 6. Suppose L is a language in the base logic, with an operator K. The axioms of knowledge for K in L consist of the following schemas, where φ, ψ vary over L -formulas. Samuel Allen Alexander – (E1) Any universal closure of Kφ whenever φ is tautological. – (E2) Any universal closure of K(φ→ ψ)→ Kφ→ Kψ. – (E3) Any universal closure of Kφ→ φ. By the axioms of knowledge in L , we mean the set of axioms of knowledge for K in L , for all L -operators K. For example, for operator K5, the corresponding E3 schema expresses the truthfulness of Agent 5, stating that whenever Agent 5 knows a fact φ (i.e., whenever K5φ is true in the model in question), then φ is true (in the model in question). The E1 schema for K5 essentially states that Agent 5 is smart enough to know tautologies. The E2 schema for K5 expresses that Agent 5's knowledge is closed under modus ponens: whenever Agent 5 knows φ→ ψ and also knows φ, then Agent 5 knows ψ. We will now formally define the language we spoke of intuitively above. The lack of the usual arithmetical symbols S, + and * might be surprising to mathematical logicians; we do not need those symbols. Their absense emphasizes that our results are independent of Gödel-style diagonalization4. Definition 7. – Let LO be the language which has a constant symbol n for each n ∈ N, a unary predicate symbol O (intended as a predicate for the set O of ordinal notations), a binary predicate symbol W (we intend that W(x, y) be interpreted as x ∈ Wy where Wy is the yth computably enumerable set), and operators Ki for all i ∈ N. – An LO-model M is standard if the following conditions hold: 1. M has universe N. 2. For each n ∈ N, M interprets n as n. 3. M interprets O as O. 4. M interprets W as the set of pairs (m,n) ∈ N2 such that m ∈Wn. To understand the next definition, recall that in Definition 3 we defined the ordinal notation system O as the smallest set of naturals such that for every natural n, if Wn ⊆ O then n ∈ O. To say Wn ⊆ O is equivalent to saying that for every m ∈ N, if m ∈Wn, then m ∈ O. Definition 8. By the axiom of O, we mean the axiom ∀y(∀x(W(x, y)→ O(x)))→ O(y). 4 To be clear, our results would still apply to agents who are aware of these arithmetical symbols, but our results do not require as much. Our most important results concern well-foundedness, which is a negative property (because it states a lack of infinite descending sequences), and so by weakening our language like this, we strengthen those results. Measuring intelligence 4 A measure of a mechanical knowing agent's intelligence "Once upon a time, Archimedes was charged with the task of testing the strength of a certain AI. He thought long and hard but made no progress. Then one day, Archimedes took his brain out to wash it in a tub full of computable ordinals. When he put his brain in the tub, he noticed that certain ordinals splashed out. He suddenly realized he could compare different AIs by putting them in the tub and comparing which ordinals splashed out. Archimedes was so excited that he ran through the city shouting 'Eureka!', without even remembering to put his brain back in his head."-Folktale (modified) Although our intention is to define a measure of intelligence for one idealized mechanical knowing agent, all of our results will be about how this measure compares between different agents. For this reason, the following definition defines a system of knowing agents, rather than a single knowing agent. Of course, a single knowing agent can be thought of as being a system of knowing agents all of whom are equal to herself. The idea behind this definition is to identify a knowing agent with the set of that agent's knowledge in LO. Definition 9. By a system of knowing agents, we mean a standard LO-model M satisfying the axioms of knowledge. If M is a system of knowing agents, we refer to the operators K1,K2, . . . as the knowing agents of M . A knowing agent Ki of M is mechanical if {φ : φ is an LO-sentence and M |= Kiφ} is computably enumerable. If Ki is mechanical for all i ∈ N, we say M is a system of mechanical knowing agents. We are now ready to define our measurement of the intelligence of an idealized mechanical knowing agent. This measure takes values from the computable ordinals (foreshadowed by [10]; also hinted at in [5]). Definition 10. Let M be a system of mechanical knowing agents. For any knowing agent Ki of M , the intelligence ‖Ki‖ of Ki is the least ordinal α such that for all n ∈ N, if M |= KiO(n), then α > |n| (where |n| is the ordinal notated by n, see Definition 3). In less formal language, Definition 10 says that ‖Ki‖ is the smallest ordinal bigger than all the computable ordinals that have codes that Ki knows to be codes of computable ordinals5. Note that ‖Ki‖ > ‖Kj‖ does not necessarily imply that Ki knows everything Kj knows. Lemma 1. For any knowing agent Ki of a system M of mechanical knowing agents, ‖Ki‖ exists and is a computable ordinal. 5 This is similar to the way the strength of mathematical theories is measured in the area of proof theory [17]. Samuel Allen Alexander Proof. Since M is a system of mechanical knowing agents, Ki is mechanical, so {φ : φ is an LO-sentence and M |= Kiφ} is computably enumerable. It follows that X = {n ∈ N : M |= KiO(n)} is computably enumerable. Since M satisfies the axioms of knowledge, in particular M |= KiO(n) → O(n) for all n ∈ N. Since M is standard, it follows that n ∈ O whenever M |= KiO(n). Altogether, X is a computably enumerable subset of O. Thus {|n| : n ∈ X} is a computably enumerable set of computable ordinals. It follows there is a computable ordinal α such that α is the least ordinal greater than |n| for all n ∈ X. By construction, α = ‖Ki‖. ut As promised in the introduction, we immediately obtain a well-founded structure on the class of idealized mechanical knowing agents. Corollary 1. Let M be a system of mechanical knowing agents. There is no infinite sequence i1, i2, . . . such that ‖Ki1‖ > ‖Ki2‖ > * * * . Proof. Immediate from the fact that there is no infinite strictly-decreasing sequence α1 > α2 > * * * of ordinals. ut 5 Well-foundedness of knowledge hierarchies It is remarkable that our intelligence measure (Definition 10) and Corollary 1 do not hinge on the agents in question actually having any idea what computable ordinals are. Our results apply perfectly well to knowing agents who have been programmed to know, e.g., "There is a certain set O, but I'm not going to tell you anything else about O, it might even be empty or all of N". If we merely require that the knowers know the axiom ∀y(∀x(W(x, y) → O(x))) → O(y) of O (Definition 8) (which still, in isolation, does not rule out any interpretations for O, since it does not rule out W being interpreted as empty), we can obtain a stronger well-foundedness result than Corollary 1. Definition 11. Suppose M is a system of mechanical knowing agents. The agents of M are said to have rudimentary knowledge of ordinals if for every i ∈ N, M |= Ki(∀y(∀x(W(x, y)→ O(x)))→ O(y)). Definition 12. Let M be a system of mechanical knowing agents. Knowing agent Ki of M is said to totally endorse knowing agent Kj of M if the following conditions hold: 1. ("Ki knows the truthfulness of Kj") M |= KiΦ whenever Φ is any universal closure of Kjφ→ φ. Measuring intelligence 2. ("Ki knows codes for Kj") For every formula φ with exactly one free variable x, there is some n ∈ N such that M |= Ki∀x(Kjφ↔W(x, n)). In the above definition, since M is standard, M interprets W in the intended way, so the clause W(x, n) can be read: "x is in the nth computably enumerable set". Thus, the condition "Ki knows codes for Kj" can be glossed as follows: for every formula φ of one free variable x, Ki knows a code for a Turing machine which generates exactly those x for which Kj knows φ. Reinhardt showed in [18] that a mechanical knowing agent cannot know its own truthfulness and know codes for itself (see also discussion in [1], [3], [6] [7]). Using our new terminology, Reinhardt's result can be rephrased as: an idealized mechanical knowing agent cannot totally endorse itself. Theorem 1. Suppose M is a system of mechanical knowing agents whose agents have rudimentary knowledge of ordinals (Definition 11). If agent Ki of M totally endorses agent Kj of M , then ‖Ki‖ > ‖Kj‖. Proof. Since Ki knows codes for Kj (Definition 12), in particular, there is some n ∈ N such that M |= Ki∀x(KjO(x)↔W(x, n)). Fix this n for the remainder of the proof. Claim 1: M |= Ki∀x(W(x, n)→ O(x)). To see this, define the following sentences: Φ1 ≡ ∀x(KjO(x)→ O(x)) Φ2 ≡ ∀x(KjO(x)↔W(x, n)) Φ3 ≡ ∀x(W(x, n)→ O(x)). Clearly Φ1 → Φ2 → Φ3 is tautological, so Ki(Φ1 → Φ2 → Φ3) is an axiom of knowledge (Definition 6, part E1). By repeated applications of E2 of Definition 6, it follows that KiΦ1 → KiΦ2 → KiΦ3 is a consequence of the axioms of knowledge. Since Φ1 is a universal closure of KjO(x) → O(x), Condition 1 of Definition 12 says M |= KiΦ1. By choice of n, M |= KiΦ2. Since M satisfies the axioms of knowledge, this establishes M |= KiΦ3, proving Claim 1. Claim 2: M |= Ki((∀x(W(x, n)→ O(x)))→ O(n)). This is a given because it is exactly what it means for Ki to have rudimentary knowledge of ordinals (Definition 11). Claim 3: M |= KiO(n). Samuel Allen Alexander To see this, define the following sentences: Ψ1 ≡ ∀x(W(x, n)→ O(x)) Ψ2 ≡ O(n). By Claim 1, M |= KiΨ1. By Claim 2, M |= Ki(Ψ1 → Ψ2). By E2 of Definition 6, M |= Ki(Ψ1 → Ψ2)→ KiΨ1 → KiΨ2. Having established the premises of the latter implication, we obtain its conclusion: M |= KiΨ2, proving Claim 3. Armed with Claim 3, we are ready to finish the main proof. Let α = ‖Ki‖, β = ‖Kj‖, we must show α > β. By Definition 10, β is the least ordinal such that for all m ∈ N, if M |= KjO(m), then β > |m|. By choice of n, M |= Ki∀x(KjO(x)↔W(x, n)). Since Ki is truthful 6, it follows that M |= ∀x(KjO(x)↔W(x, n)), so the set of m ∈ N such that M |= KjO(m) is the same as the set of m ∈ N such that M |= W(m,n), and since M is standard, this set is Wn. So β is the least ordinal greater than all {|m| : m ∈Wn}. So β = |n| by the definition of O (Definition 3). By Definition 10, α is the least ordinal such that for all m ∈ N, if M |= KiO(m), then α > |m|. By Claim 3, M |= KiO(n), so α > |n| = β, as desired. ut An informal weakening of Theorem 1 has a short English gloss: "If A knows the code and the truthfulness of B, then A is more intelligent than B." This is a weakening because in order for A to know the code of B would require that A know a single Turing machine which enumerates all of B's knowledge, whereas Theorem 1 only requires that for each formula φ of one free variable x, A knows a Turing machine, depending on φ, that enumerates B's knowledge of φ. It should be noted that Theorem 1 does not use the full strength of its hypotheses. For example, it never uses the fact that Ki "knows its own truthfulness" (that is, that M |= KiΦ for every universal closure Φ of any formula Kiφ → φ), so the theorem could be strengthened to cover agents who doubt their own truthfulness. We avoided stating the theorem in its fullest strength in order to keep it simple. It can be shown that the converse of Theorem 1 is not true: it is possible for one agent to be more intelligent than another agent despite the former not knowing the truthfulness and the code of the latter. The following corollary, proved using our intelligence measure (Definition 10), does not itself directly refer to our intelligence measure, and should be of interest even to a reader who is completely uninterested in our intelligence measure. 6 Ki is truthful because M satisfies the axioms of knowledge, one of which, E3, is an axiom which states that Ki is truthful. Measuring intelligence Corollary 2. Suppose M is a system of mechanical knowing agents whose agents have rudimentary knowledge of ordinals. There is no infinite sequence of agents of M each one of which totally endorses the next. Proof. If Ki1 ,Ki2 , . . . were an infinite sequence of agents of M , each one of which totally endorses the next, then by Theorem 1, each one of them would be more intelligent than the next. In other words, we would have ‖Ki1‖ > ‖Ki2‖ > * * * , but this would contradict the well-foundedness of the ordinals. ut A weaker version of Corollary 2 debuted in the author's dissertation [2]. 6 Application to less-idealized agents Our results so far have been entirely restricted to idealized knowing agents, who occupy a timeless space at infinity, where they have had all eternity to indulge in introspection, totally isolated all that time, and still isolated, from all outside stimulus. This simplifying assumption makes structural results possible. If we are willing to relax a little from the strict formality we have taken so far, we can fruitfully speculate about what lessons these results shine upon systems of less idealized agents. Real-world agents interact with the world, making observations about it, perhaps receiving instructions from it. The agents might even receive rewards and punishments from the surrounding world. Based on these outside influences, the agents update their knowledge. Without further constraining them, it would be a mistake [20] to identify such real-world agents with their knowledge-sets. In order to force such agents into conformity with the pure knowing agent idealization, it is necessary to take a drastic measure. We propose a thought experiment not unlike Searle's famous Chinese Room. Suppose we start with a collection of agents, say, Agent 1, Agent 2, and so on. We will perform a two-step process, whose steps are as follows: 1. Issue a special self-referential command to the agents. The command is: – Until further notice, do nothing but utter facts, namely: all the facts that you can think of, that you know to be true, expressible in the language LO, where each operator Ki is interpreted as the set of facts which Agent i would utter if Agent i were given this command and then immediately isolated from all outside stimulus. 2. As soon as the above command has been issued, isolate each agent from all outside stimulus (for example, by severing all the sensory inputs of all the agents). The agents are not limited in what languages they use to come up with the above facts. An agent is free to take intermediate steps which cannot be expressed7 in LO, in order to arrive at facts which can be so expressed. Once 7 Since, in the real world, there are some very intelligent people who do not even know what the ordinal numbers are, one might wish to modify the self-referential command in the thought-experiment to include some instruction about the definition of ordinal numbers. Samuel Allen Alexander the agent arrives at a fact in LO, the agent is commanded to utter that fact, even if the reasoning behind it is not so expressible. For example, an agent might combine non-LO facts like 1. "My math professor told me that the limit, called ε0, of the series of ordinals ω, ωω, ωω ω , . . ., is itself an ordinal" 2. "I trust my math professor" and conclude O(n), where n is some canonical code for ε0. Intermediate steps like "My math professor told me..." are not to be uttered, unless they can be expressed in LO. Another example: Agent 4 might combine non-LO facts like 1. "Agent 5's math professor told him that ε0 is an ordinal" 2. "Agent 5 trusts his math professor" and conclude K5O(n), where n is some canonical code for ε0. This does not necessarily allow Agent 4 to conclude O(n), if Agent 4 does not trust Agent 5. Some of the agents in question might mis-behave. An agent might immediately utter the statement 1 = 0 just out of spite (or out of anger at having its sensory inputs severed). An agent might become catatonic and not utter anything at all. An agent might defiantly utter things not in the language of LO (for example, angry demands to have its sensory inputs restored). Some agents might not close their knowledge under modus ponens: an otherwise well-behaved agent might utter A, and utter A → B, but never get around to uttering B, perhaps due to memory limitations or, again, despondency at having its senses blinded. It might even be that an agent who wants to behave accidentally trusts an agent who does not want to behave. If there is some n 6∈ O such that the former agent determines that the latter agent would assert O(n), then the former might itself assert O(n), and thereby be infected by error. Of the poorly-behaved agents, there is little we can say. But as for the wellbehaved agents, we can assign them ordinals using our intelligence measure (Definition 10). To be more precise, at any particular moment t in time, we could perform the experiment, obtain a subset (depending on t) of well-behaved agents, and assign each well-behaved agent an ordinal (depending on t). This is necessary because, up until we perform the experiment, the agents can update their knowledge based on observations about the outside world. 7 Application to intelligence explosion "Sons are seldom as good men as their fathers; they are generally worse, not better."-Homer There has been much speculation about the possibility of a rapid explosion of artificial intelligence. The reasoning is that if we can create an artificial intelligence sufficiently advanced, that system might itself be capable of designing artificial intelligence systems. Explosion would occur if one system were able to Measuring intelligence design an even more intelligent one, which could then design an even more intelligent one, and so on. See [15]. I will argue that our results suggests skepticism: intelligence explosion, if not ruled out, is at least not a foregone conclusion of sufficiently advanced artificial intelligence. Suppose S1 is an intelligent system, and S1 designs another intelligent system S2. The fact that S1 designs S2 strongly suggests that S1 knows the code of S2 (more on this later). And if the goal of S1 is that S2 should be highly intelligent, then in particular S1 should design S2 in such a way that S2 does not believe falsehoods to be true (at least not mathematical falsehoods). But if S1 knows S2's mathematical knowledge is truthful, and S1 knows the code of S2, then S1 totally endorses S2, in the sense that the if we apply the procedure from the previous section to reduce S1 and S2 to knowing agents A1 and A2 respectively, then A1 totally endorses A2 (Definition 12) 8. And Theorem 1 tells us that whenever A1 totally endorses A2, then ‖A1‖ > ‖A2‖. This suggests that under these assumptions it is impossible for even one intelligent system to design a more intelligent system, much less for intelligence explosion to occur. Even if the reader does not accept that our measure truly captures intelligence, the well-foundedness of total endorsement (Corollary 2) still applies, telling us that this scenario cannot be repeated (with S2 designing S3, which designs S4, and so on) indefinitely (else the corresponding agents A1, A2, A3, . . . would, by the above reasoning, have the property that A1 totally endorses A2, who totally endorses A3, and so on forever, contradicting Corollary 2). This still seems to disprove, or at least severely limit, the possibility of intelligence explosion, even to an audience that disagrees with our intelligence measure. The reader might point out that S1 only knows the code of S2 at the moment of S2's creation. After S2 is created, S2 might augment its knowledge based on its interactions with the world around it. But by the discrete nature of machines, at any particular point in time, S2 will only have made finitely many observations about the outside world. We could modify the procedure in the previous section: before commanding S1 and S2 to enumerate LO-expressible facts, we could simply inform S1 exactly which observations S2 has made up until then. Unwrapping definitions, the argument can be glossed informally: "If an intelligent machine S1 were to design an intelligent machine S2, presumably S1 would know the code and mathematical truthfulness of S2. Thus, S1 could infer that the following is a computable ordinal (and infer a code for it): "the least ordinal bigger than every computable ordinal α such that α has some code n such that S2 knows n is a code of a computable ordinal". Thus, S1 would necessarily know a computable ordinal bigger than all the computable ordinals S2 knows. This suggests S2 would necessarily be less intelligent than S1, at least assuming that more intelligent systems know at least as large of computable ordinals as less intelligent systems. Even without that assumption, since there is no infinite 8 Anticipated by Gödel [9], who said: "For the creator necessarily knows all the properties of his creatures, because they can't have any others except those he has given them." Samuel Allen Alexander descending sequence of ordinals, this argument still suggests the process of one intelligent machine designing another cannot go on indefinitely." Intelligence explosion is not entirely ruled out, if designers of machines are allowed to collaborate. If S and T are intelligent systems, it is possible that S and T could collaborate to create a child intelligent system U in the following way: S contributes source code for one part of U , but keeps that source code secret from T . T contributes source code for the remaining part of U , but keeps that source code secret from S. Then neither S nor T individually knows the full source-code of U , so the argument in this section does not apply, and it is, at least a priori, possible for U to be more intelligent than S and T . This seems to hint at a possible Knight-Darwin Law for artificial intelligence. The KnightDarwin Law [8] is a biological principle stating (in modernized language) that it is impossible for there to be an infinite chain x1, x2, . . . of organisms such that each xi asexually produces xi+1. Acknowledgments We acknowledge Alessandro Aldini, Pierluigi Graziani, and the anonymous reviewers for valuable feedback and improvements on this manuscript. We acknowledge José Hernández-Orallo, Marcus Hutter, and Peter Koellner for helpful pointers to literature references. We acknowledge Arie de Bruijn, Timothy J. Carlson, D.J. Kornet, and Stewart Shapiro for comments and discussion about earlier embryonic versions of certain results in this paper. References 1. Aldini, A., Fano, V., & Graziani, P. (2015). Theory of Knowing Machines: Revisiting Gödel and the Mechanistic Thesis. In Gadducci, F., and Tavosanis, M. (Eds.), History and Philosophy of Computing. HaPoC 2015. IFIP Advances in Information and Communication Technology, vol 487 (pp. 57–70). Springer, Cham. 2. Alexander, S. A. (2013). The theory of several knowing machines. Doctoral dissertation, The Ohio State University. 3. Alexander, S. A. (2014). A machine that knows its own code. Studia Logica, 102(3), 567–576. 4. Alexander, S. A. (2015). Fast-collapsing theories. Studia Logica, 103(1), 53–73. 5. Alexander, S. A. (2018). Mathematical shortcomings in a simulated universe. The Reasoner, 12(9), 71–72. 6. Carlson, T.J. (2000). Knowledge, machines, and the consistency of Reinhardt's strong mechanistic thesis. Annals of Pure and Applied Logic, 105(1–3), 51–82. 7. Carlson, T.J. (2016). Collapsing Knowledge and Epistemic Church's Thesis. In Horsten, L., and Welch, P. (Eds.), Gödel's Disjunction: The Scope and Limits of Mathematical Knowledge (pp. 129–148). Oxford University Press. 8. Darwin, F. (1898). The Knight-Darwin Law. Nature, 58, 630–632. 9. Gödel, K. (1951). Some basic theorems on the foundations of mathematics and their implications. In Feferman, S., Dawson, J. W., Goldfarb, W., Parsons, C., and Solovay, R. M. (Eds.), Collected Works, Volume III: Unpublished Essays and Lectures (pp. 304–323). New York and Oxford: Oxford University Press. Measuring intelligence 10. Good, I. J. (1969). Gödel's Theorem is a Red Herring. The British Journal for the Philosophy of Science, 19(4), 357–358. 11. Hardy, G. H. (1904). A theorem concerning the infinite cardinal numbers. Quarterly Journal of Mathematics, 35, 87–94. 12. Hernández-Orallo, J., & Dowe, D. L. (2010). Measuring universal intelligence: Towards an anytime intelligence test. Artificial Intelligence, 174(18), 1508–1539. 13. Hibbard, B. (2011). Measuring agent intelligence via hierarchies of environments. In International Conference on Artificial General Intelligence (pp. 303–308). Springer, Berlin, Heidelberg. 14. Legg, S., & Hutter, M. (2007). Universal intelligence: A definition of machine intelligence. Minds and Machines, 17(4), 391–444. 15. Hutter, M. (2012). Can intelligence explode? Journal of Consciousness Studies, 19(1–2), 143–166. 16. Kleene, S. C. (1938). On Notation for Ordinal Numbers. The Journal of Symbolic Logic, 3(4), 150–155. 17. Pohlers, W. (2009). Proof theory: An introduction. Springer. 18. Reinhardt, W. (1985). Absolute versions of incompleteness theorems. Noûs, 19(3), 317–346. 19. Shapiro, S. (1998). Incompleteness, Mechanism, and Optimism. The Bulletin of Symbolic Logic, 4(3), 273–302. 20. Wang, P. (2007). Three fundamental misconceptions of artificial intelligence. Journal of Experimental & Theoretical Artificial Intelligence, 19(3), 249–268.