, . ".; J. h. J STRUCTURAL LEARNING n. Issues and Approaches JOSEPH M. SCANDURA, EDITOR GORDON AND BREACH, SCIENCE PUBLISHERS NEW YORK. LONDON. PARIS Copyright ~ 1976 by Gordon and Breach Science Publishers Inc., One Park Avenue, New York, N. Y. 10016, U. S. A. Editorial office for the United Kingdom Gordon and Breach Science Publishers Ltd., 42 William IV Street, London W. C. 2, England Editorial office for France Gordon & Breach, 7-9 rue Emile Dubois, 75014 Paris, France Library of Congress catalog card number 75-34846 ISBN 0-677-15110-1. All rights reserved. No part of this book may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without permission in writing from the publisher. Printed in Great Britain by Bell & Bain Ltd., Glasgov '?-ty'u~~1.W4-lLe.s...-", •.,.:rr I~ ",.e~4.~ J+.ffvl~&.d Jo.S <;c-M.vuluy",Cel.) ~ ci'~ Yud'Lov...J-<-"" P~t .~ 1111 "4~..Lv.. 4~ 8;U.c-L-, roev~u... P"LksL..vj TWO THEORIES OF PROOF JOHN CORCORAN There was, until very lately, a special difficulty in the principles of mathematics. It seemed plain that mathematics consists of deductions, and yet the orthodox accounts of deduction were largely or wholly inapplicable to existing mathematics. Not only the Aristotelian syllogistic theory, but also the modern doctrines of Symbolic Logic, were either theoretically inadequate to mathematical reasoning, or at any rate required such artificial forms of statement that they could not be practically applied. --Russell This part of the series has a dual purpose. In the first place we will discuss two kinds of theories of proof. The first kind will be called a theory of linear proof. The second has been called a theory of suppositional proof. The term "natural deduction" has often and correctly been used to refer to the second kind of theory, but I shall not do so here because many of the theories so-called are not of the second kind--they must be thought of either as disguised linear theories or theories of a third kind (see postscript below). The second purpose of this part is 25 to develop some of the main ideas needed in constructing a comprehensive theory of proof. The reason for choosing the linear and suppositional theories for this purpose is because the linear theory includes only rules of a very simple nature, and the suppositional theory can be seen as the result of making the linear theory more comprehensive. 1. THEORIES OF LI~EAR PROOF A theory of linear proof is a theory of proof which holds that proofs have a certain simple structure which can be metaphorically called linear.26 As will be obvious shortly, such theories can be quite plausible ~ priori but, of course, the comprehensiveness of a theory of proof is an empirical 25 See par. 2 of the second article in this series. 26A system of logic need not contain a theory of proof. There are other purposes for constructing such systems. For example, a logician may be concerned to codify consequences of sets of premises without even considering the problem of describing proofs ~ se. A system designed for such a purpose was called a consequence system in Corcoran (1969). Many consequence systems are systems of formal deductions which would be theories of linear proof were they put forth as theories of proof--which they usually are not. 207 S.-tc( ~''i. ~v •••.•. -"''1)'~ P\... , ••.•.4'(k(..'(, tt-~ 1M.~~....,,, ""'~,~ ,:... "'L.~' p"ods ~+l> •••.~ 1'\ fV"4I.-bV'Cs~" i'-",J."C.( C4>.•..•.~C\••••..•.•.~s "i fj,T-,,'i", C.f ••.LSc ~"1 rt-_ •.(t\ •..\,,,,..•16 •.•~lc. ~"" ~-e..'4i'''''''..( Hu.,)c.(.,;...-es matter. As usual, what is plausible ~ priori turns out to be a gross oversimplification of reality. Because theories of lir.earproof are simple and (often) plausible it is perhaps remarkable that the first theory of proof was actually suppositional in rul.tureecL my "A Mathematical Model of Aristotle's Logic"). However, theories of linear proof are 'juiteold,tracing their common history at least as far back as Boole. Indeed, Boole (pp. 142, 143) was so clear about his own theory that his description of it can still serve as a concise introduction to the general topic. ~ll demonstration essentially consists of the deduction of conclusions from premises ,--a conclusion once deduced being i~self admissible as a premiss. And it is in this order that reasoning usually proceeds. Certain premises are laid down, either from experience or from testimony, or from some other extra logical source; from these are deduced conclusions which simply or combined with other premises derived from the same class of sources as those first given, serve as bases for further inference, until the chain of argument is completed. At any stage of the process we may find ourselves dealing with two sorts of data, viz., such as have been deduced in the previous course of argument frum given data, and such as have not before appeared. A very slight examination of any actual specimen of demonstrative reasoning will show that such are the materials of its composition and such the order of its progress. Corcoran208 In more modern terminology we can say that a linear proof of a conclusion c from a set P of premises is a sequence of lines, beginning with a list of all or some of the premises and such that each subsequent line is derived immediately from premises and/or previously proved lines and, finally, ending with c. In other words a linear proof of c from P is written li.nearly in a column, say, beginning with the premises P at the top and proceeding step-by-step through intermediate conclusions ~ derived from P and ending with c the final conclusion. This is the idea; in practice things are a little more complicated, but the following g@ncral statement always holds--in a linear proof from premises P to conclusion c each sentence in the proof is a logical consequence of P. (The reader should note that the concept of logical consequence as defined above is not relative to any system of proof.) ~ \ { ! \ I ! ':'herearc three ",inormodifications to be made to the above loose account of linear proofs. The first is that for clarity the pre",ises shall be ,oarkedas such to ",ake it clear that they are not asserted to follow frcm Gny sentences ",hich they may happen to follow. The second is thot in some syst.ems premises may be written at any place in the proof, not just at the top.27 Finally, in addition to assu"'ptions and inferences, 27fhe "iotiV3tion for allowing flew prc8ises to he introd"ced i.nthe cõrse õ 3 li11car proof rñy b~ the obs~rvnti,on that one often tries to ge~' 3 COilcll.:si.on frolf, only some of the aVJ.ilable premises blJt then ciscovers in tile course of constrllcting the proof that others are needed. Fro", the present ?oint of view this observation is irrelevant and taking cor,nizar,ceof it may lead to an incorrect theory. Our purpobe is .!l2..t to describe how paths of reasoning emerge in thought but rather to descri.be how they are described once foutlld.It seems to be a genera 1 Initial String Rule (Kernel Rule) properly so-called, all linear systems of proof permit the writing of so-called logical axioms at any point in a proof. For example, in writing proofs in algebra we often have occasion to write logical identities, t = t, in proofs. Corresponding to this we would have a logical axiom rule which permits any proof to be lengthened by addition of a logical identity. For another example, in setting up a "proof by cases" ,;e often write in proof lines of the form 'p or not-p' where p is a sentential formula. Corresponding to this we would have a logical axiom rule "hich permits any proof to be lengthened by addition of "excluded middle for28 mulas." These two are probably the most prominent logical axioms rules. Rule Set A Two Theories of Proof Premises: A finite sequence of sentences each affixed by + is a proof.29 (1) Production Rules ~y ~ i..i\t 209 ~ ~i1~ ~ " J:1 ~ ~ } ~•••.G 1...•. ~ ~ ~ ~ ~ Below we will mark premises with a plus sign. Thus '+p' will be re3.d •• ~:'::~ ':'::1: :;':':::0:: ::m:::.::,:::::: will g1w' • (ooo-oo",,.h,olH- .. sive) theory of the proofs found in the abstract algebra of equations-i .~ the so-called equational algebras wherein all sentences are either equa- ~"J" ~ tions or universal generalizations of equations. Following the state~ent ~ .••~ ~~ of the rules we will give a proof of the theorem (x)(x=x-l-l) revery ele- ~ Q ment is identical to the inverse of its own inverse 1 from the group axioms. Ii ~ .}~".-9~1 ." ~ rJ. r'~:--I o oj ~ I .• .J\J't ~11~I'.(2) Identity Law: any proof oay be lengthened by addition of any i~~ logical identity, (t = t) where t is a constant term. (3) Substitution of "Equals": any proof containir:g (t = s) and . ~ ~ ! also p may be lengthened by adding p' where p' is the resul t .:;- -~~ If of replacing occurrences of t in p by s and/or vice versa. 7 ~1r (4) Instances: any proof containing (v)p(v) may be lengthened by ~.~"' ! adding p(t)--where v is a ';ariable and t is a ter;n composed of ~ " cons tants. 0 " ~.< ! (5) Generalizations: any proof containing p(d), d a dummy3 conJ ~ ~ l stant, may be lengthened by addition of (v)p(v) provided that ~ ract that in actual proofs where premises are made explicit at all tlley --!i are put down at the beginning. If this is so then any theory which fails 1~..r~ to take account of it is, strictly speaki,lg, incorrect regal-dless of hO\; •. valid it may be on other grounds. :f ~R ~~~ 2~e way of characterizing the difference between normal reasoning :1.~ ~ and the so-called Hilbert-type systems of deduction is LO note that in ~ ! J the former there are fe;,: logical axioms t"ules out ",any inference rules .•• f!I whereas in the latter there are commonly many logical axioms rules bUe -;. ~ -t> few (usually one or two) inference rules. (CL Tho'nason, chapters TIL, 'l' -.e 10 IV, V. esp. pp. 62-(,]). ...: 1 29 ~ot(~ that for ?u::ely hEuristic reasons we have tacitly hcec ~Sili[; j:?-, the term lI;?roofl' in such a way that a partial proof is counted as a proof, -" .-: ~ thus a finished proof will be a "proof" "hieh satisfies some additional ~ "':( c conditions. This issue will be discussed in more detail below. See esp- :1"'" ':I ecially, the cJiscussion of "developments" of axiomatic theories in Sec- ~ d " tion 5. r-1;,,1 ~ '" <:. (' ~ .::J 0- ~ .~-1.;.~~ 210 (6) Corcoran no assumptions concern d (i.e •• provided d is "arbi trary"). Repetition: any proof may be lengthened by repeating any previous line dropping a '+' if it occurs. Obviously, each of the above rules corresponds exactly to a rule commonly used in proofs in algebra. Notice however that there are commonly used rules which do not appear in the list. For example, the only way of instantiating here is by rule 4 and this permits the elimination of quantifiers only one per application. This will be an annoying deficiency. Similarly for generalizations. Another deficiency is that substitutions can be done .c using only one equation at a time. In the proof below we have starred \ the lines that would remain were the deficiencies eliminated. + (x)(y)(z)(x.(y.z) + (x)(x.l a x) (x.y).z) * * + (x)(1.x ••x) + (x)«x.x-1) = 1) + (x)«x-1 .x) •.1) (y)(z)(a.(y.z) ~ (a.y).z) (z)(a.(a-l.z) = (a.a-l).z) a.l D (a.a-1).a-l-l a.1 ••1.a-l-1 a.1 •• a a = 1.a-1-l 1.a-l-l ••a-1-1 * * * * * ~I * \ ';1 * :~I * * * * Having a more powerful instantiating rule would permit going from the associative law directly to the first unquantified line--skipping two lines. The other two unstarred lines would be skipped by doing two substitutions at a time. Incidentally, the above rule set (or discourse grammar) describes proofs .. 30Use of the term "dullDllY"is redundant here; a dummy constant is simply one which does not occur in the premises. usually the constants r Two Theories of Proof 211 I 1 but it does not make explicit what "a proof of c from pOI is. Naturally, we define a proof to be a proof of c from P if c is the last line of the proof and all premises in the proof are in P. The above example is a proof of (x)(x = x-I-I) from the group axioms. As the rule set is being used here, the (metalinguistic) symbols p, p(t), p(v), and p(d) refer to formulas in the language of groups. Thus this 31 set of rules presupposed a sentential grammar for the language of groups. However, if we interpreted the symbols as referring to formulas in the arithmetic language, then we could use Rule Set A for the theory of proof needed to complete the Partial Grammar of the Arithmetic Language given at the end of the first article. This would actually be a bit silly for two reasons: first, the Partial Grammar has no quantifiers so rules 6 and 7 would never apply; second, the Partial Grammar does have the logical connectives whereas none of the rules permit any inferences involving connectives. The point, therefore, is ~ that the Partial Grammar would be finished but rather that the reader can now see what a finished grammar would be like. The respective natures of an alphabet, a rule set for words, a rule set for phrases, and a rule set for sentences are already clear from the Partial Grammar. Now we have also seen a discourse grammar which describes or produces a certain set of proofs. This discourse grammar, Rule Set A, is a theory of proof. Rule Set A is obviously a correct theory of proof--each of its rules corresponds exactly to (or is) an actual rule of inference that we have all used when doing proofs in elementary group theory. Rule Set A is obviously not comprehensive in the sense that I have defined the term because, e.g., it lacks the complex rules alluded to above which permit the unstãred lines to be omitted. However, it is complete in a certain sense.3 occurring in premises are given special symbolization: '0,' '1,' 'n,' 'e,' etc.; whereas dummies are indicated by 'a,' 'b,' 'c,' 'd,' or by variables subscripted with a '0,' e.g., xO' Incidentally, Thomason (chapter IX, esp. p. 183) does not class his rule of generalization with the immediate inference rules. His rule of generalization is sound bu:, in my opinion, it does not correspond to actual reasoning as closely as does the present rule. 3lrhe possibility of obtaining a correct theory of (symbolic) proof depends on having a "correct" symbolic sentential grammar to begin with. Indeed, finding "natural reasoñ.ng" blocked by restrictions dictated by peculiarities of the sentential grammar can indicate need for revision of the latter. For example, in the otherwise correct theory of symbolic proof given by Resnik (1970),every proof of Fyy from (x)(y)Fxy involves getting a generalization of Fyy as an intermediate step because of the need to avoid "capturing." Similar situations are common. However, it is possible to design the symbolic language in such a ~ay as to make "capturing" grarnmatically imposs ible. This makes it unnecessary to add special restrictions on the rules. Once the symbolic language is thus revised, as in Lemmon (1965),as an unexpected advantage one finds that intrinsically awkward symbolic sentences are eliminated without loss of expressive power. 32A theory of proof for a particular language is called eguationally complete when the following holds: given any set of equational sentences (either equations properly so.called or universal generalizations thereof) and any single equational sentence c, if c is a logical consequence of P, then there is a proof of c from P constructible by the rules of the theory. Rule Set A is equationally c~lete. This fact will be plausible to any reader who understands it. To the other readers the following remarks are addressed. Let P be the axioms for groups. Let 212 Corcoran " 5 In QnY theory of proof which describes or produces only linear proofs, it is possible to give a very simple description of all proofs from a particular set P of premises to a particular conclusion, c. Given a definition of the logical axioms and the rules one can then say: a proof of P from c is a finite sequence of lines ending with Co each subsequent line of which either is an assumption in P or is a logical axiom.or is obtained from previous lines by a rule. The underlined expression (or rather an even simpler version of it) has become a slogan and, sometimes, a battlecry. One eminent logician related to me that when he first heard this slogan presented he was struck by its simplicity and truth and was moved to say to himself, "By God, that is what proofs are!" If one takes the slogan as a rough description of all proofs, then one is led (1) to distinguish three kinds of rules of inference and (2) to believe that all rules of inference must be of one of the three kinds. The first kind contains only the rule of assumption--essentially to the effect that an assumption may be written to start (or to lengthen) any proof provided that i.tis marked as an assumption. The second kind contains all logical axiom rules--to the effect that a logical axiom may be written to lengthen any proof. The third kind contains all immediate inference rules; rules which state that any proof containing one or two (or some fixed finite number of) sentences of certain specified forms may be lengthened by adding a sentence in another form. 2. HIMEDIATE RULES AND SUBSIDIARY PROOF RULES It so happens that by surveying the proofs in the mathematical literature (or by looking at our own proofs) we find many rules that are ~ of any of the above three kinds. Indeed, if all rules were of the three above kinds then there would be no room in mathematical reasonine for making subsidiary assumptions. Much of the most elegant and enlightening reasoning in mathematics turns on the ability to imagine good subsidiary assumptions. Below are some examples. (1) In proving that the square root of two is not rational,we assume, in addition to the axioms of arith~tic, the subsidi.ary assumption that the ~quare root of two is rational. (2) In proving the right cancellation law =(x)(y)(z)«x.z = y.z) ~ x = y)l from the group axioms,we assume, in addition to the group axioms, that a.d = b.d where a, band d are arbitrarily chosen but fixed elements of the group. (3) \vhcnever we give proofs by cases after we have proved that there are two cases, say, we assume that the first case holds and then prove our theorem in that case, then we assume the second case and prove our theorem in that case--finally we conclude that the theorem holds in general ...• In each of these three examples the proof involve~ r.l3kingsubsidiary assumptions, assumptions other than those from which the conclusion is shown to follow. c be any equational sentence written in the lãgunge of croups and which is true i.nall groups. c, then, is a logical consequence of P; since (1) a group is by definition any mathematical system in which the axioms of groups are true and (2) to ~ay that c i.sa l.oCicalcon,;equence of P is to say that c is true in any matr.ematical system which !:lakesall of the sentences in P true. The above-mentioned completeness condition implies, then, that by using Rule Set A one can construct a proof starting with P as assumptions (as in the example) and ending with c. In fnce, such a proof can be gotten ~y lengthening the one given as a sample. Two Theories of Proof 213 At some point in each of these examples an inference is made not from certain previous lines in a proof but rather from (or on the basis of) a certain part of the proof. In other words, there are rules which can be stated as follows: any proof containing a subsidiary proof of a certain form may be extended by adding p. For example, in reductio reasoning we are following the rule: any proof containing a subsidiary proof beginning with p and containing a contradiction may be extended by adding I:,', -p (not-p). A subsidiary proof begins with a subsidiary assumption, a "new" assumpIf tion made for purposes of reasoning. The subsidiary assumption is marked with a "beginning" corner bracket' \". Thus' rp' may be read "for purposes of reasoning suppose p" or simply "suppose p." \oIhenthe subsidiary j reasoning is completed one adds a "closing" or "ending" corner bracket , , L' to the last line. Each time an ending bracke t is added it is rnatched If with the last beginning bracket not yet matched. The latter is always on the line containing the supposition which begins the subsidiary proof in , question. Thus a subsidiary proof may be defined as a section of a proof enclosed in matching brackets. The details, if not already clear, will be so after considering a couple of examples. Two paragraphs back we stated the reductio rule. We now give as an example an indirect (reductio) proof of -(x) - (x = x-I) (not every element is different from its own inversel from the group axioms. + (x)(y)(z)«x. (y.z» «x.y) .z)) + (x)(x.1 x) l, ,- (x)(1.:< x) I + (x)(x.x-l = 1) Ii + (x)(::-1.x 1) I r (0<) (x x-I) III I --(I 1-1) ,': ! 1.1-1 subsidiary 1.1-1 1-1 proof L 1 = 1-1 .. -(x) - (0< = ,,-I) The sub~~ic.!iary proof is encl.o:-icu in ::latching brackets. The contradiction in q'Jc3tiol1 is IIbcth'ce:ll1 the :.;tarrcd lines. Notice that the conclusion i~) ir',fcrrce .f.Q.._fol !.Q.~f..r.2!I!. ttlC ~~roup <1Xi0m:, (not from all 3ssu~pti.0:1s) on The cql;atiõal COll~plcteilCSS of Rule ~ct A was proved several years "SO 1:1' Jan Kalicki and DanG Scott (1'155). 33For .1 'diue-l'c,ngingdiscuss ion of this p<1rtieular proof in the general conte~l of .1 concern with the history and the soundness of 1n- ,lirect re;lsor.inr,see Cau,,,,,n(1966). the basis of the subsidiary proof. Once a subsidiary proof is marked off by an ending bracket (L), it must be regarded as an isolated, separate unit in the proof. In particular, one may no longer apply any of the immediate inference rules to lines inside of the subsidiary proof. For example, we could not write down as a next line -(1 = 1-1) by repetition because this does not follow from only the group axioms. 214 Corcoran Let us use the phrase 'subsidiary proof rule' to refer to rules which permit the lengthening of a proof on the basis of a subsidiary proof. Of course, the most notorious of subsidiary proof rules is the rule of conditionalization which permits inference of 'if p then q' on the basis of a subsidiary proof beginning with p and ending with q. We will give a proof of the right cancellation law from the group axioms to illustrate this. (In the proofs below we do not necessarily follow Rule Set A but use other commonly known rules as well.) + (x)(y)(z)«x.(y.z» = «x.y).z» + (x)(x.l = x) + (x) (1.:< x) + (x) (::.x-l = 1) + (x)(x-l.x 1) ( a.d = b.d (a.d) .d-l a. (d.d-l) (b.d).d-l b. (d.d-l) Subsidiary Proof L (a.d = b.d):> (a =b) a b (x) (y) (z)«x.z y. z) => x = y) It will be valuable to notice that in proofs by cases ~orc than onc subsidiary proof is needed--one for each case. Actually, all proofs-bycases-rules are "combinations" of the two-case rule stated as follO\,s: any proof containing 'Cl or c2', together with two subsidiary proofs,onc beginning with cl the other beginning with c2 both endinc with c, can be extended by adding c. To illustrate this we will give a proof of the two-sided cancellation law. The proof will involve one application of the two-case rule inside of a subsidiary proof on which conditionalization is used. + (x) (y)(z) Ux. (y. z» «x.y).z» ., (:< ) (x .1 x) + (x) (Lx :<) .;- (x) (x.x-1 1) + (x) (x-I. x 1) Twol1leories of Proof rca.d = b.d) v (d.a = d.b» la.d = b.d (a.d).d-l (b.d)d-l a.(d.d-l) b. (d.d-l) L a b ~.a d.b d-l.(d.a) d-l.(d.b) (d-l.d).a (d-l.d).b 215 first subsidiary proof secondary subsidiary proof L L «a.d a = b a b b.d) v (d.a d.b»~a=b -cases rule* -conditionalization** (x)(y)(z)«(x.z=y.z)v(z.x=z.y» ~ X=y) The notations on the right are designed to help the reader see exactly where and how the two subsidiary proof rules are applied (* and **). Before we proceed to a discussion of theories of suppositional proof (theories involving subsidiary proof rules), the reader should note that the above three proofs are not linear because the subsidiary assumptions are not among the premises from which the proof proceeds and neither are they consequences of the premises. That is, for example, in the proof of the cancellation law from the group axioms there are sentences which are n2l logical consequences of the group axioms. Thus in these proofs we do not reason in a linear fashion--we take "side trips." 3. THEORIES OF SUPPOSITIONAL PROOF The defining characteristic of a theory of suppositional proof is that the rules permit the use of subsidiary assumptions which are later "discilargeLi"and are not among the assumptions from which the final conclusion is shown to follO\~. These rules are subsidiary proof rules which countenance an inference not from previous lines but rather on the basis of a subsidiary proof. Such rules are not unusual but rather they comprise the cssence of clear, elegant r.~thematical reasoning. Indeed, I think the "",thcmatically experienced reader will agree that linear proofs have a vcry cot:1putationalflavor to them)whereas suppositional proofs ~.•cc:n to c.'lilbouy ~orc creative and enlightening reasoning. T!.ercarc,a fe\: ques cions concerning the formulation of suppositional rules .,hich mi(;ht ",lvebcen annoying sOI~ereaders. I will digress slightly at this point to take up some of them. 1" ti,efirs~ place we t:1ustgive an explicit rule for adding subsidiary 216 Corcoran assumptions: rule of supposition--any proof can be lengthened by addition of a formula prefixed by a beginning bracket. Secondly the following rule explicitly accounts for introduction of closing brackets: closing rule--any proof containing more beginning brackets than ending,brackets can be modified by affixing a closing bracket to its last line. The idea is that each supposition line, rp, starts a subsidiary proof and that each subsidiary proof must start with the last supposition line not already a part of another subsidiary proof. Each time an ending bracket is put into a proof there is exactly one beginniñJfracket with which to match it--namely the last one not already matched. A subproof of a proof is a sequence of lines beginning and ending with matched brackets. An occurrence of a sentence is inactive in a proof if it occurs within a subproof. An occurrence is active if not inactive. A sentence is active in a proof if it has an active occurrence therein. A subproof occurrence in a proof is inactive if it occurs within another subproof. An occurrence of a subproof is active if not inactive, A subproof is active in a proof if it has an active occurrence in the proof. All rules must be stated so that they apply only to sentences and subproofs which are active. A given proof is a proof of its last line (if active) from its set of active assumptions (premises plus active suppositions). Now we can state two important general principles for suppositional proofs. Let p be a given line of a suppositional proof. (1) The sequence of lines up to and including p is itself a proof. Let us call this the partial proof ending with p. (2) In any suppositional proof, each given line p is a logical consequence of the active assumptions of the partial proof ending with p. (If p is prefixed by ar, the r counts as in the partial proof--if by L the L does not count as in the subproof). Now we can define a finished proof to be one which satisfies the following two conditions: first, it contains no active subsidiary assumptions; second, it ends with an active sentence. The first condition guarantees that any reasoning for purposes of which a supposition lias been made is completed. The second condition allows a subsidiary proof rule to be applied after the last subsidiary proof has been cor.lpleted. This definition includes every proof thatone wou 1<1 "an t to coun t as fit'is"cd .:Ind excludes most unfinished ;Jr'Jofs,butit still counts as "finished" certain proofs which one would not "ish to consider as such. A more ~dequate definition would involve intricacies undesirable in .:In;]rticle of this sort (but see below for an easy improvement). It is obvious that the fram(~'...:ork 0f .:l suppo.siti~l th20ry is ~luch mc~c adcqu.ate for chãacterizing nathematical proofs thnn iG the fr:1J:)C'.,()rk Ot Do lin'2.:lr tncory--evcn t:;vugh .:lp.ythinri th.:lt can he rH",JVcJ in .:1 ~;i.ver~ suppositional theory wiJl ~lso 3dmit of ~)root' in s()~c li~c:lr t~,cory. In other ;,;oros) ~"e are not: ,::Õltras tin;? ('he nb:..>tr3c:" PL".-:cr of ~;l1ch :.:.~~eorics but rather their reL.ltivc .:J.deril!<3cic.<j i:-I .....-:l~;)r:i.ctcri;:ir.~~ ::nc procf."'; ".:;!l.~:j; "de actL:~lly ",,"rite. Giver: ~,hc 8JV.:lnta;~<.' of :;uppõi;ion:-ll t:"'C:(jric~, .:l~ \.~"":'j1 ask: .:Ire there other kinds of rt:~es of proof \.;hich l',ou~d l)(> .1r1(:!..~~:.~lnd 34proofs in suppositional theories h.:lvethe abstract form of ne9t s"rllctlll'L: "r' '::'(' ,;ef'se of S",u11yan (19(,<;). 35In parti.cular, there are marlY proofs H;,ich can.nf'lt. ~••: acc:olintcd for except in a supposition;]l theory. .!, Two Theories of Proof 217 which would constitute an even more adequate framework? Let us put this question another way. Besides the premises rule and the closing rule we have seen four kinds of rules of inference: (1) assumption rules, (2) logical axiom rules, (3) immediate inference rules and (4) subsidiary proof rules. Are there other kinds of rules which are actually employed in writing of proofs? The most obvious kind of rule to suggest adding is a rule that permits the '",riting of "goals." Frequently when we are "riting a proof, after some assumptions (premises and/or suppositions) have been entered, we indicate our goal by '",riting, for example, "we want to show p." This is actually a very handy device which helps convey the reasoning to be expressed in the proof. Since the purpose of proofs is to express reasoning "e should certainly consider such a rule. ~e could state it: Any proof may be lengthened by a0di~~ ?p. The question mark in this context could be read "to prove," say.3b \,'ewould then have to define all occurrences of ?p as inactive because otherwise we would be applying immediate rules to "hat "e were trying to prove--thus begging the question. Now let us consider another important kind of rule. We have actually given an example of this kind of rule, but we did not classify it. Notice that all of the above kinds of rules apply only to a part of a proof to which they apply, i.e., it is usually unnecessary to look at each line in the whole proof in order to apply any of the above four kinds of rules-supposition does not require looking at any lines, the same for logical axiom rules, immediate inference rules involve only fixed finite numbers of lines, subsidiary proof rules involve perhaps a few subsidiary proofs plus perhaps a few active lines. The rule of generalization, however, requires looking at a particular line p(d) and then checking through the ,-'hoieproof to determine that nothing has been assumed about d--i.e., that d is indeed arbitrary ('d' is dummy). Such rules we call global inu:lediate rules. Thus, the classification of linear rules above "as inadequate. In addition there are subsidiary rules which involve reference to the entire proof to "hich they are applied. The most prominent example of a global subsidiary rule is the rule that is generally used in reasoning from an existentially quantified statement. For example, suppose that "e have assumed the right cancellation law in a proof and we are aiming to prove ('Ix)(y) (y.x~x) ~ (x) (x=x-l). We assume the antecedent (3 x) (y) (y.x;x) and \Je say "let Xc be such an object." ("Let" is a 8ure sign of an assumption.) We are <1ssuming that :'0 is an "arbitrary object" satisfy!ng the condition (y)(y.xO = x9)' ~I reason then of an ~genuinely) aro1trary b that b.xO = xo and toat b .xO = xO' Then, uS1ng the cancellation l3\J, infer b ~ b-l. Since b is arbitrary, (x)(x = x-l). Now we say: '~ince xo was arbitrary and (x)(x = x-I) does not depend on xo,the 36,'hi,'; rule ,ony profitably be compared with a similar device of Kfllish :lJ',cl !'-1ontel)'~ue (pp. l!.ff) Hhich involves writing 'show pt to indic,;r:c Cl [;oa1 i.H1U hhich rCl-;uires thl~ 'show' to be crossed out once lithe goal hf!.:i heen. reached.lt As useful and valid as t.his device :..)urely is, it is ,,"lot correct ill OIJr seT'~iC l)ec~llse it violates the principle that every subproof of " (partial) proof is itself il (pilrtial) proof. rhe latter is a roui~h sutemcnt which corresponds to the apparent facts that we do not altcr pr(>viou,;]y IJritteo (partinl) proofs and that we read them "top to hottOl;l" checking each iine CiS encountered. '[he Kalish-Montap,ue device ~ corre~:ron(~ bet t:er to a tie~;cription of how proofs "emerge irl thought" ~.ilich, of cour~c) is not our goal. 218 Corcoran conclusion follows from the original assumption." This corresponds, in the below formalized version, to taking (x)(x = x-1) out of the subsidiary proof and making it active [starred linel. + (x)(y)(z)«x.z=y.z):> x=y) ?(,x)(y) (y.x=x) ~ (x) (x=x-1) r(ax) (y) (y.x=x) b .xo = Xo b-l.xO = Xo "let Xo be such an object" L -1b.xO b .xo (b.xO = b1 .xJ:::>b b=b-1 Ux) (x=x -1) (x) (x=x -1) * (cancellation law) -1(ilx) (y)(y.x=x) ::> (x) (x=x ) It might be worthwhile to do another example us ing the above rule. \~e will prove (y)«3x)(Dx&Hyx) ~ (~z)(Az&Hyz» from (x)(Dx ~ Ax). + (x) (Dx :;,Ax) r(~x)(Dx&Hbx) i (Da&Hba) Da Da ~ Aa Aa Hba Aa & Hba l..(~<:) (Az&Hbz) L (3 z ) (Az&Hbz) (] x) (Dx&Hbx) ::- (37.) (Az&llbz) "let a be such an object" (y)«3x)(Dx&HJ~) ~ (az)(Az&Hyz» The rule just exemplified could be called "existential instantiation" because it involves "instantiating" as existential statement to begin the subsidiary proof. Two Theories of Proof 219 Often in writing a proof after a pair of contradictions have been proved (made active) we write 'a contradiction' and it is on the basis of that notation that we apply the reductio rule. Thus it is .necessary (for comprehensiveness) to add a special symbol, say X, to the language of proofs.}? "X" can be read 'A contradition.' The rule of contradiction introduction is the following: any proof which contains active sentences p and not-p may be lengthened by addition of X. Given this we can now state two new reductio rules: any proof which ends in a subsidiary proof beginning p (respectively -p) and ending with X can be lengthened by addition of -p (respectively p). The usual proof of Russell's theorem [no set contains exactly the sets ~ not containing themselves] involves all three of the rules just mentioned toge ther with the subs idiary proof rule of "existential instantia tion." It should be mentioned that Russell's theorem is proved without the use of premises--it is proved using logic alone. For this reason it is often counted as a "law of logic"--indeed, its denial implies a contradiction. ~ )5Hu,'?~ 1Lc<.t...~lU\L.>,\,,,;•.l .•.\I,,.,, •. ?-(n)(y) (x£y ;;-(yey)) iNc.-< ev< ••. Ik-,\y! ~ •....:..I.V(..;r....t- ..". 1V.~..,.~"\.~..•" •..•.~"",uu\ ••.•..•." •...•.1•.4 r(BX)(y)(xey ;;;-(YEY)) ~~ ••.~ ~"tl.\l.. jo, ., ••• { n...c... Jrt1,. )'(_l.,.J"""l~"',"",.A.. l(y)(xot:Y: -(yey)) "let Xo be such an object" xOexO ;;;-(xÕxO) Lx L X Lx -(3x)(y)(x~y ;;;-(yey)) "but Xo was arbi trary" Because of limitations of space we merely mention a class of rules called definitional rules which actually form a subclass of the global subsidiary rules and which, as can be surmised from the name, countenance the u"e of nOL,inal definitions within proofs. A" .1 [inal question "c consider the [,ature of an axiomatic development of a mathematical theory. An axiomatic development of a theory begins "ith the axioms. Subsequently the first theorem is proved, then the 37 Linguistically, this may be a radical move. We are adtling to the "sentences" used in discourses something that does not appear in the underlying language. 220 Corcoran second, then the third, etc. However, after the first proof the axioms 1;; are not repeated. Noreover, in addition to the axioms, previously proved theorems are also used as new "axioms"--but these are generally not rewritten either. One way of characterizing such a development is to say that it is one long proof and that axioms and previously proved theorems can be used because they are already active above. There is something artificial about this characterization-~ually say that a development of a theory contains many proofs, here we say that it is just one long proof. i It is obviõs that there is a leve~ above the level of proofs--a level l~.....:.j containing axiomatic developments' which, in a sense, are composed of . proofs. This implies that in a development of a theorY3~ere is structure which is not reducible to the structure of proofs. Thus there are at least two levels of language above the sentential level. 4. SUMMARY OF SUPPOS ITlONAL THEORIES We have seen that linear theories contain four kinds of rules: premises, logical axiom, inunediate inference, and global immediate inference. Next, we noticed that suppositional theories contain two additional kinds of rules: subsidiary proof rules and global subsidiary proof rules. It is important to realize th{lt relative to linear systems both kinds of subsidiary rules are radical innovations because they countenance inferences not based on previously proved sentences hut rather on the basis of previously performed patterns of reasoning.39 In addition, we pointed out that the definitional rules are merely a species of the global subsidiary proof rules. We explained the concept of an active sentence40 in a proof and we asserted that the general principle behind suppositional proofs has two parts: (1) that given a proof and a sentence occurrence p in the proof, the part of the proof ending with p is also a proof (called the partial proof ending with p) and (2) each such p is a logical consequence of the active assumptions of the partial proof ending with p. Given this principle, the notation for subsidiary proofs, and the classification of the rules, }8 In a development of an a:<iomatic theory eacll theorem :Ind each lemma is a "main goal" and within the course of deduction of a main [;oal one often choses "intermediate goals" in order to focus on the local direction of the reasoning. Several things follow. The first is that one needs at least two "goal indicators;' one for main goals and one for intermediate goals. One way of handling this is to use a single question mark to indicate a main goal, two question marks to indicate a concL;sion to be reached in proving a main goal, (perhaps) three question "'arks to indicate a conclusion to be reacheu in proving a "level-tlVo" g031, etc. The second is that the notion of :1 "finished proof" r.IUStoe modificd in ore:er that a proof is counted as linishcd only if all of its ,;oa15 have been reached in the requirec order. As e.:lch sub~cquent theore~'~ or lcmlila has been reached the entire proof up to that. point should be fin;.s:lcd and it may be necessary to have a special sYT:1bGlto indicate ,i,C cnd of a finished proof. Indeed many current aut'"",S '-Isesuch 5 1'1""0 ls . Kelley (1955) uses a small shaded rectangle wilict)hc attributes to I!dl,""s; ~ul'pe,; (1960) uses the traditional 'Q.1::.D.;' and Dc.;n (1966) uscs " Lripl" asterisk. For further discussion of the structure of H Jevelopment 01 an axiomatic theory see my "A Mathematical :'[odelof Aristotle's syllogis tic. " 39 For a more detailed discussion see :liY r'Three Logical 'i."heories. 40 See Section 3.:<hove. Two Theories, of Proof anyone having a back~ound in mathematics is prepared to formulate his own theory a f proof" 1 5. SUMMARY OF TIlE SERIES 221 In the interest of accuracy we must admit that the obvious heuristic value of the notion of a partial proof probably refutes the hypothesis that the class of discourses has a kernel/transformations structure. The proof discourses are clearly the "finished proofs" and it does not seem to be the case that these have the requisite structure: one does not build up finished proofs by applying "natural" transformations to other finished proofs. Indeed, it seems to be generally the case with discourses that the beginning of a discourse is not itself a discourse but rather it seems that the beginning of a discourse makes "promises" which must be fulfilled la ter in order for the discourse to be "f inished." When we put down some axioms and "a goal" (see above), that proof is not finished until the goal has been reached. Likewise with discourses, generally. For example, if someone were to say, "I have called this meeting to give you my views on the latest crisis;' and then sat down, he would not have uttered a complete discourse. There are innumerable similar examples. The conclusion that the class of discourses fails to have a kernel/transformations structure seems inescapable. In part I we discussed some fundamental concepts involved in the analysis of mathematical reasoning. In addition, we introduced the concept of levels of language and pointed out that a grammar of an entire language should be composed of several grammars, one at each level. We also made the point that a proof is a certain kind of discourse which, in turn, suggested the possibility of a theory of proof--a discourse grammar which describes the proofs of a language. In part II we outlined what a theory of proof would be like. We noted that the grammatical rules used in describing proofs are the rules of inference according to which we write proofs. We discussed the nature of our knowledge of rules of inference distinguishing weak and strong varieties of such knowlecge. Finally, we speculated concerning the utility of a theory of proof vis-a-vis improvements in mathematical education. In the course of Part III, we contrasted what has become the traditional theory with a newer and more adequate theory whose essential features were discovered in the 19~O's (Jaskowski). The older theory holds that ,"athematical reasoning proceeds from axioms step-by-step to conclusions 4lrn Iilathe,"atical logic one constructs a precisely defined mathe- ",atical nnalog ([ormal cieductive system) of a system of proofs and a precise ""lthematic.al analog (formal semantic system) corresponding to the (actual or irc3cineci) $yste,,,vf interpretatior.s associated with the lan- ~uage. In tnis way ehe philosophical problem of the soundness of a syscern vf I'r001.s is r"placed iJy a precise mathematical problem. The form of the ,,,,,inler.1O;"i,n a soundness proof fvr a system of linear proofs is tilis: cor evpry proof,., the assu"'ptions of ,..taken together imply each sentence in ". In my opinion the form of the corresponding lemma for any correct tileory of suppositional proofs is this: for every proof" the active assumptions of " laKen together imply each active sentence of Tl. "This opinion, if correct, "ill account for the feeling of strangeness encountereo in trying ~o construct proofs in the system of Quine's Methods (pp. 159-167) • 222 in a strictly linear fashion; i.e., each step in a proof must be a logical consequence of the axioms. Apparently this view was first systematized by Boole in the nineteenth century. It became the commonly accepted view until the 1920's when Lukasiewicz pointed out in his seminar that the theory did not agree with mathematical practice. laskowski, who was a student in the seminar, accepted the project of developing the exact details of a theory of proof which would take into account the salient features of mathematical reasoning not accounted for by Boole's theory. The newer theory is largely the result of Jaskowski's effort. The older theory we called linear. the newer suppositional. We gave several examples of rules and proofs with the intention of supplying enough detail so that the basic ideas can be grasped in a useful way. 6. POSTSCRIPT The linguist and the logician will doubtless disagree with many of the above assertions. Several serious oversimplifications have been made-mostly concerning linguistics. My hope has been to show the overlap and possible cross-fertilization between, on the linguistic side, the ideas of l~rris and Chomsky and. on the logical side, the ideas of Jaskowski. I have tried to do this in a way that '.,'ouldbe of benefit to persons of diverse backgrounds. I was trying to write to an audience of mathematics educators, linguists. mathematicians, psychologists,and logicians. One final technical point: the so-called natural deduction systems found in books by Suppes, Lemmon. and ~tes are ~ theories of suppositional proof. By looking carefully at each of them, one notices that the lines of their proofs are ~ sentences. but rather ordered pairs (p.c) where l' is a set of "premises" and c is a single sentence. ~loreover. a grammar to generate their proofs takes the form of a linear theory ••ithout ~ assumptions. In particular. in each of these systems each proof is a finite sequence of lines (pl.cl)' (1'2' c2)' ••.• (Pn, cn) where each subsequent line is either (axiomatically) of the form ([cl,c) or else is the result of applying an immediate rule to a fixed. finite number of preceding lines. An ex~mple of such a rule wculd be: if (Pi' d) and (P .• d ~ c) are lines in a proof. then the proof can be lengthene~ by writIng (Pi + Pj• c). The idea behind constructing a proof of c from P in these systems is not to try to deduce c from P. but rather to construct the ordered pair (P,c) starting initially from ordered pairs «(xl. x) using rules which when applied to "valid arguments" produce "valid arguments." In a word, these systems stack-up valid ar?,uments starting wiLl) tf:f:~ simple and building to the complex. As far as either the characterization of normal reasoning or utility in teaching is concerned, it see,os to r.lethat none of these systems fares ",,,11in comparison to a supposition"i :;ystem as found in the following: Anderson and Johnstone (196:!), Kalish ClLIO Montague (196/.), Leblanc (1966), or Thomason (1910). Ackl,owlerlge~ents This work originated as a ta lk gi.ven at tn" Confe~"nce on ';;Il{",,:oat ie'll and Structural Learning held at the Cniversity of Per.:lsyh'.1nid in •...1'1'0.1 of 1968. I wish to thank Dr, Joseph Scandura for inviting role to "'!lat developed into a valuable conference. If the final version is .1 material improvement over the talk. then Professors Jar:lesGreeno, Paul Rosenbloom Two Theories of Proof 223 and Joseph Scandura d"sen;e credit for their suggestions and criticism (not all of which 1 had wisdo", to :\gree with). I also wish to iici<nowledge the fact that had I not been fortunate enough to receive a Sutmler R~search Fellow!;hip (NSF-IG-68-3) from the :iational Science Foundation thl-ough the auspices of the Cniversity of Pennsylvania, then 1 likely would not have "ritten these pages. 1 gratefull.y ac:<.r.owledge the helpful and syr::j.>athetic criticism that 1 have received fro", the students in my logic seminar. Especially significant in regard to this work were the ideas of John !ierring, i.,'illi<:>m Frank, Edward Keenan and George ,.eaver. Finally, 1 acknowledge ideas received in private co",munication separately from Mr. J:\~es ~IDnz, Linguistics Project, university of Pennsylvania ~nd Professor J. J. LeTourneau, Mathematics Department, ilampshire College. 1 wish to dedicate this work to the memory of Albert L. nammond and Llodwig Edelstein; hoth late of Johns Hopkins University. These two ",en were instrumental not only in bringing me to appreciate the search for truth but also in ~~king practical arrangements in order that 1 could pursue graduate studies. 80th had tragic lives, one was harassed by philistines in i\1"erican higher education, the other by Nazi's in Germany as well--but they both fought the good fight and neith~r lost faith in the ultimate value of truth and kindness.