A quantitative-informational approach to logical consequence Alves, Marcos Antonio; Loffredo D‟Otaviano, Itala M. A quantitative-informational approach to logical consequence. In: The Road to Universal Logic. Switzerland: Springer International Publishing, pp. 105-24 (Studies in Universal Logic), 2015 DOI: 10.1007/978-3-319-15368-1_3. http://link.springer.com/chapter/10.1007/978-3-319-15368-1_3. Marcos Antonio Alves UNESP, Philosophy Department, State University of São Paulo, Marília/SP, Brazil marcosalves@marilia.unesp.br Itala M. Loffredo D‟Ottaviano Unicamp, Philosophy Department, Centre for Logic, Epistemology and the History of Science – CLE, University of Campinas, Campinas, Brazil itala@cle.unicamp.br Abstract In this work, we propose a definition of logical consequence based on the relation between the quantity of information present in a particular set of formulae and a particular formula. As a starting point, we use Shannon‟s quantitative notion of information, founded on the concepts of logarithmic function and probability value. We first consider some of the basic elements of an axiomatic probability theory, and then construct a probabilistic semantics for languages of classical propositional logic. We define the quantity of information for the formulae of these languages and introduce the concept of informational logical consequence, identifying some important results, among them: certain arguments that have traditionally been considered valid, such as modus ponens, are not valid from the informational perspective; the logic underlying informational logical consequence is not classical, and is at the least paraconsistent sensu lato; informational logical consequence is not a Tarskian logical consequence. Keywords: Logical consequence. Information. Probability. Semantics. Informational logical consequence. Non-classical logics. Paraconsistent logic 1 Introduction Logical consequence can be considered in different ways. In intuitive terms, it is often understood as a relation established between a given set of statements of a language and a statement of the same language. In logic, it is common to define it in a way that is close to intuitive, for example, as a type of relation between elements of the power-set of a nonempty set, as done by Feitosa and D‟Ottaviano (2004), amongst others. In this sense, logical consequência is as a relation between a given set (which could be empty or even infinite) of formulae of a language (usually formal), and a formula of the same language. The statements or formulae belonging to the given set are termed premises, and the other statement or formula is termed the conclusion. The role of the premises is to found, sustain, and support the conclusion. 2 In the thirties, Tarski (1956a; 1956b; 1956c; 1956d) introduced and improved his definition of consequence operator, or logical consequence, and proved the fundamental properties of it (cf. ALVES, 2012b). In Tarskian terms, given the language L of the (classical) propositional calculus, a consequence operator on the set of formulae of L, Form(L), is a function C: (Form(L))  (Form(L)) such that, for all ,   Form(L), it satisfies the following properties: (T1)   C(); (T2) If   , then C()  C(); (T3) C(C())  C(). The first property above is named reflexivity; the second is named monotonicity; and the third property is named transitivity. A logical consequence operator that satisfies these properties is usually named a Tarskian logical consequence. When suitably combined in a sequence, the set of premises and the conclusion constitute an argument. We can say that a conclusion is logical consequence of a set of premises if, and only if, the argument constituted by the union is logically valid, or, simply, valid. In semantic terms, the relation of consequence is usually defined starting from the degree of truth of the premises and conclusion: a formula is a logical consequence of a given set of formulae if, and only if, it is true under all circumstances (for example, valuation, structure, interpretation, model), such that all the premises are true. In this work, we propose a definition of logical consequence based on the quantity of information present in the set of premises and in the conclusion. As a starting point, we use the usual languages of classical propositional logic (CPL), as constructed by Shoenfield (1967), for example. In our theoretical approach to logical consequence, we do not consider the qualitative semantic aspects of information, considered in works such as those of Dretske (1986) and Gonzalez (1996). In our case, the informational value of a message or a formula depends only on its probability of occurrence, established using probability space. In the next section, we consider some elements of a usual axiomatic theory of probabilities, indicating some of its definitions and basic results, which will be used later; these concepts include the notions of event and random experiment. In the third section, we construct a probabilistic semantics for CPL; we establish a relation between the formulae of the CPL language and the events of a random experiment, from which we define a probability value for each formula of a given language. At the end of the section, we introduce the definitions of probabilistically valid and probabilistically equivalent formulae, and of probabilistic logical consequence. In the fourth section, we discuss the notion of quantity of information present in a formula of a CPL language; we propose a quantitative-informational definition of logical consequence, which we call informational logical consequence, and we demonstrate some of the results and properties that follow from this definition. In particular, we show the existence of arguments which are considered valid according to the classical perspective, but which are invalid from the informational perspective. For example, modus ponens is informationally invalid, given the possibility that the conclusion of this argument could possess a greater quantity of information than its set of premises. Furthermore, we show that the logic underlying informational logical consequence is not classical, but is, at the least, paraconsistent sensu lato. In addition, we demonstrate that although it might satisfy 3 the property of transitivity, informational logical consequence is neither reflexive nor monotonic; in other words, it is not a Tarskian logical consequence. In the final considerations, we summarize and analyze the main properties and results of informational logical consequence. Our approach is based on the quantitative concept of information, developed in the Mathematical Theory of Communication. One of the pioneers in studies of the quantification, storage, and transmission of information was Hartley (1928). He describes the quantity of information present in a source in terms of its number of possible messages. Later, Shannon (1948) further developed this idea, including new factors such as the effect of noise in the channel, possible economy in the transmission of information, and the possibility that messages might possess distinct quantities of information. From the quantitative perspective, there can only be information where there is doubt; this, in turn, requires the existence of alternatives, which presupposes the presence of choice, selection, and discrimination. For Hartley (1928), the information in a message is measured by the freedom of choice that someone has in selecting it, based on a source. According to Shannon and Weaver (1949, p. 8-9), "... information in communications theory relates not so much to what you do say, as to what you could say. That is, information is a measure of one‟s freedom of choice when one selects a message." In an unbiased toss of a coin, for example, there are two equally probable possibilities: heads or tails. In an unbiased throw of a dice, there are six possibilities. The degree of freedom of choice in the first case is less than in the second. In the case of the dice, we could say many more things than would be possible in the case of the coin. Hence, from the present perspective, the quantity of information present in the throw of the dice is greater than that present in the toss of the coin. For Hershberger (1955), information can be defined as a measure of the reduction of uncertainty. It is related to the unpredictability in a message or a source, with the emergence of an element that was absent prior to its occurrence. In the toss of the coin, the reduction of uncertainty is less than in the throw of the dice. The occurrence of an event in a source such as the example of the toss of a coin only eliminates one alternative, while in the throw of a dice five equally probable alternatives are eliminated. Information is also associated with notions such as those of order (or organization, in the terms of Bresciani Filho and D‟Ottaviano (1990)) and entropy, as discussed by Alves (2012a). The central interest in quantitative studies of information generally lies in measurement of the quantity of information in a source, rather than in particular messages. In a broad sense, a source can be characterized as a process that generates information. Its constituent elements can be understood as a finite set of events, of messages, of symbols that have a certain probability of occurrence. Discrete ergodic sources are those in which every element produced, in addition to being discrete, has the same statistical properties as any other, and its properties remain unaltered with time. Once the probabilities of occurrence of the elements are discovered, it becomes possible to predict its probability of occurrence at any moment, as occurs in the toss of a coin or the throw of a dice. According to Shannon and Weaver (1949), the quantity of information of the i-th message of a source F, denoted by Ii(F), is the numerical value defined by: Ii(F) =df log2 pi(F), where "pi(F)" denotes the probability of occurrence of the i-th message of F. 4 The quantity of information in a source F with n elements, denoted by HF, is defined by: HF =df  n i=1 pi(F) × Ii(F). If HC and HD denote the quantities of information in the throw of the coin and dice, respectively, then HC = 1 and HD ≈ 2.58. Since, in both cases, the events corresponding to the messages are equiprobable, it can be demonstrated that the quantity of information in each message in the source is equal to the quantity of information in the source itself. The greater the freedom of choice and reduction of uncertainty in a source, the more informative it is. Information reaches its maximum value in a source when all its messages have an equal chance of being selected. The informational value is zero when only one of them can occur. It is within this framework that in the next sections we propose our informational perspective of logical consequence. 2 Elements of an axiomatic theory of probability We employ probability in cases where two or more different results can occur in a given circumstance. This means that the result is not predictable (or is indeterminate), in the sense that it is not possible to previously determine which result might occur at a given moment. Probability theory, henceforth denoted P, studies random experiments. As a basis for P, we shall use the Zermelo-Fraenkel set theory with the Choice Axiom (ZFC), with the usual elementary arithmetic theory (cf. ENDERTON, 1977). Hence, the language (alphabet and definitions) and theorems of ZFC will also be considered elements of P. The only symbols belonging to the alphabet of P are the symbols Ai, for 0  i  n and i  N, where "A" represents the primitive concept of P known as happening, result or occurrence, used for the definition of random experiment. Definition 2.1 (Random experiment) a: A random experiment or random phenomenon, denoted by "Σ", is one that, repeated various times, presents different results or occurrences, called results of Σ or occurrences of Σ, denoted by "Ai(Σ)". b: The sample space of a random experiment Σ, denoted by "U(Σ)", is the set of all possible results of Σ. c: The number of elements of the sample space, denoted by "n(U(Σ))", in which n(U(Σ))  0 is finite, is the quantity of elements of "U(Σ)". d: A sample space is equiprobable when all its elements have the same chance of occurring. The toss of a coin, the throw of a dice, and the removal of a card from a pack are examples of random experiments. Their results would be the fall of the coin with one or other face upwards, the fall of the dice with one of the numbers from one to six upwards, and the withdrawal of one of the cards of the pack, respectively. Definition 2.2 (Event of a random experiment) a: An event of a random experiment Σ, denoted by "E(Σ)", is any subset of the sample space U(Σ), in other words, E(Σ)  U(Σ). 5 b: The number of elements of an event E(Σ), denoted by "n(E(Σ))", is the quantity of elements of U(Σ) belonging to E(Σ). c: An elementary event E of Σ is that where n(E(Σ)) = 1. d: The correct event E of Σ is that where n(E(Σ)) = n(U(Σ)). e: The impossible event E of Σ, denoted by "", is that where n(E(Σ)) = 0. f: A contingent event E of Σ is that where n() < n(E(Σ)) < n(U(Σ)). g: The event Ei of Σ, complementary to E(Σ), denoted by "Ē(Σ)", is defined by: Ē(Σ) =df {A  U(Σ) | A  E(Σ)}. h: The event E of Σ, the union of Ei(Σ) and Ej(Σ), denoted by "(Ei(Σ)  Ej(Σ))", is defined by: (Ei(Σ)  Ej(Σ)) =df {A  U(Σ) | A  Ei(Σ) ou A  Ej(Σ)}. i: The event E of Σ, the intersection of Ei(Σ) and Ej(Σ), denoted by "(Ei(Σ)  Ej(Σ))", is defined by: (Ei(Σ)  Ej(Σ)) =df {A | A  Ei(Σ) e A  Ej(Σ)}. Henceforth, when there is no risk of ambiguity, we shall remove the references to Σ between parentheses of the notations. Hence, instead of A(Σ), U(Σ), or n(U(Σ)), we shall use only A, U, or n(U), respectively. Example 2.3 (Random experiments) Σi U(Σi) E(Σi) n(E(Σi)) Σ1 (Toss of coin) {H, T} E1(Σ1): {H} (Fall head) E2(Σ1): {T} (Fall tail) 1 1 Σ2 (Toss of biased coin) {H1, H2, H3, T} E1(Σ2): {H1, H2, H3} (Fall head) E2(Σ2): {T} (Fall tail) 3 1 Σ3 (Throw of dice) {1, 2, 3, 4, 5, 6} E1(Σ3): {2, 4, 6} (Fall even) E2(Σ3):  (Fall head) 3 0 In the above example, a "model" is proposed for each random experiment, its sample space, and some of its events, associating them with entities of a "world". In the third column, in parentheses, we give the common name for each event in order to express the results that comprise it. Although our theoretical approach does not consider qualitative elements of information, the examples suggested throughout this paper involve content, in order to aid understanding. Meanwhile, the same results could have been obtained using a purely formal mode of construction. The notion of random experiment can be compared with that of a discrete ergodic source of information, as indicated in the Introduction section of this work. In a way that is similar to this type of source, the elements belonging to a random experiment must be previously defined with precision. Furthermore, every sample space considered, in addition to being finite, must be equiprobable, and the probability values of the events must be given 6 and fixed. This will enable us to determine with precision the existence of the relation of logical consequence between formulae (as shown in Section 4). Definition 2.4 (Probability of an event) The probability of occurrence of an event E in the random experiment Σ with an equiprobable sample space U(Σ), denoted by "p(E(Σ))", is the numerical value defined by: p(E(Σ)) = df ))n(U( ))n(E(   . The probability function, p, provides events of a random experiment with values between 0 and 1; in other words, p: E(Σ)  [0,1]  Q. The probability value of an event E(Σ) is given by p(E(Σ)). Example 2.5 (Probability of the events of Example 2.3) E E1(Σ1) E2(Σ1) E1(Σ2) E2(Σ2) E1(Σ3) E2(Σ3) E1(Σ3)  E2(Σ3) E1(Σ2)  E2(Σ2) Ē1(Σ2) p(E) 1⁄2 1⁄2 3⁄4 1⁄4 1⁄2 0 1⁄2 0 1⁄4 Having constituted the basic elements of the language of P, we now describe its axioms and some of its elementary results. The axioms for P are as follows: (AxP1): p(E)  0, for every E  U (AxP2): p(Ei  Ej) = p(Ei) + p(Ej) – p(Ei  Ej) (AxP3): p(E  Ē) = 1 In the next theorem, some results concerning the probability of events are stated, being important to the continuation of this paper. For simplicity, we do not present the proof of this, and some other theorems. Their proofs can be found in Alves (2012b). Theorem 2.6 a: p(E)  1, for every E  U. b: p(U) = 1. c: p() = 0. d: n(Ei  Ej) = 0 ⇒ p(Ei  Ej) = p(Ei) + p(Ej). e: p(E  Ē) = 0. f: p(Ē) = 1 – p(E). g: mi=1 p(Ei) = 1, for Ei = {Ai}, for 1  i  m, with U = {A1, ..., Am}. ■ Based on P, in what follows we develop a probabilistic semantics for classical propositional logic (CPL). 3 A probabilistic semantics for languages of CPL We call this perspective the probabilistic semantics for CPL (henceforth, SP). As shown by Alves (2012b), the behavior of SP is not strictly equivalent to the behavior of the 7 usual classical veritative-functional semantics (henceforth, SV), as developed by Tarski (1956a), Mendelson (1964), Shoenfield (1967), and Mates (1972), amongst others. We associate the formulae of a CPL language, denoted by "L", with the events of a random experiment, Σ. We define some of the notions that are fundamental to the objectives of this paper, and describe some results characteristic of SP. The expression "Form(L)" denotes the set of formulae of a language L, the letters "", "ψ", and "γ" are metalinguistic variables that represent elements of Form(L); "P0", "P1", "P2" etc are the atomic formulae of L, and "" represents any finite subset of Form(L). We adopt negation and disjunction as primitive logical connectives of L. Definition 3.1 (Situation for a language L) A function f is a Σ-situation for L, or simply a situation, denoted by "f(Σ)", if (Σ): Form(L)  (U(Σ)), such that: a: If  is atomic, then (Σ)() = E(Σ), defined by f itself; b: If  is of the form ψ, then (Σ)() = (Σ)(ψ); c: If  is of the form ψ  γ, then (Σ)() = (Σ)(ψ)  (Σ)(γ). A situation for L consists of an attribution of a single event of a given random experiment to each well-formed formula of L, according to a function f. The fact that a situation is defined using a function enables us to avoid ambiguities in the next definitions, especially in the case of informational logical consequence. To say that f is an Σ-situation for L is the same as saying that the random experiment Σ is an f-structure for L. Although each formula of L is associated with a single event in a given situation f(Σ), distinct formulae can be associated with the same event in f(Σ). This always occurs, given that the set of formulae of any L is infinite, in contrast to the number of events of a random experiment, which is always finite. Definition 3.2 (Probability of formulae according to a situation) The probability function of a formula  according to f(Σ), denoted by "P(f(Σ))", is defined as follows, where "p" is the probability function concerning events as defined in P: a: If  is atomic, P(f(Σ))() =df p((Σ)()); b: If  is of the form ψ, P(f(Σ))() =df p((Σ)(ψ)); c: If  is of the form ψ  γ, P(f(Σ))() =df p((Σ)(ψ)  (Σ)(γ)). It can be seen that P(f(Σ)): Form(L)  [0,1]  Q, such that p((Σ))()) is the image of  according to P(f(Σ)). The probability value of the formula  according to f(Σ) is given by P(f(Σ))(), a rational number between 0 and 1. It is the probability value of the event of Σ corresponding to , according to f(Σ). It can be shown that P(f(Σ)) satisfies the properties described in Theorem 2.6, interpreted in the light of SP. Henceforth, when we say "for every ", we mean "for every situation f(Σ), given Σ". In addition, when there is no possibility of ambiguity or imprecision, we shall use "P()" as an abbreviation for "P((Σ))()", "p()" as an abbreviation for "p((Σ)()", "f()" as an abbreviation for "f(Σ)()", and "" as an abbreviation for "(Σ)". 8 In accordance with the usual definitions in L, we have: a: f(  ψ) = f((  ψ)) d: P(  ψ) = P((  ψ)) b: f(  ψ) = f(  ψ) e: P(  ψ) = P(  ψ) c: f(  ψ) = f((  ψ)  (ψ  )) f: P(  ψ) = P((  ψ)  (ψ  )). Example 3.3: (Probability of formulae in Σ1 and Σ2, for U(Σ1) = {1,2,3,4,5,6} and U(Σ2) = {H, T})  f(Σ1)() P(f(Σ1))() f(Σ2)() P(f(Σ2))() P1 {2,4,6} 1⁄2 {H} 1⁄2 P2 {1,3,5} 1⁄2 {T} 1⁄2 P3 {1,2,3,5} 2/3  0 P4 {1} 1/6  0 P5  0  0 P1  P2  0  0 P1  P3 {2} 1/6  0 P2  P3 {1,3,5} 1⁄2  0 P1  P3 {U} 1 {H} 1⁄2 (P1  P3)  0 {T} 1⁄2 P1  P2 {1,3,5} 1⁄2 {T} 1⁄2 Theorem 3.4: For every , we have: a: P() = 1 – P(). b: f(  ψ) = ()  (ψ). c: P(  ψ) = p(()  (ψ)). d: P(  ψ) = p(()  (ψ)). ■ Definition 3.5 (Inconsistency, validity, and contingency of formulae of L in SP) a: A formula  is probabilistically inconsistent or probabilistically contradictory, which is denoted by "P", if, for every f(Σ), P(f(Σ))() = 0. b: A formula  is probabilistically valid or probabilistically tautological, which is denoted by "⊤P", if, for every f(Σ), P(f(Σ))() = 1. c: A formula  is probabilistically contingent if it is neither probabilistically contradictory nor probabilistically valid. Definition 3.6 (Probabilistically equivalent formulae) Two formulae,  and ψ, are probabilistically equivalent, which is denoted by " ≡ ψ", if, for every situation f(Σ), P(f(Σ))() = P(f(Σ))(ψ). Definition 3.7 (Probability value of a set of formulae) Let  = {1, ..., n}  Form(L). The probability value of  according to f(Σ), denoted by "P * (f(Σ))()", is defined by: P * (f(Σ))() =df P(f(Σ))(1  ...  n). 9 If  = , then P * (f(Σ))() = 1. When  is infinite, P * (f(Σ))() is indefinite. It can be seen that P * (f(Σ)): (Form(L))  [0,1]  Q, such that P*((Σ))()) is the image of  according to P * (f(Σ)). The probability value of the set of formulae  according to f(Σ) is given by P * (f(Σ))(), a rational number between 0 and 1. When there is no risk of ambiguity or imprecision, we shall write P() instead of P * (f(Σ))(). According to Definition 3.7, the probability value of a given set of formulae of L in a situation f is, in the final analysis, defined from the probability value of the intersection of the events associated with the elements of . In other words, by Definition 3.7, and Theorem 3.4c, we have that P * (f(Σ))() = P(f(Σ))(1  ...  n) = p((Σ)(1)  ...  (Σ)(n)). Given the definitions in question, we can say that f(Σ)() = f(Σ)(1  ...  n). Strictly speaking, it is not possible to define the probability value of an empty set of formulae. Definition 3.2, which forms the basis of Definition 3.7, only applies to the probability values of formulae of language L. But  is not a formula of L. Although a formula might be associated with the empty event, with its probability value being defined as zero, it makes no sense to say that an empty formula can have any probability value. In order not to leave the empty set without a definition of the probability value, it was decided to define it arbitrarily in the way described above. This decision can be justified as follows: in this case, to say that P() ≠ 1 would signify the existence of some i   such that P(i) ≠ 1. Since there are no formulae in  with this value, given the inexistence of formulae in , then P() = 1. Definition 3.8 (Probabilistic logical consequence) A formula  is probabilistic logical consequence of a set  of formulae, which is denoted by " ⊨P ", if, for every situation f(Σ), P(f(Σ))() ≤ P(f(Σ))(). When  = , instead of  ⊨P  we simply write ⊨P . The expression " ⊭P " denotes that  is not probabilistic logical consequence of . The formulae of  are called premises and  is termed conclusion. Despite possessing certain specific characteristics, SP is a semantics for CPL. The formulae considered valid by SP are exactly the same as those considered valid by SV. Furthermore,  ⊨P  if, and only if,  ⊨V . The similarities and differences between SP and SV are described by Alves (2012b). In the following discussion, we address the notion of informational logical consequence. 4 Informational logical consequence In this section, we propose a quantitative-informational definition of logical consequence and present some of its properties. We first introduce notions such as the quantity of information present in a formula of L and in a set of formulae, based on the notion of quantity of information developed by Shannon and Weaver (1949). Definition 4.1 (Quantity of information of a formula according to a situation) The quantity of information or informational value of a formula  of L according to a situation f(Σ), denoted by "I(f(Σ))()", is the numerical value defined by: I(f(Σ))() =df – log2 P(f(Σ))(). 10 When P(f(Σ))() = 0, we define that log2 0 = 0, in other words, I(f(Σ))() = 0. When there is no risk of ambiguity or imprecision, we shall use "I()" instead of "I(f(Σ))". It can be shown that I(f(Σ)): Form(L)  Q+. Example 4.2 (Quantity of information in formulae based on the two situations of Example 3.3)  f(Σ1)() I(f(Σ1)) f(Σ2)() I(f(Σ2)) P1 {2,4,6} 1 {H} 1 P2 {1,3,5} 1 {T} 1 P3 {1,2,3,5} 0.58  0 P4 {1} 2.58  0 P1 {1,3,5} 1 {T} 1 P1  P3 {2} 2.45  0 P1  P3 U 0 {H} 1 P1  P2 {1,3,5} 1 {T} 1 P1  P3 {1,2,3,5} 0.58 {T} 1 P2  P3 U 0 {H} 1 Definition 4.3 (Quantity of information in a set of formulae) Let  = {1, ..., n}  Form(L). The informational value of  (quantity of information in ), according to f(Σ), denoted by "I*(f(Σ))()", is defined by: I * (f(Σ))() =df – log2 P * (f(Σ))(). When there is no risk of ambiguity or imprecision, we shall use I() instead of I * (f(Σ))(). Although the domain of this function may be different from that of the information function concerning formulae, its image set is the same; in other words, I * (f(Σ)): (Form(L))  Q+. Definition 4.4 (Informational logical consequence) A formula  is informational logical consequence of a set  of sentences, which is denoted by " ⊫ ", if, for every f(Σ), I(f(Σ))() ≥ I(f(Σ))(). When  = , instead of " ⊫ ", we simply write "⊫ "; "⊯ " denotes that  is not informational logical consequence of . The formulae of  are called premises and  is termed conclusion. According to the above definition, a formula is informational logical consequence of a given set of formulae if, and only if, the quantity of information present in the conclusion is never greater than the quantity of information in the premises. In probabilistic logical consequence, the relation is inverse. Theorem 4.5 a:   ψ ⊫   ψ. b:   ψ ⊫ (  ψ). c:   ψ ⊫ (  ψ). d:   ψ ⊫ ψ  . e:   ψ ⊫ ψ  . f:  ⊫ . 11 g:    ⊫ . h:    ⊫ . i: ⊤P   ⊫ . j: ⊤P   ⊫ ⊤P. k: P   ⊫ P. l: P   ⊫ . m: ⊤P ⊫ P. n:  ⊫ . ■ Since the quantity of information in the premise and conclusion of each one of the items of the above theorem is the same for each given situation, the reciprocal of each one of the items is also valid, such that one formula is informational logical consequence of the other. The first three items of the above theorem show that the definitions of one connective, obtained from others, are maintained in informational logical consequence. The first item shows that the notion of implication presupposed here is that of material implication. The fourth and sixth items show, respectively, that the logic underlying informational logical consequence is neither temporal logic nor intuitionistic logic. Theorem 4.6 a: ⊫  ⇔ I() = 0, for every f. b: ⊫ ⇔  ⊫ , for every . c: I() = 0, for every f, and  ⊫  ⇒ ⊫ . d: ⊨P  ⇒ ⊫ . ■ Proposition 4.7 a:   ψ ⊯ . b:   ψ,  ⊯ ψ. c:   ψ, ψ ⊯ . d:   ψ, ψ  γ ⊯   γ. e:  ⊯ ψ  . f:  ⊯   . g:   ,    ⊯   . ■ The above proposition shows that the rules of inference of a large part of formal logical systems are not valid in informational terms. The second item is the rule of modus ponens, adopted in systems such as that of Mendelson (1967); the last two items are the expansion rule and the cut rule, adopted by Shoenfield (1969). The same can be said when the above items are treated as arguments. Arguments that are traditionally considered valid can present more information in the conclusion than in the set of premises, which means that, according to the perspective in question, they act to amplify information. As shown by Alves (2012b), the invalidity of these rules of inference or arguments is generally due to the possibility of the set of premises being informationally empty in a given situation. In the case of the expansion rule, when, in a given situation Σ, P() = 0 and I() ≠ 0, then I() < I(  ). Hence,  ⊯   . While the premise is informationally empty, the conclusion produces novelty and reduces uncertainty. For example, in the game of dice, the sentence "it fell on number seven", for which the probability of occurrence is zero, would not reduce the uncertainty about what occurred in the game. Meanwhile, the sentence "it fell on an even number or on number seven" possesses a quantity of information that is greater than zero, since it reduces uncertainty: the dice could have fallen on numbers two, four, or six, eliminating the possibility of having fallen on an odd number. Thus, in the conclusion, there is something that did not exist in the premise. There was an informational gain, given that the quantity of information in the premise was null. 12 In the case of modus ponens, the situation f(Σ1) of Example 4.2 above provides an example in which the informational value of the premises, interpreted as "if it falls on evens, then it falls on odds" and "it falls on evens", is less than the informational value of the conclusion, interpreted as "it falls on odds". Here, the informational value of the premises is null. Since the events "fall on odds" and "fall on evens" are mutually exclusive, the sentence "if it falls on evens, then it falls on odds" is equivalent to "it falls on odds or it falls on odds", which is equivalent to "it falls on odds". The probability of the set of premises is therefore defined from the union of "it falls on odds" and "it falls on evens", equivalent to "it falls on odds and does not fall on odds", for which the probability is zero. Meanwhile, the conclusion possesses a quantity of information that is greater than zero, since it possesses an informational value that is greater than the value of the set of premises. The example discussed in the preceding paragraph, which is an individual case of Theorem 4.8b, outlined below, seems to fit an intuitive notion of informational logical consequence. Intuitively, "fall on odd and not fall on odd" provides no information concerning a circumstance. Hence, any probabilistically contingent conclusion can contain more information than is contained by the premises. Several steps of the demonstrations of the following theorems have been omitted. These steps, indicated by "TPP", refer to theorems previously proved in Alves (2012b). Theorem 4.8: Let  = {1, ..., n}. Then: a:  ⊫  if and only if I() = 0, for every f, or (1  ...  n) ≡ ; b: If I()  0, for a given f, then P ⊯ ; c:  ⊫⊤P; d:  ⊫P; e: If I()  0, for a given f, then ⊤P ⊯ . Proof a: (⇒): Let  ⊫ . Suppose that I()  0, for a given f, and (1  ...  n) ≢ . It is then possible to show the existence of f‟ such that I(f‟)()  I(f‟)(), contradicting the initial hypothesis: I(f‟)() = 0 and I(f‟)() = I(f)(). Hence, when  ⊫  we have that if I()  0, for a given f, then (1  ...  n) ≡ , in other words, I() = 0, for every f, or (1  ...  n) ≡ . (⇐): Case 1: Let I() = 0, for every f. Then, by TPP, I()  I(), for every f and every . So, by Definition 4.4,  ⊫ . Case 2: By Definition 3.6, (1  ...  n) ≡  if and only if P(1  ...  n) = P(), for every f. By Definition 3.7, P(1  ...  n) = P(), for every f if and only if P() = P(), for every f. Then, by TPP, I() = I(), for every f. So, by Definition 4.4,  ⊫ . b: Let P() = 1⁄2. Then I(P) = 0  I() = 1. c: Since I(⊤P) = 0, then I() ≥ I(⊤P). d: Since I(P) = 0, then I() ≥ I(P). e: Let P() = 1⁄2. Then I(⊤P) = 0  I() = 1. ■ The first item above describes the arguments that are valid according to the informational perspective of logical consequence. The second item explains that a probabilistically contradictory formula cannot lead informationally to any informative formula. In fact, only formulae that are probabilistically 13 valid or invalid, in other words not informative, are informational logical consequence of contradictory formulae. Presupposing the distinction between classical and non-classical systems, as suggested by Da Costa (1993; 1997), or by Haack (1978), we can conclude from the second item of the above theorem that classical logic is not the logic underlying informational logical consequence. This is because, in classical formal logical systems, a contradiction generates any formula. Considering that complementary logics, such as modal logic, retains the same principles of classical logic, we can also conclude that no complementary logic underlies informational logical consequence. The remaining candidates are heterodox logics, such as intuitionistic and paraconsistent systems, as described by D‟Ottaviano (1992). In intuitionistic logics, negation possesses certain particular characteristics. Such characteristics do not permit, for example, recourse to proofs employing reduction to the absurd, given that formulae such as    are not valid in these systems. Meanwhile, it can be shown that  ⊫  and  ⊫  and, from Theorem 4.9b below, we have ⊫   . Therefore, intuitionistic logic cannot provide a basis for informational logical consequence. Theorem 4.8, especially the second item, indicates that the logic underlying informational logical consequence is, at least, paraconsistent sensu lato. This is because this notion of consequence does not permit, for example, the Ex Falso Quodlibet, also known as the Explosion Principle, such that from a contradiction does not follow any formula. The third item of the above theorem is shared by both probabilistic and veritativefunctional logical consequence, but for different reasons. Informationally, a probabilistically valid formula is logical consequence of any formula, because its value is minimal; hence, it cannot provide more information than any other formula. In probabilistic and veritative-functional terms, it is logical consequence of any formula because its value is maximal; in other words, it possesses the probabilistic value „1‟ or the true value „V‟. The fourth item, like the second, is not valid in the other two notions of logical consequence considered here. In these notions, an inconsistent formula is logical consequence solely of a contradictory set of formulae. In the informational version, it is logical consequence of any set of formulae. In informational terms, there is no distinction between valid and inconsistent formulae. The fifth result expresses a similarity between informational logical consequence and the other notions of consequence: contingent formulae are not logical consequence of valid formulae. Theorem 4.9 a: If  ⊫ ψ, and ψ ⊫ γ, then  ⊫ γ. b: If  ≡ ψ, then ⊫   ψ. c: Is not the case that: ,  ⊫ ψ if and only if  ⊫   ψ. d: Is not the case that:⊫  e ⊫ ψ if and only if  ≡ ψ. e: Is not the case that: ⊫  if and only if f()  f(), for every f. f: Is not the case that: ⊫  if and only if  ⊨V . Proof 14 a: Let  ⊫ ψ, and ψ ⊫ γ. By Definition 4.4, I() ≥ I(ψ) ≥ I(γ), for every f. Then, by TPP, I() ≥ I(γ), for every f. So, by Definition 4.4,  ⊫ γ. b: By TPP, if  ≡ ψ, then   ψ is ⊤P. And, by TPP, if   ψ is ⊤P, then ⊨P   ψ. So, by Theorem 4.6d, ⊫   ψ. c: (⇏): Let  = . P ⊯ ψ, but ⊫P  ψ. (⇍): ⊯ ψ  P, but ψ ⊫P. d: (⇏): ⊫P e ⊫⊤P, but P ≢ ⊤P. (⇍):   ψ ≡   ψ, but ⊨P   ψ and ⊨P   ψ do not occur. e: (⇏): ⊤P ⊫P, but f(⊤P) = U ⊈ f(P) = . (⇍): Let f() ≠ . Then f(P)  f(), but P ⊯ . f: (⇏):    ⊫   , but    ⊭V   . (⇍): ψ ⊨V   ψ, but ψ ⊯   ψ. ■ The above theorem expresses some of the characteristic properties of informational logical consequence, when compared to the probabilistic and veritative-functional versions. In contrast to the latter two versions, the reciprocal of Theorem 4.9b, especially, cannot be shown for the informational version; in other words, is not the case that if ⊫   ψ, then  ≡ ψ, given that, for example, ⊫P  ⊤P, but P ≢⊤P. Theorem 4.9c shows the invalidity of the corresponding semantics of the Deduction Theorem. The fourth and fifth items illustrate some of the innate characteristics of informational logical consequence, as discussed in the final considerations (below). Theorem 4.9f expresses a distinction between the relations of informational and veritativefunctional logical consequence. We show below that informational logical consequence is not a Tarskian logical consequence, according to the definition set out in the introduction of this paper. Theorem 4.10 a: Is not the case that: if   , then  ⊫ . b: Is not the case that: if    and  ⊫ , then  ⊫ . c: If  ⊫ ψ, for each ψ  , and  ⊫ , then  ⊫ . Proof: Let f be any situation. a: Let P() = 0 and P() = 1⁄2. Then, by TPP, I()  I(). b: Let  = {},  = {, } and I()  0. Then, by TPP,  ⊫  and  ⊯ . c: Case 1: Let I() = 0. Then, by Hypothesis, I(ψ) = 0, for each ψ  . By TPP, I() = 0. Thus, by Hypothesis, I() = 0. So, I() = I(). Case 2: Let I()  0. By TPP, and by Hypothesis, P()  P(ψ), for each ψ  . In this case, it can be shown that f‟()  f‟()  f‟(), for every f‟. Then, by TPP and Definition 3.8, P()  P()  P(). So, by TPP, I()  I(). ■ Thus, informational logical consequence is neither reflexive nor monotonic, although it may satisfy the condition of transitivity. According to our objectives in this paper, we finish our presentation of informational logical consequence. Others results and properties can be found in Alves (2012b). Final considerations 15 The shift of perspective in the analysis of the notion of logical consequence, from the true value to the quantity of information of formulae, enables characteristics to emerge that are uniquely attributable to this approach. Alves (2012b) presents some of the main similarities and differences between this concept and the usual perspective whereby logical consequence is defined in terms of maintenance of the truth of the premises for the conclusion of an argument. In what follows, we highlight the main results obtained from the elements presented in this paper. (FC1) Informationally empty formulae are informational logical consequence of the empty set (Theorem 4.6a). This means that probabilistically valid and contradictory formulae are self-sustained, which illustrates a first difference between, on one hand, the veritative-functional and probabilistic versions and, on the other hand, the informational version of logical consequence. In these two versions, a contradiction is generally not a logical consequence of a given set of premises. (FC2) The formulae that are informational logical consequence of a given set of formulae whose quantity of information is always null are informational logical consequence of the empty set (Theorem 4.6c). This result also illustrates an inherent characteristic of the informational perspective of logical consequence. From this, it follows that if a formula is logical consequence of a contradictory set of formulae, it is informational logical consequence of the empty set. This does not generally hold in the case of the veritative-functional and probabilistic perspectives of logical consequence. (FC3) Some of the rules of inference of classical formal logical systems, and some of the arguments traditionally considered valid, do not possess general validity in the informational perspective of logical consequence (Proposition 4.7). The cases in which the set of premises possesses null information provide examples showing that the conclusion can be more informative than the set of premises in these rules or arguments. Meanwhile, it can be shown that when the quantity of information in the set of premises is greater than zero, the quantity of information in the conclusion is always smaller than the quantity of information in the set of premises. This result indicates that according to the veritative-functional perspective, the conclusion of a valid argument can possess more information than its set of premises. This seems to support the conception that in a valid argument, the information in the conclusion is already implicitly or explicitly given in the premises. We leave it open for future work to investigate the nature of the information underlying the veritative-functional perspective, as well as the association between the true and probability values and the sentences involved in a logical consequence relation. We believe that this analysis should resolve, at least partially, the strangeness indicated in this paragraph. In the informational and veritative-functional perspectives, amplifying inductive arguments, where the conclusion is more informative than the set of premises, are invalid. Meanwhile, such arguments may be considered interesting in some areas of knowledge, given that they amplify information. The theorem in question also shows that the informational perspective of logical consequence is not equivalent to the veritative-functional and probabilistic perspectives. There are formulae that are the veritative-functional and probabilistic logical consequence of a given set of formulae, but are not informational logical consequence of it. On the other hand, as already shown from Theorems 4.6d or 4.8d, some formulae can be informational logical consequence of a given set of formulae, but cannot be logical consequence from the probabilistic or veritative-functional points of view. 16 (FC4) A formula is only informational logical consequence of a given set of formulae if that set of premises is probabilistically equivalent to the conclusion, or if the conclusion is informationally null (Theorem 4.8). This result shows that a probabilistically contradictory formula can be informational consequence of a given set of informative formulae. As an example:  ⊫   . If, on one hand, informational logical consequence is distinct from traditional classical deductive logical consequence, on the other hand, it should not be considered to be an inductive inference. This is because a characteristic of induction is that it permits the conclusion to possess more information than its set of premises. (FC5) The logic underlying informational logical consequence is, at the least, paraconsistent sensu lato (Theorem 4.8b). One of the motives for arriving at this conclusion is that in paraconsistent logics, the principle of explosion is not valid: from a contradiction does not follow any formula. A proposal for future work is to analyze the elementary characteristics of a paraconsistent system, and show that they are satisfied by informational logical consequence. The logic underlying informational logical consequence is not classical logic, since in classical formal logical systems a contradiction generates any formula. Consequently, the complementary logics are also unable to provide a basis for informational logical consequence, given that they satisfy the principles of classical logic. Heterodox logics, such as the intuitionistic and temporal systems, are also unable to fulfill this role, as shown in the commentaries on Theorems 4.8b and 4.5d. From Theorem 4.8b, we find that, in contrast to the veritative-functional and probabilistic perspectives, in formal logical systems that adopt informational logical consequence, the inconsistency of a given theory does not imply its triviality, as shown by Alves (2012b). From the point of view of formal logical systems such as the classical ones, this signifies that any formula would be considered a theorem of the given theory. Consequently, from the theorem of completeness, for these systems we have ⊧V ψ and ⊧P ψ, for any ψ. Meanwhile, since in informational logical consequence, in general, P ⊯ ψ, it is not possible to conclude that the inconsistency of a theory implies its triviality. (FC6) The corresponding semantics of the Theorem of Deduction is not valid in the informational perspective of logical consequence (Theorem 4.9c). The last three items of Theorem 4.9 illustrate other significant results specific to the informational perspective. In the veritative-functional and probabilistic perspectives, if two formulae are logical consequence of an empty set of inferences, then they are equivalent, as they are both valid. In the informational version, as expressed in Theorem 4.9d, this does not possess general validity. A probabilistically valid formula and another that is probabilistically inconsistent provide an example to demonstrate the invalidity of this result. Theorem 4.9d illustrates another specificity of informational logical consequence. This result can be easily shown for the probabilistic version, as reported by Alves (2012b). In the case of the veritative-functional version, it is necessary to adapt the right hand side of the result: the true value of the set of premises is smaller than the true value of the conclusion, in all evaluations. This signifies that it is never possible that all the premises can be true while the conclusion is false in the same single evaluation. Finally, the last item of the theorem states that informational and veritative-functional logical consequence cannot be satisfied by the same sets of sentences. (FC7) Informational logical consequence is not a Tarskian logical consequence (Theorem 4.10). A large part of the characteristics specific to the informational perspective of logical consequence, such as that which refers to non-monotonicity, derives from cases 17 in which the quantity of information is null. The specific difference of this perspective is found in cases that concern extreme probability values, whether in individual situations or in all situations. While this may be the basic difference, it should not be considered a small difference. It produces results that might be considered discrepant, when compared to the traditional perspectives of logical consequence. Amongst these, we recall that some rules of inference do not constitute valid arguments, and that not every formula is informational logical consequence of a contradictory set of formulae. Furthermore, it shows that logical consequence, when analyzed from the informational viewpoint in question, ceases to be Tarskian, and that the logic underlying this perspective is not classical, but at the very least paraconsistent. (FC8) The set of premises of an informational inference is always finite, and the sample space is constituted of a finite set of events. These two characteristics represent serious restrictions in our proposal. The first restriction indicates the impossibility of dealing with arguments using a potentially infinite set of premises, such as the arguments, described by Tarski (1956b). This author claims to have constructed a theory in which sentences of the type "n possesses the property P", for natural n, are theorems of the theory, and the sentence "every natural number possesses the property P" cannot be proved in the theory. Hence, this sentence is not their logical consequence, which in this case seems absurd. The second restriction limits the possible models for a language of a formal system. A proposal for future work would be to consider informational logical consequence based on a definition that involves an infinite probability space. It would then become viable to analyze informational logical consequence for languages of first-order theories, which has not been addressed in the present work. References ALVES, M. A. Informação e conteúdo informacional: notas para um estudo da ação. In: GONZALEZ, M. E. Q; BROENS, M. C. (Orgs). Informação, Conhecimento e Ação Ética. São Paulo: Cultura Acadêmica, p. 98-112, 2012a. ______. Lógica e Informação: Uma Análise da Consequência Lógica a Partir de uma Perspectiva Quantitativa da Informação. Campinas/SP: Universidade Estadual de Campinas, 2012b (Doctorate thesis). BRESCIANI FILHO, E; D‟OTTAVIANO, I. M. L. Conceitos básicos de sistêmica. In: D‟OTTAVIANO, I. M. L; GONZALES, M. E. Q. (Orgs.). Auto-organização: Estudos Interdisciplinares. Campinas: Universidade Estadual de Campinas/CLE, p. 283-306, 1990 (Coleção CLE, v. 30). DA COSTA, N. C. A. Lógica Indutiva e Probabilidade. 2 nd ed. São Paulo: Ed. HUCITEC, 1993. ______. Logiques Classiques et Non Classiques: Essai sur les Fundements de la Logique. Paris: Masson, 1997. D‟OTTAVIANO, I. M. L. A lógica clássica e o surgimento das lógicas não-clássicas. In: ÉVORA, F. R. R. (Org.). Século XIX: O Nascimento da Ciência Contemporânea. Campinas: Universidade Estadual de Campinas/CLE, p.65-93, 1992 (Coleção CLE, v. 11). DRETSKE, F. Knowledge and the Flow of Information. Oxford: Basil Blackwell, 1981. 18 ENDERTON, H. B. Elements of set theory. San Diego: Academic Press, 1977. GONZALEZ, M. E. Q. Informação e cognição: uma proposta de (dis)solução do problema mente-corpo. In: Encontro Brasileiro/Internacional de Ciências Cognitivas, 2, 1996, Campos dos Goytacazes. Anais... Campos de Goytacazes: Universidade Estadual do Norte Fluminense, p. 53-60, 1996. FEITOSA, H. A; D‟OTTAVIANO, I. M. L. Um olhar algébrico sobre as traduções intuicionistas. In: SAUTTER, F. T.; FEITOSA, H. A. (Orgs.) Lógica: Teoria, Aplicações e Reflexões. Campinas: Universidade Estadual de Campinas/CLE, p. 59-90, 2004 (Coleção CLE, v. 39). HAACK, S. Philosophy of logics. Cambridge: Cambridge University Press, 1978. HARTLEY, R. Transmission of information. Bell System Technical Journal, v. 7, p. 535563, 1927. HERSHBERGER, W. Principles of Communication Systems. New York: Prentice-Hall, 1955. MATES, B. Elementary logic. New York: Oxford University Press, Inc., 1972. MENDELSON, E. Introduction to Mathematical Logic. Princeton, NJ: D. Van Nostrand, 1964. SHANNON, C. A mathematical theory of information. The Bell System Technical Journal, vol. 27, p. 379-423, 1948. ______; WEAVER, W. The Mathematical Theory of Information. Urbana: University of Illinois Press, 1949. SHOENFIELD, J. Mathematical Logic. Reading, Mass.: Addison Wesley Publishing Company, 1967. TARSKI, A. On some fundamental concepts of metamathematics. In: Tarski. A. (ed.) Logic, semantics, metamathematics: papers from 1923 to 1938. Translated by J. H. Woodger, pp. 30-37. Oxford: Clarendon Press, 1956a ______. Fundamental concepts of methodology of deductive sciences. In: Tarski. A. (ed.) Logic, semantics, metamathematics: papers from 1923 to 1938. Translated by J. H. Woodger, pp. 60-109. Oxford: Clarendon Press, 1956b. ______. Concept of truth in formalized languages. In: Tarski. A. (ed.) Logic, semantics, metamathematics: papers from 1923 to 1938. Translated by J. H. Woodger, pp. 152-278. Oxford: Clarendon Press, 1956c. ______. On the concept of logical consequence. In: Tarski. A. (ed.) Logic, semantics, metamathematics: papers from 1923 to 1938. Translated by J. H. Woodger, pp. 409-420. Oxford: Clarendon Press, 1956d.