The desire to be puzzled

A prominent professor in the philosophy of mathematics once told me that the key to writing an attractive philosophy paper is to present the reader with a puzzle. “Give me a puzzle, and I’ll be interested”, he said. As I was surrounded by mathematicians and philosophers of mathematics which were steadily exchanging puzzles, I had no doubt that he was right: mathematicians and philosophers of mathematics like puzzles. But then, mightn’t it be the case that this fondness of puzzles influences much more than just our judgment of a philosophy paper (and our conversations over dinner)? Here’s a crazy idea – or maybe not so crazy – does our desire to be puzzled affect our judgement of a certain foundational mathematical theory?

The foundational mathematical theory which I have in mind is, of course, Cantor’s transfinite set theory. Given its general acceptance nowadays, it is easy to forget that in order to generalize arithmetic from the finite to the infinite, Cantor’s theory is not inevitable, and in fact is based on an objectionable conceptual choice: it requires that we give up the principle that a whole is always bigger than each of its proper parts. Since the notion of `size’ of a collection (or: set) is in Cantor’s theory defined in terms of one-one correspondence, the collection of natural numbers has the same size, i.e. is  `just as big’, as, for example, the collection of all even numbers, even though the latter is a proper part of the former.

As is finely described in Mancosu’s (2009) paper Measuring the size of infinite collections of natural numbers, throughout history two basic intuitions have been at play concerning the ‘size’ of collections:

PW (Part-Whole, or Euclid’s axiom): every whole is strictly bigger than each of its proper parts;

HP (Hume’s principle, or Cantor’s axiom): two collections have the same size iff there is a one-to-one association between their elements.

For finite collections, PW and HP are both obviously true; the problem is that for infinite collections they turn out to be inconsistent. We can only hold on to both intuitions if we deny that the relations of equality, less than, and greater than, apply to infinite collections (this was Galileo’s solution), or that infinite collections can be taken as a whole to which a size can be attributed (as held Leibniz); and this is nowadays commonly considered to be too high a price.

Cantor himself acknowledged that there are two intuitions at play concerning the notion of `size’ of collections. He wrote that in some sense the collection of natural numbers is bigger (he calls it `richer’) than the collection of the even numbers (because, for example, the collection of even numbers is a proper part of the natural numbers), but in another sense, the collection of natural numbers is just as big as the collection of even numbers (because every natural number has an even number corresponding to it). Nevertheless, as is well-known, Cantor opted for abandoning PW and adopting HP as the basis for the notion of `size’ in his transfinite theory.

Cantor’s choice engendered a beautiful and powerful mathematical theory, which seems to have led us to believe that dropping PW is the only way to generalize arithmetic to infinite sets. But, as is pointed out by Mancosu, this is in fact not the case: mathematical theories have been developed which generalize finite arithmetic in such a way that they preserve the part-whole principle for infinite sets. Examples are F.M. Katz’s Class Size theory and Benci, Di Nasso & Forti’s theory of numerosities.

Thus, Cantor’s theory is commonly accepted even though it forces us to let go of the highly intuitive part-whole principle and there are alternatives which do not force us to do so. What makes Cantor’s theory so attractive?

A possible answer to this question, and indeed Kitcher’s (1984) answer, is that Cantor’s theory is superior to its alternatives for its explanatory power. Kitcher, as quoted in Mancosu, writes that the real advantage of Cantor’s theory is that

“we do not even need to go so far into transfinite arithmetic to receive explanatory dividends. Cantor’s initial results on the denumerability of the rationals and the algebraic numbers, and the non-denumerability of the reals, provide us with new understanding of the differences between the real numbers and the algebraic numbers. Instead of viewing transcendental real numbers (numbers which are not the roots of polynomial equations in rational coefficients) as odd curiosities, our comprehension of them increased when we see why algebraic numbers are the exception rather than the rule.” (Kitcher 1984, p. 221)

Thus, according to Kitcher, the benefit of Cantor’s theory is that while generalizing arithmetic from finite to infinite sets, it yields new insights which are not linked to this generalization, namely, new understanding of the differences between the real and the algebraic numbers.

But to which extent did Cantor’s theory provide us with new understanding of the differences between the real numbers and the algebraic numbers? Is it really true that our comprehension of the transcendental numbers increased with Cantor’s theory?

For sure, Cantor provided us with a new way to look at transcendental numbers. His theory employs new concepts (such as set, equinumerosity, denumerability, non-denumerability) and a new proof methodology (diagonalization). Cantor’s diagonal method allows us to construct transcendental numbers, and indeed infinitely many of them. A different method to construct infinitely many transcendental numbers was already given by Liouville in 1844, but on the assumption of Cantor’s transfinite theory, there are not simply infinitely many transcendental numbers, but uncountably (or non-denumerabily) many of them. Importantly, Cantor’s proof that there are more transcendental than algebraic numbers is one by contradiction, and rests upon the assumption that the reals are uncountable.

It thusly seems that we might as well argue, contrary to Kitcher, that Cantor’s transfinite theory upgraded the status of transcendental numbers from being “odd curiosities” to an outright mystery. Today, several classes of transcendental numbers have been identified, but still we have found only countably many of them (see the Wiki). This means, in Cantor’s framework, that there are uncountably many that we are missing. It seems to me that on the basis of a theory which does preserve the part-whole principle (such as the two theories mentioned before), even if it can be proved that there are more transcendental than algebraic numbers, then there will be `more’ of them in a much less interesting, that is, less puzzling, way.

The factors which eventually lead to the acceptance or rejection of a theory are for sure not always transparent, nor need they be completely rational. For all I know, it adds to the allure of a mathematical theory if it provides us with a nice new puzzle.

Thanks to Eric Wawerczyk for helpful discussion.

Generality in explanations

A favorite example in the debate about the applicability of mathematics and mathematical explanation is the one concerning the life cycle of cicades (cf. Ginammi 2014, p. 109; Mancosu SEP): the (biological) fact that cidades have a prime life cycle is to be explained by the (mathematical) fact that prime periodes minimize the intersections with other periodes (and therefore make the cycades better protected from predators).

However, as Ginammi explaines when he discusses this example in his disseration about the applicability of mathematics, it is pointed out by Pincock (2012, ch. 10.2) that in the explanation of the prime life cycles of cicades, we could replace the proposition

(p) prime periodes minimize intersections with other periods

with

(p*) prime periods less than 100 years minimize intersections with other periods.

In this case, the weaker (p*) seems to have as much explanatory value as (p), because both explain the actual life cycles of cicades equally well. This example suggests that in Biology we can adopt propositions of different strenght to explain the same conclusion, and we have no decisive reason to choose one over the other (cf. Ginammi 2014, p. 110).

However, do we really not have a reason to prefer one of (p) and (p*)? According to an ideal of science that originated from Aristotle’s Analytica Posteriora, proper scientific proofs should state their premisses in there most general (hence strongest) form (cf. Betti & De Jong 2010). In accordance with this tradition, in Bolzano’s view for example

(t) Equiangular triangles have angles that together equal two right angles

should never be deployed in a proper scientific proof. The reason for this is that having angles that together equal two right can be truthfully predicated of all triangles, and not only of those that are equiangular. Therefore, as Bolzano argues, in a proper scientific proof we should use the maximally general

(t*) Triangles have angles that together equal two right angles (cf. Bolzano 1837, §447).

Pincock argues, as Ginammi quotes him (2014, p. 110), concerning the different propositions that we can adopt to explain the prime life cycles of cicades:

I do not believe that the ability to explain nonactual instances of these phenomena should heighten the explanatory power of these explanations

At first sight, Pincock seems to have a point here. Why should we adopt premisses that seem way to strong to explain the fact that we are concerned with?

However, I would like to pose the opposite question: why should we not adopt the stronger premise?

Evidently, both (p) and (p*) are truths that are themselves in need of proof. How do we prove these truths? It seems that in both cases, the most straightforward choice would be a proof that appeals to the concept of prime number. Moreover, it seems that the proof of (p*) just is the proof of (p), where the conclusion is narrowed down to prime numbers less than 100. But now suddenly it looks artifical to use (p*) in the proof of our biological fact: the proof of the mathematical fact that it is based on gives us the more general (p) “for free”.

Maybe this points us to an essential difference between proofs in mathematics and proofs in the empirical sciences: whereas the former are proofs about concepts and for this reason are naturally stated in their most general (strongest) form, the latter are proofs about actual things, and can for this reason be narrowed down to cover just the actual instances.

Bolzano was not sure whether his ideal of proper science that he explicated for the deductive sciences should also hold for the natural sciences. And maybe this is an interesting question to raise concerning the problem of applicability of mathematics: given that we appeal to mathematical concepts in our proofs of empirical facts, does this imply that our ideal of generality that we have for proofs in mathematics give us a reason to prefer stronger premises over the weaker in a proof of an empirical fact?

Read Michele Ginammi’s blog.