Heegner’s solution to the ‘Class Number 1 problem’

I have just completed a short monograph on the so-called ‘Class Number 1 problem‘. It was written to fulfill the EPSRC ‘broadening requirement’, having attended the Part C Modular Forms course this term in Oxford,  and was therefore, by design, a little outside my comfort zone — I hope this disclaimer will temper the disdain of any serious algebraic number theorists who happen across this article. No expert, I set out to try to write the kind of exposition of this topic that I , as an interested mathematician with a slightly different specialism, would have liked to have read myself. Although there already exist some very thorough accounts of this topic in the literature — we make reference to a book by Cox, and essays by Booher, Green and Kezuka — I know of no shorter survey which nonetheless gives a detailed description of the entire argument and sketch proofs of most of the important results.

 

 

Introduction

Gauss found nine imaginary quadratic fields with class number 1, and conjectured that he had found them all. In 1952 Heegner published a purported proof, based heavily on the work of Weber from the third volume of his landmark, but fearsome, treatise Lehrbuch der Algebra. Heegner was unknown to the mathematical community at the time, and it was felt that his proof contained a serious gap. Stark and Baker independently published the first accepted proofs in 1966, but then Stark examined the argument of Heegner and discovered it to be very similar to his own. Indeed, he went on to show that the ‘gap’ in Heegner’s proof was virtually non-existent. Furthermore, he noticed that enough technical machinery could be avoided to have enabled Weber to prove this result some 60 years earlier.

A detailed historical overview of progress on the problem has been written by Goldfeld. It is worth noting that Heilbronn and Linfoot knew in the 1930s that there were at most 10 imaginary quadratic fields with class number 1.

The aim of this short essay is to outline Heegner’s argument, prove a few of the important constituent lemmas, and to relate some of the theory to that covered in the Part C Modular Forms Course. With this latter aim in mind, we shall focus mostly on the modular functions involved in the proof, rather than the input from algebraic number theory — indeed, two particularly technical propositions will be left entirely unproved. However, we will assume familiarity with basic results concerning the ring of integers \mathcal{O}_K of an imaginary quadratic field, and concerning non-maximal orders \mathcal{O}\subset \mathcal{O}_K. This theory is well covered in Chapter 7 of Cox’s ‘Primes of the form x^2 + ny^2‘. Regarding the input from modular forms, we will make heavy reference to Eisenstien series and the Ramanujan \Delta-function, and introduce other modular functions which are invariant under other congruence subgroups \Gamma\leqslant SL_2(\mathbb{Z}). There will also be an analogy to a lemma from the theory of Hecke operators.

Serre has an approach to the class number 1 problem which is much more geometric, constructing a particular modular curve and then counting special points on it, which has more of a flavour of the first half of the Part C Modular Forms course. However, the approach is extremely involved and this is only a short project — Booher  gives details, and also discusses the relationship to Heegner’s argument.

Note on references: We have relied heavily on the excellent essay of Booher, the paper by Stark, and the astonishing book ‘Primes of the form x^2 + ny^2 by David Cox, which despite its unassuming title provided a wealth of insight into all aspects of the argument. The master’s essay of Kezuka is comprehensive but is rarely more than a recitation of Cox. Green has an essay which covers much of the background regarding complex multiplication, although from a more high-brow viewpoint than we shall pursue here.

 

Section 1: Plan of attack

We recall that modular forms may be viewed as functions on lattices, and that an imaginary quadratic field K is uniquely determined by its lattice of integers \mathcal{O}_{K}. Heegner’s starting point is the observation that if a function f on lattices were injective, then lattices L can be classified by the value f(L). The much-celebrated j-function satisfies (a slight weakening of) this property, and the rich structure of this function allows one to compute a short list of potential values j(\mathcal{O}_K) for imaginary quadratic fields with class number 1, which the known examples completely exhaust.

The j-function shall be formally defined in the next section. For now let us collect a series of facts about this remarkable function which we will need for the main theorem, some of which we will prove later.

 

Theorem 1: Some properties of the j-function

Let p\equiv 3(\text{mod }8), p\neq 3, and suppose K=\mathbb{Q}(\sqrt{-p}) has class number 1. The j-function satisfies the following properties.

  1. j:\mathfrak{h}\rightarrow \mathbb{C} is weakly modular of weight 0 for SL_2(\mathbb{Z}), with a simple pole at the cusp.
  2. j(z) has a unique holomorphic cube-root \gamma_2(z):\mathfrak{h}\rightarrow \mathbb{C} which is real on the positive imaginary axis.
  3. \gamma_2\left(\frac{3+\sqrt{-p}}{2}\right)\in\mathbb{Z}
  4. K(j(\sqrt{-p})) is a degree 3 extension for p>60

 

Remark: From now on we call any meromorphic function f:\mathfrak{h}\rightarrow \mathbb{C} that is weakly modular of weight 0 for SL_2(\mathbb{Z}) a modular function for SL_2(\mathbb{Z}).

Remark: The hypotheses of the above theorem will be implicit throughout, sometimes called ‘the usual hypotheses’. The class number 1 problem can be easily reduced to this subcase.

Fact 3 is a special instance of a deep connection between j(z) and algebraic integers — for example j\left(\frac{3+\sqrt{-p}}{2}\right) is an algebraic integer of degree exactly the class number h(-p) — which holds in much more generality. Fact 4 is also hiding a much more general result, namely that if \mathfrak{a}\subset \mathcal{O} is a proper ideal in an order \mathcal{O}\subset \mathcal{O}_K then K(j(\mathfrak{a})) is the ring class field for that order; in particular [K(j(\mathfrak{a})):K]=h(\mathcal{O}), and under the usual hypotheses we have h(\mathbb{Z}(\sqrt{-p}))=3. This is called ‘The First Main Theorem of Complex Multiplication’, and is more advanced than anything we will do here.

For ease of notation, in all that follows we set \tau_0=\frac{3+\sqrt{-p}}{2}.

The field extension K(j(\sqrt{-p})) admits another description in terms of the Weber functions \mathfrak{f}, \mathfrak{f}_1 and \mathfrak{f}_2. Again these shall be defined properly in the next section, but for now let us say that they are modular functions for certain congruence subgroups \Gamma\leqslant SL_2(\mathbb{Z}), closely related to the function \Delta(z). The two critical facts are as follows:

 

Theorem 2: Facts about Weber functions

Under the usual hypotheses, we have:

  1. \gamma_2(\tau_0)=\frac{\mathfrak{f}(\tau_0)^{24}-16}{\mathfrak{f}(\tau_0)^8}
  2. K(\mathfrak{f}(\sqrt{-p})^2)=K(j(\sqrt{-p}))

 

The first fact is actually universally true, and the second is true under much weaker hypotheses. The critical observation is that these two theorems imply that both \alpha:= \mathfrak{f}(\sqrt{-p})^2 and \alpha^4 are algebraic integers of degree 3, satisfying (for some integers a, b, c) the equations

x^3+ax^2+bx+c=0

and

x^3-\gamma_2(\tau_0)x - 16 = 0

respectively. This puts extremely strong conditions on the coefficients a, b and c: indeed, we can show that c=2, that the pair (a,b) must satisfy a certain Diophantine equation, and further we may express \gamma_2(\tau_0) in terms of a and b. This restriction is enough to determine a short finite list of possibilities for (a,b,c), hence a short list of possibilities for \gamma_2(\tau_0)^3=j(\mathcal{O}_K). By great fortune (!), the known imaginary quadratic fields with class number 1 completely exhaust this list.

 

Section 2: Definition of the j-function

Classically, the j-function arises from the study of invariants of elliptic curves. However, owing to the purpose of this project, we will emphasise the connection to the two modular forms of weight 12 out of which the j function is formed. Recall the Eisenstein series of weight k

E_k(z)=\sum\limits_{\substack{(c,d)\in\mathbb{Z}^2\setminus (0,0)\\(c,d)=1}}\dfrac{1}{(cz+d)^k}

and the Ramanujan \Delta-function

\Delta(z)=\dfrac{E_4(z)^3-E_6(z)^2}{1728}

We have that both E_4(z)^3 and \Delta(z) are linearly independent modular forms of weight 12. Thus we may define the modular function

j(z)=\dfrac{E_4(z)^3}{\Delta(z)}

which by a q-expansion argument is holomorphic on \mathfrak{h} with a simple pole at the cusp. Further, having proven that the q-expansions of E_4(z) and \Delta(z) have integer coefficients, it is then immediate that the q-expansion

j(z)=\dfrac{1}{q}+744+196884q + 21493760q^2+\cdots

has integer coefficients. This q-expansion shows that j(z) is holomorphic on \mathfrak{h} with a simple pole at the cusp.

Remark: The non-constant coefficients in this q-expansion correspond to finite linear combinations of dimensions of the irreducible representations of the ‘monster group’ M, the largest sporadic simple group. This mysterious phenomenon is known as ‘monstrous moonshine’.

 

The j-invariant of a lattice

The key the result of this section will be showing that, in the interpretation of j as a function on lattices, j(L)=j(L^\prime) just when L and L^\prime are homothetic, i.e. just when \exists \lambda\in\mathbb{C} s.t. L^\prime=\lambda L. To prove this we will have to go via some theory of elliptic curves, and so rather than define j([\omega_1,\omega_2]):=j(\frac{\omega_1}{\omega_2}) we will in fact redefine the j-function on lattices directly, using notation from elliptic curves. This is undesirable, of course, but in fact amounts to nothing more than a few renormalisations.

Let L\subset \mathbb{C} be a lattice. Recall the modular forms

G_4(L)=\sum\limits_{z\in L\setminus \{0\}}\dfrac{1}{z^4}

G_6(L)=\sum\limits_{z\in L\setminus \{0\}}\dfrac{1}{z^6}

of weight 4 and 6 respectively, both constant multiples of the corresponding Eisenstein series. Defining g_2(L)=60G_4(L) and g_3(L)=140G_4(L), we define the j-function on lattices as

j(L)=1728\dfrac{g_2(L)^3}{g_2(L)^3-27g_3(L)^2}

It is a painful but elementary exercise to show that the normalisations are consistent with the definition of j(z).

 

Lemma 3

Let L and L^\prime be two lattices in \mathbb{C}. Then j(L)=j(L^\prime)\Longleftrightarrow\exists \lambda\in\mathbb{C}\text{ s.t. } L=\lambda L^\prime.

 

Proof: The ‘right-to-left’ implication is obvious. For the other direction, suppose that j(L)=j(L^\prime) and for simplicity suppose that none of the four values g_2(L), g_2(L^\prime), g_3(L), g_3(L^\prime) are zero — the proofs in these cases are similar. Define \lambda\in\mathbb{C} such that

\lambda^4=\dfrac{g_2(L)}{g_2(L^\prime)}

We will show that L=\lambda L^\prime.

It is routine to calculate that (possibly replacing \lambda with i\lambda) this choice implies that

\lambda^6=\dfrac{g_3(L)}{g_3(L^\prime)}

and thus (by the homogeneity properties of g_2 and g_3) that g_2(L)=g_2(\lambda L^\prime) and g_3(L)=g_3(\lambda L^\prime). Now, recall that for any lattice L we can define the Weierstrass \wp-function as a meromorphic function invariant under L, whose poles are exactly the elements of L, satisfying the differential equation

\wp^\prime(z)^2=4\wp(z)^3-g_2(L)\wp(z)-g_3(L)

This differential equation determines the Laurent expansion of \wp(z) around 0 uniquely, and hence \wp(z) uniquely everywhere (not just near 0). As L is precisely the poles of \wp, this determines the original lattice. Since g_2(L)=g_2(\lambda L^\prime) and g_3(L)=g_3(\lambda L^\prime), we get that L=\lambda L^\prime.

 

 

Section 3: The modular equation

In this section we begin the assault on the the third part of Theorem 1, namely the statement that \gamma_2(\tau_0)\in\mathbb{Z}. [NB: \gamma_2(z) is standard notation for the appropriate cube-root of j(z), coming from Weber.] In fact, we will only show that j(\tau_0)\in\mathbb{Z}: the result for \gamma_2 is similar but more technical, and we refer the reader to Cox. The crucial observation is the following lemma:

 

Lemma 4

Every holomorphic modular function is a polynomial in j(z).

 

Proof: We recall that every holomorphic function on a compact Riemann surface is constant. If f(z) is a modular function for SL_2(\mathbb{Z}) whose only pole is at the cusp, we construct a polynomial P such that P(j(z)) has exactly the same coefficients for negative powers of q as f(z) does — P(j(z)) and f(z) have the same ‘principal part’. This is possible since the pole of j(z) is simple. Then f(z)-P(j(z)) is holomorphic on the compact modular curve X(1), and hence is constant.

 

For N\in\mathbb{N}, let \{\gamma_i\} be a set of coset representatives for \Gamma_0(N), w.l.o.g. including the identity. We construct the function

\Phi_N(x,j(z))=\prod\limits_{i}(x-j(N\gamma_i z))

 

Lemma 5

The following hold:

  1. \Phi_N(x,j(z)) is a polynomial in x and j(z)
  2. This polynomial has integer coefficients
  3. If N is not a square, then the leading coefficient of \Phi_N(x,x) is \pm 1

 

Remark: The equation \Phi_N(x,j(z))=\prod\limits_{i}(x-j(N\gamma_i z)) is called the modular equation.

Sketch proof: For the first part, we note that the coefficient of x^k is symmetric in the j(N\gamma_i z) and hence a modular function for SL_2(\mathbb{Z}): we then apply Lemma 4 to each coefficient.

For the second and third parts, we use a slight refinement of Lemma 7.3 from the Part C course, proved as part of the theory of Hecke operators. Letting

C(N)=\left\{\left(\begin{matrix} a & b \\ 0 & d \end{matrix}\right): ad=N, a \geqslant 1, 0 \leqslant b < d,\text{gcd}(a,b,d)=1\right\}

then for any \gamma\in SL_2(\mathbb{Z}) there is a \tilde{\gamma}\in SL_2(\mathbb{Z}) and a unique \sigma\in C(N) such that

\left(\begin{matrix} N & 0 \\ 0 & 1 \end{matrix}\right)\gamma=\tilde{\gamma}\sigma

[Compare this with Lemma 7.3, which states that (if \mathcal{C}(N):=\{(\begin{smallmatrix} a & b \\ 0 & d \end{smallmatrix}): ad=N, a \geqslant 1, 0 \leqslant b < d\}) then any 2\times 2 integer matrix A with det(A) = N can be written as \gamma^{-1}\sigma for some \gamma\in SL_2(\mathbb{Z}) and unique \sigma\in\mathcal{C}(N).]

Using this, we can rewrite the modular equation as \Phi_N(x,j(z))=\prod\limits_{\sigma\in C(N)}(x-j(\sigma z)), and so it is enough to show that any symmetric function in j(\sigma z) can be written as a polynomial in j(z) with integer coefficients. Closer inspection of the argument from Lemma 4 reveals that, as the $q$-expansion for j(z) has integral coefficients, it is enough to show that the q-expansion of f(z) has integral coefficients.

Note: from Lemma 4 we know that f(z) does have some q-expansion.

It is immediate that the coefficients lie in \mathbb{Q}(\zeta_N), where \zeta_N is a primitive N^{th} root of unity. An easy Galois theory argument shows that they must lie in \mathbb{Q}. Another inspection of Lemma 4 reveals that these coefficients are all algebraic integers, hence in fact like in \mathbb{Z}.

For part 3, a short calculation with \Phi_N(j(z),j(z)) shows that the leading coefficient is a root of unity — the key fact turns out to be that there is no \sigma\in C(N) with a=d, since N is not a square — and so by part 2 this coefficient must be \pm 1.

 

 

Section 4: Values of the j-function at singular modulii

We are now ready for the first big result.

 

Theorem 6

Under the usual hypotheses, j(\tau_0)\in\mathbb{Z}.

 

Proof: The proof of this theorem is in two parts. Firstly, we use the modular equation to show that j(\tau_0) is an algebraic integer. Should one so wish, deep general results from complex multiplication may then be employed to conclude directly that the degree of j(\tau_0) is exactly h(-p)=1. However, there is a more elementary (although slightly indirect) approach, using only the most basic results on complex multiplication, which shows that the degree of j(\tau_0) is at most h(-p), which in this regime is good enough. It should be noted that Booher seems to have a slick elementary way of arguing directly from the modular equation, but it is bogus.

j(\tau_0) is an algebraic integer: Note that j(\tau_0)=j(\mathcal{O}_K). Suppose one could find \alpha\in\mathcal{O}_K such that N=N(\alpha) was not a square and such that \Phi_N({j(\alpha\mathcal{O}_K),j(\mathcal{O}_K}))=0. Then since j(\alpha\mathcal{O}_K)=j(\mathcal{O}_K) we have that j(\tau_0) is a root of \Phi_N(x,x), which we have already noted to be a monic polynomial with integer coefficients. It remains to find a suitable \alpha. There is a great flexibility of choice, but we can very concretely take \alpha=\frac{1+\sqrt{-p}}{2}. Then N(\alpha)=\frac{p+1}{4}, which is certainly non-square, and

\alpha\mathcal{O}_K=\frac{1+\sqrt{-p}}{2}\left[1,\frac{1-\sqrt{-p}}{2}\right]=\left[\frac{1+\sqrt{-p}}{2},\frac{p+1}{4}\right]=\frac{p+1}{4}\left[1,\sigma \frac{1+\sqrt{-p}}{2}\right]

where

\sigma=\left(\begin{matrix} 1 & 0 \\ 0 & \frac{p+1}{4} \end{matrix}\right)\in C(N)

Hence j(\alpha\mathcal{O}_K)=j(\sigma(\mathcal{O}_K)) for some (explicit) \sigma\in C(N), and therefore we have \Phi_N({j(\alpha\mathcal{O}_K),j(\mathcal{O}_K}))=0 as required.

The degree j(\tau_0) is at most h(-p)We introduce the most basic result from the theory of complex multiplication.

 

Lemma 7

Let L be a fixed lattice, and let \alpha\in \mathbb{C}\setminus \mathbb{R}. Then TFAE:

  1. \wp(\alpha z) is a rational function of \wp(z)
  2. \alpha L\subset L
  3. There is an order \mathcal{O} in an imaginary quadratic field K such that \alpha \in \mathcal{O} and L is homothetic to a proper fractional \mathcal{O}-ideal

 

This is Theorem 10.14 from Cox, and the proof is extremely short once one recalls a particular universality property of \wp(z), namely that every even elliptic function for L is a rational function in \wp(z).

We refer to the ring of \alpha\in\mathbb{C}\setminus \mathbb{Z} satisfying \alpha L\subset L as the ring of complex multiplications for L. Fixing an order \mathcal{O} in an imaginary quadratic field, we consider those lattices L which have \mathcal{O} as their full ring of complex multiplcations. The third equivalence in the above lemma then implies that w.l.o.g. L is homothetic to a proper fractional \mathcal{O}-ideal. Noting (almost tautologically) that two proper fractional \mathcal{O}-ideals are homothetic as lattices iff they belong to the same element of the class group C(\mathcal{O}), we reach the following result:

 

Corollary 8

There is a one-to-one correspondence between C(\mathcal{O}) and (homothety classes of) lattices with \mathcal{O} as their full ring of complex multiplications.

 

We are now ready to show that j(\tau_0)=j(\mathcal{O}_K) has degree at most h(\mathcal{O}_K). Indeed, let \rho be any automorphism of \mathbb{C}, and pick some \alpha\in \mathcal{O}_K. Lemma 7 gives us that

\wp(\alpha z;g_2(\mathcal{O}_K),g_3(\mathcal{O}_K))=\dfrac{P(\wp( z; g_2(\mathcal{O}_K),g_3(\mathcal{O}_K)))}{Q(\wp(z; g_2(\mathcal{O}_K),g_3(\mathcal{O}_K)))}

for some polynomials P,Q\in \mathbb{C}[X]. We let \rho act on Laurent series by acting on each coefficient, and since the coefficients of the Laurent series for \wp(z) are rational function of g_2 and g_3 we get that

\wp(\rho(\alpha) z;\rho(g_2(\mathcal{O}_K)),\rho(g_3(\mathcal{O}_K)))=\dfrac{P^\rho(\wp( z; \rho(g_2(\mathcal{O}_K)),\rho(g_3(\mathcal{O}_K)))}{Q^\rho(\wp(z; \rho(g_2(\mathcal{O}_K)),\rho(g_3(\mathcal{O}_K)))}

We observe that the discriminant (\rho(g_2(\mathcal{O}_K)))^3-27(\rho(g_3(\mathcal{O}_K)))^2\neq 0, since \rho is an automorphism, and hence by the standard theory of \wp there exists some lattice L with

g_2(L)=\rho(g_2(\mathcal{O}_K))

and

g_3(L)=\rho(g_3(\mathcal{O}_K))

Using Lemma 7 again, we have that L has complex multiplication by \rho(\alpha). Since \alpha\in\mathcal{O}_K was arbitrary, we have that \mathcal{O}_K is contained in the ring of complex multiplications of L. Applying \rho^{-1} and interchanging the roles of L and \mathcal{O}_K gives the reverse inclusion. Since L has \mathcal{O}_K as its full ring of complex multiplications, we conclude by the above lemma that j(L) can only has h(\mathcal{O}_K) many possible values. But we observe that j(L)=\rho(j(\mathcal{O}_K)), and so we conclude that j(\mathcal{O}_K) has at most h(\mathcal{O}_K) conjugates. This concludes the argument.

 

It is an easy task to generalise the above proof for more general orders \mathcal{O}\subset \mathcal{O}_K, and ideals within them. We will need this generalisation later for the order \mathbb{Z}[\sqrt{-p}]\subset \mathbb{Z}[\frac{1+\sqrt{-p}}{2}]=\mathcal{O}_K of conductor f=2. One may show (see Cox, Chapter 10):

 

Theorem 9

Let \mathcal{O}\subset \mathcal{O}_K be an order in an imaginary quadratic field, and let \mathfrak{a}\subset\mathcal{O} be a proper fractional \mathcal{O}-ideal. Then j(\mathfrak{a}) is an algebraic integer of degree at most h(\mathcal{O}).

 

 

Section 5: Weber functions

We define the Dedekind \etafunction as

\eta(z)=q^{\frac{1}{24}}\prod\limits_{n\geqslant 1}(1-q^n)

In light of the product formula for \Delta(z) we see that \eta(z) is a (particular choice of) 24^{th} root of Delta(z), and so it inherits certain invariance properties. In particular

\eta(z + 1) = \zeta_{24}\eta(z)

and

\eta\left(\dfrac{-1}{z}\right)=\sqrt{-iz}\eta(z)

for a particular branch of the square root.

Remark: These invariance properties can be shown directly by careful consideration of the conditionally convergent Eisenstein series of weight 2.

The Weber functions \mathfrak{f}, \mathfrak{f}_1 and \mathfrak{f}_2 may be defined concretely in terms of \eta(z), but these definitions — although useful for calculations — are extremely unintuitive. Fortunately, there is a more abstract interpretation of these functions in terms of the lattice [1,z], which we will subsequently discuss. Indeed, we may initially define

\mathfrak{f}(z)=\zeta_{48}^{-1}\dfrac{\eta\left(\frac{z+1}{2}\right)}{\eta(z)}

\mathfrak{f}_1(z)=\dfrac{\eta\left(\frac{z}{2}\right)}{\eta(z)}

\mathfrak{f}_2(z)=\sqrt{2}\dfrac{\eta(2z)}{\eta(z)}

These functions then satisfy the following abstract theorem:

 

Theorem 10

Let L=[1,z] and let g_2=g_2(L), etc. Let e_1=\wp(\frac{z}{2}), e_2=\wp(\frac{1}{2}) and e_3=\wp(\frac{z+1}{2}) be the three roots of the cubic 4x^3-g_2x-g_3. Then

e_2-e_1=\pi^{2}\eta(z)^4\mathfrak{f}(z)^8

e_2-e_3=\pi^{2}\eta(z)^4\mathfrak{f}_1(z)^8

e_3-e_1=\pi^{2}\eta(z)^4\mathfrak{f}_2(z)^8

 

The proof of this theorem is highly non-trivial, using the Weierstrass \sigma-function, and is most of the work in proving the product formula for \Delta(z). Indeed, the product formula follows immediately after recalling that (up to suitable normalisations) \Delta(z) is the discriminant of the cubic 4x^3-g_2x-g_3. However, it goes some way to describing how the Weber functions arise naturally from the classical study of elliptic curves. For proof, see Booher or Cox.

The above theorem can also be used to derive some surprising algebraic relationships between \gamma_2(z) and the Weber functions.

 

Theorem 11

\gamma_2(z)=\dfrac{\mathfrak{f}(z)^{24}-16}{\mathfrak{f}(z)^8}=\dfrac{\mathfrak{f}_1(z)^{24}+16}{\mathfrak{f}_1(z)^8}=\dfrac{\mathfrak{f}_2(z)^{24}+16}{\mathfrak{f}_2(z)^8}

 

Proof: From Vieta’s formulae we get that g_2(z)=-4(e_1e_2+e_2e_3+e_3e_1) and e_1+e_2+e_3=0. Elementary manipulations then show us that, amongst other similar identities, 3g_2(z) = 4((e_2- e_1)^2 - (e_2-e_3)(e_3 - e_1)). By Theorem 10, this gives us that

g_2(z)=4\pi^4\eta(z)^8(\mathfrak{f}(z)^{16} - \mathfrak{f}_1(z)^8\mathfrak{f}_2(z)^8)

Noting that \gamma_2(z)=\frac{E_4(z)}{\eta(z)^8}, and after consideration of the appropriate normalisation factor relating E_4(z) and g_2(z), we derive

\gamma_2(z)=(\mathfrak{f}(z)^{16} - \mathfrak{f}_1(z)^8\mathfrak{f}_2(z)^8)

We then note a final identity, namely \mathfrak{f}(z)\mathfrak{f}_1(z)\mathfrak{f}_2(z)=\sqrt{2}, which follows from the product formula expression. This yields the first equality, and the others are similar.

 

Remark: Most presentations of the class number 1 problem use the \mathfrak{f}_2 equation in the above theorem, having done most of the work using \mathfrak{f}. Although this conjures up the elusive connection to the modular curve X(24) — indeed, as we shall discuss below, \mathfrak{f_2} is a modular function for \Gamma(24) — all the calculations can be done with \mathfrak{f} only. This saves a bit of work at the end.

 

\gamma_2(z) and the Weber functions as modular functions

For certain congruence subgroups, \gamma_2(z) and (powers of) the Weber functions are modular functions. We present a sketch proof here to emphasise the broader connections to the Part C course, although, since we are presenting a much-reduced proof of the main theorem, we won’t see these results applied in any other proofs in this essay.

 

Lemma 12

\gamma_2(z) is a modular function for \Gamma(3), \mathfrak{f}(z)^6 is a modular function for \Gamma(8), and \mathfrak{f}_2(z) is a modular function for \Gamma(24).

 

In fact, \gamma_2(z) and \mathfrak{f}(z)^6 are modular functions for larger subgroups, although not subgroups of the form considered in the Part C course.

Sketch proof: Given the transformation properties of j(z) and \eta(z), none of these are deep results. Indeed, it is easy to show from the invariance properties of j(z) that we have \gamma_2(z+1)=\zeta_3^2\gamma_2(z) and \gamma_2(\frac{-1}{z})=\gamma_2(z). Since z\rightarrow z+1 and z \rightarrow \frac{-1}{z} generate (the action of) SL_2(\mathbb{Z}) on \mathfrak{h}, we can show that for any (\begin{smallmatrix} a & b\\ c & d \end{smallmatrix})\in SL_2(\mathbb{Z}) we have

\gamma_2\left(\left(\begin{smallmatrix} a & b\\ c & d \end{smallmatrix}\right)z\right)=\zeta_3^{ac-ab+a^2cd-cd}\gamma_2(z)

If b\equiv c \equiv 0 (\text{mod }3), then the exponent is a multiple of 3 and so \gamma_2(z) is invariant under this transformation. A q-expansion argument shows that \gamma_2(z) is meromorphic.

The results for \mathfrak{f}(z)^6 and \mathfrak{f}_2 are very similar arguments, based instead on the transformation properties of \eta(z), but are slightly messier since \eta(z) is not quite a modular function for SL_2(\mathbb{Z}). For full proof, as ever, see Cox.

 

Remark: The second result of this theorem is vital in establishing the algebraic properties of \mathfrak{f}(\sqrt{-p}) in the next section, although the proof is omitted.

 

 

Section 6: The field extensions \mathbb{Q}(j(\sqrt{-p})) and \mathbb{Q}(\mathfrak{f}(\sqrt{-p})^2)

The technical heart of the proof is the argument showing that K(\mathfrak{f}(\sqrt{-p})^2) is a degree 3 extension, and hence (as \mathfrak{f}(\sqrt{-p}) is real) that \mathbb{Q}(\mathfrak{f}(\sqrt{-p})^2) is a degree 3 extension. The approach is first to show that K(j(\sqrt{-p})) is a degree 3 extension, and then to identify the fields K(j(\sqrt{-p})) and K(\mathfrak{f}(\sqrt{-p})^2).

There are deep theorems of complex multiplication that give [K(j(\sqrt{-p})):K] as a trivial corollary. Indeed, recall from an earlier remark the so-called ‘First Main Theorem of Complex Multiplication’, which gives that K(j(\sqrt{-p})) is the ring class field for the order \mathbb{Z}[\sqrt{-p}]\subset \mathbb{Z}[\frac{1+\sqrt{-p}}{2}]=\mathcal{O}_K, and so certainly [K(j(\sqrt{-p})):K]=h(\mathcal{O}). Taking f to be the conductor of \mathcal{O} we have the general formula

h(\mathcal{O})=\dfrac{h(\mathcal{O}_K)f}{[\mathcal{O}_K^{\times}:\mathcal{O}^{\times}]}\prod\limits_{\substack{r\vert f\\ r\text{ prime}}}\left(1-\left(\frac{d_K}{r}\right)\frac{1}{r}\right)

which in this special case (h(\mathcal{O}_K)=1, f=2, [\mathcal{O}_K^{\times}:\mathcal{O}^{\times}]=1, d_K=-p\equiv 5(\text{mod }8)) reduces to

h(\mathcal{O})=\dfrac{1 \times 2}{1}\left(1+\frac{1}{2}\right)=3

However it was noted by Stark that, for the purposes of the class number 1 problem, one could prove [K(j(\sqrt{-p})):K]=3 directly, without going via such a strong structural result.

 

Lemma 13

Let p\equiv 3(\text{mod }8), p\neq 3, p>60, and let K=\mathbb{Q}(\sqrt{-p}) be an imaginary quadratic field with h(-p)=1. Then [K(j(\sqrt{-p})):K]=3.

 

Sketch proof: We have already showed in Theorem 9 that j(\sqrt{-p})) is an algebraic integer of degree at most h(\mathcal{O})=3. Therefore it suffices to show that j(\sqrt{-p})) is not rational or quadratic.

It transpires that a short analysis of the modular equation \Phi_2(x,j(\tau_0)) allows us to eliminate the case where (\sqrt{-p})) is quadratic. If j(\sqrt{-p})) is rational, then it is an integer (since it is an algebraic integer), so from the q-series we observe that

j(\tau_0)^2 - 1488j(\tau_0) + 160512 - j(\sqrt{-p})=42987520q + O(q^2)

The left-hand-side is an integer, and the right-hand-side lies in (0,1) for large enough p, which is a contradiction. Stark calculates that p>60 is enough.

 

It remains to identify the fields K(j(\sqrt{-p})) and K(\mathfrak{f}(\sqrt{-p})^2). The formula above connecting \mathfrak{f} and \gamma_2 shows that K(j(\sqrt{-p}))\subset K(\mathfrak{f}(\sqrt{-p})^2), and also that the reverse inclusion would be implied by showing \mathfrak{f}(\sqrt{-p})^6\in K(j(\sqrt{-p})). This is the statement whose proof in Weber’s Algebra Vol. 3 is highly questionable, and this is (essentially) where the gap in Heegner’s original proof lies. Unfortunately — for our brief survey, at least — a correct proof is a substantial undertaking. Stark presents a long proof in the style of Weber, while Cox and Birch use substantial modern machinery from algebraic number theory; both of these are well beyond the scope of this short essay. However, what we can say is that it is critical to all methods that \mathfrak{f}(z)^6 is a modular function for \Gamma(8), so the proof of this fact was not in vain.

 

 

Section 7: The final argument

Let \mathbb{Q}(\sqrt{-n}) be an imaginary quadratic field with class number 1, with n square-free, and assume that n\notin\{1,2,3,7,11,19,43,67,163\}. We aim for a contradiction.

It is known by more elementary means that we may restrict to the case where n=p is prime, and where p\equiv 3(\text{mod }8). Indeed, it is an old theorem of Landau (see Cox for reference) that h(-4n)=1 iff n=1,2,3,4,7, and one may also find in Cox an easy argument showing that h(-n) is even if n has more than two distinct prime factors; these two theorems are enough to kill all the other cases. Further, laborious checking of lower primes allows us to assume p>60. In this regime, we have remarked above that \mathbb{Q}(\mathfrak{f}(\sqrt{-p})^2) is a degree 3 extension.

Defining \alpha:= \mathfrak{f}(\sqrt{-p})^2, we recall the equation

\alpha^{12}-\gamma_2(\tau_0)\alpha^4-16=0

which shows that \alpha is in fact an algebraic integer of degree 3. Hence, for some integers a, b, and c, we have

\alpha^3+a\alpha^2+b\alpha+c=0

Manipulations: Separating odd and even degree terms, and squaring, we get

\alpha^6+(2b-a^2)\alpha^4+(b^2-2ac)\alpha^2-c^2=0

Repeating the process produces

\alpha^{12}+(-4ac-2b^2+4a^2b-a^4)\alpha^8+(b^4-4ab^2c+2a^2c^2+4bc^2)\alpha^4-c^4=0

Collecting the facts we already know, and using the tower law, it is easily seen that \alpha^4 has degree 3 over \mathbb{Q}; further, we have found two different representations for its minimal polynomial. Thus we may equate coefficients.

Immediately we get that c=\pm 2, and can w.l.o.g. that c=2. We can then express \gamma_2(\tau_0) in terms of a and b by equating the \alpha^4 coefficient, namely \gamma_2(\tau_0) = -(b^4-8ab^2+8a^2+16b), and then solve for a and b using the \alpha ^8 coefficient. Indeed, under the substitutions x=\frac{-a}{2}, y=\frac{b-a^2}{2}, elementary manipulation shows that x and y are integers satisfying

2x(x^3 + 1) = y^2

Standard — although quite lengthy — arguments regarding the solution of Diophantine equations (in particular using the UFD \mathbb{Z}[\frac{1+\sqrt{-3}}{2}]) show that the only solutions are (0, 0), (-1; 0), (1,\pm 2) and (2,\pm 6), giving the list of possible values of \gamma_2(\tau_0) as 0, -96, -5280, -32, -640320 and -960. The five non-trivial values correspond to the cube-roots of j(\mathcal{O}_K) for the fields \mathbb{Q}(\sqrt{-19}), \mathbb{Q}(\sqrt{-67}), \mathbb{Q}(\sqrt{-11}), \mathbb{Q}(\sqrt{-163}) and \mathbb{Q}(\sqrt{-43}) respectively. In particular, there can be no more imaginary quadratic fields with class number 1.

 

 

Bibliography

A. Baker, Linear forms in logarithms of algebraic numbers, Mathematika (1966), p204-216

J. Booher, Modular Curves and the Class Number One problem, http://math.stanford.edu/~jbooher/expos/class_number_one.pdf

D. A. Cox, Primes of the form $x^2+ny^2$, Wiley Classics, 2013 (second edition)

D. Goldfeld, Gauss’ class number problem for imaginary quadratic fields, Bulletin (New Series) of the American Mathematical Society, Vol. 13, No. 1, July 1985

B. J. Green, The Ramanujan Constant, http://people.maths.ox.ac.uk/greenbj/papers/ramanujanconstant.pdf

K. Heegner, Diophantische analysis und modulfunktionen, Math Z. 56 p227-253, 1952

Y. Kezuka, The Class Number Problem, http://wwwf.imperial.ac.uk/~buzzard/maths/research/notes/Yukako_Kezuka_MSc_Project.pdf

A. Lauder, Lecture Notes for Part C Modular Forms MT 2014, http://www0.maths.ox.ac.uk/system/files/coursematerial/2014/3116/8/MF_2014_Lectures_v201114.pdf

H. M. Stark, A complete determination of the complex quadratic fields of class number one, Michigan Mathematics Journal (1967), p1-27

H. M. Stark, On a gap in the theorem of Heegner, Journal of Number Theory, 1 p16-27, 1969

H. Weber, Lehrbuch der Algebra, Vol. 3, Chelsea, New York, 1961

Advertisements

One thought on “Heegner’s solution to the ‘Class Number 1 problem’

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s