A Lemma of Bateman-Katz

Here we present a formulation of a lemma from Bateman-Katz‘s work on the cap-set problem, which may be of independent interest in additive combinatorics at large. The proof is also novel — Ben Green knows of no other instances of this argument — and may yet find other applications. Indeed, Thomas Bloom recently used this lemma together with other bounds to give his improvement to Roth’s theorem.

[Our main result will be ‘Corollary 3’ below, which will follow from Lemma 1 and Theorem 2. ]

For any abelian group G and finite A\subseteq G, we define the ‘2m-fold additive energy’  of A by

E_{2m}(A)=\left\lvert\{(a_{1},\cdots,a_{m},a_{1}^{\prime},\cdots,a_{m}^{\prime})\in A^{2m}:a_{1}+\cdots+a_{m}=a_{1}^{\prime}+\cdots a_{m}^{\prime}\}\right\rvert

Any (a_{1},\cdots,a_{m},a_{1}^{\prime},\cdots,a_{m}^{\prime}) counted on the right-hand-side we call an ‘additive tuple’. The energy is one way of measuring the additive structure of A. The main theorems of this note link data concerning the energy to data concerning another notion of additive structure, namely (in the case where G is a finite vector space) dimension.

Lemma 1

 If \lvert S\rvert = d and \text{dim}(\text{span}(S))=d-k, with 1\leqslant k\leqslant d, then E_{2m}(S)\leqslant 2^{2m+1}m^{4m+3}k^{2m}d^{m}

Remarks on Lemma 1

  1. There are about m^{2m} d^{m} diagonal solutions counted in the additive energy, and — thinking of m as much smaller than d — the claim is exactly saying that (up to some tame factors) these dominate when k is very small, i.e. when S is almost completely linearly independent. This matches our intuition.
  2. The claim is (worse than) trivial if k is the same order as d.
  3. We shall see that information about dimension meshes most naturally with equations in the elements of S, and not additive tuples. The count for the former is roughly equivalent to Bloom’s notion of ‘restricted energy’, and we will, like him, have to undergo a messy calculation to convert information about this restricted energy  into information about  E_{2m} (S). The issue that E_{2m} (S) cannot be calculated directly does genuinely seem to have been missed by B-K. Broadly speaking this means that their definition giving what it means for the large spectrum to be ‘additive smoothing’ is overly optimistic. I am currently trying to establish whether this has any major effect on their work; this will probably be the subject of a future blog post.
  4. My m dependence is extremely wasteful, and no doubt one can do much better. However, don’t be too scared of extra m^m factors; in Bloom’s argument they roughly correspond to \log\log N factors in the final bound. Also remember that there are m^m factors in the trivial lower bound, so we cannot eschew them completely.
  5. The reason we go to the trouble of proving these theorems for E_{2m} (S) rather than the restricted energy is that  E_{2m} (S) has a particularly pleasant expression in terms of the Fourier transform, which allows for many useful manipulations (in particular Holder’s inequality). 

Viewing Lemma 1 in the contrapositive, it is saying that sets with larger-than-trivial energy have lower-than-trivial dimension. Theorem 2 is a more refined version, namely that sets with large energy have subsets with very low dimension.

Theorem 2

Let integers d\geqslant 2m\geqslant\text{max}(2,2Ce) be arbitrary parameters and C>0 be independent of d.  Let V be finite vector space, and A\subseteq V satisfy E_{2m}(A)>\lvert A \rvert ^{2m}d^{-m}\left(2^{4m+3}m^{6m+4}C^{2m}+2m^{2m+1}\right). Then \exists S\subseteq A with \text{dim}(\text{span}(S))\leqslant d and \lvert S\rvert \geqslant\dfrac{C}{d} \lvert A\rvert

Remarks on Theorem 2

  1. This theorem is good in the case where d is a smallish power of \lvert A\rvert; B-K have d\approx \lvert A\rvert ^{\frac{1}{3}}. It gives some structural information even when the 2m-fold additive energy of A is quite a long way below the maximum possible.
  2. This is the first main ‘new’ result in Bateman-Katz’s paper, and is the main topic of this post.
  3. 2m\geqslant 2Ce is just a minor technical condition that streamlines a particular estimation step. It could certainly be removed with a little more effort, and no effort at all if one didn’t care about how the constants in Theorem 2 depended on m.
  4. One needs Lemma 1 as an ingredient in the proof of Theorem 2, applied to a suitably chosen random subset S\subseteq A.
  5. For certain choices of the parameters we can use Theorem 2  to bootstrap an improved theorem regarding E_{8}(A), say, which would be better than applying Theorem 2 directly with m=4.

Expanding on this last remark, we note that by Holder’s inequality we have

E_{2m}(A)\geqslant \dfrac{E_{8}(A)^{\frac{m-1}{3}}}{\lvert A \rvert ^{\frac{m-4}{3}}}

[This follows from the fact that E_{2m}(A)=\sum\lvert \hat{A}(r)\rvert^{2m} and taking Holder in the form \sum\lvert f\rvert ^{2k}\leqslant\left(\sum\lvert f \rvert^{2m}\right)^{\frac{k-1}{m-1}}\left(\sum\lvert f \rvert ^{2}\right)^{\frac{m-k}{m-1}} with k=4, followed by Parseval applied to the second bracket].

Therefore we can make the conclusion of Theorem 2 provided that, for some m,

E_{8}(A)>\left(\lvert A\rvert ^{2m+\frac{m-4}{3}}\right)^{\frac{3}{m-1}}d^{\frac{-3m}{m-1}}\left(2^{4m+3}m^{6m+4}C^{2m}+2m^{2m+3}\right)^{\frac{3}{m-1}}   (1)

Suppose we wanted to be able to take d=\lvert A \rvert ^{\epsilon} for some fixed \epsilon. Applying Theorem 2 directly would only allow us to do so when E_{8}(A)\gg\lvert A\rvert ^{8-4\epsilon} (so for example we would be forced to have \epsilon>\frac{1}{4} to get a non-trivial result). However, equation (1) allows us to conclude whenever E_{8}(A)\gg_{m}\lvert A\rvert ^{7+\frac{3}{m-1}-\frac{3\epsilon m}{m-1}}. Taking m large enough, depending on \epsilon, we win if E_{8}(A)\gg_{\epsilon}\lvert A\rvert ^{7-2\epsilon}, which is better for \epsilon<\frac{1}{2}. We state this as a corollary:

Corollary 3

Let \epsilon\in (0,\frac{1}{2}), and let V be finite vector space, and A\subseteq V. Then there exist a constant K=K(\epsilon) such that if E_{8}(A)\geqslant K(\epsilon)\lvert A\rvert ^{7-2\epsilon} then \exists S\subseteq A with \text{dim}(\text{span}(S))\leqslant \lvert A\rvert^{\epsilon} and \lvert S\rvert\geqslant\lvert A\rvert^{1-\epsilon}

Proof: Above discussion.

Proofs of Lemma 1 and Theorem 2 are below.

Proof of Lemma 1

Let U be the formal vector space of dimension d spanned by the elements of S. Let E\subseteq U be the set of all linear combinations of the elements of S, with at most 2m variables, which equal 0 when evaluated in V.

Now, consider the linear map from U to \text{span}(S) given by evaluation in V. This map is clearly surjective, so has a kernel of dimension k. But every element of E is in this kernel, so \text{span}(E) has a basis of at most k elements of U, which we can assume — recalling the linear algebra fact that every spanning set contains a basis — consists of elements from E. These basis elements mention at most 2mk elements of S, so all equations in E combined mention at most 2mk elements of S.

Regrettably E does not contain all additive 2m-tuples (a_{1},\cdots,a_{m},a_{1}^{\prime},\cdots,a_{m}^{\prime}). There are two phenomena which are not counted in E:

  1. We could have a_{i}=a_{j}^{\prime} for one or more pairs (i,j) (diagonal behaviour).
  2. We could have repetition of a_{i} on one side of the equation that is some multiple of the characteristic of the underlying field.

To prove the claim all we do is take both of these into account. However, the reader is allowed to find this simple process rather painful. There is a similar — and equally painful — lemma in Bloom.

Indeed, a particular  additive 2m-tuple could have r pairs a_{i}=a_{j}^{\prime}, say. There are (m(m-1)\cdots (m-r+1))^2 choices for the positions of these pairs, and d^{r} choices for what the elements are. We remove these pairs from both sides to get another equation, in which no a_{i} equals any a_{j}^{\prime}.

The elements mentioned in this equation only come from that special set of size at most 2mk except those elements which appear a multiple of p times on one side of the equation, where p is the characteristic of the underlying field.

Let’s say that there are np such elements a_{i}. Then there are

\left(\begin{matrix} m-r\\p,p,\cdots,p,m-r-np\end{matrix}\right)

ways of choosing the subsets of repeated indices (where there are n instances of p in the multinomial coefficient) and d^n way of assigning elements of S to these repeated blocks. [Of course this overcounts tuples where a particular element of S is repeated 2p times, but we’re only after an upper bound]. There are at most (2mk)^{m-r-np} ways of filling in the remaining a_{i}.

By performing a similar analysis for the a_{j}^{\prime} we see that

E_{2m}(S)\leqslant\sum\limits_{r=0}^{m} (m(m-1)\cdots (m-r+1))^{2} d^{r}\left(\sum\limits_{n=0}^{\frac{m-r}{p}}\left(\begin{smallmatrix} m-r\\p,p,\cdots,p,m-r-np\end{smallmatrix}\right) d^{n}(2mk)^{m-r-np}\right)^{2}

which — approximating factorials by powers, forgetting about the denominator of the multinomial coefficient completely, and taking the maximum value of everything in the sum — is at most


which — as p\geqslant 2 — is at most


which is at most


as claimed.

Remark: When k=0 we only get contribution from terms which come entirely from phenomena 1 and 2. Running the above calculation we get a bound of 2m^{2m+1}d^{m} in this case.

Proof of Theorem 2

We prove the contrapositive statement. Suppose every S\subseteq A with \text{dim}(\text{span}(S))\leqslant d  satisfies \lvert S\rvert \leqslant \frac{C}{d}\lvert A\rvert. We will show that this assumption makes a random subset of A of size d  have small energy, and therefore that the entire set A has small energy.

Indeed, let S\subseteq A be a subset of size d picked uniformly at random.

Claim: For any integer k in the range 0\leqslant k\leqslant d, we claim that \mathbb{P}(\text{dim}(\text{span}(S))=d-k)\leqslant \dfrac{C^k}{k!}.

Proof of Claim: We can consider picking the elements of S one at a time, uniformly, without replacement. For 0\leqslant l\leqslant d-1, let S_l be the set constructed after l<d trials, and a_{l+1}\in A be the next chosen element. Then


by assumption, since \text{span}(S_{l})\cap A is certainly a subset of A whose span has dimension at most d.

Now, the events X_{l}:= a_{l+1}\in \text{span}(S_{l}) are not — in general — independent. Indeed, suppose that V=\mathbb{F}_{3}^{N} and let A=\{u,-u,v,-v\}, for some linearly independent vectors u and v. Then X_2 and X_3 both have positive probabilities but \mathbb{P}(X_{2}\cap X_{3})=0. [Bloom’s paper uses a slightly different probabilistic model in which we pick with replacement, but this example still scuppers independence even in that instance].

However, it is true that \mathbb{P}\left(\bigcap\limits_{l\in L}X_{l}\right)\leqslant \prod\limits_{l\in L}\mathbb{P}(X_{l}). Indeed, if any of the X_l have probability zero then the above inequality is trivial. Otherwise, if l_{1} < l_{2} then conditioning on X_{l_{1}} cannot increase \text{span}(S_{l_{2}}) hence cannot increase \mathbb{P}(X_{l_{2}}), i.e. \mathbb{P}(X_{l_{2}}\vert X_{l_{1}})\leqslant\mathbb{P}(X_{l_{2}}).  This shows the inequality for pairwise intersections. We may show similarly that \mathbb{P}\left(X_{l}\vert\bigcap\limits_{l\in L^{\prime}}X_{l}\right)\leqslant\mathbb{P}(X_{l}) whenever \text{max}(L^\prime)<l. Writing out \mathbb{P}\left(\bigcap\limits_{l\in L}X_{l}\right) as a suitable product of conditional probabilities, this proves the inequality.

[In B-K and in Bloom, it is not entirely obvious from the wording whether independence is implicitly (and incorrectly) assumed or whether this weaker statement is (correctly) assumed.]

The probability that \text{dim}(\text{span}(S))=d-k is exactly the probability that k of the events X_l occur. By the above discussion, we have

\mathbb{P}(\text{dim}(\text{span}(S))=d-k)\leqslant \left(\begin{smallmatrix} d\\k\end{smallmatrix}\right)(\dfrac{C}{d})^k\leqslant\dfrac{C^k}{k!}

thus proving the claim.

Claim: With S the random set of size d constructed earlier, \mathbb{E}(E_{2m}(S))\leqslant d^{m}\left(2^{4m+3}m^{6m+4}C^{2m}+2m^{2m+3}\right)

Proof of claim: We have \mathbb{E}(E_{2m}(S))=\sum\limits_{k=0}^{d}\mathbb{E}(E_{2m}(S)\vert \text{dim}(\text{span}(S))=d-k)\mathbb{P}(\text{dim}(\text{span}(S))=d-k)

which, by Theorem 1 and the remark following its proof, and the previous claim, is at most


It remains just to bound this sum. The infinite sum \sum\limits_{k=1}^{\infty}k^{2m}\dfrac{C^k}{k!} is clearly convergent;  we observe that if k\geqslant2C\left(1+\dfrac{1}{k}\right)^{2m} then  \dfrac{\left(\frac{k+1}{k}\right)^{2m}C}{k}\leqslant \dfrac{1}{2}.  and we can bound the sum by the ratio test. If k\geqslant \text{max}(2m,2Ce) then the above certainly holds. By assumption, 2m\geqslant 2Ce, hence we may split the sum as


and bound both bits as

2m(2m)^{2m}C^{2m}+(2m)^{2m}\dfrac{C^{2m}}{(2m)!}\cdot 2

which is certainly at most


Inputting this into the upper bound for the expected energy, and adding in the k=0 contribution, we get the bound

\mathbb{E}(E_{2m}(S))\leqslant d^{m}\left(2^{4m+3}m^{6m+4}C^{2m}+2m^{2m+1}\right)

as claimed.

We are now nearly ready to finish to proof of the Theorem 2. Indeed, we can easily bound \mathbb{E}(E_{2m}(S)) below by noting that the probability that a certain additive 2m-tuple is entirely contained in S is at least \left(\dfrac{d}{\lvert A \rvert}\right)^{2m} and so by linearity of expectation \mathbb{E}(E_{2m}(S))\geqslant\left(\dfrac{d}{\lvert A \rvert}\right)^{2m}E_{2m}(A). Therefore, combining our upper and lower bounds, we get

E_{2m}(A)\leqslant \lvert A \rvert ^{2m}d^{-m}\left(2^{4m+3}m^{6m+4}C^{2m}+2m^{2m+1}\right)

as required.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s