Category: technical

  • S-unit, log embedding, and Dirichlet’s unit theorem

    I’ve been working on lattice-based cryptography projects for a while. Before that, I spent some time working on stuff related to computing group cohomology related to {S}-units, which also show up in attacks on lattice-based crypto systems. In this post, I’m going to talk about {S}-units and how to use convex geometry to recover structures about them (Dirichlet’s unit theorem generalized to {S}-units). This post is aimed at readers from a non-algebraic background.

    I like {S}-units a lot. Beyond number theory, they show up in algebraic K-theory and algebraic geometry, and they relate to some of the most profound and mysterious stuff in mathematics, such as the higher regulators and the Birch-Swinnerton-Dyer conjecture. Good material for a future post.

    Lattice and convex geometry

    Many people’s mental picture of a lattice is a stack of molecules stacked together. This presumes an external reference frame. To a number theorist, a lattice is a more inherent object. Take a number field {K}, which is a finite extension of {\mathbb{Q}} the field of rational numbers, its ring of integers {\mathcal{O}_K} consists of elements {x \in K} such that the minimal polynomial of {x} over {\mathbb{Q}} is integral. This generalizes {\mathbb{Z}} the subset of integers in {\mathbb{Q}}. For example, take {K = \mathbb{Q}(i)}, its ring of integers is {\mathcal{O}_K = \mathbb{Z}[i]}, the ring of Gaussian integers. More generally, the ring of integers {\mathcal{O}_K} is a {\mathbb{Z}}-module that spans {K} as a vector space over {\mathbb{Q}}, hence the name “lattice”.

    Number theorists are interested in generalizations of these objects, as they encode arithmetic information, for example actions from the Galois group and torsion elements, which often stays invisible in the “geometry”, which involves thinking about algebraic objects as some kind of spaces familiar to most people’s geometric intuition.

    However, real geometry does play pivotal roles in understanding number theory, which is a bit uncanny in retrospect. One can embed a number field into a real vector space through the “canonical embedding”. For example, one can picture {\mathbb{Q}(i)} as a subset of the complex plane in the usual sense, where {i} corresponds to the point {(0, 1)} in the {(x, y)}-plane. This respects the relation {i^2 = -1}, where recall that multiplication of complex numbers amounts to doing a rotation and scaling. More generally, one can embed cyclotomic extensions similarly where a generator is mapped to a root of unity on the unit circle. By induction using the primitive element theorem, one can embed any number field into a real vector space, in a way that is compatible with addition and multiplication.

    One might wonder why to bother with the canonical embedding. The answer is that {\mathbb{R}} and {\mathbb{C}} have useful properties invisible at the algebraic level. For example, they come equipped with the standard Euclidean norms, which allows one to define the canonical embedding norm on a number field. More interestingly still, convex geometry constrains arithmetic structures. This is perhaps best motivated by Dirichlet’s unit theorem.

    Before stating the theorem, we recall that an Archimedean place is an embedding of the number field {K} into {\mathbb{R}} or {\mathbb{C}}. It is called a real place if the target is {\mathbb{R}} and a complex place otherwise. Due to the conjugate action, the complex places come in pairs, so we write {r_1} for the number of real places and {2r_2} for the number of complex places. Dirichlet’s unit theorem states that the group of units {\mathcal{O}_K^\times} is generated by {r_1 + r_2 - 1} independent elements up to roots of unity, specifically {\mathcal{O}_K^{\times} \cong \mathbb{Z}^{r_1 + r_2 - 1} \times \mu_K}, where {\mu_K} denotes the group of roots of unity in {K}.

    This statement looks purely algebraic, but is actually a convex geometry statement in disguise. The standard proof follows by constructing a logarithmic embedding {L: \mathcal{O}_K^\times \to \mathbb{R}^{r_1 + r_2}}, where


    \displaystyle L(\alpha) = \bigl(\log|\sigma_1(\alpha)|, \ldots, \log|\sigma_{r_1}(\alpha)|, \; 2\log|\sigma_{r_1 + 1}(\alpha)|, \ldots, \; 2\log|\sigma_{r_1 + r_2}(\alpha)|\bigr), \hfill (1)

    where {\sigma_1, \ldots, \sigma_{r_1}} are the real places and {\sigma_{r_1 + 1}, \ldots, \sigma_{r_1 + r_2}} the complex places of {K}. The key geometric observation is that the coordinates of {L(\alpha)} sum to {0} (equivalently, {\log\lvert N_{K/\mathbb{Q}}(\alpha)\rvert = 0}), so {L(\mathcal{O}_K^\times)} lies in the hyperplane of dimension {r_1 + r_2 - 1} cut out by this relation. The image {L(\mathcal{O}_K^\times)} is a discrete subgroup of that hyperplane; we write {T} for the compact quotient of the hyperplane by {L(\mathcal{O}_K^\times)}. Using Minkowski’s convex body theorem, which states that a sufficiently large symmetric convex body contains an integral point, one can construct a surjective group homomorphism from a compact space to {T}. Therefore, by a purely topological argument, one shows that {T} is compact, hence {L(\mathcal{O}_K^\times)} must have full rank in the hyperplane! With more work and looking at the kernel of this map carefully, one recovers Dirichlet’s unit theorem.

    S-units

    The ring of integers {\mathcal{O}_K} is a Dedekind domain, so every nonzero fractional ideal factors uniquely into prime ideals. For {x \in K^\times}, write {(x) = \prod_{i} \mathfrak{p}_i^{e_i}} for the prime factorization of the principal fractional ideal {(x)}; for a prime ideal {\mathfrak{q}}, the valuation {v_{\mathfrak{q}}(x)} is the exponent of {\mathfrak{q}} in this product.

    Fix a finite set {S} of nonzero prime ideals of {\mathcal{O}_K}. The ring of {S}-integers {\mathcal{O}_{K,S}} consists of those {x \in K} such that {v_{\mathfrak{p}}(x) \geq 0} for every nonzero prime ideal {\mathfrak{p} \notin S}. Its unit group {\mathcal{O}_{K,S}^{\times}} is the group of {S}-units: those {\alpha \in K^\times} such that {v_{\mathfrak{p}}(\alpha) = 0} for all {\mathfrak{p} \notin S}. Equivalently, only primes in {S} may appear in the factorization of {(\alpha)}.

    Listing the primes in {S} as {\mathfrak{p}_1, \ldots, \mathfrak{p}_{\lvert S\rvert}}, one defines a logarithmic embedding {L: \mathcal{O}_{K,S}^{\times} \to \mathbb{R}^{r_1 + r_2 + \lvert S\rvert}} by the same Archimedean coordinates as in <a href=”#eqlog-embedding-units”>(1)</a>, together with the finite coordinates {v_{\mathfrak{p}_j}(\alpha) \log N\mathfrak{p}_j}:

    \displaystyle L(\alpha) = \bigl(\log\lvert\sigma_1(\alpha)\rvert, \ldots, \log\lvert\sigma_{r_1}(\alpha) \ldots 2\log\lvert\sigma_{r_1+1}(\alpha)\rvert, \ldots, 2\log\lvert\sigma_{r_1+r_2}(\alpha)\rvert, \ldots v_{\mathfrak{p}_1}(\alpha)\log N\mathfrak{p}_1, \ldots, v_{\mathfrak{p}_{\lvert S\rvert}}(\alpha)\log N\mathfrak{p}_{\lvert S\rvert}\bigr). \hfill (2)

    where {N\mathfrak{p}_i = \lvert\mathcal{O}_K/\mathfrak{p}_i\rvert} is the ideal norm of {\mathfrak{p}_i}. By the product formula, the coordinates of {L(\alpha)} sum to {0}, so the image lies in a hyperplane of dimension {r_1 + r_2 + \lvert S\rvert - 1}. The same convex-geometry arguments as for Dirichlet’s unit theorem show that {L(\mathcal{O}_{K,S}^{\times})} is a full-rank lattice in this hyperplane.

  • Miller’s Algorithm and Pairing

    I’ve been recently implementing some optimizations related to pairing calculations in elliptic curve cryptography. This note summarises some of my learning so far, and I hope it would be helpful for someone coming from a pure maths background who wants to learn about implementation and optimization details. 

    Elliptic Curve Pairing For Mathematicians 

    A pairing in cryptography is a bilinear map with certain computational properties that are very useful for cryptography applications. There are many applications for pairing-based cryptography and I will not go over them here. The Weil pairing was the first proposed pairing map, which is really the Poincaré duality for Étale cohomology. In practice, cryptographers use other related variants (e.g. Tate pairing coming from Poitou-Tate duality) for performance reasons. For someone whose background was not in number theory, I found Silverman’s survey to be a clear conceptual explanation of where the various pairings arise from. 

    For an elliptic curve (a smooth, projective algebraic curve of genus 1 with a marked point {\mathcal{O}}), its rational points are in bijection with the degree zero part of its Picard group. One proof of this fact is based on a dimensionality argument using Riemann-Roch. Concretely, this map sends a point {\mathcal{P}} to the line bundle represented by the divisor {(\mathcal{P}) - (\mathcal{O})}. Tensor product of line bundles in the Picard group induces a multiplication map on the rational points, which recovers the classical group law. 

    Miller’s Algorithm 

    In practice, computing pairings on elliptic curves in some presentation can be challenging although it’s conceptually clear (to algebraic geometers) that they define a bilinear map. The original paper of Miller considers the problem of computing a rational function on an algebraic curve that corresponds to a given divisor (in the sense that the given divisor is equivalent to some effective divisor given that function). For an elliptic curve, this basically boils down to the identity

    \displaystyle (div(h_{\mathcal{P}, \mathcal{Q}})) = (\mathcal{P}) + (\mathcal{Q}) - (\mathcal{P} + \mathcal{Q}) - (\mathcal{O}), \ \ \ \ \ (1)

    where {h_{\mathcal{P}, \mathcal{Q}}} is the function of the line passing through {\mathcal{P}} and {\mathcal{Q}}, where {h_{\mathcal{P}, \mathcal{Q}}} is a rational function representing the ratio of two lines that we will define later. Equation 1 follows straightforwardly from the following two identities corresponding to divisors that represent the line passing through {\mathcal{P}} and {\mathcal{Q}} and the vertical line at {x}-coordinate {\mathcal{P}+\mathcal{Q}}.

    \displaystyle (y - \lambda x - v) = (\mathcal{P}) + (\mathcal{Q}) + (-\mathcal{P} - \mathcal{Q}) - 3(\mathcal{O}), \ \ \ \ \ (2)

    \displaystyle (x - x_{\mathcal{P}+\mathcal{Q}}) = (\mathcal{P} + \mathcal{Q}) - (\mathcal{P} + \mathcal{Q}) - 2(\mathcal{O}). \ \ \ \ \ (3)

    We then set {h_{\mathcal{P}, \mathcal{Q}} = \frac{y - \lambda x - v}{x - x_{\mathcal{P}+\mathcal{Q}}}} to recover the desired identity. Miller’s algorithm (as described below) then uses (1) at each step to compute the rational function whose divisor corresponds to {N(\mathcal{P}) - (N(\mathcal{P})) - (N-1)(\mathcal{O})} for an {N}-torsion point {\mathcal{P}}, where the proof essentially follows from a telescoping sum argument.

    Set {T = P} and {f = 1} 
    Loop {i = t - 1} down to {0} 
    Set {f = f^2 \cdot h_{T,T}} 
    Set {T = 2T} 
    If {\varepsilon_i = 1} 
    Set {f = f \cdot h_{T,P}} 
    Set {T = T + P} 
    End If 
    End {i} Loop 
    Return the value {f}

    (This section follows the presentation in Silverman’s book on the arithmetic of elliptic curves.)

    Optimization 

    There are many techniques to speed up pairing calculations; I’ll probably address them in a later post. The notes gives a detailed discussion on the optimization details of pairing calculations. One particular optimization related to computing the Miller loop is that one can drop the denominator of the line function (so-called “denomination elimination” technique).

  • Gröbner bases and geometry

    Algebra is the offer made by the devil to the mathematician. The devil says: “I will give you this powerful machine, it will answer any question you like. All you need to do is give me your soul: give up geometry and you will have this marvelous machine. — Michael Atiyah

    Back in the day when dinosaurs still roamed Palo Alto and I was a clueless Stanford junior undergrad, I tried and failed many times to understand Gröbner bases. Now I have to implement Gröbner bases for my research, after a Saturday’s suffering, I finally seem to understand what’s going on with Gröbner bases from the wonderful paper [1] What can be computed in algebraic geometry?. This note summarises my learning so far. In retrospect, I hope someone taught me this when I took my first scheme-based algebraic geometry class as an example of the subtlety of algebraic structures compatible with algebraic varieties.

    Gröbner Bases 

    First recall Buchberger’s algorithm which finds a basis for an ideal {I} of a polynomial ring {R} proceeds as follows (pseudo-code generated by Claude based on the Wikipedia page).

    Input A set of polynomials {F} that generates {I}

    Output A Gröbner basis {G} for {I}

    1. {G := F} 
    2. For every {f_i, f_j} in {G}, denote by {g_i} the leading term of {f_i} with respect to the given monomial ordering, and by {a_{ij}} the least common multiple of {g_i} and {g_j}
    3. Choose two polynomials in {G} and let {S_{ij} = \frac{a_{ij}}{g_i} f_i - \frac{a_{ij}}{g_j} f_j} (Note that the leading terms here will cancel by construction). 
    4. Reduce {S_{ij}} with the multivariate division algorithm relative to the set {G} until the result is not further reducible. If the result is non-zero, add it to {G}
    5. Repeat steps 2–4 until all possible pairs are considered, including those involving the new polynomials added in step 4. 
    6. Output {G}

    When I first learnt about this, it seemed to me extremely unclear what exactly is going on. Why do we choose a monomial ordering? How does the choice of a different monomial ordering affect the runtime? The geometric insight in [1] is to relate the monomial ordering to a 1-parameter family of automorphism of the projective space wherein the variety defined by the set of polynomials {F} embeds. 

    In particular, let {X} be the algebraic variety defined by {F} (take homogenization) in some projective scheme {\mathbb{P}^n}, consider a one-parameter subgroup {\lambda(t) \subset GL(n + 1)} of the diagonal form

    \displaystyle \lambda(t) = \begin{bmatrix} t^{w_0} & & & \\ & t^{w_1} & & \\ & & \ddots & \\ & & & t^{w_n} \end{bmatrix},where each {\lambda(t)} induces an automorphism of {\mathbb{P}^n} where we identify the image of {X} as {X_t}. Let {x^A = x_0^{a_0}\ldots x_n^{a_n}} be a monomial in the homogeneous coordinate ring of {\mathbb{P}^n}. Then the automorphism {\lambda(t)} maps a homogenous polynomial {f = ax^A + bx^B + \ldots} to {\lambda f = at^{W\cdot A}x^A + bt^{W\cdot B}x^B + \ldots}

    From here, one may take projective limits of {\lambda f} and {X_t} in the projective spaces and ponder what happens at {t = 0}. The upshot is that with a particular choice of the one-parameter automorphism groups, we can recover exactly the monomial ordering (exercise: recover the lexicographic and reverse lexicographic ordering this way; the reverse lexicographic order described in [1] is actually the graded lexicographic order, but they coincide here since we consider homogeneous coordinates) as the projective limit of {\lambda f_t} for {t} away from {0}

    Back to the question about what happens to {\varprojlim X_t} as {t} approaches {0}, the “right” perspective is to define a {\mathbb{P}^n[t]}-scheme {X'} whose fiber at {t} recovers {X_t}. An obvious attempt is to define {X'} to be generated by each {g_t} as the quotient of {f} divided by the largest power of {t} in its monomials, which would recover {X_t} at fibers away from {0}. However, at {t = 0}, the situation is not nice as the projection from {X'} is not necessarily flat (or equivalently the syzygies upstairs do not generate the ones downstairs). To make it flat, we lift syzygies upstairs, which amounts to adding “{t}-torsion” points to the coordinate rings, which recovers exactly the “multivariate division” part of Buchberger’s algorithm! Therefore the Gröbner bases in some sense define a flat family of varieties whose fiber at general position recovers the original variety {X}.

    Connections to Regularity 

    It turns out Gröbner bases relate to Hilbert functions and its computational complexity relates to Castelnuovo-Mumford regularity (also can be used to construct Hilbert schemes). I strongly recommend working through the example in [1] to gain intuition. One particular nice thing is that reducing by lexicographic order or reversed lexicographic order is related to projection of the orignal variety to some subspaces ([1] discusses really cool connections to flag varieties in birational geoemtry). The upshot is that lexicographic order reduction corresponds to the projection {\mathbb{P}^n \rightarrow \mathbb{P}^{n-1}} where {\mathbb{P}^n} is viewed as the Thom space of the tautological bundle over {\mathbb{P}^{n-1}} and reverse lexicographic order reduction corresponds to projection from the point at infinity. This intuitively explains why in practice reverse lexicographic order reduction works better, as the relevant projection maps are simpler – which is pretty cool as the hidden geometry is not obvious at all if one only thinks about the ordering.