arrow-left

All pages
gitbookPowered by GitBook
1 of 4

Loading...

Loading...

Loading...

Loading...

Gram-Schmidt Orthogonalization

hashtag
Overview

Gram-Schmidt orthogonalization is an algorithm that takes in a basis {bi}i=1n\left\{b_i\right\}_{i=1}^n{bi​}i=1n​ as an input and returns a basis {biβˆ—}i=1n\left\{b_i^*\right\}_{i=1}^n{biβˆ—β€‹}i=1n​where all vectors are orthogonal, i.e. at right angles. This new basis is defined as

biβˆ—=biβˆ’βˆ‘j=1iβˆ’1ΞΌi,jbjβˆ—ΞΌi,j=⟨bi,bjβˆ—βŸ©βŸ¨bjβˆ—,bjβˆ—βŸ©b_i^*=b_i-\sum_{j=1}^{i-1}\mu_{i,j}b_j^*\quad\mu_{i,j}=\frac{\langle b_i,b_j^*\rangle}{\langle b_j^*,b_j^*\rangle}biβˆ—β€‹=biβ€‹βˆ’j=1βˆ‘iβˆ’1​μi,j​bjβˆ—β€‹ΞΌi,j​=⟨bjβˆ—β€‹,bjβˆ—β€‹βŸ©βŸ¨bi​,bjβˆ—β€‹βŸ©β€‹

where ΞΌi,j\mu_{i,j}ΞΌi,j​is the Gram-Schmidt coefficients.

One can immediately check that this new basis is orthogonal, meaning

Let be the matrix where the th row is given by andbe the matrix where the th row is given by , then the Gram-Schmidt orthogonalization gives us where and is the Gram-Schmidt coefficient. As an example, consider the basis of a subspace of :

Instead of doing the Gram-Schmidt orthogonalization by hand, we can get sage to do it for us:

This outputs two matrices, and :

One can quickly verify that and that the rows of are orthogonal to each other.

A useful result is that

Intuitively, this tells us that the more orthogonal a set of basis for a lattice is, the shorter it is as the volume must be constant.

hashtag
Exercises

1) Show that the basis is orthogonal.

2) Verify that the output of sage is indeed correct.

3) Show that and is a diagonal matrix whose entries are . Conclude that .

4*) Given the Iwasawa decomposition where is a lower diagonal matrix with on its diagonal, is a diagonal matrix and an orthogonal matrix, meaning , show that and . Furthermore, prove that such a decomposition is unique.

⟨biβˆ—,bjβˆ—βŸ©={0iβ‰ jβˆ₯biβˆ—βˆ₯2i=j\langle b_i^*,b_j^*\rangle=\begin{cases}0&i\neq j\\\left\lVert b_i^*\right\rVert^2&i=j\end{cases}⟨biβˆ—β€‹,bjβˆ—β€‹βŸ©={0βˆ₯biβˆ—β€‹βˆ₯2​iξ€ =ji=j​
B\mathcal BB
iii
bib_ibi​
Bβˆ—\mathcal B^*Bβˆ—
iii
biβˆ—b_i^*biβˆ—β€‹
B=ΞΌBβˆ—\mathcal B=\mu\mathcal B^*B=ΞΌBβˆ—
ΞΌi,i=1,ΞΌj,i=0\mu_{i,i}=1,\mu_{j,i}=0ΞΌi,i​=1,ΞΌj,i​=0
ΞΌi,j\mu_{i,j}ΞΌi,j​
R4\mathbb R^4R4
b1=(βˆ’1βˆ’231)b2=(βˆ’6βˆ’451)b3=(551βˆ’3)\begin{matrix} b_1 &= & (&-1&-2&3&1&)\\ b_2 &= & (&-6&-4&5&1&)\\ b_3 &= & (&5&5&1&-3&) \end{matrix}b1​b2​b3​​===​(((β€‹βˆ’1βˆ’65β€‹βˆ’2βˆ’45​351​11βˆ’3​)))​
Bβˆ—\mathcal B^*Bβˆ—
ΞΌ\muΞΌ
B=ΞΌBβˆ—\mathcal B=\mu\mathcal B^*B=ΞΌBβˆ—
Bβˆ—\mathcal B^*Bβˆ—
det⁑(BBT)=det⁑(Bβˆ—Bβˆ—T)=∏iβˆ₯biβˆ—βˆ₯\det\left(\mathcal B\mathcal B^T\right)=\det\left(\mathcal B^*\mathcal B^{*T}\right)=\prod_i\left\lVert b_i^*\right\rVertdet(BBT)=det(Bβˆ—Bβˆ—T)=iβˆβ€‹βˆ₯biβˆ—β€‹βˆ₯
biβˆ—b_i^*biβˆ—β€‹
ΞΌΞΌT=1\mu\mu^T=1ΞΌΞΌT=1
Bβˆ—Bβˆ—T\mathcal B^*\mathcal B^{*T}Bβˆ—Bβˆ—T
βˆ₯biβˆ—βˆ₯\left\lVert b_i^*\right\rVertβˆ₯biβˆ—β€‹βˆ₯
det⁑(BBT)=det⁑(Bβˆ—Bβˆ—T)=∏iβˆ₯biβˆ—βˆ₯\det\left(\mathcal B\mathcal B^T\right)=\det\left(\mathcal B^*\mathcal B^{*T}\right)=\prod_i\left\lVert b_i^*\right\rVertdet(BBT)=det(Bβˆ—Bβˆ—T)=∏i​βˆ₯biβˆ—β€‹βˆ₯
B=LDO\mathcal B=LDOB=LDO
LLL
111
DDD
OOO
OOT=1OO^T=1OOT=1
Bβˆ—=DO\mathcal B^*=DOBβˆ—=DO
ΞΌ=L\mu=LΞΌ=L
B = Matrix([
[-1, -2, 3, 1],
[-6, -4, 5, 1],
[5, 5, 1, -3]])

B.gram_schmidt()
(
[-1 -2  3  1]  [ 1  0  0]
[-4  0 -1 -1]  [ 2  1  0]
[ 0  3  3 -3], [-1 -1  1]
)

Lagrange's algorithm

hashtag
Overview

Lagrange's algorithm, often incorrectly called Gaussian reduction, is the 2D analouge to the Euclidean algorithm and is used for lattice reduction. Intuitively, lattice reduction is the idea of finding a new basis that consists of shorter vectors. Before going into Lagrange's algorithm, we first recap the Euclidean algorithm:

The algorithm primarily consists of two steps, a reduction step where the size of mmmis brought down by a multiple of nnnand a swapping step that ensures mmmis always the largest number. We can adapt this idea for lattices:

Here is actually the Gram-Schmidt coefficient and it turns out that this algorithm will always find the shortest possible basis! Using the basis

the Lagrange reduction looks like

and here we see it clearly gives the shortest vectors.

hashtag
Optimality proof

Let be a lattice. The basis is defined to be the shortest for any other basis , we have and . Note that this generally cannot be generalized to other dimensions, however in dimension 2, this is possible and is given by Lagrange's algorithm. The proof is a somewhat messy sequence of inequalities that eventually lead to the conclusion we want.

Let be the output of the Lagrange reduction for some lattice . To prove that Lagrange reduction gives the shortest basis, we first show that is the shortest vector in .

We know that from the algorithm directly. Let be any element in . We first show that :

Since , this quantity is only when and is a positive integer for all other cases, hence and is a shortest vector of . Note that we can have multiple vectors with the same norm as , for instance . So this is not a unique shortest vector.

Suppose there exists some basis for such that . We show that . Let .

If , then as must form a basis. This means that and by the inequality above, we must have or . The first case tells us that . By squaring the second case, we get

but since is the shortest vector, .

Otherwise, we have and , so

Hence proving Lagrange's algorithm indeed gives us the shortest basis vectors.

hashtag
Exercises

1) Show that the output of Lagrange's algorithm generate the same lattice as the input.

2) Find a case where . Notice that the vectors here is the equality case for the bound given in Exercise 4 of the introduction, this actually tells us that the optimal lattice circle packing in 2D is given by this precise lattice! It turns out that this is actually the optimal circle packing in 2D but the proof is significantly more involved. (See for the details)

3*) Let , show that

and show that for all steps in the algorithm except the first and last, hence decreases by at least at each loop and the algorithm runs in polynomial time.

LLL reduction

hashtag
Overview

There are a few issues that one may encounter when attempting to generalize Lagrange's algorithm to higher dimensions. Most importantly, one needs to figure what is the proper way to swap the vectors around and when to terminate, ideally in in polynomial time. A rough sketch of how the algorithm should look like is

There are two things we need to figure out, in what order should we reduce the basis elements by and how should we know when to swap. Ideally, we also want the basis to be ordered in a way such that the smallest basis vectors comes first. Intuitively, it would also be better to reduce a vector by the larger vectors first before reducing by the smaller vectors, a very vague analogy to filling up a jar with big stones first before putting in the sand. This leads us to the following size reduction algorithm:

def euclid(m,n):
    while n!=0:
        q = round(m/n)
        m -= q*n
        if abs(n) > abs(m):
            m, n = n, m
    return abs(m)
def lagrange(b1,b2):
    mu = 1
    while mu != 0:
        mu = round((b1*b2) / (b1*b1))
        b2 -= mu*b1
        if b1*b1 > b2*b2:
            b1, b2 = b2, b1
    return b1, b2
circle-info

We can further improve this by optimizing the Gram Schmidt computation as this algorithm does not modify Bβˆ—\mathcal B^*Bβˆ—at all. FurthermoreΞΌ\muΞΌchanges in a very predictable fasion and when vectors are swapped, one can write explicit formulas for howBβˆ—\mathcal B^*Bβˆ—changes as well.

Next, we need to figure a swapping condition. Naively, we want

for all iii. However, such a condition does not guarantee termination in polynomial time. As short basis vectors should be almost orthogonal, we may also want to incorperate this notion. Concretely, we want ∣μi,j∣\left|\mu_{i,j}\right|∣μi,jβ€‹βˆ£to be somewhat small for all pairs of i,ji,ji,j, i.e. we may want something like

However, since ΞΌi,j=⟨bi,bjβˆ—βŸ©βŸ¨bjβˆ—,bjβˆ—βŸ©\mu_{i,j}=\frac{\langle b_i,b_j^*\rangle}{\langle b_j^*,b_j^*\rangle}ΞΌi,j​=⟨bjβˆ—β€‹,bjβˆ—β€‹βŸ©βŸ¨bi​,bjβˆ—β€‹βŸ©β€‹, this condition is easily satisfied for a sufficiently long bjβˆ—b_j^*bjβˆ—β€‹, which is not what we want. The key idea is to merge these two in some way and was first noticed by LovΓ‘sz - named the LovΓ‘sz condition:

It turns out that using this condition, the algorithm above terminates in polynomial time! More specifically, it has a time complexity of O(d5nlog⁑3B)O\left(d^5n\log^3B\right)O(d5nlog3B)where we havedddbasis vectors as a subset of Rn\mathbb R^nRnand BBBis a bound for the largest norm of bib_ibi​. 14<Ξ΄\frac14<\delta41​<Ξ΄ ensures that the lattice vectors are ordered roughly by size and Ξ΄<1\delta<1Ξ΄<1ensures the algorithm terminates.

hashtag
Polynomial time proof

This follows the proof provided by the authors of the LLL paper. We first prove that the algorithm terminates by showing it swaps the vectors finitely many times. Letdddbe the number of basis vectors as a subset of Rn\mathbb R^nRn. Let did_idi​be the volume of the lattice generated by {bj}j=1i\left\{b_j\right\}_{j=1}^i{bj​}j=1i​at each step of the algorithm. We have di=∏j=1iβˆ₯bjβˆ—βˆ₯d_i=\prod_{j=1}^i\left\lVert b_j^*\right\rVertdi​=∏j=1i​​bjβˆ—β€‹β€‹. Now consider the quantity

This quantity only changes whenever some biβˆ—b_i^*biβˆ—β€‹changes, i.e when swaps happen. Let's consider what happens when we swap bib_ibi​and bi+1b_{i+1}bi+1​. Recall the Gram-Schmidt algorithm:

From this, see that when we swap bib_ibi​and bi+1b_{i+1}bi+1​, biβˆ—b_i^*biβˆ—β€‹is replaced by bi+1βˆ—+ΞΌi+1,ibiβˆ—b_{i+1}^*+\mu_{i+1,i}b_i^*bi+1βˆ—β€‹+ΞΌi+1,i​biβˆ—β€‹. Now using the LovΓ‘sz condition, we see that we haveβˆ₯bi+1βˆ—+ΞΌi+1,ibiβˆ—βˆ₯2<Ξ΄βˆ₯biβˆ—βˆ₯2\left\lVert b_{i+1}^*+\mu_{i+1,i}b_i^*\right\rVert^2<\delta\left\lVert b_i^*\right\rVert^2​bi+1βˆ—β€‹+ΞΌi+1,i​biβˆ—β€‹β€‹2<Ξ΄βˆ₯biβˆ—β€‹βˆ₯2, hence the value of did_idi​must decrease by at least Ξ΄\deltaΞ΄, i.e. the new did_idi​is less than diΞ΄\frac{d_i}\deltaΞ΄di​​. All other dj,jβ‰ id_j,j\neq idj​,jξ€ =imust remain the same as the volume remains fixed when we swap basis vectors around. Hence at each swap, DDDdecreases by Ξ΄\deltaΞ΄. This is why we need Ξ΄<1\delta<1Ξ΄<1.Now we are left with showing did_idi​is bounded from below then we are done.

Let Ξ»1(L)\lambda_1(L)Ξ»1​(L)be the length of the shortest (nonzero) vector in the lattice. We can treat did_idi​as the volume of the lattice LiL_iLi​generated by{bj}j=1i\left\{b_j\right\}_{j=1}^i{bj​}j=1i​. Let xix_ixi​be the shortest vector in the lattice in LiL_iLi​. By using Minkowski's lattice point theorem, we have

(Note that the value of CiC_iCi​isn't particularly important, one can use a easier value like i\sqrt ii​)

Hence we see that did_idi​, and hence DDDhas a (loose) lower bound Dmin⁑=∏i=1ddi,min⁑D_{\min}=\prod_{i=1}^dd_{i,\min}Dmin​=∏i=1d​di,min​, meaning that there are at most log⁑Dlog⁑Dmin⁑δ\frac{\log D}{\log D_{\min}\delta}logDmin​δlogD​swaps. Since at each iteration,kkkeither increases by111when there is no swaps or decreases by at most111when there is swaps and kkkranges from222toddd, the number of time the loop runs must be at most 2log⁑Dlog⁑Dmin⁑δ+d2\frac{\log D}{\log D_{\min}\delta}+d2logDmin​δlogD​+d, hence the algorithm terminates.

This proof also gives us a handle on the time complexity of the operation. LetBBBis the length of the longest input basis vector. Since we have di≀Bid_i\leq B^idi​≀Bi, D≀Bm2+m2D\leq B^{\frac{m^2+m}2}D≀B2m2+m​and the algorithm loops O(d2log⁑B)O\left(d^2\log B\right)O(d2logB)times. The Gram-Schmidt orthogonalization is the most expensive part in the entire process, taking up O(d2n)O\left(d^2n\right)O(d2n)arithmetic operations. By using classical algorithm for arithmetic operations, each takes O(nlog⁑B)O\left(n\log B\right)O(nlogB)time. From this, we deduce that the time complexity of the LLL algorithm is O(d5mlog⁑2B)O\left(d^5m\log^2B\right)O(d5mlog2B), a somewhat reasonable polynomial time algorithm.

Let bib_ibi​be the output of the LLL algorithm, it turns out that we have the bound

which requires Ξ΄>14\delta>\frac14Ξ΄>41​. Such bounds for the shortest vector will be elaborated in more detail in the section on reduced basis.

hashtag
Exercises

1) Implement the LLL in sage and experimentally verify that DDDdoes indeed decrease byΞ΄\deltaΞ΄each time.

2) Show that the time complexity analysis is correct, and indeed each loop takes at most O(d2n)O\left(d^2n\right)O(d2n)operations.

βˆ₯biβˆ₯≀βˆ₯bi+1βˆ₯\left\lVert b_i\right\rVert\leq\left\lVert b_{i+1}\right\rVertβˆ₯bi​βˆ₯≀βˆ₯bi+1​βˆ₯
∣μi,jβˆ£β‰€c|\mu_{i,j}|\leq c∣μi,jβ€‹βˆ£β‰€c
Ξ΄βˆ₯biβˆ—βˆ₯2≀βˆ₯bi+1βˆ—+ΞΌi+1,ibiβˆ—βˆ₯2δ∈(14,1)\delta\left\lVert b_i^*\right\rVert^2\leq\left\lVert b_{i+1}^*+\mu_{i+1,i}b_i^*\right\rVert^2\quad\delta\in\left(\frac14,1\right)Ξ΄βˆ₯biβˆ—β€‹βˆ₯2≀​bi+1βˆ—β€‹+ΞΌi+1,i​biβˆ—β€‹β€‹2δ∈(41​,1)
D=∏i=1ddiD=\prod_{i=1}^dd_iD=i=1∏d​di​
biβˆ—=biβˆ’βˆ‘j=1iβˆ’1ΞΌi,jbjβˆ—ΞΌi,j=⟨bi,bjβˆ—βŸ©βŸ¨bjβˆ—,bjβˆ—βŸ©b_i^*=b_i-\sum_{j=1}^{i-1}\mu_{i,j}b_j^*\quad\mu_{i,j}=\frac{\langle b_i,b_j^*\rangle}{\langle b_j^*,b_j^*\rangle}biβˆ—β€‹=biβ€‹βˆ’j=1βˆ‘iβˆ’1​μi,j​bjβˆ—β€‹ΞΌi,j​=⟨bjβˆ—β€‹,bjβˆ—β€‹βŸ©βŸ¨bi​,bjβˆ—β€‹βŸ©β€‹
Ξ»1(L)≀xi≀2πΓ(i2+1)1i⏟Cidi1idiβ‰₯Ξ»1(L)iCii=di,min⁑\begin{align*} \lambda_1(L)\leq x_i&\leq\underbrace{\frac2{\sqrt\pi}\Gamma\left(\frac i2+1\right)^{\frac1i}}_{C_i}d_i^\frac1i\\ d_i&\geq\frac{\lambda_1(L)^i}{C_i^i}=d_{i,\min} \end{align*}Ξ»1​(L)≀xi​di​​≀Ci​π​2​Γ(2i​+1)i1​​​dii1​​β‰₯Cii​λ1​(L)i​=di,min​​
βˆ₯b1βˆ₯≀(44Ξ΄βˆ’1)dβˆ’14vol(L)1d\left\lVert b_1\right\rVert\leq\left(\frac4{4\delta-1}\right)^{\frac{d-1}4}\text{vol}(L)^\frac1dβˆ₯b1​βˆ₯≀(4Ξ΄βˆ’14​)4dβˆ’1​vol(L)d1​
def LLL(B):
    d = B.nrows()
    i = 1
    while i<d:
        size_reduce(B)
        if swap_condition(B):
            i += 1
        else:
            B[i],B[i-1] = B[i-1],B[i]
            i = max(i-1,1)
    return B
def size_reduce(B):
    d = B.nrows()
    i = 1
    while i<d:
        Bs,M = B.gram_schmidt()
        for j in reversed(range(i)):
            B[i] -= round(M[i,j])*B[j]
            Bs,M = B.gram_schmidt()
    return B
ΞΌ\muΞΌ
ΞΌ2,1\mu_{2,1}ΞΌ2,1​
b1=(βˆ’1.8,1.2)b2=(βˆ’3.6,2.3)\begin{matrix} b_1&=&(-1.8,1.2)\\ b_2&=&(-3.6,2.3) \end{matrix}b1​b2​​==​(βˆ’1.8,1.2)(βˆ’3.6,2.3)​
LLL
b1,b2b_1,b_2b1​,b2​
b1β€²,b2β€²,βˆ₯b1β€²βˆ₯≀βˆ₯b2β€²βˆ₯b_1',b_2',\left\lVert b_1'\right\rVert\leq\left\lVert b_2'\right\rVertb1′​,b2′​,βˆ₯b1′​βˆ₯≀βˆ₯b2′​βˆ₯
βˆ₯b1βˆ₯≀βˆ₯b1β€²βˆ₯\left\lVert b_1\right\rVert\leq\left\lVert b_1'\right\rVertβˆ₯b1​βˆ₯≀βˆ₯b1′​βˆ₯
βˆ₯b2βˆ₯≀βˆ₯b2β€²βˆ₯\left\lVert b_2\right\rVert\leq\left\lVert b_2'\right\rVertβˆ₯b2​βˆ₯≀βˆ₯b2′​βˆ₯
b1,b2b_1,b_2b1​,b2​
LLL
βˆ₯b1βˆ₯\left\lVert b_1\right\rVertβˆ₯b1​βˆ₯
LLL
∣⟨b1,b2⟩∣βˆ₯b1βˆ₯2≀12\frac{\left|\langle b_1,b_2\rangle\right|}{\left\lVert b_1\right\rVert^2}\le\frac12βˆ₯b1​βˆ₯2∣⟨b1​,b2β€‹βŸ©βˆ£β€‹β‰€21​
v=mb1+nb2∈Lv=mb_1+nb_2\in Lv=mb1​+nb2β€‹βˆˆL
LLL
βˆ₯b1βˆ₯≀βˆ₯vβˆ₯\left\lVert b_1\right\rVert\leq\left\lVert v\right\rVertβˆ₯b1​βˆ₯≀βˆ₯vβˆ₯
βˆ₯vβˆ₯2=βˆ₯mb1+nb2βˆ₯2=m2βˆ₯b1βˆ₯2+2mn⟨b1,b2⟩+n2βˆ₯b2βˆ₯2β‰₯m2βˆ₯b1βˆ₯2βˆ’βˆ£mn∣βˆ₯b1βˆ₯2+n2βˆ₯b1βˆ₯2=(m2βˆ’βˆ£mn∣+n2)βˆ₯b1βˆ₯2\begin{align*} \left\lVert v\right\rVert^2&=\left\lVert mb_1+nb_2\right\rVert^2\\ &=m^2\left\lVert b_1\right\rVert^2+2mn\langle b_1,b_2\rangle+n^2\left\lVert b_2\right\rVert^2\\ &\geq m^2\left\lVert b_1\right\rVert^2-|mn|\left\lVert b_1\right\rVert^2+n^2\left\lVert b_1\right\rVert^2\\ &=\left(m^2-|mn|+n^2\right)\left\lVert b_1\right\rVert^2\\ \end{align*}βˆ₯vβˆ₯2​=βˆ₯mb1​+nb2​βˆ₯2=m2βˆ₯b1​βˆ₯2+2mn⟨b1​,b2β€‹βŸ©+n2βˆ₯b2​βˆ₯2β‰₯m2βˆ₯b1​βˆ₯2βˆ’βˆ£mn∣βˆ₯b1​βˆ₯2+n2βˆ₯b1​βˆ₯2=(m2βˆ’βˆ£mn∣+n2)βˆ₯b1​βˆ₯2​
m2βˆ’mn+n2=(mβˆ’n2)2+34n2m^2-mn+n^2=\left(m-\frac n2\right)^2+\frac34n^2m2βˆ’mn+n2=(mβˆ’2n​)2+43​n2
000
m=n=0m=n=0m=n=0
βˆ₯vβˆ₯β‰₯βˆ₯b1βˆ₯\left\lVert v\right\rVert\geq\left\lVert b_1\right\rVertβˆ₯vβˆ₯β‰₯βˆ₯b1​βˆ₯
βˆ₯b1βˆ₯\left\lVert b_1\right\rVertβˆ₯b1​βˆ₯
LLL
b1b_1b1​
βˆ’b1-b_1βˆ’b1​
b1β€²,b2β€²b'_1,b'_2b1′​,b2′​
LLL
βˆ₯b1β€²βˆ₯≀βˆ₯b2β€²βˆ₯\left\lVert b_1'\right\rVert\leq\left\lVert b_2'\right\rVertβˆ₯b1′​βˆ₯≀βˆ₯b2′​βˆ₯
βˆ₯b2βˆ₯≀βˆ₯b2β€²βˆ₯\left\lVert b_2\right\rVert\leq\left\lVert b_2'\right\rVertβˆ₯b2​βˆ₯≀βˆ₯b2′​βˆ₯
b2β€²=mb1+nb2b_2'=mb_1+nb_2b2′​=mb1​+nb2​
n=0n=0n=0
b2β€²=Β±b1b_2'=\pm b_1b2′​=Β±b1​
b1β€²,b2β€²b_1',b_2'b1′​,b2′​
βˆ₯b1βˆ₯=βˆ₯b1β€²βˆ₯=βˆ₯b2β€²βˆ₯\left\lVert b_1\right\rVert=\left\lVert b_1'\right\rVert=\left\lVert b_2'\right\rVertβˆ₯b1​βˆ₯=βˆ₯b1′​βˆ₯=βˆ₯b2′​βˆ₯
Β±b1β€²=b2\pm b_1'=b_2Β±b1′​=b2​
Β±b1β€²=b1+b2\pm b_1'=b_1+b_2Β±b1′​=b1​+b2​
βˆ₯b1β€²βˆ₯=βˆ₯b2βˆ₯\left\lVert b'_1\right\rVert=\left\lVert b_2\right\rVertβˆ₯b1′​βˆ₯=βˆ₯b2​βˆ₯
βˆ₯b1β€²βˆ₯2=βˆ₯b1+b2βˆ₯2βˆ₯b1β€²βˆ₯2=βˆ₯b1βˆ₯2+2⟨b1,b2⟩+βˆ₯b2βˆ₯20=2⟨b1,b2⟩+βˆ₯b2βˆ₯2βˆ₯b1βˆ₯2≀βˆ₯b2βˆ₯2\begin{align*} \left\lVert b'_1\right\rVert^2&=\left\lVert b_1+b_2\right\rVert^2\\ \left\lVert b'_1\right\rVert^2&=\left\lVert b_1\right\rVert^2+2\langle b_1,b_2\rangle+\left\lVert b_2\right\rVert^2\\ 0&=2\langle b_1,b_2\rangle+\left\lVert b_2\right\rVert^2\\ \left\lVert b_1\right\rVert^2&\leq\left\lVert b_2\right\rVert^2\\ \end{align*}βˆ₯b1′​βˆ₯2βˆ₯b1′​βˆ₯20βˆ₯b1​βˆ₯2​=βˆ₯b1​+b2​βˆ₯2=βˆ₯b1​βˆ₯2+2⟨b1​,b2β€‹βŸ©+βˆ₯b2​βˆ₯2=2⟨b1​,b2β€‹βŸ©+βˆ₯b2​βˆ₯2≀βˆ₯b2​βˆ₯2​
βˆ₯b1βˆ₯\left\lVert b_1\right\rVertβˆ₯b1​βˆ₯
βˆ₯b1βˆ₯=βˆ₯b2βˆ₯\left\lVert b_1\right\rVert=\left\lVert b_2\right\rVertβˆ₯b1​βˆ₯=βˆ₯b2​βˆ₯
m,n≠0m,n\neq0m,n=0
m2βˆ’mn+n2β‰₯1m^2-mn+n^2\geq1m2βˆ’mn+n2β‰₯1
βˆ₯b2β€²βˆ₯2=m2βˆ₯b1βˆ₯2+2mn⟨b1,b2⟩+n2βˆ₯b2βˆ₯2β‰₯m2βˆ₯b1βˆ₯2βˆ’βˆ£mn∣βˆ₯b1βˆ₯2+n2βˆ₯b2βˆ₯2=n2(βˆ₯b2βˆ₯2βˆ’βˆ₯b1βˆ₯2)+(m2βˆ’βˆ£mn∣+n2)βˆ₯b1βˆ₯2β‰₯(n2βˆ’1)(βˆ₯b2βˆ₯2βˆ’βˆ₯b1βˆ₯2)+βˆ₯b2βˆ₯2β‰₯βˆ₯b2βˆ₯2\begin{align*} \left\lVert b'_2\right\rVert^2&=m^2\left\lVert b_1\right\rVert^2+2mn\langle b_1,b_2\rangle+n^2\left\lVert b_2\right\rVert^2\\ &\geq m^2\left\lVert b_1\right\rVert^2-|mn|\left\lVert b_1\right\rVert^2+n^2\left\lVert b_2\right\rVert^2\\ &=n^2\left(\left\lVert b_2\right\rVert^2-\left\lVert b_1\right\rVert^2\right)+\left(m^2-|mn|+n^2\right)\left\lVert b_1\right\rVert^2\\ &\geq\left(n^2-1\right)\left(\left\lVert b_2\right\rVert^2-\left\lVert b_1\right\rVert^2\right)+\left\lVert b_2\right\rVert^2\\ &\geq\left\lVert b_2\right\rVert^2 \end{align*}βˆ₯b2′​βˆ₯2​=m2βˆ₯b1​βˆ₯2+2mn⟨b1​,b2β€‹βŸ©+n2βˆ₯b2​βˆ₯2β‰₯m2βˆ₯b1​βˆ₯2βˆ’βˆ£mn∣βˆ₯b1​βˆ₯2+n2βˆ₯b2​βˆ₯2=n2(βˆ₯b2​βˆ₯2βˆ’βˆ₯b1​βˆ₯2)+(m2βˆ’βˆ£mn∣+n2)βˆ₯b1​βˆ₯2β‰₯(n2βˆ’1)(βˆ₯b2​βˆ₯2βˆ’βˆ₯b1​βˆ₯2)+βˆ₯b2​βˆ₯2β‰₯βˆ₯b2​βˆ₯2​
βˆ₯b1βˆ₯=βˆ₯b2βˆ₯=βˆ₯b1+b2βˆ₯\left\lVert b_1\right\rVert=\left\lVert b_2\right\rVert=\left\lVert b_1+b_2\right\rVertβˆ₯b1​βˆ₯=βˆ₯b2​βˆ₯=βˆ₯b1​+b2​βˆ₯
ΞΌ2,1=⌊μ2,1βŒ‰+Ξ΅=ΞΌ+Ο΅\mu_{2,1}=\lfloor\mu_{2,1}\rceil+\varepsilon=\mu+\epsilonΞΌ2,1​=⌊μ2,1β€‹βŒ‰+Ξ΅=ΞΌ+Ο΅
βˆ₯b2βˆ₯2β‰₯((βˆ£ΞΌβˆ£βˆ’12)2βˆ’Ξ΅2)βˆ₯b1βˆ₯2+βˆ₯b2βˆ’ΞΌb1βˆ₯\left\lVert b_2\right\rVert^2\geq\left(\left(|\mu|-\frac12\right)^2-\varepsilon^2\right)\left\lVert b_1\right\rVert^2+\left\lVert b_2-\mu b_1\right\rVertβˆ₯b2​βˆ₯2β‰₯((βˆ£ΞΌβˆ£βˆ’21​)2βˆ’Ξ΅2)βˆ₯b1​βˆ₯2+βˆ₯b2β€‹βˆ’ΞΌb1​βˆ₯
∣μ∣β‰₯2|\mu|\geq2∣μ∣β‰₯2
βˆ₯b1βˆ₯βˆ₯b2βˆ₯\left\lVert b_1\right\rVert\left\lVert b_2\right\rVertβˆ₯b1​βˆ₯βˆ₯b2​βˆ₯
3\sqrt33​
https://arxiv.org/abs/1009.4322arrow-up-right

LLL reduction

hashtag
Introduction

In this section, we hope to bring some intuitive understanding to the LLL algorithm and how it works. The LLL algorithm is a lattice reduction algorithm, meaning it takes in a basis for some lattice and hopefully returns another basis for the same lattice with shorter basis vectors. Before introducing LLL reduction, we'll introduce 2 key algorithms that LLL is built from, Gram-Schmidt orthogonalization and Gaussian Reduction. We give a brief overview on why these are used to build LLL.

As the volume of a lattice is fixed, and is given by the determinant of the basis vectors, whenever our basis vectors gets shorter, they must, in some intuitive sense, become more orthogonal to each other in order for the determinant to remain the same. Hence, Gram-Schmidt orthogonalization is used as an approximation to the shortest basis vector. However, the vectors that we get are in general not in the lattice, hence we only use this as a rough idea of what the shortest vectors would be like.

Lagrange's algorithm can be thought as the GCD algorithm for 2 numbers generalized to lattices. This iteratively reduces the length of each vector by subtracting some amount of one from another until we can't do it anymore. Such an algorithm actually gives the shortest possible vectors in 2 dimensions! Unfortunately, this algorithm may not terminate for higher dimensions, even in 3 dimensions. Hence, it needs to be modified a bit to allow the algorithm to halt.