Saturday, March 30, 2013

Ostrowski's theorem

In this blog post, we look at yet another fascinating way in which prime numbers appear in mathematics. The result below is called the Ostrowski's theorem, named after the Russian mathematician Alexander Ostrowski.

Before we go into the theorem, let us attempt to define absolute value on the set of rationals. We call a function $f: \mathbb{Q} \mapsto \mathbb{R}^+ \cup \{0\}$ an absolute value, if the following properties $(\star)$ are satisfied:
  1. Non-negativity: $f(0) = 0$ and $f(x) > 0$ for all $x > \mathbb{Q} \backslash \{0\}$.
  2. Multiplicativity: $f(x \cdot y) = f(x) \cdot f(y)$.
  3. Sub-additivity (or) Triangle inequality: $f(x+y) \leq f(x) + f(y)$.
Note: From $(1)$ and $(2)$, we have that $f(-1) = f(1) = 1$. This also gives us $f(-x) = f(x)$ and $f \left(x^k \right) = f \left(x \right)^k$. In addition, from $(3)$, we also get that $f(n) \leq \vert n \vert$, for $n \in \mathbb{Z}$.

The natural question is can be categorize/classify functions that satisfy the properties described above. To do so, let us look at some examples of function that satisfy the properties $(\star)$.

Examples:

$\mathbf{1}$: The usual absolute value function, $f(x) = \vert x \vert$, which measures the distance of a point on the real line from the origin, satisfies the properties $(\star)$.

$\mathbf{2}$: The function given by $f(x) = \vert x \vert^{\alpha}$, where $\alpha \in [0,1]$, also satisfies the properties $(\star)$.  Note that $\alpha = 0$ is the trivial function $$f(x) = \begin{cases} 0 & \text{if } x = 0\\ 1 & \text{if } x \neq 0\end{cases}$$ The case $\alpha = 1$, gives us the usual absolute value.

$\mathbf{3}$: $p$-adic absolute value: We shall now define a function on rational numbers as follows. Consider $a \in \mathbb{Q}$ and any prime number $p$. We have $a = \dfrac{m}n$, where $m,n \in \mathbb{Z}$. Note that we can always write any rational number as
$$a = \dfrac{m}n = p^{\beta} \dfrac{r}s$$where $(r,p) = 1 = (s,p)$. Essentially, we are pulling out all the powers of $p$ from both the numerator and denominator. Now define the function $f_p(a)$ as $f_p(a) = p^{-\beta}$ for $a\in \mathbb{Q} \backslash \{0\}$ and $f_p(0) = 0$.

Claim: The function described above satisfies the properties $(\star)$.
Proof:
Even before we check the properties, we need to check if the function is well-defined. It is not hard to check this, i.e., if $a = \dfrac{m_1}{n_1} = \dfrac{m_2}{n_2}$, where $\dfrac{m_1}{n_1} = p^{\beta_1} \dfrac{r_1}{s_1}$, $(r_1,p) = 1 = (s_1,p)$ and $\dfrac{m_2}{n_2} = p^{\beta_2} \dfrac{r_2}{s_2}$, $(r_2,p) = 1 = (s_2,p)$, then $\beta_1 = \beta_2$.

Trivially non-negativity is satisfied. Multiplicativity follows immediately from the fact that if $(k,l) = 1$ and $(t,l) = 1$, then $(tk,l) = 1$. Sub-additivity requires to be checked. Let $a = p^{\beta_a} \dfrac{m_a}{n_a}$ and $b = p^{\beta_b} \dfrac{m_b}{n_b}$. Since the function is well-defined, we can always choose $m_a,n_a$ and $m_b, n_b$ such that $(m_a,n_a) = 1$ and $(m_b,n_b) = 1$. We have
$$f_p(a) = \dfrac1{p^{\beta_a}} \text{ and } f_p(b) = \dfrac1{p^{\beta_b}}$$
Without loss of generality, let us assume $\beta_a \leq \beta_b$. We then have $$a+b = p^{\beta_a} \dfrac{m_a}{n_a} + p^{\beta_b} \dfrac{m_b}{n_b} = p^{\beta_a} \dfrac{m_a \cdot n_b + p^{\beta_b - \beta_a} m_b \cdot n_a}{n_a \cdot n_b}$$ Now consider $\dfrac{m_a \cdot n_b + p^{\beta_b - \beta_a} m_b \cdot n_a}{n_a \cdot n_b}$. No power of $p$ divides the denominator always. If $\beta_a < \beta_b$, then no power of $p$ divides the numerator as well. Hence, $f_p(a + b) = \dfrac1{p^{\beta_a}} = f_p(a)$. If $\beta_a = \beta_b$, then it is possible that some power of $p$ could divide the numerator. Then we have $f_p(a+b) = \dfrac1{p^{\beta_a + r}}$ where $r > 0$. Hence, either way, we have $$f_p(a+b) \leq \dfrac1{p_a} = \max\{f_p(a),f_p(b)\}.$$ Hence, we in-fact have a strong sub-additivity condition that
$$f_p(a+b) \leq \max \{f_p(a), f_p(b)\}$$Absolute values that satisfy this stronger property are called ultra-metric or non-Archimedean absolute values. We shall denote the $p$-adic absolute by the $\vert \cdot \vert_p$.

$\mathbf{4}$: The $p$-adic absolute value raised to the power $\alpha$, where $\alpha \geq 0$ also satisfies the properties $(\star)$, i.e., $\vert x \vert_p^{\alpha}$. The only thing that needs some work to be checked is the sub-additivity. The argument for sub-additivity is the same as before and this absolute value is also a non-Archimedean absolute value.

We now have $4$ examples of functions satisfying the properties $(\star)$. Strictly speaking, we only have $2$ examples. Examples $1$ and $3$ are special cases of examples $2$ and $4$ respectively, with $\alpha = 1$.

Ostrowski's theorem states that examples $2$ and $4$ gives us all the functions that satisfy $(\star)$, i.e., all possible absolute values on $\mathbb{Q}$ are given by examples $2$ and $4$.

Ostrowski's theorem is a powerful statement. In general, all classification theorems are powerful statements. If we have a property $(\dagger)$, we can construct objects satisfying property $(\dagger)$. Classification theorems enable us to classify all possible objects satisfying $(\dagger)$. This is usually done by finding example objects satisfying $(\dagger)$ and once we feel that there can be no more objects, we set above proving that any object satisfying property $(\dagger)$ falls in one of the categories/examples.

Now let us go above proving Ostrowski's theorem. We will prove that examples $2$ and $4$ are the only possible absolute values for the set of natural numbers. Hence, these could only be the possible set of absolute values for the rationals as well. We will split the theorem into two cases.

Case $1$: $f(n) \geq 1$, $\forall n \in \mathbb{Z}^+$. For this case, we shall show that $f(x) = \vert x \vert^{\alpha}$, where $\alpha \in [0,1]$ is the only such function.

We know that $f(1)$ has to be $1$. Consider $n \in \mathbb{Z}^+$ and not equal to $1$. Consider another $m \in \mathbb{Z}^+$. We can write $m$ in base $n$ as $$m = m_0 + m_1 n + m_2 n^2 + \cdots + m_k n^k$$where $m_l \in \{0,1,2,\ldots,n-1\}$. From sub-additivity of $f$, we have $$f(m) \leq f(m_0) + f(m_1) f(n) + f(m_2) f(n)^2 + \cdots + f(m_k) f(n)^k$$ Since $m_l \in \{0,1,2,\ldots,n-1\}$, we have $f(m_l) \leq n-1$. Hence, we get that \begin{align*} f(m) & \leq (n-1) \left(1 + f(n) + f(n)^2 + \cdots + (f(n))^k\right)\\ & < (n-1)(k+1) f(n)^k \leq (n-1)(\log_n(m)+1) f(n)^{\log_n(m)} \end{align*} Now replacing $m$ by $m^l$, we get that $$f(m^l) = f(m)^l \leq (n-1)(l \cdot \log_n(m)+1) f(n)^{l \cdot \log_n(m)}$$ Taking the $l^{th}$ root on both sides, we get that $$f(m) \leq (n-1)^{1/l} (1+l \cdot \log_n(m))^{1/l}f(n)^{\log_n(m)}$$ Now let $l \to \infty$, to get that $$f(m) \leq f(n)^{\log_n(m)}, \text{i.e.,} f(m)^{1/\log(m)} \leq f(n)^{1/\log(n)}$$ Since this inequality is symmetric in $n$ and $m$, we obtain $$f(n)^{1/\log(n)} = \text{constant}$$ for all $n \in \mathbb{Z}^+$. This gives us $$f(n) = \text{constant}^{\log(n)} = e^{\alpha \cdot \log(n)} = n^{\alpha}$$ Triangle inequality and the fact that $f(n) \geq 1$, $\forall n \in \mathbb{Z}^+$ gives us $\alpha \in [0,1]$.

Case $2$: $f(n) < 1$, for some $n \in \mathbb{Z}^+$. For this case, we shall show that $f(x) = \vert x \vert_p^{\alpha}$, where $\alpha \in \mathbb{R}^+$ is the only such function.

By well-ordering principle, we consider the smallest such $n \in \mathbb{Z}^+$ such that $f(n) < 1$. If $n$ is not a prime, then $n= ab$ for some $a,b \in \{2,3,\ldots,n-1,n\}$. This means $f(a) \cdot f(b) = f(n) < 1$. This means either $f(a) \in (0,1)$ or $f(b) \in (0,1)$, which contradicts the minimality of $n$. Hence, $n$ has to be a prime. Now consider any $m \in \mathbb{Z}^+$. We write $m$ in base $p$ as $$m = m_0 + m_1 p + m_2 p^2 + \cdots m_k p^k$$ We then have \begin{align*}f(m) & \leq f(m_0) + f(m_1) f(p) + f(m_2) f(p^2) + \cdots f(m_k) f(p^k)\\ & < (p-1) + (p-1) f(p) + (p-1) f(p) + \cdots + (p-1) f(p) = (k+1)(p-1) f(p)\\ & \leq (1+\log_p(m))(p-1)f(p) \end{align*} As before replace $m$ by $m^l$ to get $$f(m^l) = f(m)^l < (1+ l \log_p(m))(p-1) f(p)$$ This gives us $$f(m)^l < (1+l \log_p(m)) (p-1) f(p)$$ and taking the $l^{th}$ root on both sides and letting $l \to \infty$, gives us $$f(m) \leq 1$$ for all $m \in \mathbb{Z}^+$ and since $f(-m) = f(m)$, we also have $f(m) \leq 1, \forall m \in \mathbb{Z}$. The only thing left for us to show is that if $(m,p) = 1$, then $f(m) = 1$. (This is so since if $(m,p) \neq 1$, we can always write $m = p^{\gamma} \times u$, where $(u,p) = 1$ and make use of multiplicativity.) If $(m,p) = 1$, we have $(m^n,p^n) = 1$ for all integers $n$. Hence, there exists sequence of integer pairs $(x_n,y_n) \in \mathbb{Z}^2$ such that $$m^n x_n + p^n y_n = 1$$ We then have $$f(m^n x_n + p^n y_n) = f(1) = 1$$ This gives us $$1 \leq f(m)^n f(x_n) + f(p)^n f(y_n) \leq f(m)^n + f(p)^n$$ Letting $n \to \infty$, we have $f(p)^n \to 0$, since $f(p) < 1$. Hence, we have $f(m) = 1$, whenever $(m,p) = 1$. Hence, if $n = p_1^{\beta_1} p_2^{\beta_2} \cdots p_k^{\beta_k}$, and if $p_1$ is the smallest prime such that $f(p_1) < 1$, we then have $$f(n) = f(p_1^{\beta_1}) \cdot f(p_2^{\beta_2} p_3^{\beta_3} \cdots p_k^{\beta_k}) = f(p_1)^{\beta_1} \times 1 = f(p_1)^{\beta_1},$$ where $f(p_1)$ is some quantity less than $1$ given by $f(p_1) = p_1^{-\alpha}$, where $\alpha > 0$. This gives the $p$-adic absolute value.

Monday, March 25, 2013

$\displaystyle \int_0^{2\pi} \frac{\sin^2\theta}{a+b\cos\theta}\ d\theta$ using Catalan numbers

A recent post on math.stackexchange asked for a proof of the following statement:
$$\int_0^{2\pi} \frac{\sin^2\theta}{a+b\cos\theta}\ d\theta = \frac{2\pi}{b^2} \left(a-\sqrt{a^2-b^2} \right) $$
A typical way to evaluate this integral is to rewrite the integral as
$$\int_0^{2\pi} \frac{\sin^2\theta}{a+b\cos\theta}\ d\theta = \oint_{\vert z \vert = 1} \dfrac{\left( \dfrac{z-1/z}{2i}\right)^2}{a+b \left(\dfrac{z+1/z}2\right)} \dfrac{dz}{iz}$$ and then do some algebraic massaging and make use of residue theorem to evaluate this integral. In this post, we look at an elementary method to evaluate this integral. (I wrote this as an answer to that question on math.stackexchange.)

Proof:

$$I = \int_0^{2 \pi} \dfrac{\sin^2(x)}{a+b \cos(x)} dx \implies aI = \int_0^{2 \pi} \dfrac{\sin^2(x)}{1+\dfrac{b \cos(x)}a}dx$$
\begin{align}
aI & = \sum_{k=0}^{\infty}\left(\dfrac{(-b)^k}{a^k} \int_0^{2 \pi}\sin^2(x) \cos^k(x) dx \right) = \sum_{k=0}^{\infty}\left(\dfrac{b^{2k}}{a^{2k}} \int_0^{2 \pi}\sin^2(x) \cos^{2k}(x) dx \right)
\end{align}
Note that we have thrown away the odd terms, since for odd $k$, the integral $\displaystyle \int_0^{2 \pi}\sin^2(x) \cos^k(x) dx$ is zero.
\begin{align}
\dfrac{\displaystyle \int_0^{2 \pi}\sin^2(x) \cos^{2k}(x) dx}4 & = \int_0^{\pi/2}\sin^2(x) \cos^{2k}(x) dx\\
&
\int_0^{\pi/2}\cos^{2k}(x) dx - \int_0^{\pi/2}\cos^{2k+2}(x) dx\\
& = \dfrac{2k-1}{2k} \dfrac{2k-3}{2k-2} \cdots \dfrac12 \dfrac{\pi}2 - \dfrac{2k+1}{2k+2} \dfrac{2k-1}{2k} \dfrac{2k-3}{2k-2} \cdots \dfrac12 \dfrac{\pi}2\\
& = \dfrac1{2k+2} \dfrac{2k-1}{2k} \dfrac{2k-3}{2k-2} \cdots \dfrac12 \dfrac{\pi}2\\
& = \dfrac{\pi}{2^{k+2}} \dfrac{(2k-1)(2k-3)\cdots3 \cdot 1}{(k+1)!} = \dfrac{\pi}{2^{2k+2}} \dfrac{(2k)!}{k! (k+1)!}
\end{align}
Hence,
$$\dfrac{aI}{\pi} = \sum_{k=0}^{\infty} \left(\dfrac{b}{2a} \right)^{2k} \underbrace{\dfrac{(2k)!}{k! (k+1)!}}_{\text{Catalan numbers}}$$
Now $$\sum_{k=0}^{\infty} \dfrac{\dbinom{2k}k x^{k}}{k+1} = \dfrac{1-\sqrt{1-4x}}{2x} \,\,\,\,\,\, \forall \vert x \vert < \dfrac14$$ This is the generating function for the Catalan numbers. Mike Spivey has a nice little post on Catalan numbers, where you can find the expression and generating function of the Catalan numbers. Hence, in our case, we get that
$$\sum_{k=0}^{\infty} \left(\dfrac{b}{2a} \right)^{2k} \underbrace{\dfrac{(2k)!}{k! (k+1)!}}_{\text{Catalan numbers}} = \dfrac{1-\sqrt{1-4 \cdot \left(\dfrac{b}{2a} \right)^2}}{2 \cdot \left(\dfrac{b}{2a} \right)^2} \,\,\,\,\,\,\,\,\forall \dfrac{b}{2a} < \dfrac12$$
Hence,
$$\dfrac{aI}{\pi} = \dfrac{1-\sqrt{1-\left(\dfrac{b}a\right)^2}}{\dfrac{b^2}{2a^2}} = \dfrac{a-\sqrt{a^2-b^2}}{\dfrac{b^2}{2a}} \implies I = \dfrac{2\pi}{b^2} (a-\sqrt{a^2-b^2})$$

Friday, March 15, 2013

Generalization of Millin series

The series $$\sum_{k=0}^{\infty} \dfrac1{F_{2^k}} = \dfrac{7-\sqrt5}2$$where $F_m$ denotes the $m^{th}$ Fibonacci number is called the Millin series. In this post, we will look at a generalization of this result for recurrences of the form
$$a_{n+2} = b a_{n+1} + a_n$$
where $a_0=0$, $a_1 = 1$ and $b \in \mathbb{R}$. Note that in the case of the Fibonacci numbers, we have $b = 1$.

Claim: For any $b \in \mathbb{R}$, we have $$\sum_{k=0}^{\infty} \dfrac1{a_{2^k}} = 1 + \dfrac2b + \dfrac{b - \sqrt{b^2+4}}2 \,\,\,\,\, (\heartsuit)$$

Note that if we prove the above claim, then setting $b=1$ gives us
$$\sum_{k=0}^{\infty} \dfrac1{F_{2^k}} = 1 + 2 + \dfrac{1-\sqrt5}2 = 3 + \dfrac{1-\sqrt5}2 = \dfrac{7-\sqrt5}2$$which is the Millin series.

Proof:  Before we prove this, let us first obtain a result for the finite sum counterpart in $(\heartsuit)$.
----------------------------------------------------------------------------------------------------------------------------------------------------------
Lemma: If we have a sequence given by the recurrence,
$$a_{n+2} = ba_{n+1} + a_n,$$ with $a_0 =0 $ and $a_1 = 1$, we then have
$$\boxed{\color{blue}{\displaystyle \sum_{k=0}^{N} \dfrac1{a_{2^k}} = 1 + \dfrac2b - \dfrac{a_{2^N-1}}{a_{2^N}}}}$$

Proof: Let us write out a few terms of this sequence, we get
$$a_0 = 0, a_1 = 1, a_2 = b, a_3 = b^2 + 1, a_4 = b^3 + 2b, \cdots$$
The proof is by induction on $N$. For $N=1$, we have the left hand side to be $$\dfrac1{a_1} + \dfrac1{a_2} = 1 + \dfrac1b$$ while the right hand side is $$1 + \dfrac2b - \dfrac{a_1}{a_2} = 1 + \dfrac2b - \dfrac1{b} = 1 + \dfrac1b$$
For $N=2$, we have the left hand side to be $$\dfrac1{a_1} + \dfrac1{a_2} + \dfrac1{a_4} = 1 + \dfrac1b + \dfrac1{b^3 + 2b}$$ while the right hand side is $$1 + \dfrac2b - \dfrac{a_3}{a_4} = 1 + \dfrac2b - \dfrac{b^2+1}{b^3+2b} = 1 + \dfrac1b + \dfrac1b - \dfrac{b^2+1}{b^3+2b} = 1 + \dfrac1b + \dfrac1{b^3+2b}$$ Hence, it holds for $N=1$ and $N=2$. Now lets go ahead with induction now. Assume the result is true for $N=m$, i.e., we have
$$\sum_{k=0}^{m} \dfrac1{a_{2^k}} = 1 + \dfrac2b - \dfrac{a_{2^m-1}}{a_{2^m}}$$
Now $$\sum_{k=0}^{m+1} \dfrac1{a_{2^k}} = 1 + \dfrac2b - \dfrac{a_{2^m-1}}{a_{2^m}} + \dfrac1{a_{2^{m+1}}}$$ Hence, we want to show that
$$ - \dfrac{a_{2^m-1}}{a_{2^m}} + \dfrac1{a_{2^{m+1}}} = -\dfrac{a_{2^{m+1}-1}}{a_{2^{m+1}}}, \,\, \text{i.e.}, \,\, \dfrac1{a_{2^{m+1}}} + \dfrac{a_{2^{m+1}-1}}{a_{2^{m+1}}} =  \dfrac{a_{2^m-1}}{a_{2^m}}, \,\, \text{i.e.}, \,\, a_{2^m}(1+a_{2^{m+1}-1}) = a_{2^m-1} a_{2^{m+1}} \,\,\,\, (\star)$$
which can be verified using the recurrence. In fact $(\dagger)$, a slightly more general version of $(\star)$, which is easier to check is true.
$$a_{2k}(1+a_{4k-1}) = a_{2k-1} a_{4k} \,\,\,\, (\dagger),\,\, \text{i.e.}, \,\, a_{2k-1} a_{4k} - a_{2k} a_{4k-1} = a_{2k} \,\,\,\, (\dagger)$$
Hence, we get that
$$\boxed{\color{red}{\displaystyle \sum_{k=0}^{N} \dfrac1{a_{2^k}} = 1 + \dfrac2b - \dfrac{a_{2^N-1}}{a_{2^N}}}}$$
----------------------------------------------------------------------------------------------------------------------------------------------------------
Now letting $N \to \infty$, we get what we want immediately. This is so since from the recurrence we get that
$$\dfrac{a_{n+2}}{a_{n+1}} = b + \dfrac{a_n}{a_{n+1}}$$
If we have $\displaystyle \lim_{n \to \infty} \dfrac{a_n}{a_{n+1}} = L$, then we get that
$$\dfrac1L = b + L$$ and since $L>0$, we have $L = \dfrac{\sqrt{b^2+4}-b}2$. Hence,
$$\boxed{\color{red}{\displaystyle \sum_{k=0}^{\infty} \dfrac1{a_{2^k}} = \lim_{N \to \infty} \displaystyle \sum_{k=0}^{N} \dfrac1{a_{2^k}} = 1 + \dfrac2b - \lim_{N \to \infty} \dfrac{a_{2^N-1}}{a_{2^N}} = 1 + \dfrac2b - L = 1 + \dfrac2b + \dfrac{b-\sqrt{b^2+4}}2}}$$

Saturday, March 2, 2013

Euler's Theorem

Euler's theorem in elementary number theory goes as
$$c^{\phi(n)} \equiv 1 \pmod{n}$$
where $\gcd(c,n) = 1$. This is one of first classical results one gets introduced to in elementary number theory is Euler's theorem. This post look at a well-known proof of this theorem.

Proof: Consider the set of all numbers less than $n$ and relatively prime to it. Let $\{a_1,a_2,...,a_{\varphi(n)}\}$ be this set.

Consider a number $c < n$ and relatively prime to it i.e. $c \in \{a_1,a_2,\ldots,a_{\varphi(n)}\}$.

First observe that for any $a_i$, $c a_{i} \equiv a_{j} (\bmod n)$ for some $j$. This is true since $c$ and $a_i$ are themselves relatively prime to $n$, their product has to be relatively prime to $n$.

If $c a_{i} \equiv c a_{j} (\bmod n)$ then $a_i = a_j$. Note that cancellation can be done since $c$ is relatively prime to $n$.

Hence, if we now consider the set $\{ca_1,ca_2,...,ca_{\varphi(n)}\}$ this is just a permutation of the set $\{a_1,a_2,...,a_{\varphi(n)}\}$.

Thereby, we have $\displaystyle \prod_{k=1}^{\varphi(n)} ca_k \equiv \prod_{k=1}^{\varphi(n)} a_k (\bmod n)$.

Hence, we get $\displaystyle c^{\varphi(n)} \prod_{k=1}^{\varphi(n)} a_k \equiv \prod_{k=1}^{\varphi(n)} a_k (\bmod n)$.

Now, note that $\displaystyle \prod_{k=1}^{\varphi(n)} a_k$ is relatively prime to $n$ and hence you can cancel them on both sides to get

$$c^{\varphi(n)} \equiv 1 (\bmod n)$$ whenever $\gcd(c,n) = 1$