In this blog post, we look at yet another fascinating way in which prime numbers appear in mathematics. The result below is called the Ostrowski's theorem, named after the Russian mathematician Alexander Ostrowski.
Before we go into the theorem, let us attempt to define absolute value on the set of rationals. We call a function $f: \mathbb{Q} \mapsto \mathbb{R}^+ \cup \{0\}$ an absolute value, if the following properties $(\star)$ are satisfied:
Before we go into the theorem, let us attempt to define absolute value on the set of rationals. We call a function $f: \mathbb{Q} \mapsto \mathbb{R}^+ \cup \{0\}$ an absolute value, if the following properties $(\star)$ are satisfied:
- Non-negativity: $f(0) = 0$ and $f(x) > 0$ for all $x > \mathbb{Q} \backslash \{0\}$.
- Multiplicativity: $f(x \cdot y) = f(x) \cdot f(y)$.
- Sub-additivity (or) Triangle inequality: $f(x+y) \leq f(x) + f(y)$.
Note: From $(1)$ and $(2)$, we have that $f(-1) = f(1) = 1$. This also gives us $f(-x) = f(x)$ and $f \left(x^k \right) = f \left(x \right)^k$. In addition, from $(3)$, we also get that $f(n) \leq \vert n \vert$, for $n \in \mathbb{Z}$.
Examples:
$\mathbf{1}$: The usual absolute value function, $f(x) = \vert x \vert$, which measures the distance of a point on the real line from the origin, satisfies the properties $(\star)$.
$\mathbf{2}$: The function given by $f(x) = \vert x \vert^{\alpha}$, where $\alpha \in [0,1]$, also satisfies the properties $(\star)$. Note that $\alpha = 0$ is the trivial function $$f(x) = \begin{cases} 0 & \text{if } x = 0\\ 1 & \text{if } x \neq 0\end{cases}$$ The case $\alpha = 1$, gives us the usual absolute value.
The natural question is can be categorize/classify functions that satisfy the properties described above. To do so, let us look at some examples of function that satisfy the properties $(\star)$.
Examples:
$\mathbf{1}$: The usual absolute value function, $f(x) = \vert x \vert$, which measures the distance of a point on the real line from the origin, satisfies the properties $(\star)$.
$\mathbf{2}$: The function given by $f(x) = \vert x \vert^{\alpha}$, where $\alpha \in [0,1]$, also satisfies the properties $(\star)$. Note that $\alpha = 0$ is the trivial function $$f(x) = \begin{cases} 0 & \text{if } x = 0\\ 1 & \text{if } x \neq 0\end{cases}$$ The case $\alpha = 1$, gives us the usual absolute value.
$\mathbf{3}$: $p$-adic absolute value: We shall now define a function on rational numbers as follows. Consider $a \in \mathbb{Q}$ and any prime number $p$. We have $a = \dfrac{m}n$, where $m,n \in \mathbb{Z}$. Note that we can always write any rational number as
$$a = \dfrac{m}n = p^{\beta} \dfrac{r}s$$where $(r,p) = 1 = (s,p)$. Essentially, we are pulling out all the powers of $p$ from both the numerator and denominator. Now define the function $f_p(a)$ as $f_p(a) = p^{-\beta}$ for $a\in \mathbb{Q} \backslash \{0\}$ and $f_p(0) = 0$.
Claim: The function described above satisfies the properties $(\star)$.
Proof:
Even before we check the properties, we need to check if the function is well-defined. It is not hard to check this, i.e., if $a = \dfrac{m_1}{n_1} = \dfrac{m_2}{n_2}$, where $\dfrac{m_1}{n_1} = p^{\beta_1} \dfrac{r_1}{s_1}$, $(r_1,p) = 1 = (s_1,p)$ and $\dfrac{m_2}{n_2} = p^{\beta_2} \dfrac{r_2}{s_2}$, $(r_2,p) = 1 = (s_2,p)$, then $\beta_1 = \beta_2$.
Trivially non-negativity is satisfied. Multiplicativity follows immediately from the fact that if $(k,l) = 1$ and $(t,l) = 1$, then $(tk,l) = 1$. Sub-additivity requires to be checked. Let $a = p^{\beta_a} \dfrac{m_a}{n_a}$ and $b = p^{\beta_b} \dfrac{m_b}{n_b}$. Since the function is well-defined, we can always choose $m_a,n_a$ and $m_b, n_b$ such that $(m_a,n_a) = 1$ and $(m_b,n_b) = 1$. We have
$$f_p(a) = \dfrac1{p^{\beta_a}} \text{ and } f_p(b) = \dfrac1{p^{\beta_b}}$$
Without loss of generality, let us assume $\beta_a \leq \beta_b$. We then have $$a+b = p^{\beta_a} \dfrac{m_a}{n_a} + p^{\beta_b} \dfrac{m_b}{n_b} = p^{\beta_a} \dfrac{m_a \cdot n_b + p^{\beta_b - \beta_a} m_b \cdot n_a}{n_a \cdot n_b}$$ Now consider $\dfrac{m_a \cdot n_b + p^{\beta_b - \beta_a} m_b \cdot n_a}{n_a \cdot n_b}$. No power of $p$ divides the denominator always. If $\beta_a < \beta_b$, then no power of $p$ divides the numerator as well. Hence, $f_p(a + b) = \dfrac1{p^{\beta_a}} = f_p(a)$. If $\beta_a = \beta_b$, then it is possible that some power of $p$ could divide the numerator. Then we have $f_p(a+b) = \dfrac1{p^{\beta_a + r}}$ where $r > 0$. Hence, either way, we have $$f_p(a+b) \leq \dfrac1{p_a} = \max\{f_p(a),f_p(b)\}.$$ Hence, we in-fact have a strong sub-additivity condition that
$$f_p(a+b) \leq \max \{f_p(a), f_p(b)\}$$Absolute values that satisfy this stronger property are called ultra-metric or non-Archimedean absolute values. We shall denote the $p$-adic absolute by the $\vert \cdot \vert_p$.
$\mathbf{4}$: The $p$-adic absolute value raised to the power $\alpha$, where $\alpha \geq 0$ also satisfies the properties $(\star)$, i.e., $\vert x \vert_p^{\alpha}$. The only thing that needs some work to be checked is the sub-additivity. The argument for sub-additivity is the same as before and this absolute value is also a non-Archimedean absolute value.
We now have $4$ examples of functions satisfying the properties $(\star)$. Strictly speaking, we only have $2$ examples. Examples $1$ and $3$ are special cases of examples $2$ and $4$ respectively, with $\alpha = 1$.
Ostrowski's theorem states that examples $2$ and $4$ gives us all the functions that satisfy $(\star)$, i.e., all possible absolute values on $\mathbb{Q}$ are given by examples $2$ and $4$.
Ostrowski's theorem is a powerful statement. In general, all classification theorems are powerful statements. If we have a property $(\dagger)$, we can construct objects satisfying property $(\dagger)$. Classification theorems enable us to classify all possible objects satisfying $(\dagger)$. This is usually done by finding example objects satisfying $(\dagger)$ and once we feel that there can be no more objects, we set above proving that any object satisfying property $(\dagger)$ falls in one of the categories/examples.
Now let us go above proving Ostrowski's theorem. We will prove that examples $2$ and $4$ are the only possible absolute values for the set of natural numbers. Hence, these could only be the possible set of absolute values for the rationals as well. We will split the theorem into two cases.
Case $1$: $f(n) \geq 1$, $\forall n \in \mathbb{Z}^+$. For this case, we shall show that $f(x) = \vert x \vert^{\alpha}$, where $\alpha \in [0,1]$ is the only such function.
We know that $f(1)$ has to be $1$. Consider $n \in \mathbb{Z}^+$ and not equal to $1$. Consider another $m \in \mathbb{Z}^+$. We can write $m$ in base $n$ as $$m = m_0 + m_1 n + m_2 n^2 + \cdots + m_k n^k$$where $m_l \in \{0,1,2,\ldots,n-1\}$. From sub-additivity of $f$, we have $$f(m) \leq f(m_0) + f(m_1) f(n) + f(m_2) f(n)^2 + \cdots + f(m_k) f(n)^k$$ Since $m_l \in \{0,1,2,\ldots,n-1\}$, we have $f(m_l) \leq n-1$. Hence, we get that \begin{align*} f(m) & \leq (n-1) \left(1 + f(n) + f(n)^2 + \cdots + (f(n))^k\right)\\ & < (n-1)(k+1) f(n)^k \leq (n-1)(\log_n(m)+1) f(n)^{\log_n(m)} \end{align*} Now replacing $m$ by $m^l$, we get that $$f(m^l) = f(m)^l \leq (n-1)(l \cdot \log_n(m)+1) f(n)^{l \cdot \log_n(m)}$$ Taking the $l^{th}$ root on both sides, we get that $$f(m) \leq (n-1)^{1/l} (1+l \cdot \log_n(m))^{1/l}f(n)^{\log_n(m)}$$ Now let $l \to \infty$, to get that $$f(m) \leq f(n)^{\log_n(m)}, \text{i.e.,} f(m)^{1/\log(m)} \leq f(n)^{1/\log(n)}$$ Since this inequality is symmetric in $n$ and $m$, we obtain $$f(n)^{1/\log(n)} = \text{constant}$$ for all $n \in \mathbb{Z}^+$. This gives us $$f(n) = \text{constant}^{\log(n)} = e^{\alpha \cdot \log(n)} = n^{\alpha}$$ Triangle inequality and the fact that $f(n) \geq 1$, $\forall n \in \mathbb{Z}^+$ gives us $\alpha \in [0,1]$.
Case $2$: $f(n) < 1$, for some $n \in \mathbb{Z}^+$. For this case, we shall show that $f(x) = \vert x \vert_p^{\alpha}$, where $\alpha \in \mathbb{R}^+$ is the only such function.
By well-ordering principle, we consider the smallest such $n \in \mathbb{Z}^+$ such that $f(n) < 1$. If $n$ is not a prime, then $n= ab$ for some $a,b \in \{2,3,\ldots,n-1,n\}$. This means $f(a) \cdot f(b) = f(n) < 1$. This means either $f(a) \in (0,1)$ or $f(b) \in (0,1)$, which contradicts the minimality of $n$. Hence, $n$ has to be a prime. Now consider any $m \in \mathbb{Z}^+$. We write $m$ in base $p$ as $$m = m_0 + m_1 p + m_2 p^2 + \cdots m_k p^k$$ We then have \begin{align*}f(m) & \leq f(m_0) + f(m_1) f(p) + f(m_2) f(p^2) + \cdots f(m_k) f(p^k)\\ & < (p-1) + (p-1) f(p) + (p-1) f(p) + \cdots + (p-1) f(p) = (k+1)(p-1) f(p)\\ & \leq (1+\log_p(m))(p-1)f(p) \end{align*} As before replace $m$ by $m^l$ to get $$f(m^l) = f(m)^l < (1+ l \log_p(m))(p-1) f(p)$$ This gives us $$f(m)^l < (1+l \log_p(m)) (p-1) f(p)$$ and taking the $l^{th}$ root on both sides and letting $l \to \infty$, gives us $$f(m) \leq 1$$ for all $m \in \mathbb{Z}^+$ and since $f(-m) = f(m)$, we also have $f(m) \leq 1, \forall m \in \mathbb{Z}$. The only thing left for us to show is that if $(m,p) = 1$, then $f(m) = 1$. (This is so since if $(m,p) \neq 1$, we can always write $m = p^{\gamma} \times u$, where $(u,p) = 1$ and make use of multiplicativity.) If $(m,p) = 1$, we have $(m^n,p^n) = 1$ for all integers $n$. Hence, there exists sequence of integer pairs $(x_n,y_n) \in \mathbb{Z}^2$ such that $$m^n x_n + p^n y_n = 1$$ We then have $$f(m^n x_n + p^n y_n) = f(1) = 1$$ This gives us $$1 \leq f(m)^n f(x_n) + f(p)^n f(y_n) \leq f(m)^n + f(p)^n$$ Letting $n \to \infty$, we have $f(p)^n \to 0$, since $f(p) < 1$. Hence, we have $f(m) = 1$, whenever $(m,p) = 1$. Hence, if $n = p_1^{\beta_1} p_2^{\beta_2} \cdots p_k^{\beta_k}$, and if $p_1$ is the smallest prime such that $f(p_1) < 1$, we then have $$f(n) = f(p_1^{\beta_1}) \cdot f(p_2^{\beta_2} p_3^{\beta_3} \cdots p_k^{\beta_k}) = f(p_1)^{\beta_1} \times 1 = f(p_1)^{\beta_1},$$ where $f(p_1)$ is some quantity less than $1$ given by $f(p_1) = p_1^{-\alpha}$, where $\alpha > 0$. This gives the $p$-adic absolute value.
Claim: The function described above satisfies the properties $(\star)$.
Proof:
Even before we check the properties, we need to check if the function is well-defined. It is not hard to check this, i.e., if $a = \dfrac{m_1}{n_1} = \dfrac{m_2}{n_2}$, where $\dfrac{m_1}{n_1} = p^{\beta_1} \dfrac{r_1}{s_1}$, $(r_1,p) = 1 = (s_1,p)$ and $\dfrac{m_2}{n_2} = p^{\beta_2} \dfrac{r_2}{s_2}$, $(r_2,p) = 1 = (s_2,p)$, then $\beta_1 = \beta_2$.
Trivially non-negativity is satisfied. Multiplicativity follows immediately from the fact that if $(k,l) = 1$ and $(t,l) = 1$, then $(tk,l) = 1$. Sub-additivity requires to be checked. Let $a = p^{\beta_a} \dfrac{m_a}{n_a}$ and $b = p^{\beta_b} \dfrac{m_b}{n_b}$. Since the function is well-defined, we can always choose $m_a,n_a$ and $m_b, n_b$ such that $(m_a,n_a) = 1$ and $(m_b,n_b) = 1$. We have
$$f_p(a) = \dfrac1{p^{\beta_a}} \text{ and } f_p(b) = \dfrac1{p^{\beta_b}}$$
Without loss of generality, let us assume $\beta_a \leq \beta_b$. We then have $$a+b = p^{\beta_a} \dfrac{m_a}{n_a} + p^{\beta_b} \dfrac{m_b}{n_b} = p^{\beta_a} \dfrac{m_a \cdot n_b + p^{\beta_b - \beta_a} m_b \cdot n_a}{n_a \cdot n_b}$$ Now consider $\dfrac{m_a \cdot n_b + p^{\beta_b - \beta_a} m_b \cdot n_a}{n_a \cdot n_b}$. No power of $p$ divides the denominator always. If $\beta_a < \beta_b$, then no power of $p$ divides the numerator as well. Hence, $f_p(a + b) = \dfrac1{p^{\beta_a}} = f_p(a)$. If $\beta_a = \beta_b$, then it is possible that some power of $p$ could divide the numerator. Then we have $f_p(a+b) = \dfrac1{p^{\beta_a + r}}$ where $r > 0$. Hence, either way, we have $$f_p(a+b) \leq \dfrac1{p_a} = \max\{f_p(a),f_p(b)\}.$$ Hence, we in-fact have a strong sub-additivity condition that
$$f_p(a+b) \leq \max \{f_p(a), f_p(b)\}$$Absolute values that satisfy this stronger property are called ultra-metric or non-Archimedean absolute values. We shall denote the $p$-adic absolute by the $\vert \cdot \vert_p$.
$\mathbf{4}$: The $p$-adic absolute value raised to the power $\alpha$, where $\alpha \geq 0$ also satisfies the properties $(\star)$, i.e., $\vert x \vert_p^{\alpha}$. The only thing that needs some work to be checked is the sub-additivity. The argument for sub-additivity is the same as before and this absolute value is also a non-Archimedean absolute value.
We now have $4$ examples of functions satisfying the properties $(\star)$. Strictly speaking, we only have $2$ examples. Examples $1$ and $3$ are special cases of examples $2$ and $4$ respectively, with $\alpha = 1$.
Ostrowski's theorem states that examples $2$ and $4$ gives us all the functions that satisfy $(\star)$, i.e., all possible absolute values on $\mathbb{Q}$ are given by examples $2$ and $4$.
Ostrowski's theorem is a powerful statement. In general, all classification theorems are powerful statements. If we have a property $(\dagger)$, we can construct objects satisfying property $(\dagger)$. Classification theorems enable us to classify all possible objects satisfying $(\dagger)$. This is usually done by finding example objects satisfying $(\dagger)$ and once we feel that there can be no more objects, we set above proving that any object satisfying property $(\dagger)$ falls in one of the categories/examples.
Now let us go above proving Ostrowski's theorem. We will prove that examples $2$ and $4$ are the only possible absolute values for the set of natural numbers. Hence, these could only be the possible set of absolute values for the rationals as well. We will split the theorem into two cases.
Case $1$: $f(n) \geq 1$, $\forall n \in \mathbb{Z}^+$. For this case, we shall show that $f(x) = \vert x \vert^{\alpha}$, where $\alpha \in [0,1]$ is the only such function.
We know that $f(1)$ has to be $1$. Consider $n \in \mathbb{Z}^+$ and not equal to $1$. Consider another $m \in \mathbb{Z}^+$. We can write $m$ in base $n$ as $$m = m_0 + m_1 n + m_2 n^2 + \cdots + m_k n^k$$where $m_l \in \{0,1,2,\ldots,n-1\}$. From sub-additivity of $f$, we have $$f(m) \leq f(m_0) + f(m_1) f(n) + f(m_2) f(n)^2 + \cdots + f(m_k) f(n)^k$$ Since $m_l \in \{0,1,2,\ldots,n-1\}$, we have $f(m_l) \leq n-1$. Hence, we get that \begin{align*} f(m) & \leq (n-1) \left(1 + f(n) + f(n)^2 + \cdots + (f(n))^k\right)\\ & < (n-1)(k+1) f(n)^k \leq (n-1)(\log_n(m)+1) f(n)^{\log_n(m)} \end{align*} Now replacing $m$ by $m^l$, we get that $$f(m^l) = f(m)^l \leq (n-1)(l \cdot \log_n(m)+1) f(n)^{l \cdot \log_n(m)}$$ Taking the $l^{th}$ root on both sides, we get that $$f(m) \leq (n-1)^{1/l} (1+l \cdot \log_n(m))^{1/l}f(n)^{\log_n(m)}$$ Now let $l \to \infty$, to get that $$f(m) \leq f(n)^{\log_n(m)}, \text{i.e.,} f(m)^{1/\log(m)} \leq f(n)^{1/\log(n)}$$ Since this inequality is symmetric in $n$ and $m$, we obtain $$f(n)^{1/\log(n)} = \text{constant}$$ for all $n \in \mathbb{Z}^+$. This gives us $$f(n) = \text{constant}^{\log(n)} = e^{\alpha \cdot \log(n)} = n^{\alpha}$$ Triangle inequality and the fact that $f(n) \geq 1$, $\forall n \in \mathbb{Z}^+$ gives us $\alpha \in [0,1]$.
Case $2$: $f(n) < 1$, for some $n \in \mathbb{Z}^+$. For this case, we shall show that $f(x) = \vert x \vert_p^{\alpha}$, where $\alpha \in \mathbb{R}^+$ is the only such function.
By well-ordering principle, we consider the smallest such $n \in \mathbb{Z}^+$ such that $f(n) < 1$. If $n$ is not a prime, then $n= ab$ for some $a,b \in \{2,3,\ldots,n-1,n\}$. This means $f(a) \cdot f(b) = f(n) < 1$. This means either $f(a) \in (0,1)$ or $f(b) \in (0,1)$, which contradicts the minimality of $n$. Hence, $n$ has to be a prime. Now consider any $m \in \mathbb{Z}^+$. We write $m$ in base $p$ as $$m = m_0 + m_1 p + m_2 p^2 + \cdots m_k p^k$$ We then have \begin{align*}f(m) & \leq f(m_0) + f(m_1) f(p) + f(m_2) f(p^2) + \cdots f(m_k) f(p^k)\\ & < (p-1) + (p-1) f(p) + (p-1) f(p) + \cdots + (p-1) f(p) = (k+1)(p-1) f(p)\\ & \leq (1+\log_p(m))(p-1)f(p) \end{align*} As before replace $m$ by $m^l$ to get $$f(m^l) = f(m)^l < (1+ l \log_p(m))(p-1) f(p)$$ This gives us $$f(m)^l < (1+l \log_p(m)) (p-1) f(p)$$ and taking the $l^{th}$ root on both sides and letting $l \to \infty$, gives us $$f(m) \leq 1$$ for all $m \in \mathbb{Z}^+$ and since $f(-m) = f(m)$, we also have $f(m) \leq 1, \forall m \in \mathbb{Z}$. The only thing left for us to show is that if $(m,p) = 1$, then $f(m) = 1$. (This is so since if $(m,p) \neq 1$, we can always write $m = p^{\gamma} \times u$, where $(u,p) = 1$ and make use of multiplicativity.) If $(m,p) = 1$, we have $(m^n,p^n) = 1$ for all integers $n$. Hence, there exists sequence of integer pairs $(x_n,y_n) \in \mathbb{Z}^2$ such that $$m^n x_n + p^n y_n = 1$$ We then have $$f(m^n x_n + p^n y_n) = f(1) = 1$$ This gives us $$1 \leq f(m)^n f(x_n) + f(p)^n f(y_n) \leq f(m)^n + f(p)^n$$ Letting $n \to \infty$, we have $f(p)^n \to 0$, since $f(p) < 1$. Hence, we have $f(m) = 1$, whenever $(m,p) = 1$. Hence, if $n = p_1^{\beta_1} p_2^{\beta_2} \cdots p_k^{\beta_k}$, and if $p_1$ is the smallest prime such that $f(p_1) < 1$, we then have $$f(n) = f(p_1^{\beta_1}) \cdot f(p_2^{\beta_2} p_3^{\beta_3} \cdots p_k^{\beta_k}) = f(p_1)^{\beta_1} \times 1 = f(p_1)^{\beta_1},$$ where $f(p_1)$ is some quantity less than $1$ given by $f(p_1) = p_1^{-\alpha}$, where $\alpha > 0$. This gives the $p$-adic absolute value.