From: Christoph Fuerst Date: Fri, 19 May 2017 20:08:54 +0000 (+0200) Subject: Edited report X-Git-Url: http://git.risc.jku.at/gitweb/?a=commitdiff_plain;h=329cebc19ecf258abe1a1c31b1fffd067755bdf0;p=cfuerst%2Fformal-numbers.git Edited report --- diff --git a/report/formal.pdf b/report/formal.pdf index a7d5b0a..bfe3efc 100644 Binary files a/report/formal.pdf and b/report/formal.pdf differ diff --git a/report/formal.tex b/report/formal.tex index 4c0cc8c..13ae054 100644 --- a/report/formal.tex +++ b/report/formal.tex @@ -59,12 +59,12 @@ stepnumber=1, % the step between two line-numbers. If it's 1, each line will be numbered tabsize=2, % sets default tabsize to 2 spaces } - +\usepackage{hyperref} \newcommand{\CC}{C\nolinebreak\hspace{-.05em}\raisebox{.4ex}{\tiny\bf +}\nolinebreak\hspace{-.10em}\raisebox{.4ex}{\tiny\bf +}} \def\CC{{C\nolinebreak[4]\hspace{-.05em}\raisebox{.4ex}{\tiny\bf ++}}} %opening -\title{Formal Verification of Algorithms arising in Cryptography} +\title{Formalization of Two Algorithms arising in Number Theory} \author{Christoph F\"urst\qquad Wolfgang Schreiner\\ Research Institute for Symbolic Computation (RISC) Linz} \date{\today} @@ -73,23 +73,56 @@ \begin{abstract} We present a formal verification of algorithms that are used -in applied Cryptography. Classic cryptographic algorithms rest -on the mathematical field of number theory, where abstract -mathematical problems are presented and algorithms to solve -them are discussed. These mathematical preliminaries are then +in Number Theory. Classic cryptographic algorithms rest +on number theory, where abstract mathematical problems are presented +and algorithms to solve them are discussed. These mathematical preliminaries are then used to ensure secure encryption of secret messages. We prove formal correctness by considering concrete implementations and use the recently developed verification system {\rm RISCAL} to prove exactness of the treated algorithms. We present the theories -developed in RISCAL, and put them into context of formal verification +developed in {\rm RISCAL}, and put them into context of formal verification systems. \end{abstract} \tableofcontents \section{Introduction} - -\section{Mathematical and Cryptographic Preliminaries} +In mathematics and computer science, we are in position to design and implement algorithms, +that operate on certain inputs and compute outputs that satisfy certain constraints on the +input parameters. We have building blocks, that connect like a puzzle, and allow us to +compute the output without further justification. However, already the input domain +is often rather general, e.g. we deal with unbounded domains of values. Aesthethic +beauty in mathematics lies in its generality, this is, we try to formulate algorithms +and theorems in full generality, on the largest domain, such as the category of rings, +vector spaces and even more general constructions. Unfortunately, the price we pay for +generality is that we have to apply more human insight in proving statements, and make it +hard for the computer to formally verify statements and prove formal (partial) correctness.\\ + +To ease the job of formal verification, we split large problems into smaller and clear subroutines +which can be verified independently. A concrete approach is the \emph{Design by Contract}-principle, which +specifies the input parameters in their full form, and describes the output computed by the algorithm in +detail. Design by Contract allows to prove partial correctness of the considered algorithms. +In its essence, its a concrete instance of abstract mathematical theories that are of value for +the implementation of a real-life task.\\ + +Such Design by Contracts have been incoorporated into several programming languages, such as Java (by the +Java Modelling Language JML \cite{JML}) and C and \CC (for example, by the Larch project \cite{Larch}), in +one way or the other. Instrumenting the source code by requiring pre-conditions and ensuring post-conditions, +allow a formal verification of the implemented algorithm. Further, additional provers such as the KeY-system +investigate this formal correctness of program code.\\ + +Recently, at RISC the algorithm language {\rm RISCAL} \cite{Schreiner} has been developed. {\rm RISCAL} allows to +formulate an algorithm in pseudo-code, and deduce and verify certain properties of this pseudo-code. +To that end, verification conditions, logical formulas, preconditions, postconditions and invariants +are valitated at each step of execution. Still, {\rm RISCAL} is an algorithm formalization rather than +a concrete programming language, and deals with (potentially nondeterministic) inputs. The validity is +verified by instantiating concrete instances, this is, checking on finite bounded domains. It might be +a reasonable assumption, that counter-examples are expected at reasonable small inputs, although there +are plenty of exceptions, and it is possible to construct certain examples where algorithms will fail at +certain sizes of their inputs. + +\section{Mathematical and Number Theoretic Preliminaries} +Classic textbooks on number theory and finite fields are \cite{hardylittlewood,LidlNiederreiter}. Throughout this paper, let $\mathbb{N}$ denote the non-negative integers including zero, i.e. $\mathbb{N} = \{0,1,2,\ldots\}$. Further, let $\mathbb{Z}$ denote the integers $\mathbb{Z} = \{\ldots,-2,-1,0,1,2,\ldots\}$. @@ -100,21 +133,21 @@ such that $c = a\cdot b$. Every number $n\in\mathbb{N}$ has at least two divisors, a \emph{prime number} $p\in\mathbb{N}$ is a number $p$ that has exactly two divisors, $1$ and $p$. Let now be given two numbers $a,b\in\mathbb{N}$. Among all divisors of two numbers $a$,$b$ there is a -unique \emph{greatest} common divisor of $a$ and $b$, denoted by $\gcd(a,b)$, +unique \emph{greatest common divisor} of $a$ and $b$, denoted by $\gcd(a,b)$, that is computed by the \emph{Euclidean algorithm}. The \emph{extended Euclidean algorithm} allows to compute integers $s,t\in\mathbb{Z}$ such that $a\cdot s + b\cdot t = \gcd(a,b)$. -An implementation of the extended Euclidean algorithm in \CC\, is formulated as follows: -{\tiny \lstinputlisting{../src/xGCD.cpp}} Two numbers $a,b$ with $\gcd(a,b)=1$ are sometimes called \emph{relative prime}. In particular, if $p$ is prime and $a\in\mathbb{N}$, there are integers $s,t\in\mathbb{Z}$ such that $$ p\cdot s + a\cdot t = 1 \Longleftrightarrow a\cdot t\equiv 1\pmod p \Longleftrightarrow p|(a\cdot t-1). $$ -The last equivalence introduces the notion of 'multiplicative inverse mod $p$', i.e. -$a\cdot t \equiv 1\pmod p$ means that $t = a^{-1}$ if we consider the integers modulo -a prime number $p$. The integers modulo a prime $p$ form a field, as introduced in the -next subsection. +The last equivalence introduces the notion of \emph{multiplicative inverse mod $p$}, i.e. +$$ +a\cdot t \equiv 1\pmod p \Longleftrightarrow: t = a^{-1} \pmod p +$$ +The integers modulo a prime $p$, with addition and multiplication modulo $p$ form a field, +as we will see in the next subsection. \subsection{Finite Fields} An elementary notion in cryptographic mathematical theories, is the notion @@ -127,17 +160,17 @@ $p$ is prime, is usually written as $\mathbb{Z}_p = \mathbb{Z}/p\mathbb{Z}$, and consists of the elements $\{0,1,\ldots,p-1\}$. The classic operations are carried out modulo $p$, and hence, instead of $a\in\mathbb{Z}$, we consider the residue class, denoted by $[a]_p := \{a+k\cdot p: 0\leq a < p \wedge k\in\mathbb{Z}\}$. -For residue classes, there are two representations available, either +On the other hand, for residue classes, there is a second representations available, $$ -\left\{-\frac{p-1}{2},-\frac{p-3}{2}\ldots,\frac{p-3}{2},\frac{p-1}{2}\right\}\qquad \text{ or }\qquad \{0,\ldots,p-1\} = \mathbb{Z}_p, +\left\{-\frac{p-1}{2},-\frac{p-3}{2}\ldots,\frac{p-3}{2},\frac{p-1}{2}\right\} $$ -which carry equivalent information. We focus on the second representation, i.e. -we demand the result of the operation modulo $p$ to be in $\mathbb{Z}_p = \{0,\ldots,p-1\}$.\\ +which carries equivalent information as $\{0,1,\ldots,p-1\}$. +We focus on the classic representation, i.e. view $\mathbb{Z}_p = \{0,\ldots,p-1\}$.\\ The number $p$ is called the \emph{characteristic} of $\mathbb{Z}_p$. In fact, the characteristic of every finite field is a prime power. Throughout this paper, -we restrict our attention to the set $\mathbb{Z}_p$, i.e. the integers modulo a prime $p$ -(rather than a prime power).\\ +we restrict our attention to the set $\mathbb{Z}_p$, where $p$ is an odd prime, +this is we considers the integers modulo a prime $p$ (rather than a prime power).\\ Let $p$ be an odd prime. In finite sets of the form $\mathbb{Z}_p$, addition, subtraction and multiplication are performed modulo a prime $p$ by $[a]_p$, and we have @@ -153,240 +186,77 @@ $$ Hence, the finite set $\mathbb{Z}_p$ with the operations of addition, subtraction, multiplication and division as defined above, forms a finite field. - - -\subsection{Elementary Number Theory} +\subsection{Elementary Number Theory}\label{sec:math} With the preliminaries introduced in the last subsection, we can already formulate the first non-trivial Theorems in number theory. \begin{thm}[Fundamental Theorem of Arithmetic] Every natural number can be written as product of primes uniquely up to order. \end{thm} - -TODO: Describe algorithms for computing prime decomposition - -\subsubsection*{Euler's totient Function} -A fundamental notion for a natural number $n$ is the set of elements relatively prime -to $n$. The cardinality of this set is referred in literature as \emph{Euler's $\phi$-function} -(sometimes also as \emph{Euler's totient}-function) $\phi(n)$, i.e. -$$ -\phi(n) := |\{a\in\mathbb{N}: 1\leq a\leq n \gcd(a,n) = 1\}|. -$$ -Let us characterize $\phi(n)$. -\begin{lem}\label{lem:eulerfermat} - Let $k,m,n\in\mathbb{N}$, $p,p_1,p_2$ be prime. - \begin{itemize} - \item $\phi(p) = p-1$; - \item $\phi(p^k) = p^{k}-p^{k-1} = p^k \left(1-\frac{1}{p}\right)$; - \item $\phi(m\cdot n) = \phi(m)\cdot\phi(n)\cdot \frac{\gcd(m,n)}{\phi(\gcd(m,n))}$ in particular $\phi(p_1\cdot p_2) = \phi(p_1)\cdot\phi(p_2)$; - \item $\phi(n) = n\cdot \prod_{p|n}\limits\left(1-\frac{1}{p}\right)$ ; - \item $\sum_{d|n}\limits \phi(d) = n$. - \end{itemize} -Moreover if $a, n\in\mathbb{N}$ are relatively prime, then -\begin{itemize} - \item $a^{\phi(n)} \equiv 1 \pmod n$, in particular $a^{p-1}\equiv 1\pmod p$ for all primes $p$. -\end{itemize} -The last point is known as the Euler-Fermat theorem. -\end{lem} - -\subsubsection*{The Legendre/Jacobi Symbol} -In cryptographic applications, the Legendre/Jacobi symbol appears. Let $p$ be a prime. An element $a\in\mathbb{Z}_p$ is -called a \emph{quadratic residue modulo $p$} if and only if $x^2-a$ has a zero in $\mathbb{Z}_p$. Otherwise, $a$ is called -a quadratic nonresidue modulo $p$. -\begin{defn}[Legendre/Jacobi Symbol]~\\ -Let $p$ be a prime. The \emph{Legendre-Symbol} $\left(\frac{a}{p}\right)$ is defined as -$$ -\left(\frac{a}{p}\right) = -\begin{cases} - 0,\qquad & \text{if $p|a$}\\ - 1,\qquad & \text{if $a$ is quadratic residue modulo $p$}\\ - -1,\qquad & \text{if $a$ is quadratic nonresidue modulo $p$} -\end{cases} -$$ -Let $a$ be an integer and $n$ be a positive odd number such that $n = p_1^{\alpha_1}\ldots p_s^{\alpha_s}$. -The \emph{Jacobi-Symbol} $\left(\frac{a}{n}\right)$ is defined as -$$ -\left(\frac{a}{n}\right) = \left(\frac{a}{p_1}\right)^{\alpha_1}\ldots \left(\frac{a}{p_s}\right)^{\alpha_s}. -$$ -\end{defn} -\begin{lem} - Let $p,q$ be a primes, $a,b\in\mathbb{Z}_p$. Then, +A constructive proof of the fundamental theorem is obtained, by showing that the +smallest divisor $p_1$ of $n\in\mathbb{N}$ (with both, $n$ and $p_1$, greater than one) +is prime. An algorithm for computing the prime decomposition proceeds recursively considering +$n/p_1$. This steps are performed until $n/(p_1\ldots p_k) = 1$.\\ + +The first algorithm we will be concerned with the (more or less) opposite problem. Given a natural +number $n$, we are interested in answering the question: +\begin{equation}\label{eq:q1} +\text{Describe the set }\, P(n) := \{p\in\mathbb{N}: 2 \leq p\leq n \wedge p \text{ is prime}\}.\tag{{\rm Q1}} +\end{equation} +The cardinality of the set $P$ is in classic textbooks referred by $\pi(n)$. +The famous prime number theorem gives information about the asymptotic growth +$$ +\lim\limits_{n \rightarrow \infty}\pi(n) \sim \frac{n}{\log(n)} +$$ +Given an natural number $n\geq 2$, the answer to question \eqref{eq:q1} is given by the \emph{Sieve of Erathostenes}. +The classic algorithm proceeds by starting with the set $N := \{2,\ldots,n\}$ successively erasing non-prime numbers. +Although this algorithm is not particular performant (compared to tuned versions, such as \emph{Aitken's Sieve}), +it is of theoretical importance.\\ + +Obviously, the largest prime factor of $n$ is less equal $\lfloor \sqrt{n}\rfloor$, where +$$ +\lfloor x\rfloor := \min\{k\in \mathbb{Z}: k\leq x\},\qquad x\in \mathbb{R} +$$ +We initialize $P \leftarrow \emptyset$. The smallest number contained in $N$ is the prime number 2. +In the first iteration, we put $P' \leftarrow P\cup \{2\}$ and exclude from $N$ all even numbers, i.e. all numbers divisible by 2. +The next step is repeated until the least element of the set $N$ exceeds $\lfloor \sqrt{n}\rfloor$. +We put +$$ +P' \leftarrow P\cup \{\min(N)\},\qquad \text{ and }\qquad N' \leftarrow N\backslash \{m:\, \min(N)|m\}. +$$ +Hence, for the formalization we will discuss in section \ref{sec:formal}, we need an algorithm for computing +the integer square-root $\lfloor \sqrt{n}\rfloor$ and perform the described algorithm. In fact, we will discuss +two possible implementations of the Sieve of Erathostenes, and view the difference in the formalization. + +\subsection{(Slightly more) Advanced Number Theory} +Describe the algorithm for discrete logarithm here + +\section{The Formalization} +The first step in formalization is to do the necessary paper-work. We start by analyzing the expected input and output +behavior, and put advance payment into the task of formalizing, to obtain two possible (simple) implementations +of the \emph{Sieve of Erathostenes} algorithm, in order to answer \eqref{eq:q1} properly. Each approach comes with its +own facet of difficulties, but provide additional insight on the nature of the algorithm. + +\subsection{Integer Roots} +As already noted earlier, the largest prime divisor of a non-negative integer $n$ is less than or equal to +$m := \lfloor \sqrt{n}\rfloor$. To compute the value $m$, we solve the so called \emph{integer root problem}. +The integer root problem, is the problem of \begin{itemize} - \item $\left(\frac{a}{p}\right) \equiv a^{(p-1)/2} \pmod p$; - \item $\left(\frac{ab}{p}\right) = \left(\frac{a}{p}\right)\left(\frac{b}{p}\right)$; - \item $\left(\frac{a^2}{p}\right) = 1$; - \item If $a\equiv b\pmod p$ then $\left(\frac{a}{p}\right) = \left(\frac{b}{p}\right)$; - \item $\sum_{a=0}^{p-1}\limits\left(\frac{a}{p}\right) = 0$; - \item $\left(\frac{p}{q}\right) = (-1)^{(p-1)(q-1)/4} \left(\frac{q}{p}\right)$; - \item $\left(\frac{-1}{p}\right) = \begin{cases} - 1\, \qquad & p\equiv 1\pmod 4;\\ - -1\, \qquad & p\equiv 3\pmod 4; - \end{cases}$ - \item $\left(\frac{2}{p}\right) = (-1)^{(p^2-1)/8}$; - \end{itemize} - In particular, the last point specializes to - \begin{itemize} - \item $\left(\frac{2}{p}\right) = \begin{cases}1\,\qquad & \text{if $p\equiv 1$ or $p\equiv 7\pmod 8$};\\ - -1\,\qquad & \text{if $p\equiv 3$ or $p\equiv 5\pmod 8$}. - \end{cases}$ - + \item \textbf{Given} an non-negative integer $n>2$, + \item \textbf{Find} a non-negative integer $m\in \mathbb{N}$ such that +\begin{equation}\label{eq:sqr} + m^2 \leq n<(m+1)^2. +\end{equation} \end{itemize} - -\end{lem} - - -\subsubsection*{The Discrete Logarithm problem} -Consider an odd prime $p$. Let $b\in\mathbb{Z}_p$ and $0\leq x\leq p-1$. -We consider the equality $a\equiv b^x\pmod p$, and call $x$ the \emph{discrete -logarithm} (or \emph{index}) of $a$ with respect to the basis $b$. It is indicated -by $x={\rm ind}_b(a)$.\\ - -The Diffie-Hellmann key exchange algorithm relies on the assumption, that given $a^x$ and -$a^y$ (without knowing $x$ and $y$), the expression $a^{xy}$ can not be efficiently computed, -i.e. the discrete logarithm problem is not effectively solvable. Effective solvable in this -context means that there is an algorithm which proceeds more effective than the brute-force -method, which proceeds by computing $b^k\pmod p$ until $a\equiv b^x\pmod p$ is reached.\\ - -For computing the discrete logarithm, the main algorithm is the Baby-Step Giant-Step algorithm. -Suppose we are given an odd prime $p$, an element $g\in\mathbb{Z}_p$ and $a = a_0\in\mathbb{Z}$. -The algorithm proceeds by first compiling a table with entries $(i,t_i)$ where $t_i = g^i \pmod p$ -(computed by repeated multiplying modulo $p$). -The table consists of $m := \lceil \sqrt{p-1}\,\rceil$ entries from 0 to $m-1$. -In the next step, one computes $g^{-m}\pmod p$. Initialize an index $i$ with zero. -If $a_i$ happens to coincide with some $t_j$ where $0\leq j\leq m-1$, then the -algorithm terminates with $x=i\cdot m + j$, otherwise it computes $a_{i+1} = a_i g^{-m}\pmod p$. -An implementation of this algorithm in \CC\, might be written as follows: -{\tiny \lstinputlisting{../src/discrete_log.cpp}} - -\subsection{The RSA cryptosystem} -At the RSA cryptosystem, named after its authors Rivest, Shamir and Adleman, the two -protagonists \textbf{A}lice and \textbf{B}ob want to exchange secret messages. To that -end, A generates two primes $p$ and $q$ which are approximately of the same size. -Then, A calculates the product $n=pq$ and $\phi(n) = (p-1)(q-1)$, and proceeds by -choosing $1 0$} -\If{$n$ is odd} -\State $y\leftarrow x\cdot y$; -\EndIf -\State $x\leftarrow x^2$; -\State $n \leftarrow \lfloor n/2\rfloor$; -\EndWhile -\State\Return $y$. -\end{algorithmic} - \caption{Left-to-Right Exponentation} - \label{alg:1} -\end{algorithm} -Let us analyze this in more detail. To gain insight, we might consider some concrete examples. -If we apply Algorithm \ref{alg:1} to a handful of numbers, we realize the following relationship: -Formulated as function ${\rm Exp}(x,n)$, we find that algorithm is equivalent to the recursive description -$$ -{\rm Exp}(x,n) = -\begin{cases} - 1,\qquad & n =0\\ - {\rm Exp}(x^2,n/2),\qquad & n \text{ is even}\\ - x\cdot {\rm Exp}(x^2,(n-1)/2),\qquad & n \text{ is odd}. -\end{cases} -$$ -The binary representation of $d\in\mathbb{N}$ has at most $\lfloor \log_2(d)\rfloor +1$ digits. -Let us consider two extreme cases: If $d$ is a power of 2, say $2^m$, then its binary representation -equals -$$ -2^m = (1\underbrace{0\ldots0}_{m \text{ zeros}})_2, -$$ -and hence consists of exactly $\log_2(d) +1 = m+1$ positions. The If-condition is met only -once, after $m$ divisions by 2, after which the remainder is one.\\ - -The other extreme case is when $d$ is $2^m-1 = (1,\ldots,1)_2$ i.e. the binary representation -consists of exactly $m$ ones. When this is the case, the If-condition \emph{always} evaluates -to true, and hence we gain nothing compared to the classic algorithm.\\ - -For the While-loop, we start with values $(x,y,n)$ and update them to $(x',y',n')$. -We observe that regardless if $n$ is even or odd, that $n'2$} +\Ensure{$m\in\mathbb{N}$ such that $m \leq \sqrt{n} < m+1$ or equivalently $m= \lfloor \sqrt{n}\rfloor$} \State $(x,y,z) \leftarrow (0,1,1)$; -\While{$y \leq a$} +\While{$y \leq n$} \State $x\leftarrow x+ 1$; \State $z\leftarrow z+ 2$; \State $y\leftarrow y+ z$; @@ -394,13 +264,13 @@ We present an algorithm that computes the integer-square-root, and analyze it af \State\Return $x$. \end{algorithmic} \caption{Integer-Square-Root} - \label{alg:2} + \label{alg:1} \end{algorithm} -The first step of Algorithm \ref{alg:2} is the initialization of the result $x$ and two +The first step of Algorithm \ref{alg:1} is the initialization of the result $x$ and two temporary variables $y$ and $z$. As for the While-Loop, we observe that initially \begin{align} -x = x^2 &\leq a\quad &\wedge& \quad y = x^2 + z \quad &\wedge& \quad z = 2x+1\label{eq:inv}\\ -0 &\leq a\quad &\wedge& \quad 1 = 0+1 \quad &\wedge& \quad 1 = 2\cdot 0 +1.\nonumber +x = x^2 &\leq n\quad &\wedge& \quad y = x^2 + z \quad &\wedge& \quad z = 2x+1\label{eq:inv}\\ +0 &\leq n\quad &\wedge& \quad 1 = 0+1 \quad &\wedge& \quad 1 = 2\cdot 0 +1.\nonumber \end{align} We claim, that \eqref{eq:inv} is an invariant that is kept at each step of the loop. This is seen as follows: The tuple $(x,y,z)$ is updated to $(x',y',z')$, where @@ -418,79 +288,49 @@ z' = 2\cdot x' + 1 \Leftrightarrow z+2 = 2\cdot (x+1) + 1 = (2x + 1) + 2 \Leftrightarrow z = 2x + 1. $$ -After finitely many iterations, we have $y > a$ and we terminate the algorithm with -$x^2\leq a$ (as it holds in each step of the loop), and we conclude +After finitely many iterations, we have $y > n$ and we terminate the algorithm with +$x^2\leq n$ (as it holds in each step of the loop), and we conclude $$ -y > a \Leftrightarrow (x^2+z) > a \Leftrightarrow x^2+2x+1 > a \Leftrightarrow (x+1)^2 > a, +y > n \Leftrightarrow (x^2+z) > n \Leftrightarrow x^2+2x+1 > n \Leftrightarrow (x+1)^2 > n, $$ -hence we have shown $x^2\leq a < (x+1)^2$, and taking square roots shows the claim. - -\subsection*{The RISCAL Formalization} -%{\scriptsize \verbatiminput{../integerroot.txt}} -{\scriptsize \verbatimboxed{../integerroot.txt}} - -\subsection{The Baby-Step Giant-Step Algorithm} -At the Diffie-Hellmann Key-Exchange algorithm, we have encounterd the Discrete -Logarithm problem. An algorithm to compute the discrete logarithm is the Baby-Step -Giant-Step algorithm as sketeched before. A possible formalization might be formulated -as follows: -{\scriptsize \verbatimboxed{../discrete_log.txt}} -Some comments on the formalization: -\begin{itemize} - \item The first argument is the characteristic of the finite field, which is a prime. - Below 10, the primes are 2,3,5,7. Obviously, for $p=2$ we have no elements of - order $p-1$; - \item The condition that $g$ has to be of order $p-1$ is a major restriction. - The only elements that satisfy this are given by -\end{itemize} -\begin{center} -\begin{tabular}{|c|c|} -\hline -$p$ & Elements of order $p-1$\\ -\hline -3 & 2 \\ -5 & 2,3\\ -7 & 3,5\\ -\hline -\end{tabular} -\end{center} +hence we have shown $x^2\leq n < (x+1)^2$, and taking square roots shows the claim. + +\subsection{The Sieve of Erathostenes I} +We have described the algorithm for computing the set of prime numbers less equal $n$ in section \ref{sec:math}. +While this description is rather general, we have several different ways to implement it. First, we will show an +implementation that is based on the mathematical category of sets. In mathematics, we learn that the natural numbers +are well-ordered, meaning, that every non-empty (possibly infinite) subset of $\mathbb{N}$ has a least element. +Let us treat this formally. If we consider a non-empty (and lets restrict for the moment to finite sets) set +$N:=\{n_1,\ldots,n_k\}$, its minimal element $h$ is characterized by two properties: \begin{itemize} - \item The element $a$ has obviously to be non-zero in $\mathbb{Z}_p$. + \item $h\in N$; + \item $\forall k\in N: h\leq k$. \end{itemize} -Having this thoughts in mind, we expect that the set of possible inputs might -be shrinked dramatically. Indeed, the tuples with values less than 10, which -are in our scope might be explictly enumerated by -\begin{center} -\begin{tabular}{c} -$(p,g,a)$\\ -\hline -$(3,2,1)$, $(3,2,2)$\\ -$(5,2,1)$,$(5,2,2)$,$(5,2,3)$,$(5,2,4)$\\ -$(5,3,1)$, $(5,3,2)$, $(5,3,3)$, $(5,3,4)$\\ - $(7,3,1)$, $(7,3,2)$, $(7,3,3)$, $(7,3,4)$, $(7,3,5)$, $(7,3,6)$ \\ -$(7,5,1)$, $(7,5,2)$, $(7,5,3)$, $(7,5,4)$, $(7,5,5)$, $(7,5,6)$ -\end{tabular} -\end{center} -which gives a total of 22 elements to check. - -\subsection{Prime Number Generation} -A classic algorithm for generating prime-numbers less equal a given number, is the Sieve of Erathostenes. -Suppose, we want to produce a list of all primes less equal $m$, the algorithm proceeds by enumerating -all integers from 2 to $m$, we set $S^{(0)}:=\{2,\ldots,m\}$. The first prime, is obviously given by 2, i.e. -the minimal element of the set $S^{(0)}$. Of course, any number greater and divisible by 2 is not prime, -hence, we remove the \emph{even} elements from the set $S^{(0)}$ to obtain the set $S^{(1)}$. The next prime -number is the least number of the updated set $S^{(1)}$, i.e. 3. We remove all multiples of 3 to obtain -$S^{(2)}$. Proceedings likewise, up to $\sqrt{m}$, by selecting the minimal element $p$ (which is prime) -of $S^{(k)}$ and updating $S^{(k)}$ to $S^{(k+1)}$, that is, removing all multiples of $p$, we obtain a list -of primes less equal $m$.\\ - -We formalize this algorithm in two ways. While the first formalization treats the numbers 2..$m$ as boolean -array, we can for the second formalization the data structure of sets (that corresponds to sets in the -mathematical sense). Obviously, we should obtain the same results. - -\subsection*{The RISCAL Formalization} -{\scriptsize \verbatimboxed{../sieveEratosthenes.txt}} +While this seems obvious when written, we must not forget, that the computer can not apply the insight that +mathematically inclined people would already provide \emph{per se}, as a formal proof system could not use +intuition.\\ +However, we now present one possible answer to question \eqref{eq:q1}, by considering the following algorithm. +\begin{algorithm}[H] +\begin{algorithmic}[1] +\Require{$n\in\mathbb{N}$ such that $n>2$} +\Ensure{$P(n)$ where $P(n)$ is as in \eqref{eq:q1}} +\State $P \leftarrow \emptyset$; +\State $Q \leftarrow \{2,\ldots,n\}$; +\While{$Q \neq \emptyset$} +\State $P\leftarrow P \cup \{\min(Q)\}$; +\State $Q\leftarrow Q - \{m\in Q: \min(Q)|m\}$; +\EndWhile +\State\Return $P$. +\end{algorithmic} + \caption{Sieve of Erathostenes I} + \label{alg:2} +\end{algorithm} +Algorithm \ref{alg:2} provides several points for discussion. First of all, both sets, $P$ and $Q$ are +at all steps obviously subsets of $\{2,\ldots,n\}$. As this condition holds at each execution of the while +loop, we identify this as a \emph{loop invariant}. + +\subsection{The Sieve of Erathostenes II} \appendix @@ -502,6 +342,14 @@ mathematical sense). Obviously, we should obtain the same results. G.H. Hardy and E.M. Wright. \textit{An Introduction to the Theory of Numbers}. Oxford University Press, 6. edition. 2008. +\bibitem{JML} +\text{Java Modelling Language JML} +\url{http://www.eecs.ucf.edu/~leavens/JML//index.shtml} + +\bibitem{Larch} +\text{The Larch System}. +\url{https://www.cs.cmu.edu/afs/cs/project/larch/www/home.html} + \bibitem{LidlNiederreiter} R. Lidl and H. Niederreiter. \textit{Introduction to finite fields and their applications}. Cambridge University Press, 1986. diff --git a/report/saved.tex b/report/saved.tex new file mode 100644 index 0000000..4a725ee --- /dev/null +++ b/report/saved.tex @@ -0,0 +1,326 @@ + +\subsubsection*{Euler's totient Function} +A fundamental notion for a natural number $n$ is the set of elements relatively prime +to $n$. The cardinality of this set is referred in literature as \emph{Euler's $\phi$-function} +(sometimes also as \emph{Euler's totient}-function) $\phi(n)$, i.e. +$$ +\phi(n) := |\{a\in\mathbb{N}: 1\leq a\leq n \gcd(a,n) = 1\}|. +$$ +Let us characterize $\phi(n)$. +\begin{lem}\label{lem:eulerfermat} + Let $k,m,n\in\mathbb{N}$, $p,p_1,p_2$ be prime. + \begin{itemize} + \item $\phi(p) = p-1$; + \item $\phi(p^k) = p^{k}-p^{k-1} = p^k \left(1-\frac{1}{p}\right)$; + \item $\phi(m\cdot n) = \phi(m)\cdot\phi(n)\cdot \frac{\gcd(m,n)}{\phi(\gcd(m,n))}$ in particular $\phi(p_1\cdot p_2) = \phi(p_1)\cdot\phi(p_2)$; + \item $\phi(n) = n\cdot \prod_{p|n}\limits\left(1-\frac{1}{p}\right)$ ; + \item $\sum_{d|n}\limits \phi(d) = n$. + \end{itemize} +Moreover if $a, n\in\mathbb{N}$ are relatively prime, then +\begin{itemize} + \item $a^{\phi(n)} \equiv 1 \pmod n$, in particular $a^{p-1}\equiv 1\pmod p$ for all primes $p$. +\end{itemize} +The last point is known as the Euler-Fermat theorem. +\end{lem} + +\subsubsection*{The Legendre/Jacobi Symbol} +In cryptographic applications, the Legendre/Jacobi symbol appears. Let $p$ be a prime. An element $a\in\mathbb{Z}_p$ is +called a \emph{quadratic residue modulo $p$} if and only if $x^2-a$ has a zero in $\mathbb{Z}_p$. Otherwise, $a$ is called +a quadratic nonresidue modulo $p$. +\begin{defn}[Legendre/Jacobi Symbol]~\\ +Let $p$ be a prime. The \emph{Legendre-Symbol} $\left(\frac{a}{p}\right)$ is defined as +$$ +\left(\frac{a}{p}\right) = +\begin{cases} + 0,\qquad & \text{if $p|a$}\\ + 1,\qquad & \text{if $a$ is quadratic residue modulo $p$}\\ + -1,\qquad & \text{if $a$ is quadratic nonresidue modulo $p$} +\end{cases} +$$ +Let $a$ be an integer and $n$ be a positive odd number such that $n = p_1^{\alpha_1}\ldots p_s^{\alpha_s}$. +The \emph{Jacobi-Symbol} $\left(\frac{a}{n}\right)$ is defined as +$$ +\left(\frac{a}{n}\right) = \left(\frac{a}{p_1}\right)^{\alpha_1}\ldots \left(\frac{a}{p_s}\right)^{\alpha_s}. +$$ +\end{defn} +\begin{lem} + Let $p,q$ be a primes, $a,b\in\mathbb{Z}_p$. Then, +\begin{itemize} + \item $\left(\frac{a}{p}\right) \equiv a^{(p-1)/2} \pmod p$; + \item $\left(\frac{ab}{p}\right) = \left(\frac{a}{p}\right)\left(\frac{b}{p}\right)$; + \item $\left(\frac{a^2}{p}\right) = 1$; + \item If $a\equiv b\pmod p$ then $\left(\frac{a}{p}\right) = \left(\frac{b}{p}\right)$; + \item $\sum_{a=0}^{p-1}\limits\left(\frac{a}{p}\right) = 0$; + \item $\left(\frac{p}{q}\right) = (-1)^{(p-1)(q-1)/4} \left(\frac{q}{p}\right)$; + \item $\left(\frac{-1}{p}\right) = \begin{cases} + 1\, \qquad & p\equiv 1\pmod 4;\\ + -1\, \qquad & p\equiv 3\pmod 4; + \end{cases}$ + \item $\left(\frac{2}{p}\right) = (-1)^{(p^2-1)/8}$; + \end{itemize} + In particular, the last point specializes to + \begin{itemize} + \item $\left(\frac{2}{p}\right) = \begin{cases}1\,\qquad & \text{if $p\equiv 1$ or $p\equiv 7\pmod 8$};\\ + -1\,\qquad & \text{if $p\equiv 3$ or $p\equiv 5\pmod 8$}. + \end{cases}$ + +\end{itemize} + +\end{lem} + + +\subsubsection*{The Discrete Logarithm problem} +Consider an odd prime $p$. Let $b\in\mathbb{Z}_p$ and $0\leq x\leq p-1$. +We consider the equality $a\equiv b^x\pmod p$, and call $x$ the \emph{discrete +logarithm} (or \emph{index}) of $a$ with respect to the basis $b$. It is indicated +by $x={\rm ind}_b(a)$.\\ + +The Diffie-Hellmann key exchange algorithm relies on the assumption, that given $a^x$ and +$a^y$ (without knowing $x$ and $y$), the expression $a^{xy}$ can not be efficiently computed, +i.e. the discrete logarithm problem is not effectively solvable. Effective solvable in this +context means that there is an algorithm which proceeds more effective than the brute-force +method, which proceeds by computing $b^k\pmod p$ until $a\equiv b^x\pmod p$ is reached.\\ + +For computing the discrete logarithm, the main algorithm is the Baby-Step Giant-Step algorithm. +Suppose we are given an odd prime $p$, an element $g\in\mathbb{Z}_p$ and $a = a_0\in\mathbb{Z}$. +The algorithm proceeds by first compiling a table with entries $(i,t_i)$ where $t_i = g^i \pmod p$ +(computed by repeated multiplying modulo $p$). +The table consists of $m := \lceil \sqrt{p-1}\,\rceil$ entries from 0 to $m-1$. +In the next step, one computes $g^{-m}\pmod p$. Initialize an index $i$ with zero. +If $a_i$ happens to coincide with some $t_j$ where $0\leq j\leq m-1$, then the +algorithm terminates with $x=i\cdot m + j$, otherwise it computes $a_{i+1} = a_i g^{-m}\pmod p$. +An implementation of this algorithm in \CC\, might be written as follows: +{\tiny \lstinputlisting{../src/discrete_log.cpp}} + +\subsection{The RSA cryptosystem} +At the RSA cryptosystem, named after its authors Rivest, Shamir and Adleman, the two +protagonists \textbf{A}lice and \textbf{B}ob want to exchange secret messages. To that +end, A generates two primes $p$ and $q$ which are approximately of the same size. +Then, A calculates the product $n=pq$ and $\phi(n) = (p-1)(q-1)$, and proceeds by +choosing $1 0$} +\If{$n$ is odd} +\State $y\leftarrow x\cdot y$; +\EndIf +\State $x\leftarrow x^2$; +\State $n \leftarrow \lfloor n/2\rfloor$; +\EndWhile +\State\Return $y$. +\end{algorithmic} + \caption{Left-to-Right Exponentation} + \label{alg:1} +\end{algorithm} +Let us analyze this in more detail. To gain insight, we might consider some concrete examples. +If we apply Algorithm \ref{alg:1} to a handful of numbers, we realize the following relationship: +Formulated as function ${\rm Exp}(x,n)$, we find that algorithm is equivalent to the recursive description +$$ +{\rm Exp}(x,n) = +\begin{cases} + 1,\qquad & n =0\\ + {\rm Exp}(x^2,n/2),\qquad & n \text{ is even}\\ + x\cdot {\rm Exp}(x^2,(n-1)/2),\qquad & n \text{ is odd}. +\end{cases} +$$ +The binary representation of $d\in\mathbb{N}$ has at most $\lfloor \log_2(d)\rfloor +1$ digits. +Let us consider two extreme cases: If $d$ is a power of 2, say $2^m$, then its binary representation +equals +$$ +2^m = (1\underbrace{0\ldots0}_{m \text{ zeros}})_2, +$$ +and hence consists of exactly $\log_2(d) +1 = m+1$ positions. The If-condition is met only +once, after $m$ divisions by 2, after which the remainder is one.\\ + +The other extreme case is when $d$ is $2^m-1 = (1,\ldots,1)_2$ i.e. the binary representation +consists of exactly $m$ ones. When this is the case, the If-condition \emph{always} evaluates +to true, and hence we gain nothing compared to the classic algorithm.\\ + +For the While-loop, we start with values $(x,y,n)$ and update them to $(x',y',n')$. +We observe that regardless if $n$ is even or odd, that $n' a$ and we terminate the algorithm with +$x^2\leq a$ (as it holds in each step of the loop), and we conclude +$$ +y > a \Leftrightarrow (x^2+z) > a \Leftrightarrow x^2+2x+1 > a \Leftrightarrow (x+1)^2 > a, +$$ +hence we have shown $x^2\leq a < (x+1)^2$, and taking square roots shows the claim. + +\subsection*{The RISCAL Formalization} +%{\scriptsize \verbatiminput{../integerroot.txt}} +{\scriptsize \verbatimboxed{../integerroot.txt}} + +\subsection{The Baby-Step Giant-Step Algorithm} +At the Diffie-Hellmann Key-Exchange algorithm, we have encounterd the Discrete +Logarithm problem. An algorithm to compute the discrete logarithm is the Baby-Step +Giant-Step algorithm as sketeched before. A possible formalization might be formulated +as follows: +{\scriptsize \verbatimboxed{../discrete_log.txt}} +Some comments on the formalization: +\begin{itemize} + \item The first argument is the characteristic of the finite field, which is a prime. + Below 10, the primes are 2,3,5,7. Obviously, for $p=2$ we have no elements of + order $p-1$; + \item The condition that $g$ has to be of order $p-1$ is a major restriction. + The only elements that satisfy this are given by +\end{itemize} +\begin{center} +\begin{tabular}{|c|c|} +\hline +$p$ & Elements of order $p-1$\\ +\hline +3 & 2 \\ +5 & 2,3\\ +7 & 3,5\\ +\hline +\end{tabular} +\end{center} +\begin{itemize} + \item The element $a$ has obviously to be non-zero in $\mathbb{Z}_p$. +\end{itemize} +Having this thoughts in mind, we expect that the set of possible inputs might +be shrinked dramatically. Indeed, the tuples with values less than 10, which +are in our scope might be explictly enumerated by +\begin{center} +\begin{tabular}{c} +$(p,g,a)$\\ +\hline +$(3,2,1)$, $(3,2,2)$\\ +$(5,2,1)$,$(5,2,2)$,$(5,2,3)$,$(5,2,4)$\\ +$(5,3,1)$, $(5,3,2)$, $(5,3,3)$, $(5,3,4)$\\ + $(7,3,1)$, $(7,3,2)$, $(7,3,3)$, $(7,3,4)$, $(7,3,5)$, $(7,3,6)$ \\ +$(7,5,1)$, $(7,5,2)$, $(7,5,3)$, $(7,5,4)$, $(7,5,5)$, $(7,5,6)$ +\end{tabular} +\end{center} +which gives a total of 22 elements to check. + +\subsection{Prime Number Generation} +A classic algorithm for generating prime-numbers less equal a given number, is the Sieve of Erathostenes. +Suppose, we want to produce a list of all primes less equal $m$, the algorithm proceeds by enumerating +all integers from 2 to $m$, we set $S^{(0)}:=\{2,\ldots,m\}$. The first prime, is obviously given by 2, i.e. +the minimal element of the set $S^{(0)}$. Of course, any number greater and divisible by 2 is not prime, +hence, we remove the \emph{even} elements from the set $S^{(0)}$ to obtain the set $S^{(1)}$. The next prime +number is the least number of the updated set $S^{(1)}$, i.e. 3. We remove all multiples of 3 to obtain +$S^{(2)}$. Proceedings likewise, up to $\sqrt{m}$, by selecting the minimal element $p$ (which is prime) +of $S^{(k)}$ and updating $S^{(k)}$ to $S^{(k+1)}$, that is, removing all multiples of $p$, we obtain a list +of primes less equal $m$.\\ + +We formalize this algorithm in two ways. While the first formalization treats the numbers 2..$m$ as boolean +array, we can for the second formalization the data structure of sets (that corresponds to sets in the +mathematical sense). Obviously, we should obtain the same results. + +\subsection*{The RISCAL Formalization} +{\scriptsize \verbatimboxed{../sieveEratosthenes.txt}}