“Finite Field Arithmetic.” Chapter 16A: The MillerRabin Test.
This article is part of a series of handson tutorials introducing FFA, or the Finite Field Arithmetic library. FFA differs from the typical "Open Sores" abomination, in that  rather than trusting the author blindly with their lives  prospective users are expected to read and fully understand every single line. In exactly the same manner that you would understand and pack your own parachute. The reader will assemble and test a working FFA with his own hands, and at the same time grasp the purpose of each moving part therein.
 Chapter 1: Genesis.
 Chapter 2: Logical and Bitwise Operations.
 Chapter 3: Shifts.
 Chapter 4: Interlude: FFACalc.
 Chapter 5: "Egyptological" Multiplication and Division.
 Chapter 6: "Geological" RSA.
 Chapter 7: "Turbo Egyptians."
 Chapter 8: Interlude: Randomism.
 Chapter 9: "Exodus from Egypt" with Comba's Algorithm.
 Chapter 10: Introducing Karatsuba's Multiplication.
 Chapter 11: Tuning and Unified API.
 Chapter 12A: Karatsuba Redux. (Part 1 of 2)
 Chapter 12B: Karatsuba Redux. (Part 2 of 2)
 Chapter 13: "WidthMeasure" and "Quiet Shifts."
 Chapter 14A: Barrett's Modular Reduction. (Part 1 of 2)
 Chapter 14ABis: Barrett's Modular Reduction. (Physical Bounds Proof.)
 Chapter 14B: Barrett's Modular Reduction. (Part 2 of 2.)
 Chapter 15: Greatest Common Divisor.
 Chapter 16A: The MillerRabin Test.
You will need:
 A Keccakbased VTron (for this and all subsequent chapters.)
 All of the materials from Chapters 1  15.
 ffa_ch16_miller_rabin.kv.vpatch
 ffa_ch16_miller_rabin.kv.vpatch.asciilifeform.sig
Add the above vpatches and seals to your Vset, and press to ffa_ch16_miller_rabin.kv.vpatch.
You should end up with the same directory structure as previously.
As of Chapter 16, the versions of FFACalc and FFA are 253 and 253, respectively.
Now compile ffacalc:
cd ffacalc gprbuild
But do not run it quite yet.
First, the mail bag!
Reader bvt has given me to know that he has read and signed Chapters 1  6:
 ffa_ch1_genesis.kv.vpatch.bvt.sig
 ffa_ch2_logicals.kv.vpatch.bvt.sig
 ffa_ch3_shifts.kv.vpatch.bvt.sig
 ffa_ch4_ffacalc.kv.vpatch.bvt.sig
 ffa_ch5_egypt.kv.vpatch.bvt.sig
 ffa_ch6_simplest_rsa.kv.vpatch.bvt.sig
He also published a report of his interesting experiment with a variant of the Karatsuba multiplication algorithm.
Thank you, reader bvt!
Now, let's eat the meat of this Chapter.
We will add a very useful feature to FFA and FFACalc, a constantspacetime MillerRabin Monte Carlo Primality Test:
Op  Description  # Ins  # Outs  Notes 

P  Perform a single shot of the MillerRabin Monte Carlo Primality Test on N, the second item on the stack, using the top item as the Witness parameter for the test. Places 1 on the stack if N was found to be composite; or 0, if N was not found to be composite.  2  1  If the supplied Witness does not satisfy the inequality 2 ≤ Witness ≤ N  2 , it will be mapped via modular arithmetic to a value which satisfies it. N ∈ {0, 1} will be pronounced composite under any Witness; N ∈ {2, 3} will be judged not composite under any Witness. Any N which was found to be composite under any particular Witness, is in fact composite. The converse, is however, not true; see Ch. 16 discussion. 
The proof of the MillerRabin method will be given in the next chapter, 16B.
Presently, the reader is invited to satisfy himself that the given program conforms to the stated variant of MillerRabin, and executes it in constantspacetime:
Algorithm 1: MillerRabin Monte Carlo Primality Test.
For an odd integer N ≥ 5 and a positive integer Witness where 2 ≤ W ≤ N  2:
 Find the unique values R ≥ 1 and odd K such that 2^{R} × K = N  1.
 T := W^{K} mod N.

If T ∈ {1, N  1}:
Return PossiblyPrime.  Iterate R  1 times:
 T := T^{2} mod N

If T = N  1:
Return PossiblyPrime.  Return Composite.
The astute reader will immediately observe that Algorithm 1 is not suitable for use in FFA, as it does not execute in constant time, and does not correctly handle all possible values of N: N < 5 and even N do not meet the stated constraints.
Additionally, given as most practical uses of the MillerRabin method involve a sequence of invocations with a random Witness parameter, it is necessary to force a potentially outofrange Witness into the expected range, with a minimal loss of entropy.
So, let's generalize the algorithm:
Algorithm 2: Generalized MillerRabin Monte Carlo Primality Test.
For any integers N and W:

ProbPrime := 0.

Degen_{Comp} := {N ∈ {0, 1, even and ≠ 2} → 1, else 0}.

Degen_{Prime} := {N ∈ {2, 3} → 1, else 0}.

BW := W mod (N  1).

If BW < 2:
BW := 2;
 Find the unique values R ≥ 1 and odd K such that 2^{R} × K = N  1.
 T := BW^{K} mod N.

If T ∈ {1, N  1}:
ProbPrime := 1.  Iterate R  1 times:
 T := T^{2} mod N

If T = N  1:
ProbPrime := 1. 
If Degen_{Comp} = 1:
Return Composite. 
If Degen_{Prime} = 1 or ProbPrime = 1:
Return PossiblyPrime.  Return Composite.
Still not a constanttime algorithm; but the sharp reader will be able to identify the missing ingredients.
Now let's rewrite it into a form suitable for use in FFA:
Algorithm 3: ConstantTime Generalized MillerRabin Monte Carlo Primality Test.
For any integers N and W:

Degen_{Comp} := (N = 0) OR (N = 1) OR ((N ≠ 2) AND (N mod 2 = 0))

Degen_{Prime} := (N = 2) OR (N = 3)

ProbPrime := Degen_{Prime}.

BW := W mod (N  1).

BW := Mux( 2 ← (BW < 2) , BW ← (BW ≥ 2) )
 R := Count_Bottom_Zeros(N).
 K := N >> R.
 T := BW^{K} mod N.
 ProbPrime := ProbPrime OR (T = 1) OR (T = N  1)
 I := 1
 Iterate FZ_Bitness(N)  1 times:
 T := T^{2} mod N
 ProbPrime := ProbPrime OR ((I < R) AND (T = N  1))
 I := I + 1
 Return Degen_{Comp} OR (1  ProbPrime).
 Returned value 0 corresponds to PossiblyPrime; 1 to Composite.
This variant will carry out the MillerRabin test on any possible candidate integer of a given FFA Bitness, without leaking the integer or the Witness via timing sidechannel, and the Witness will be forced into the valid range. To my knowledge, this is the first publication of an algorithm which has these characteristics.
Without further delay, let's implement Algorithm 3 as a FFA subroutine:
package body FZ_Prime is  Find the highest power of 2 which divides N. ( or 0 if N is zero. ) function FZ_Count_Bottom_Zeros(N : in FZ) return FZBit_Index is  The result (default : 0, will remain 0 if N is in fact zero) Index : Word := 0;  The currentlyexamined Word, starting from the highest Ni : Word;  The most recentlyseen nonzero Word, if indeed any exist W : Word := 0;  1 if currentlyexamined Word is zero, otherwise 0 NiZ : WBool; begin  Find lowest nonzero Word (or simply stop at lowest, if N = 0) for i in reverse 0 .. Indices(N'Length  1) loop Ni := N(N'First + i);  Ni := current Word; NiZ := W_ZeroP(Ni);  ... is this Word = 0? Index := W_Mux(Word(i), Index, NiZ);  If NO, save its Index, W := W_Mux(Ni, W, NiZ);  ... and save the Word. end loop;  Set Index to be the bitposition of the lowest nonzero Word: Index := Shift_Left(Index, BitnessLog2);  Index := Index * Bitness  Handle degenerate case: if W is equal to 0, Index is not changed;  Otherwise, start by advancing Index by an entire Word Bitness: Index := Index + ((0  W_NZeroP(W)) and Word(Bitness));  Now crank back the Index by the number of high bits of W (i.e.  starting from its top) that must be discarded for W to become zero: for b in 1 .. Bitness loop  If W is nonzero at this time, decrement the Index: Index := Index  W_NZeroP(W);  Advance W left, i.e. discard the top bit of it: W := Shift_Left(W, 1); end loop;  If N = 0, result will be 0; otherwise: length of bottom zeros in N. return FZBit_Index(Index); end FZ_Count_Bottom_Zeros;  ConstantTime MillerRabin Test on N using the given Witness.  Witness will be used asis if it conforms to the valid bounds,  i.e. 2 < = Witness <= N  2; otherwise will be transformed into a  valid Witness via modular arithmetic.  Outputs ONE if N WAS FOUND composite; ZERO if NOT FOUND.  Handles degenerate cases of N that MR per se cannot eat:  N=0, N=1: ALWAYS 'FOUND COMPOSITE'; 2, 3  ALWAYS 'NOT FOUND'.  If N is Even and not equal to 2, N is ALWAYS 'FOUND COMPOSITE.'  For ALL other N, the output is equal to that of the MR test. function FZ_MR_Composite_On_Witness(N : in FZ; Witness : in FZ) return WBool is  Widths of N, Witness, and Result are equal : subtype Width is Word_Index range N'Range;  Whether N is 'small' (i.e. 1 Word) and therefore possibly degenerate : N_Is_Small : constant WBool := FZ_OneWordP(N);  Head of N, for detecting (and handling) the degenerateN case : N_Head : constant Word := FZ_Get_Head(N);  Zero and One are defined as COMPOSITE degenerate cases of N : N_Is_Zero_Or_One : constant WBool := N_Is_Small and (W_EqP(N_Head, 0) or W_EqP(N_Head, 1));  Two and Three are PRIME degenerate cases of N : N_Is_Two : constant WBool := N_Is_Small and W_EqP(N_Head, 2); N_Is_Three : constant WBool := N_Is_Small and W_EqP(N_Head, 3);  The COMPOSITE degenerate case of N where N != 2 and N is Even : N_Ne_2_Is_Even : constant WBool := (1  N_Is_Two) and (1  W_OddP(N_Head));  Degeneracy latch. If N is Zero or One, then set immediately : Degen_Composite : constant WBool := N_Is_Zero_Or_One or N_Ne_2_Is_Even;  Possibleprimality latch. If N is Two or Three, then set immediately : Possibly_Prime : WBool := N_Is_Two or N_Is_Three;  The writable copy of N that we will operate on : X : FZ(Width) := N;  Potentiallyreplaced head of X (if degenerate N) : X_Head : Word := N_Head;  Will be Barrettoid(X), for operations modulo X : XBar : Barretoid(ZXMLength => X'Length + 1, BarretoidLength => 2 * X'Length);  The Bound Witness, constrained to valid range 2 < = BW <= X  2 : BW : FZ(Width);  Head of BW, for detecting crossing of the lower bound : BW_Head : Word;  X  1 (for MR proper, and to constrain BW) : X_Minus_One : FZ(Width);  X  1 = 2^R * K, with K odd : K : FZ(Width); R : FZBit_Index;  Working register for all MR modular operations : T : FZ(Width);  For subtraction where no underflow can happen, barring cosmic ray: NoCarry : WZeroOrDie := 0; begin  First: X, which will remain equal to N unless N is degenerate:  If N is one of the two prohibited small primes (2,3) X will become 5: X_Head := W_Mux(A => X_Head, B => 5, Sel => Possibly_Prime);  If one of the two prohibited small composites (0,1), or even, then 9: X_Head := W_Mux(A => X_Head, B => 9, Sel => Degen_Composite);  Given as we're stuck carrying out MR even if N is degenerate case,  then let the MR do The Right Thing, for added cosmic ray resistance.  X gets a new head, if N was a degenerate case; else keeps old head: FZ_Set_Head(X, X_Head);  Compute X  1. As now X > 3, underflow is not permitted: FZ_Sub_W(X => X, W => 1, Difference => X_Minus_One, Underflow => NoCarry);  Now, compute BW (Bound Witness), which satisfies 2 < = BW <= X  2:  First, BW := Witness mod (X  1). After this, 0 <= BW <= X  2: FZ_Mod(Dividend => Witness, Divisor => X_Minus_One, Remainder => BW);  Now, adjust BW if it is found to be below Two (the lower bound) :  Get head of BW: BW_Head := FZ_Get_Head(BW);  If BW is equal to Zero or One, it will be forced to equal Two: BW_Head := W_Mux(A => BW_Head, B => 2, Sel => FZ_OneWordP(BW) and (W_EqP(BW_Head, 0) or W_EqP(BW_Head, 1)));  BW gets the new head, if it must; otherwise keeps its old head: FZ_Set_Head(BW, BW_Head);  We finished adjusting X and BW for degeneracies, and can now MR:  Generate Barrettoid(X) to use in all of the moduloX operations: FZ_Make_Barrettoid(Modulus => X, Result => XBar);  Find R >= 1, and odd K, where X − 1 = 2^R * K :  ... first, find R, the largest power of two which divides X  1 : R := FZ_Count_Bottom_Zeros(X_Minus_One);  ... and now obtain K := X / 2^R, i.e., K := X >> R : FZ_Quiet_ShiftRight(N => X_Minus_One, ShiftedN => K, Count => R);  T := BW^K mod X FZ_Mod_Exp_Barrett(Base => BW, Exponent => K, Bar => XBar, Result => T);  if T = 1 or T = X  1, then X is possiblyprime: Possibly_Prime := Possibly_Prime or FZ_EqP_W(T, 1) or FZ_EqP(T, X_Minus_One);  Needs R  1 times, but perform for max possible count (gated): for i in 1 .. FZ_Bitness(N)  1 loop  T := T^2 mod X FZ_Mod_Sqr_Barrett(X => T, Bar => XBar, Product => T);  if T = X  1, and i < R, then X is possiblyprime: Possibly_Prime := Possibly_Prime or (FZ_EqP(T, X_Minus_One) and W_LtP(Word(i), Word(R))); end loop;  The output, which will be 1 iff X WAS FOUND to be composite via BW,  ... or if X was found to equal Zero or One (without regard to BW.) return (1  Possibly_Prime) or Degen_Composite;  If X was found to equal Two or Three, output will be 0 (under any BW). end FZ_MR_Composite_On_Witness; end FZ_Prime;
Observe that degenerate values of N result in a reassignment, such that the MR procedure (which is carried out regardless of degeneracy, for constanttime operation) is forced to arrive at a correct output, acting as "anticosmicray" backup to the degeneracy latches.
Now, let's run the program, and get an idea of how it works. Let's create a FFA Tape which executes MR, via random witnesses, on the first 12 Mersenne Primes:
.2 .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .3 .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .5 .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .7 .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .d .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .11 .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .13 .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .1f .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .3d .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .59 .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .6b .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z .7f .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z
Now, run the tape as follows:
$ cat 12_mersenne.tape  ./bin/ffa_calc 256 3
... and if successful, the output will be:
MR(2^0000000000000000000000000000000000000000000000000000000000000002  1)=Prime MR(2^0000000000000000000000000000000000000000000000000000000000000003  1)=Prime MR(2^0000000000000000000000000000000000000000000000000000000000000005  1)=Prime MR(2^0000000000000000000000000000000000000000000000000000000000000007  1)=Prime MR(2^000000000000000000000000000000000000000000000000000000000000000D  1)=Prime MR(2^0000000000000000000000000000000000000000000000000000000000000011  1)=Prime MR(2^0000000000000000000000000000000000000000000000000000000000000013  1)=Prime MR(2^000000000000000000000000000000000000000000000000000000000000001F  1)=Prime MR(2^000000000000000000000000000000000000000000000000000000000000003D  1)=Prime MR(2^0000000000000000000000000000000000000000000000000000000000000059  1)=Prime MR(2^000000000000000000000000000000000000000000000000000000000000006B  1)=Prime MR(2^000000000000000000000000000000000000000000000000000000000000007F  1)=Prime
MR is, of course, not the optimal means of confirming the primality of a suspected Mersenne prime. But it does give us a basic test: regardless of the output of your RNG, you will always see the above output.
Let's now try a large number known to be prime, the 18^{th} Mersenne prime, 2^{3217}  1:
.c91 .1`LS.1?P[MR(2^]`#[ 1)=]{[Composite ]}{[Prime ]}Z
Run this tape as follows:
$ cat 18th_mersenne.tape  ./bin/ffa_calc 4096 3
... and the output will always be:
MR(2^ 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 000000000000000000000000000000000000000000000000000000000000000000000000000000 0000000C91  1)=Prime
Now, let's test the MR mechanism on more cryptographicallytypical inputs. Let's take one of the braindamaged RSA moduli from the Phuctor zoo (the stars of the SecLab incident), and a Witness of 3:
.c08b0693f9ae0854829cd88d6538756df69ff8067d1556678f7e45b17437014374174c4 aca94bf1f83640928832b398f88c935c6a08177c4cbaa8b85002fee95068bd68487f286f e3b814d92d6147b3d90fba606701f72e1f205c3e06dba55f5e180e45c2225a6ca2061d2d 638ef42609c5d8225620107519628b35983e92e0930ff2e2b8a3a0d9da57a4f50aaefe21 c0b02f8a91587f3ea2337df593f2faea40cb0d6359fee2df45b14b4e8f20988c542b81c7 862f74ea3a3761c22f6ecef64efb2014ccdcf13fb251ed3160ee20f392d0a2200db105c4 5bc12badbaa53a00a1371a77e12de455824c10dafd87f9c150f1e3fb622a8bb68134764a 77a939371bbe63ede53591d1b2bf35ff2f15776a2e1670c8c0006973782c52e97ded5ad1 e4cc96cc4bffd73061e14059aa40dbbc89d46ea1e20500a0e5ac7ac374c277e8d745dc45 449505d1c1bafecd9df8aa75096fffe4cd2f164e2a12d35000782dd73a5b58f8064ea4c0 afe2066f31d336fe65c50a9dff8e3db8a379b182e6d440cb8903fad5ed8477bde7ac2c13 1a7cd47d94630e92f98f68b86d6288607d1ef03880ca18f4176caf08869df93e93433a08 20af7e82e5eed7fd39a2480d98c34f5862dd7ceb4f8382a84acad97d1ee8da685d2e4aa5 f26167a3385f0a3412e168162916dd7ec1a864431f649e610d0ed2593d1be2abda31bb48 a66214a3f8e0ba011 .3 P #
$ cat seclab_1.tape  ./bin/ffa_calc 4096 3
... and the output will be 1 (Composite), given as 3 is a MR witness for that integer (which, like all other RSA moduli, is in fact composite.)
Let's now execute MR on the shared factor of the two Seclab moduli, via a random witness:
.f59cc31339d001d37570dc0ccd986f3f5ea737fa9185c15dbc17e6bfef29435c79a7c22 e8616738947cab8711b6a6e7b5704e5283b57892adad3b170c726f34d3a9859b1504e005 ee4b69d4803cd56773c50ab01d6546ce66dcdb2be4a34e15160d8e0eb69184b699246b42 28f6f25bfcc91970fa99ea3123409f6865b161423581a5f9522ef774f09818bfef6c2b1c 51d06218a07dc717ec94bb231b062936bfd8794cb39bdf8dc05cd2c8bd74b1d0acb14d39 bc293deb45fa52de89af30e4bc5688fb8116be7e7ad4332810c04903939ee2a356ea254b 83fb811c76898672d24997d8647f969a8e02aa2f2016cb1e0c8a9afe99760cd37bf2794d 4ea58951f ? P #
$ cat seclab_gcd.tape  ./bin/ffa_calc 2048 3
Do this as many times as you care to (in the next Chapter, we will discuss exactly what is gained from repeatedly invoking MR on randomlygenerated Witnesses) and the output will remain 0 (i.e. ProbablyPrime. I personally gave it 40 shots, and have not found a witness of compositivity for it. At this point I would bet money that it is in fact prime  but there is no way to settle the bet, as AFAIK there does not presently exist a deterministic primality test that will operate on numbers of this length reasonably quickly.)
Now, let's exhibit the reason why our MR test is formulated in such a way that the operator is free to specify an arbitrary valid MR Witness, rather than the nonsense seen in heathen arithmetrons, where the witness is either drawn from a set of fixed values, or taken directly from RNG.
The reason for this is so that the operator is able to confirm, at arbitrary times and with arbitrary values of his choosing, that the MR mechanism actually performs MR and not something else.
Let's perform MR on the smallest composite integer where 2, 3, 5, 7 are "Liars", i.e. do not function as MR Witnesses: 3215031751 (equal to 151 × 751 × 28351) :
.bfa17dc7 .2 P # .bfa17dc7 .3 P # .bfa17dc7 .5 P # .bfa17dc7 .7 P #
$ cat liars.tape  ./bin/ffa_calc 256 3
Produces:
0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000 0000000000000000000000000000000000000000000000000000000000000000
However we know that 11 is a MR Witness for 3215031751:
.bfa17dc7 .B P #
0000000000000000000000000000000000000000000000000000000000000001
We thereby confirm that the given program in fact behaves as MR, for the given audit input. You will want to generate your own audit tape if using FFA in anger, for this as well as other commands.
In Chapter 16B, we will demonstrate a proof that the supplied algorithm is in fact a Monte Carlo Primality Test, and discuss the statistical facts governing its proper use. We will also discuss the CPU cost of the algorithm, as a function of FFA Bitness.
In Chapter 17, we will... but let's not spoil it!
~To be continued!~
It seems counterintuitive to me that an operator called P checks for composites. Is this related to the way it is used in later chapters?
Dear semz,
MR is really a compositivity test, and the semantics reflect this.
Yours,
S