Numeric Algorithms

Basics

A numeric algorithm does some computation given one or more numeric values. When measuring complexity we almost always care about the bit complexity: the complexity is relative to the number of bits in the numeric inputs, not to the values of the integers themselves.

Note the base of the numerals does not matter when computing asymptotic complexity. There is always a linear relationship between the number of digits required to represent a number in base a and base b.

Exercise: Let bits(n) be the number of bits required to represent the number n in base 2, and digits(n) be the number of digits required to represent in n in base 10. Prove that bits ∈ Θ(digits).
Exercise: Why does this mean, technically, that we can disregard the base when giving asymptotic complexity?

Wikipedia has a nice collection of links to articles on numeric algorithms.

Addition

The addition algorithm you already know works in any base ≥ 2, because the sum of any three digits is at most a two-digit number.

Example:

The algorithm goes from right to left, propagating carries. In binary:

     101011010
     100010111
    ----------
    1001110001

In decimal:

     939972
     654108
    -------
    1594080

The complexity is Θ(n). We can't do better because we have to touch each bit anyway.

Subtraction is similar; you can do it in linear time.

Exercise: Give an algorithm for subtraction.

Multiplication and Division

You could multiply by repeated addition...

    39053 × 46987 = 39053 + 39053 + 39053 + ... + 39053
...but uggggh! That was 46986 additions for 5-digit integers!
Exercise: Prove that this algorithm has exponental time complexity.

You probably learned this algorithm before (shown here in base 2):

        1011
      x 1101
        ----
        1011
       0000
      1011
     1011
    --------
    10001111

The complexity of that algorithm in Θ(n2), because multiplying an n-bit number by a one-bit number can be done in linear time, and we basically do that n times. Then we do n+1 additions of 2n bit numbers: also quadratic.

Did you notice that all we did was just shifting and adding? And that shifting is really multiplying by 2? Look at the above example: it is saying 11×13 = 11 + 44 + 88 = 143. Some people actually carry out this binary-ish algorithm in decimal numbers like this:

     11    13
     22     6
     44     3
     88     1
    ---
    143
Exercise: Try multiplying 4823 × 1089 this way.

Let's generalize this to a real algorithm. In pseudocode, assuming y >= 0:

def times(x, y)
  result = 0
  while y > 0
    result += x if y.odd
    y >>= 1
    x <<= 1
  end
  return result
end

There is also a quadratic time division algorithm. As long as y >= 1, this returns the quotient and the remainder:

def divide(x, y)
  return (0, 0) if x == 0
  (q, r) = divide(x >> 1, y)
  (q, r) = (q << 1, r << 1)
  r++ if x.odd
  (q, r) = (q + 1, r - y) if r >= y
  return (q, r)
end

Exponentiation

You can do exponentiation by repeated multiplication:

    31400 = 31 × 31 × 31 × 31 × ... × 31

Do we really have to do 399 multiplications? No!

    31400 = 3116 × 31128 × 31256

Why does that matter? Because:

a = 1      //              a == 1
p = 31     // p == 31^1
p *= p     // p == 31^2
p *= p     // p == 31^4
p *= p     // p == 31^8
p *= p     // p == 31^16
a *= p     //              a == 31^16
p *= p     // p == 31^32
p *= p     // p == 31^64
p *= p     // p == 31^128
a *= p     //              a == 31^144
p *= p     // p == 31^256
a *= p     //              a == 31^400

Looks like the number of multiplications is linear in the number of bits of the exponent, but what's the complexity of the whole thing? You still have to compute (that is, at least write out), the whole answer, and how many bits are in that? First let's find out what the answer is. Ruby says:

>> 31**400
=> 35049153521902777114708986131739468304466934398225961899541896904817914996246
23354936652173201312485109671587265639812785215391138217944131481131092898719746
85894570874283007085730451124921973944672872304632607933438457819117259406119399
66676143051741472986147114119144281388547152293994389062523413380947354017864265
23926432321776970921620521452184049782817872783269059536200968913247547651336539
79081932865951491386227358704384254668823602537295120143399193347260309636777807
86780733041458754894712578986705032469879343454522360163745820435322202794898190
2971920726384470581625600419245578432001

Okay now lets estimate the number of bits:

    31400 ≅ (24.95)400 ≅ 21980

Practically a 2000-bit number (more or less). Hey, that's pretty big, considering 31 is a 5-bit number, and 400 is a 9-bit number!

Exercise: How many bits are in the number (er, numeral) xy where x and y were both 5000 bit numbers?
Exercise: What is the complexity of that algorithm above for computing xy, where x and y are both n-bit numbers. Remember you have to do at most n multiplications, but don't forget to consider the size of the numbers being multiplied.

We're usually a lot more interested in modular exponentation than plain old exponentiation. This is coming up soon.

Integer Square Root

See Wikipedia for this one.

Greatest Common Divisor

Euclid's algorithm is, for y >= 0, shown here in pseudocode:

def gcd(x, y)
  return y == 0 ? x : gcd(y, x mod y)
end
Exercise: Prove that this is correct.

Examples

    gcd(253, 113)
    = gcd(113, 27)
    = gcd(27, 5)
    = gcd(5, 2)
    = gcd(2, 1)
    = gcd(1, 0)
    = 1

    gcd(2535, 1130)
    = gcd(1130, 275)
    = gcd(275, 30)
    = gcd(30, 5)
    = gcd(5, 0)
    = 5

    gcd(33031, 182845)
    = gcd(182845, 33031)
    = gcd(33031, 17690)
    = gcd(17690, 15341)
    = gcd(15341, 2349)
    = gcd(2349, 1247)
    = gcd(1247, 1102)
    = gcd(1102, 145)
    = gcd(145, 87)
    = gcd(87, 58)
    = gcd(58, 29)
    = gcd(29, 0)
    = 29
Exercise: Show why this algorithm has worst-case Θ(n3) time complexity.

Modular Arithmetic

When numbers get too big, start counting over.

Example: 5395 seconds = 89 minutes, 55 seconds = 1 hour, 29 minutes, 55 seconds.
Example: As I write this, my computer says the time is 1202955781593. That's the number of milliseconds since the epoch. Does that number mean much to you?

Let's look at numbers "mod 6":

    -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
    ----------------------------------------------------------------------------
      0   1   2  3  4  5  0  1  2  3  4  5 0 1 2 3 4 5 0 1 2 3  4  5  0  1  2  3

People like the notation "x ≡ y (mod N)", examples:

What do addition, multiplication, and exponentiation look like in a modular context?

Modular Addition

If the values to be added are in 0..N−1, you don't have to divide to get the remainder:

Example:
Exercise: Prove that modular addition has linear complexity.

If adding a lot of numbers, do the "mod" on the intermediate results:

      (30 + 26 + 8 + 29 + 28 + 5) mod 31
    = (((((30 + 26) mod 31 + 8) mod 31 + 29) mod 31 + 28) mod 31 + 5) mod 31
    = ((((25 + 8) mod 31 + 29) mod 31 + 28) mod 31 + 5) mod 31
    = (((2 + 29) mod 31 + 28) mod 31 + 5) mod 31
    = ((0 + 28) mod 31 + 5) mod 31
    = (28 + 5) mod 31
    = 2

Modular Multiplication

To compute x × y mod N, multiply (that's quadratic) then divide to get the remainder (also quadratic). You can also do the "mods" on intermediate results:

      (30 × 26 × 8 × 29 × 28 × 5) mod 31
      (((((30 × 26) mod 31 × 8) mod 31 × 29) mod 31 × 28) mod 31 × 5) mod 31
    = ((((5 × 8) mod 31 × 29) mod 31 × 28) mod 31 × 5) mod 31
    = (((9 × 29) mod 31 × 28) mod 31 × 5) mod 31
    = ((13 × 28) mod 31 × 5) mod 31
    = (23 × 5) mod 31
    = 22

Modular Exponentiation

To compute xy mod N, we can use the exponentiation algorithm above, doing the modulos for all intermediate results.

Let's try 31400 mod 100:

a = 1                //              a == 1
p = 31               // p == 31^1 == 31
(p *= p) mod 100     // p == 31^2 == 61
(p *= p) mod 100     // p == 31^4 == 21
(p *= p) mod 100     // p == 31^8 == 41
(p *= p) mod 100     // p == 31^16 == 81
(a *= p) mod 100     //              a == 31^16 == 81
(p *= p) mod 100     // p == 31^32 == 61
(p *= p) mod 100     // p == 31^64 == 21
(p *= p) mod 100     // p == 31^128 == 41
(a *= p) mod 100     //              a == 31^144 == 21
(p *= p) mod 100     // p == 31^256 == 81
(a *= p) mod 100     //              a == 31^400 == 1

Also see the Wikipedia article; it includes a nice code fragment. In pseudocode:

def modpow(x, y, n)
  result = 1
  while y > 0
    result = (result * x) mod n if e.odd
    y >>= 1
    x = (x * x) % n
  end
  return result
end

Modular Exponentiation is built into Java (and therefore Groovy):

groovy> 8G.modPow(3, 100)
groovy> go
===> 12
groovy> 23454G.modPow(2311,145332)
groovy> go
===> 77904
groovy> 5438394857757488G.modPow(3424255654452323, 234235256666421)
groovy> go
===> 189325407946433

The fact that you can do the modulos at any intermediate results lets you do some cool computations without a calculator:

    2345 mod 31
    = (25)69 mod 31
    = 3269 mod 31
    = (32 mod 31)69 mod 31
    = 169 mod 31
    = (1 mod 31)69 mod 31
    = 1

Once you get good at this stuff you just "pull out" the mod 31 and write only this:

    2345 ≡ (25)69 ≡ 3269 ≡ 169 ≡ 1 (mod 31)
Note: We've just seen a polynomial time algorithm for computing z = xy mod N, but there currently exists no known efficient algorithm for finding y given x, z, and N. So modular exponentiation is a kind of one-way function.

Modular Multiplicative Inverse

The (regular) multiplicative inverse of a non-zero real number x is the number y such that xy = 1. We usually write it as x−1 or 1/x.

In the modular world the same idea holds.

Example: The modular multiplicative inverse of 11, mod 25, is 16, because (16 × 11) mod 25 = 1.

But wait.... You don't always have a modular inverse. There is no number y such that 12 × y mod 34 = 1. There is a theorem that says x has an inverse mod N if and only if x is relatively prime to N. (x and y are relatively prime means that gcd(x, y) = 1.)

Code to find the modular inverse is part of the standard library in Java, and therefore in Groovy:

groovy> 11G.modInverse(25)
groovy> go
===> 16
groovy> 234535235234G.modInverse(2345665654331)
groovy> go
===> 146170270779
groovy> 12G.modInverse(34)
groovy> go
Caught: java.lang.ArithmeticException: BigInteger not invertible.

To compute the modular inverse, yourself, from scratch, you can use the "extended Euclidean algorithm". Details are in any good book on number theory, algorithms, or cryptography. You might even find a trustworthy implementation online.

Exercise: Write a modular inverse function in C.

Primality Testing

There are lots of algorithms out there to determine if a number is prime. Two main classes of algorithms:

Even the fastest deterministic algorithms are really, really, slow compared to the probabilistic ones. Besides, you can get probabilistic algorithms to have a probability of error less than that of hardware failures or getting hit by lightning and winning the lottery on the same day.

Some Algorithms

Solovay-Strassen Test

It turns out that, for an odd positive n, if n is prime then

    a(n-1)/2 ≡ Jacobi(a,n) (mod n)

is true for all a &isin {1...n-1}, and if n is composite then that congruence holds true for at most half of the values in {1...n-1}.

So if you pick a random a less than n, and the congruence fails, you definitely know n is composite. But if it holds, n has a 50% chance (1 out of 2) of being composite. So pick another value for a. If it holds again, you have a 1/4 chance of being composite. If another one passes, your odds of being composite go down to 1 in 8. Pass 30 straight tests and you have only a 1 in a billion chance.

# Returns whether n is probably prime, with a 1/2^k chance of being wrong
def is_probably_prime(n, k)
  assert n is odd and >= 3
  k.times do
    a = random(1, k-1)
    j = Jacobi(a, n)
    if j == -1 then j = n - 1
    if j == 0 or modpow(a, (n-1)/2, n) != j then return false
  end
  return true
end

Lehmann Test

This one doesn't require computing the Jacobi symbol. You simply compute a(p−1)/2 mod p for k randomly selected as (all less than p). If you ever get a value other than 1 or p-1, then p is definitely composite. Otherwise, if you always get 1 and p-1 and there is at least one p-1 then the probability that p is prime is 1−2−k.

Miller-Rabin Test

For self-study.

Integer Factorization

No one as yet knows any efficient general-purpose algorithm for determining the prime factors of an integer, but a number of special-purpose algorithms (besides the simple-minded "direct search" approach) have been invented.

MathWorld has good articles on Prime Factorization and Prime Factorization Algorithms. Start your self-study tour there.

Why care? The fact we can generate primes efficiently but not (yet???) factor integers efficiently gives us one way to implement public-key cryptography.

Oh yeah, you might want to read about Shor's Algorithm.