For small, randomly chosen composites, naive factoring, i.e. sequential divisibility checking, is many times quicker than Pollard’s p-1 method.

I implemented both algorithms in C++ using GMP to handle the big numbers. I generated numbers with 5-50 digits, three numbers for each digit-class. I filtered the numbers to be composite and odd. I evaluated both algorithms on each list of numbers, as seen in table 1. A couple numbers took so many times longer than the others to factor – hours compared to the usual seconds – that I left them blank.

Comparing Naive (Nx) and Pollard (Px) algorithms via clock time in microseconds, over three lists of numbers.
Digits N1 (\(\mu\)s) P1 (\(\mu\)s) N2 (\(\mu\)s) P2 (\(\mu\)s) P3 (\(\mu\)s) N3 (\(\mu\)s)
5 50 430 27 335 398 30
6 2 290 2 289 372 6
7 2 280 5 312 350 4
8 2 273 2 276 329 2
9 2 277 1 275 348 2
10 3 275 17 305 347 3
11 2 276 2 277 5641 2026
12 1 277 1 283 424 3
13 1 278 1 301 366 3
14 2 275 7 339 1023 75
15 2 276 17 316 297 2
16 22 553 2 276 302 2
17 3 290 1 279 327 2
18 3 277 4 367 327 5
19 8 280 58 1136 367 30
20 2 277 2 283 320 4
21 2 291 11 363 330 5
22 34 411 184 835 3948 417
23 2 287 80 1074 306 2
24 64501813 2107606294 2 291 290 3
25 12 429 264611 1810016 292 2
26 2 278 3 496 286 2
27 2 275 136139284 1152083995 288 2
28 2 347 2419 6599 1539853 141744
29 73 1916 21 1324 342 4
30     9 1299 285 2
31 26 630 11 1411 271 2
32 87235 1090536 9 1265 276 2
33 6 505 45 1614 313 22
34 2 432 8 1305 2127 109
35 2 395 9 1297 310 2
36 3 336 8 1300 327 4
37 1 361 8 1334 369 3
38 8 386 9 1291 343 2
39 8 484 26 1566 419 8
40 9 595 10 1293 316 6
41 2 446 40 1298 358 2
42 3 357 12 1358 320 2
43 8 669 147963 877520 54677 1538
44 8 405 17 1386 384 4
45 2 365 23 1415 304 2
46 2 330 10 1319    
47 2 418 12 1350 29 495
48 5 550 738 2472 3 441
49 2 330 13 1322 3 519
50 2 420 18 1324 462 959

In absolute terms, a few numbers stand out as taking way longer than the average. We can see in table 2 these are the ones with larger smaller factors. For example, the first 24-digit number, 776618064461348516680447, takes 5,375,151 times longer to factor with the naive algorithm than the first 25-digit number (4,912,835 times with Pollard’s). When choosing a number to be difficult to factor, the size of the number itself doesn’t matter – only the size of the smallest factor matters.

Naive factoring pretty much directly corresponds to the smallest factor of the number (c.f. table 2), which makes sense, but there is clearly a relatively large constant time factor for Pollard’s. Where does the time go? I profiled Pollard’s running on the first 24-digit number, and it turns out the program spent a whopping 64% of its time in the gcd function performing assignments and modulos.

/*
 * Use the Euclidean algorithm to find the GCD.
 */
mpz_class Factorer::gcd(mpz_class a, mpz_class b) {
    if (b < a) {
        mpz_class temp;
        temp = b;
        b = a;
        a = temp;
    }

    mpz_class r = a, old_r;
    do {
        old_r = r;
        r = b % a;
        b = a;
        a = r;
    } while (r != 0);

    return old_r;
}

As a matter of fact, the naive method’s main speed advantage came from not having to call the gcd function. While we think of the EEA as having linear time complexity, in this case it was a major drag on performance.

All these results make sense when we compare the algorithms. The naive obviously just depends on the size of the smallest factor:

int naive(n) {
    for (i=2; i <= sqrt(n); i++)
        if (n % i == 0)
            return i;
}

Pollard’s is a little more subtle. First let’s describe the algorithm.

Assume we factor \(n\). Choose an integer \(a \in (1, n)\). If \(\gcd(a, n) \neq 1\), we’ve gotten lucky and found a divisor of \(n\).

Otherwise, we iteratively compute \(a^{k!} \mod n\) for increasing values of \(k\):

\[\begin{aligned} a_1 &= a^{1!} = a \mod n \\ a_2 &= a_{1}^2 \mod n \\ &\vdots \\ a_i &= a_{i-1}^i \mod n \end{aligned}\] Notice how we use the previous number in the sequence to easily find the next. For each \(a_k\), we compute \(g = \gcd(a_k - 1, n)\). If \(g \in (1, n)\), \(g\) is a proper factor of \(n\). If \(g = n\), then the algorithm fails (we can retry with a different \(a\)). Otherwise we keep calculating \(a_k\) values.

At first, it’s not very clear what’s happening here. We seem to be cycling through arbitrary numbers until we stumble on a factor. And why take \(\gcd(a_k-1, n)\)?

Theorem 1. If \(p\) is a prime factor of \(n\) and \(p-1 \ | \ B!\) for some integer \(B\), then Pollard’s p-1 algorithm terminates in at most \(B\) steps.

Proof. We assume \(p\) is a factor of \(n\) and \(p-1 \ | \ B!\). When \(a_k = a^{k!} \mod n \iff \gcd(a_k - 1, n) \neq 1\), we want to show \(k \leq B\). Since by assumption \(B! = l(p-1)\),

\[\begin{aligned} &a^{B!} \equiv (a^{p-1})^l \equiv 1^l \mod p &\text{ by Fermat’s little theorem} \\ &\implies a^{B!} - 1 \equiv 0 \mod p \\ &\implies p \ | \ a^{B!} - 1 \\ &\implies \gcd(a^{B!} - 1, n) \neq 1 &\text{ since by assumption p divides n} \end{aligned}\] ◻

Now we can see the efficiency of Pollard’s depends on the factors of \(p-1\). If \(p-1\) has small factors, there will be a small \(B!\) for it it divide. Hence Wikipedia lists Pollard’s as a special-purpose algorithm; a well-chosen prime will be strong against it.

I didn’t implement elliptic curve factorization, but that would be the next obvious step. It would be interesting to see how EC would perform on the numbers I couldn’t factor here, and test EC factorization on even bigger numbers.

Listing the smallest factor found via naive factoring.
Digits Num1 Smallest Factor Num2 Smallest Factor Num3 Smallest Factor
5 103 3 5
6 5 5 17
7 3 19 11
8 3 7 3
9 3 3 3
10 7 71 7
11 3 3 7529
12 3 3 3
13 3 3 3
14 3 17 239
15 3 71 3
16 89 3 5
17 7 3 3
18 7 13 11
19 29 233 17
20 3 3 7
21 5 43 17
22 137 787 1669
23 5 337 3
24 270609551 5 7
25 41 1103779 3
26 3 3 3
27 3 569306393 3
28 5 2251 561109
29 307 13 3
30   3 3
31 17 5 3
32 342179 3 3
33 17 37 89
34 5 3 257
35 3 3 3
36 7 3 13
37 3 3 11
38 29 3 3
39 23 17 29
40 31 3 19
41 3 31 3
42 3 5 5
43 29 139291 5849
44 29 5 3
45 3 13 3
46 3 3  
47 3 5 3
48 17 673 3
49 5 5 3
50 5 11 1327

Acknowledgment: I learned about cryptography and factoring from Prof. Daniel Hernandez’s excellent course at KU.