Primes to One Trillion
The Gutenberg text "The First 100,000 Prime Numbers", EBook #65, lists the primes up to 1,318,699. This somewhat more ambitious version lists the primes up to one trillion (1,000,000,000,000 or 1E12).
Introduction
I became interested in prime numbers after hearing about [Goldbach's Conjecture,]
"Every even integer greater than 2 can be expressed as the sum of two primes".
Verifying this requires a source of primes. Short lists (or programs to generate them) are widely available. Really long lists are scarce, except for
[primos.mat.br.]
To make these lists more accessible, I have reformatted them to a size easily manageable by ordinary text editors and viewers--about 55MB. The file names correspond to the range of primes the file contains:
| 00000000000_to_00100000000.txt | Zero to 100 Million |
| 00100000000_to_00200000000.txt | 100 Million to 200 Million |
| etc. |
The leading zero digits cause the file names to collate in order of their content. Longer lists can be composed with the DOS copy command. Move the required prime txt files to a temporary directory and use:
copy *.txt longList.txt
Prime Text Files
This is a collection of 10,000 files, occupying about 486GB of disk space in their unzipped native txt format. Since adjacent primes have about 90% identical leading digits, the compressed (zip) versions total 61GB. Each zip file contains 100 txt files.
Do not use Windows Explorer to copy or move large numbers of files at a time. Use DOS copy or xcopy for large copies. I find Beyond Compare (Scooter Software) handy for keeping track of large numbers of files.
The following zip file names denotes the start of a range of primes in billions (1,000,000,000), and covers a range of 10,000,000,000. Each zip file contains 100 txt files, each covering the primes in a range of 100,000,000.
[0000.zip] 000G to 010G (0 to 1E10)
[0010.zip] 010G to 020G
[0020.zip] 020G to 030G
[0030.zip] 030G to 040G
[0040.zip] 040G to 050G
[0050.zip] 050G to 060G
[0060.zip] 060G to 070G
[0070.zip] 070G to 080G
[0080.zip] 080G to 090G
[0090.zip] 090G to 100G
[0100.zip] 100G to 110G
[0110.zip] 110G to 120G
[0120.zip] 120G to 130G
[0130.zip] 130G to 140G
[0140.zip] 140G to 150G
[0150.zip] 150G to 160G
[0160.zip] 160G to 170G
[0170.zip] 170G to 180G
[0180.zip] 180G to 190G
[0190.zip] 190G to 100G
[0200.zip] 200G to 210G
[0210.zip] 210G to 220G
[0220.zip] 220G to 230G
[0230.zip] 230G to 240G
[0240.zip] 240G to 250G
[0250.zip] 250G to 260G
[0260.zip] 260G to 270G
[0270.zip] 270G to 280G
[0280.zip] 280G to 290G
[0290.zip] 290G to 200G
[0300.zip] 300G to 310G
[0310.zip] 310G to 320G
[0320.zip] 320G to 330G
[0330.zip] 330G to 340G
[0340.zip] 340G to 350G
[0350.zip] 350G to 360G
[0360.zip] 360G to 370G
[0370.zip] 370G to 380G
[0380.zip] 380G to 390G
[0390.zip] 390G to 300G
[0400.zip] 400G to 410G
[0410.zip] 410G to 420G
[0420.zip] 420G to 430G
[0430.zip] 430G to 440G
[0440.zip] 440G to 450G
[0450.zip] 450G to 460G
[0460.zip] 460G to 470G
[0470.zip] 470G to 480G
[0480.zip] 480G to 490G
[0490.zip] 490G to 400G
[0500.zip] 500G to 510G
[0510.zip] 510G to 520G
[0520.zip] 520G to 530G
[0530.zip] 530G to 540G
[0540.zip] 540G to 550G
[0550.zip] 550G to 560G
[0560.zip] 560G to 570G
[0570.zip] 570G to 580G
[0580.zip] 580G to 590G
[0590.zip] 590G to 500G
[0600.zip] 600G to 610G
[0610.zip] 610G to 620G
[0620.zip] 620G to 630G
[0630.zip] 630G to 640G
[0640.zip] 640G to 650G
[0650.zip] 650G to 660G
[0660.zip] 660G to 670G
[0670.zip] 670G to 680G
[0680.zip] 680G to 690G
[0690.zip] 690G to 600G
[0700.zip] 700G to 710G
[0710.zip] 710G to 720G
[0720.zip] 720G to 730G
[0730.zip] 730G to 740G
[0740.zip] 740G to 750G
[0750.zip] 750G to 760G
[0760.zip] 760G to 770G
[0770.zip] 770G to 780G
[0780.zip] 780G to 790G
[0790.zip] 790G to 700G
[0800.zip] 800G to 810G
[0810.zip] 810G to 820G
[0820.zip] 820G to 830G
[0830.zip] 830G to 840G
[0840.zip] 840G to 850G
[0850.zip] 850G to 860G
[0860.zip] 860G to 870G
[0870.zip] 870G to 880G
[0880.zip] 880G to 890G
[0890.zip] 890G to 800G
[0900.zip] 900G to 910G
[0910.zip] 910G to 920G
[0920.zip] 920G to 930G
[0930.zip] 930G to 940G
[0940.zip] 940G to 950G
[0950.zip] 950G to 960G
[0960.zip] 960G to 970G
[0970.zip] 970G to 980G
[0980.zip] 980G to 990G
[0990.zip] 990G to 1000G
PrimeC File Format and
Miscellaneous C++ Programs
While working with primes, I developed the primec format, a file or array representation for primes that is roughly the same size of the compressed (zipped) txt representation, and supports fast access, both sequential and direct. The exact location of the primality specification of any number in the file (or memory array) is computed with a few instructions and no search.
If you wish to examine and experiment with the C++ programs used to reformat these prime lists and test the Goldbach Conjecture, download the "programs.zip" package. It contains Generating and Analyzing Prime Numbers, a description of the content and use of these files, including the primec file format.
Primec Format
The primec format exploits the fact that all primes greater than 5 end in the decimal digits 1, 3, 7, or 9. Thus, the primality of 20 successive numbers can be specified in one 8 bit byte. The file begins with the complete binary representation of:
The beginning of the sequence
The end of the sequence
A check sum of all data bytes
(All three are 8 bytes for this implementation).
The first and last values are a multiple of twenty, thus are never primes. There is no overlap of primes between successive files that use the same number for the upper boundary of the first file, and the lower boundary of the second file.
The primality of any number in the range of the file is determined as follows:
If the number ends in 0, 2, 4, 5, 6, or 8, it is not prime.
Otherwise, the location of the specifying byte is at offset:
( value - start ) / 20
Within that byte, the primality of the value is specified by the bit as shown in the following table.
The only tedious programming tasks were:
Special case code for values less than 20, which include 2 and 5, and exclude 1 and 9. All larger values follow the same simple pattern.
The increment and decrement operators for the corresponding iterators must search forward (or backward) for the next true bit, specifying the next prime number.
This table shows the layout and content for a file containing 20 to 60. The first 24 bytes (start value, end value, check sum) are not shown.
| Byte | 0 | 1 | ||||||||||||||
| Bit | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| Value | 21 | 23 | 27 | 29 | 31 | 33 | 37 | 39 | 41 | 43 | 47 | 49 | 51 | 53 | 57 | 59 |
| Prime | F | T | F | T | T | F | T | F | T | T | T | F | F | T | F | T |
| Hex | 5A | E5 | ||||||||||||||
As primes become larger, the density of primes becomes smaller as 1/ln(n). Thus the density of true bits also falls off. The number of digits (binary or decimal) to represent the primes grows as ln(n). Thus, a sequence of primes represented as primec is always competitive in size with the corresponding sequence in ASCI text or binary, besides providing fast direct access by value:
bool isPrime(value).
The results for the sequence of the largest 64 bit primes (18446744073707000000 to 18446744073709551558) is:
| Format | Size (KB) |
| 64 Bit Binary | 445 |
| Txt | 1150 |
| Zip Txt | 114 |
| PrimeC | 125 |
Programs
Among the programs in the "program.zip" package are:
| BuildTxtPrime | Create file of primes, txt format |
| TxtToPrimeC | Convert txt to primec format. |
| Goldbach | Verify Goldbach's conjecture for zero to 1E12 |
Among the more than 15 classes and utilities are:
| PrimeGenerator | Create prime numbers in a given range. |
| Directory | A vector of strings containing the names of files in a file directory. |
| Progress | A class to manage the periodic reporting of program activity. |
| PrimeCVector | Abstract class providing the algorithms to access primec data. |
| PrimeCFileWriter | Create a primec file. |
| PrimeCFileReader | Read a primec file. |
I hope you find them useful.
If you have any questions, observations or bug reports concerning the C++ programming or the content of the prime files, send an email (after changing "at" to "@".
primes1e12 at earthlink.net
I embarked on this project as a programming challenge. I am not a mathematician. I have no deep insight into prime number theory. Please confine messages to programming issues. Here are some references:
[Prime Numbers]
[Wikipedia: List of Prime Numbers (with numerous references)]
[The Math Forum]
[The Prime Pages]
The program files are also posted on
[http://home.earthlink.net/~primes1E12].
Corrections and additions will be posted there as they occur.
Don Kostuch
October, 2018.