Outline, <--- Previous Page, Next Page --->
How many clones needed to to insure you have all the human genome represented by at least one clone?
Consider the probability that this fragment is not represented in any clone:
let f = K/G.
If we select a fragment at random from the clone library the probability of it not being the correct fragment is 1-f
If we select N clones the probability of the fragment not being in any of the N clones is (1-f)N
Let P be the probability that the fragment is represented at least once in N clones. The probability that the fragment is not represented in N clones is then
1 - P = (1-f)N
N = log(1 -P)/Log(1-f)
Suppose you want to insure that a specific fragment will be represented at least once in a human clone library with a probability of 0.99.
By substitution of P =0.99 and f = 6.666... * 10-5
We find that N = 69,075 clones
But the BAC fragments are still too big for DNA sequencing so mechanical shearing is used on the whole genome to produce smaller fragments say of 10kB.
The number of clones to include a specific fragment with 99% coverage is about 140,000
A number of runs with different sized fragments down to 2kB were used.
Took 20,000 CPU hours to sequence the human genome on a super computer.