Tag Archives: Kirk Durston

New software calculates the probability of generating functional proteins by chance

Apologetics and the progress of science
Apologetics and the progress of science

Here’s an article sent to me by JoeCoder about a new computer program written by Kirk Durston.

About Kirk:

Kirk Durston is a scientist, a philosopher, and a clergyman with a Ph.D. in Biophysics, an M.A. in Philosophy, a B.Sc. in Mechanical Engineering, and a B.Sc. in Physics. His work involves a significant amount of time thinking, writing and speaking about the interaction of science, theology and philosophy within the context of authentic Christianity. He has been married for 34 years to Patti and they have six children and three grandchildren. He enjoys landscape photography, antiques of various types, wilderness canoeing and camping, fly fishing, amateur astronomy, reading, music, playing the saxophone (alto), and enjoying family and friends.

Kirk grew up on a cattle and grain farm in central Manitoba, Canada, where he spent countless hours wandering around on his own in the forest as a young boy, fascinated with the plants and animals that are native to that region of the province. Throughout his teen years he spent six days a week in the summer working as a farm hand with cattle and grain. He left his father’s farm at the age of 19 to go to university.

Canada? Can anything good come out of Canada? Oh well, at least he’s not from Scotland. Anyway, on to the research, that’s what we care about. Code!

Summary of the article:

  • Biological life requires proteins
  • Proteins are sequences of amino acids, chained together
  • the order of amino acids determines whether the sequence has biological function
  • sequences that have biological function are rare, compared to the total number of possible sequences
  • Durston wrote a program to calculate the number of the probability of getting a functional sequence by random chance
  • The probability for getting a functional protein by chance is incredibly low

With that said, we can understand what he wrote:

This program can compute an upper limit for the probability of obtaining a protein family from a wealth of actual data contained in the Pfam database. The first step computes the lower limit for the functional complexity or functional information required to code for a particular protein family, using a method published by Durston et al. This value for I(Ex) can then be plugged into an equation published by Hazen et al. in order to solve the probability M(Ex)/N of ‘finding’ a functional sequence in a single trial.

I downloaded 3,751 aligned sequences for the Ribosomal S7 domain, part of a universal protein essential for all life. When the data was run through the program, it revealed that the lower limit for the amount of functional information required to code for this domain is 332 Fits (Functional Bits). The extreme upper limit for the number of sequences that might be functional for this domain is around 10^92. In a single trial, the probability of obtaining a sequence that would be functional for the Ribosomal S7 domain is 1 chance in 10^100 … and this is only for a 148 amino acid structural domain, much smaller than an average protein.

For another example, I downloaded 4,986 aligned sequences for the ABC-3 family of proteins and ran it through the program. The results indicate that the probability of obtaining, in a single trial, a functional ABC-3 sequence is around 1 chance in 10^128. This method ignores pairwise and higher order relationships within the sequence that would vastly limit the number of functional sequences by many orders of magnitude, reducing the probability even further by many orders of magnitude – so this gives us a best-case estimate.

There are only about 10^80 particles in the entire physical universe – 10^85 at the most. These are long odds. But maybe if we expand the probabilistic resources by buying more slot machines, and we pull the slot machine lever at much faster rate… can we win the jackpot then?

Nope:

What are the implications of these results, obtained from actual data, for the fundamental prediction of neo-Darwinian theory mentioned above? If we assume 10^30 life forms with a fast replication rate of 30 minutes and a huge genome with a very high mutation rate over a period of 10 billion years, an extreme upper limit for the total number of mutations for all of life’s history would be around 10^43. Unfortunately, a protein domain such as Ribosomal S7 would require a minimum average of 10^100 trials, about 10^57 trials more than the entire theoretical history of life could provide – and this is only for one domain. Forget about ‘finding’ an average sized protein, not to mention thousands.

So even if you have lots of probabilistic resources, and lots of time, you’re still not going to get your protein.

Compare these numbers with the 1 in 10^77 number that I posted about yesterday from Doug Axe. There is just no way to account for proteins if there is no intelligent agent to place the amino acids in sequence. When it comes to writing code, writing blog posts, writing music, or placing Scrabble letters, you need an intelligence. Sequencing amino acids into proteins? You need an intelligence.