The reemergence of ancient notions in the modern field of bioinformatics
Aristotle, in his zoological opus Historia Animalium (The history of animals), launches into his analysis of the animal kingdom by observing differences and similarities between the species. For example, he observes that bats and birds both have wings, so he surmises that they must be grouped together; like fish and dolphins should. By examining animal anatomy and by comparing features such as number or shape of legs (or absence of legs), wings, types of skin, habitats, etc., Aristotle put together a logically coherent taxonomy of animal life that remained virtually unchallenged until Linnaeus. This idea of comparative anatomy, as systematized by Aristotle, is essentially the study of homology (from the Greek word “hómoios”: “similar”) – i.e. of similarities. The idea flowed naturally from Aristotelian Logic and in particular his theory of syllogisms: is A equals B and C equals B, then A equals C. If one replaces “equal” with “similar”, then homology is the logical corollary of equality.
Ancient Greek, and by consequence Medieval European, homology was explained by ideal archetypes, by timeless blueprints designed by a heavenly architect, and into which the objects of perceived reality were molded. Darwin’s revolutionary idea was to provide a naturalistic explanation to animal homology, thus ushering in the era of the scientific study of life.
One and a half centuries after the publication of Darwin’s Origin of Species the modern brethren of his Victorian genius spend much of their time, alas not aboard adventurous sailing yachts roaming the southern seas, but in front of computer monitors applying an ever-expanding arsenal of mathematical and computational techniques in the analysis of living organisms.
One of the most significant application areas of bioinformatics – as this contemporary fusion of biology, computer science and mathematics is termed – is in the study of complex molecules, such as proteins.
Proteins, the building blocks of cells, have structures made up from their particular sequence of aminoacids (which are, in turn, the building blocks of proteins); the way these amino acid molecules unfold in three-dimensional space is what determines the function of a protein. So it is very important for biologists to be able to predict the structure of proteins. What we know is that a protein structure is generally determined by the sequence of the gene that codes for it. And here is where the notion of homology reemerges. It is used to predict the function of a gene. If the function of gene A, whose function is known, is homologous to the sequence of gene B, whose function is unknown, one could infer that B may share A’s function. In a technique called homology modeling, this information is used to predict the structure of a protein once the structure of a homologous protein is known.
Caveat Lector: biologists beware! Meddling with mathematicians who are, secretly, Platonic devotees, may one day lead you to the defense of positivist naturalism against subversive philosophical attacks from the musical spheres of perfect, ideal, proteins-out-there. Ancient ideas, as you should know, are very hard to beat.