Genes versus Proteins

From BCCD 3.0

Jump to: navigation, search

It was previously mentioned (in About mpiBLAST) that BLAST can search databases of either genes or proteins. This brings up the question of, "What's the difference?" If genes code for an amino acid sequence and that sequence just makes up the protein, they should be the same, right? Not exactly.

Some variation comes into play. Each three nucleotides (A, T, C, and G) code for a specific amino acid. For example, the sequence UUA codes for the amino acid leucine. However, UUG, CUU, CUC, CUA, and CUG all also code for leucine! So, mutations in a gene do not necessarily always cause a difference in the protein that the gene produces. If a mutation causes CUA to become CUG, then the amino acid leucine will still be put in the same place in the protein.

On the other hand, a single base change in the nucleotide sequence of a gene can have drastic effects on the protein. Since the gene is read in groups of three nucleotides (a codon) at a time, deleting or adding one base shifts the entire sequence and can change everything downstream of it. This is called a frameshift mutation. On a smaller scale, changing one base in a codon can change the amino acid inserted in that part of the protein. If a UUG codon (which normally codes for leucine) becomes UUC, phenylalanine will be added instead of leucine.

So, depending on what you're investigating, choosing to look at either the gene (nucleotide) or protein (amino acid) sequence can have a big impact. (For more specifics on the DNA sequences that code for amino acids, please check out Wikipedia's article on the genetic code.)

Personal tools