Mitochondrial DNA is long, circular strand of DNA. (Bacterial DNA is also circular.) It is composed of 16,569 smaller units, called base pairs. Each base pair is composed of two nucleotides. There are only four possible nucleotides — adenine (A), thymine (T), cytosine (C) and guanine (G). Each nucleotide has a complementary nucleotide. So, along the strand of DNA, adenine always appears paired with thymine, and cytosine always appears paired with guanine. Because each nucleotide can only appear with its complement, it is not necessary to report both sides of the chain. So, the DNA chain can be expressed as a chain of nucleotides, for example, GATCACAGGT…
Taking a DNA Sample
A DNA sample consists of human cells. The most common method of taking a sample is to use a cotton swab to brush the inside of a person’s cheek. Some labs use mouth wash or chewing gum. Older procedures often required a blood sample. The sample is then sent to a lab for testing.
Lab Procedure
When a lab tests mtDNA, it looks for mutations. Mutations can take three forms:
1. Substitutions — the base pair at a particular location can change. This is the most common form of mutation, and the only form I discuss here.
2. Deletions — the base pair at a location can be deleted.
3. Insertions — a new base pair can be inserted between existing locations.
To find mutations, the lab determines which nucleotides appear at each location on the mtDNA molecule. To save time, it tests only hyper-variable segments, that is, areas where mutations are most likely to occur. One common segment to test is HVS-1, which starts at base pair 16,001 and ends at base pair 16,568. Another common region to test is HVS-2, which starts at base pair 1 and ends at base pair 574. (Note: the actual range for each hyper-variable region varies slightly from lab to lab.)
Understanding Test Results
The convention for reporting mtDNA results is not difficult, but it requires some explanation:
The lab compares test results to the Cambridge Reference Series (CRS). The reference series is arbitrary. It is the mtDNA sequence for the first person whose mtDNA was analyzed, not the original sequence for homo sapiens.
Locations on the DNA molecule are numbered. As a shorthand, the lab uses location numbers, then adds the abbreviation for the nucleotide at each location. The nucleotides are abbreviated as A (adenine), T (thymine), C (cytosine) and G (guanine). For example, in this shorthand 16270T means the nucleotide at location number 16,270 is thymine.
Each nucleotide can only appear with its complement, so the lab reports only one nucleotide at each location. For example, 16270T means that the nucleotide at location number 16,270 is thymine, which is understood to be one side of a base pair composed of thymine and its complement adenine.
The lab reports only differences from the Cambridge Reference Series. If the result at a particular location matches the reference series, it is not reported. If it is different from the series, it is reported. For example, a test result of 16270T means the test sample matches the reference series, except at location number 16,270. The reference series has a cytosine/guanine base pair at this location, but the test subject has thymine/adenine.
My mtDNA Test Results
The test results are:
16270T 16292A 16298C 00072C 00195C 00263G 00309.1C 00315.1C (Haplogroup V)
These codes are shorthand for the mutations in my individual family line. The three numbers mean that my mtDNA matches the standard reference series, except at those locations. The letters indicate the difference. The reference series has 16270C (cytosine/guanine), 16292C (cytosine/guanine) and 16298T (thymine/adenine). In my mtDNA, those locations are 16270T (thymine/adenine), 16292A (adenine/thymine) and 16298C (cytosine/guanine). Only the left-hand nucleotide of the base pair is reported, because its complement can be assumed.
Haplogroup Assignment
I belong to Haplogroup V2, although that was not clear initially. I had my mtDNA tested by Oxford Ancestors in 1999. They got it wrong. The error came to light in 2007 when I was re-tested at Family Tree DNA.
Based on my (erroneous) test results, Oxford Ancestors (1999) predicted that I belong to Haplogroup U5b. Their prediction was problematic. Both Haplogroups H and U match the reference series at HVR-1. To distinguish between them, HVR-2 must be tested. I was tested only at HVR-1. Nevertheless, a mutation at 16270 is a defining characteristic (“motif”) of subgroup U5, so, it seemed likely that I would be U5. (Not U5b — I do not have a mutation at 16189, which is the motif for U5b). (See Macaulay, Table of Haplogroup Motifs).
Family Tree DNA (2007) disagreed, and on the basis of their tests, assigned me to Haplogroup V. Oxford Ancestors then explained, “A mutation at position 270 is characteristic of clade U and a position at 298 is a characteristic of clade V. It was always believed in the early days that as clade U was the more common that it over rode the clade V, but more recent research has in fact confirmed that this is not in fact the case and the a [sic] mutation at position 298 is the defining one and you are therefore more correctly assigned to clade V and indeed this is where we would now place you.” (Personal Communication, October 11, 2007).
My haplogroup assignment could change again, slightly. It is not possible to distinguish between Haplogroups Pre-V and V on the basis of results only from HVR-1 and HVR-2. About 23% of those assigned to Haplogroup V actually belong to Haplogroup Pre-V.
Haplogroup V evolved from Haplogroup Pre-V, which evolved from Haplogroup HV. HV has 14766C, which matches the standard reference series (which is in Haplogroup H). Mutations 16298 T>C and 00072 T>C define Pre-V. Then,5904 C>T, leads further into Pre-V. Finally, 04580 G>A defines Haplogroup V.
The full motif for Haplogroup V is:
16298C, 00072C, 04580A, 14766C, 15904T
According to Whit Athey, when Family Tree DNA does RFLP tests for Haplogroup V, they check 04580, 07028 and 14766.04580A defines Haplogroup V, but 14766C assures that the haplotype lies somewhere in the HV complex, and 07028T confirms that it is not in H.