The Evolution of
the Gordon Surname:
New Insight From Y-DNA Correlations and Genealogical Pedigrees
Tei A. Gordon and
William E. Howard III
Abstract
In the R1b1 haplogroup, the
ISOGG time estimates, the RCC time scale, the Y-DNA evidence and our results
are consistent with an origin of the Gordon surname in areas near modern Turkey
and Greece. Comparison of the ISOGG dates with those determined using the RCC
time scale shows good agreement and no inconsistency between the RCC- and
ISOGG-derived estimates.
The times derived from the RCC matrix for the early
migrations of the I1 haplogroup into the British
Isles from Scandinavia and from Western Europe agree well with the history of
the area derived from archaeological excavations, genetics and anthropologic
studies.
Address for correspondence: Tei Gordan, [email protected]
Received:
Introduction
The
power of the correlation approach will be demonstrated as well as the degree to
which genealogical pedigrees contribute to the analysis. The limitations of the
approach will be explored.
The Gordon DNA Project
accepts results from various labs; Kit Numbers referenced in this paper are
from Family Tree DNA, which represents the bulk of those testees in the project[2]. All identifiable information has
been removed to protect their privacy.
The
Gordon surname was chosen as the focus of the analysis for reasons that included:
We
decided to do the groupings by pedigree and marker string comparisons
separately from the cluster groupings using RCC matrix analysis. By keeping the
processes scrupulously separate, we could better compare the two group/cluster
determinations, lending more credibility to the final comparisons of the two
approaches. In this paper we refer to Gordon families derived by pedigree
comparisons and traditional haplotype matching as groups. We refer to
Gordon families derived by RCC matching as clusters.
The
favorable outcome of using the correlation approach in conjunction with good
pedigrees, strongly suggests that the methodology can be applied with success
to other surnames under similar selection criteria.
OMNASTICS
OF THE GORDON SURNAME[4]
Though theories on the origin of surnames
abound for most every Scottish family, a unique combination of tradition,
legend and references by some of the most respected names of antiquity give unique
opportunities for Y-DNA and RCC to help test these theories and unravel the
onomastics of the modern Gordons in Scotland.
Early Romans were using up to four names during
the expansion of their empire. The practice of passing one of their names on to
their children no doubt left their mark in territories they conquered. Such naming traditions likely were
brought back by crusaders returning from the Holy Lands.
In the 12th century the widespread practice of given and family names was already in use
first by aristocracy, nobility and the wealthy, followed by the merchant classes. By the 18th century, hereditary
surnames became the norm among the general population, which had been using a
single given name. It was common Western European practice to adopt a surname based upon
one’s location, occupation, patronymic name or even physical characteristic.
According to early-19th century Scottish
historian George Chalmers in his series Caledonia, “the founder
of the Gordon family came from England in the reign of David the First
(1124-53), and was granted the lands of Gordon (anciently Gordun, or Gordyn or
the Gaelic Gordin, "on the hill")” from which the family may have derived its
name (Chalmers (1742-1825)).
Yet other ancient historians and even Chalmers in his research indicate
that the Gordons likely already had their surname prior to arriving in Scotland, and he surmises possibly from France, and Macedonia before
that, giving their name to the town of Gordon in the borderlands. Thus, Chalmers
acknowledges that prior to Scotland, the Gordons may
have originated in Gordonia in Macedonia and migrated to ancient Gaul, with their
seat in present-day city of Gent, Belgium (Chalmers (1742-1825)).
Chalmers indicates in his writings that generally the
eleventh and twelfth century Normans and other foreigners who came to Scotland
and England had no family name when they received lands from William the
Conqueror or from other noble families as many Scottish families did.
Extending such universal generalization to all
early Scottish families, however, is perhaps an over-generalization. Allowing for a
little phonetic flexibility reflecting the diversity of the period (Gordon versus Gourdon, Gordoun,
Gordun, de Gordoun, de Gourdon, Gurdon, etc.), changing geographic boundaries
(medieval Normandy), as well as the origin of the base of arms of the Gordons
(stars and boars’ heads) and badges (Ivy), the Gordon researcher learns that
there are plenty of clues to an older family history that RCC can help
reveal. Throughout this paper, we
follow the spelling variation given in the original source.
Among the Clans, a few
larger Scottish families enticed fragmented, smaller clans to align interests
during economic downturns by offering food in exchange for changing their
surname. For military families, such as the Gordons, this was an especially
useful approach during the tumultuous 15th to 18th
centuries as a means to increasing the ranks and thus, many Gordons were
sometimes referred to as the Bowl o’ Meal Gordons (Dickens
1887, pp. 58).
Leveraging the internet to access voluminous libraries of books, and by
continuously monitoring new Y-DNA results, while applying new and evolving tools such as the
RCC technique of analysis, we find that we may be able to finally prove some of
these ancient traditions, theories and early citations.
To date, no studies have focused upon the time
relationships between the haplogroups of the Gordons (R1b1, I1 and I2
groupings), nor has any previous study revealed the relationships of the Gordons and their
subclusters.
DETERMINING GORDON HAPLOTYPE GROUPINGS USING
TRADITIONAL MARKER AND PEDIGREE COMPARISONS
As is likely the case in many surname projects,
the current groupings for the Gordon surname DNA project consisted of the
following steps:
DESCRIPTIONS
OF THE GORDON PEDIGREE BRANCHES
Lending credibility to the French and Norman origins of the Gordons, R1b and I1 branches respectively, historians note that prior
to the Scottish
Gordon families, the French
families of de Gourdon were already using and continue to this day to use the ivy leaf as their badge and the same base of arms as their
Scotland cousins.
In line with a descent from the French
and/or Norman de Gourdons, it is recorded that the following tree outlines the generally
accepted Gordon family history, highlighting its three major branches[5]:
1. Jock & Tam Gordon branch
2. Sir William Gordon branch
3. Seton-Gordon, the Ducal
line branch
Descriptions of these branches follow.
The Jock and Tam Gordon Branch
This is the largest, oldest documented branch
of the Gordons, dating back to the 11th century and the first recorded
and single male progenitor of modern Gordon lines in the Gordon surname DNA project
– the Laird of Gordon – credited as the
founder of the House of Gordon in Scotland. Given the
limitations of FTDNA algorithms, this large group cannot be broken down into
sub-groups. Several kits have
verifiable documented pedigrees among those in the Jock and Tam branch.
The John “Jock” Gordon of Scurdargue and Thomas
“Tam” Gordon of Ruthven (Jock and Tam) represent one of two unbroken lines of
Gordon males back to the Laird of Gordon (died at Battle of Standard, 1138) and
held its seat in the Highlands.
The Jock & Tam Gordons belong to the I1
Haplogroup, typically found in northern Scandinavian countries, which supports
the theory that this branch of Gordons came from Normandy to Scotland.
The Sir William Gordon Branch
This branch is documented to have branched from
ancestors of the above Jock & Tam Gordons back in the 13th century (Bulloch, op cit). Prior to this analysis, it was thought more than likely that some
of the kits currently listed under the above Jock & Tam Branch actually
belong in this branch. Several other testees in this group and of the R1b1
Haplotype have family claims of descending from this line, but the I1 Haplotype
would seem to support generally accepted Gordon family history. Therefore, the
R1b1 sub-grouping is likely not in an unbroken line. One explanation could be that a Seton-Gordon
married into the R1b1 branch of the Sir William Gordon family, thereby producing
this anomaly. The benchmark kits for this branch are Kit Nos. 89515 and 93333.
Both have well-documented pedigrees[6].
We have yet to determine a unique genetic sequence
for the Sir William Gordon descendants, as the results are almost identical to
those of the Jock & Tam Gordon descendants. However, as we see later in
this paper, RCC analysis has revealed a branching of the Gordon Cluster AA
cluster at around 1300 CE, about the time of the
branching of this Sir William Gordon line. As further testees with
documentation back to each of these lines are found, we should eventually be
able to determine unique sequences for each branch.
The Seton-Gordon Branch
The single verifiable pedigree for the Seton-Gordon
branch is Kit No. 35045, as will be seen in a later section, this kit has not been
included in the RCC analysis because it most closely matched only one other test
kit and thus did not qualify as a major Gordon RCC cluster member.
Small Groupings of Gordons
Groups of two or more
testees with the Gordon surname that match each other but that do not match one
of the three major Gordon branches within the last 1,000 years (modern origin
of the Gordon surname) have been included in this section. These Gordons may yet
ultimately be attached to one of the 150+ Gordon branches. However, members of
this grouping should also consider the possibility of an undocumented
non-paternal event in their family history.
Gordon Septs
Not all Gordons were
Gordons by birth. Many were Gordons by bond, pledging allegiance to the House
of Gordon, some even taking the Gordon surname.
There are 136 unique
haplotypes and 45 different surnames/Septs in the House of Gordon. Definitions of a Sept are
ambiguous at best and open to interpretation. Since no documented ties have
been found between Gordons and their Septs, one of the goals of the DNA project
is to determine whether there may sometimes be a genetic link between any of
its Septs with one of the three major Gordon pedigree branches
within a genealogically significant timeframe.
All Septs, even if
matching any Gordons, are listed in the project only on the Sept section of the
webpage and a separate notation is made when a match with any Gordon is
determined.
Ungrouped Gordons
Gordon testees who do not genetically match any other Gordons
for the last 1,000+ years are included in this grouping. Furthermore,
documentation for these singular Gordon testees do not indicate any relation to other Scotland Gordon branches or Gordon Septs and often documentation does not extend more than a few hundred years or about
16-generations. In other words, these independent testees do not have a linear paternal
heritage, genetic or documented, leading to the progenitors of the three major
branches.
Since there are a total
of 150+ documented branches of the Gordons, including the three major branches,
it is reasonable to expect new genetic branches will be found. However, we must
also consider the possibility of a non-paternal event, such as a male from a
closely aligned family of a different surname, marrying into the Gordons and
assuming the Gordon surname, similar to the Seton-Gordons. Likewise, adoption
or other undocumented paternal event must also be considered.
It should be expected
that ungrouped testees will ultimately match other testees and will be moved to
the Small Gordon Groupings, which include two or more matching kits, or
possibly a non-paternal event uncovered.
THE
RCC CORRELATION MATRIX ANALYSIS: BACKGROUND AND APPLICATION TO A GORDON SURNAME
RCC MATRIX
A.
The Evolution of Surname Clusters Inferred from the Distribution of
Intercluster Data
A1.
Background
To
understand the relationship between the RCC matrix and the evolution of surname
clusters and how interclusters in the matrix add to that understanding, the
following observations are important:
Appendix
A outlines a useful method for identifying clusters in an RCC matrix.
A2.
How Surname Evolution Produces the RCC Matrix
A
schematic matrix of a surname cluster and the intercluster regions of a pair of
surname clusters were presented in Figures 1A and 1B of Howard (2009b). A
method was developed to estimate the time to the common ancestor (TCA) of a
cluster from the values of RCC of pairs of cluster members1. In addition, a method to estimate the time to
the (earlier) common ancestor of a pair of surname clusters (i.e., the TCA of
the two cluster CAs) was developed using the averages
of the RCC of the cluster members in the intercluster region. This section
extends the rationale of Figure 5 of Howard (2009b). It starts with a schematic
evolutionary diagram of a hypothetical group of surname clusters, A, B, …. F shown in the following figure.
Figure
1: RCC Values of the Intersections of Six Hypothetical Surname Clusters (Upper
Graph a) and How the RCC Values Appear in the RCC Intercluster Matrix (Lower
Matrix b)
Consider
the schematic evolutionary upper plot (a). The RCCs of the TMRCA of surname
Clusters A, B, …. F are
plotted at the bottom of the diagram in the interval between RCC 0 and 10. The
earliest CA pair is AF at RCC=50 where the MRCA of Cluster A shares a MRCA with
Cluster F. The common ancestor at AF has the “starting” haplotype, which then
experiences marker changes as its lines evolve down the diagram on various
evolutionary paths from RCC=50 to the present.
The
downward connecting arrows in the diagram show the evolution of the clusters. Although
AF is the oldest of the paired interclusters, and the lines of both Cluster A
and Cluster F can be traced backward in time from the bottom of the diagram up
to AF, we cannot tell which of the two clusters is the older since the
schematic contains no information that would indicate their relative ages.
The
CA of each pair of clusters is given on the graph as AB, DE,
…. Clusters A and B will each have different most recent common
ancestors. The common ancestor of those two common ancestors will appear
at AB, etc. This is the MRCA of the Intercluster AB. Its haplotype will be the “starting”
haplotype for the two lines that evolve to Clusters A and B. We have
arbitrarily plotted the intercluster CAs of AB and DE at RCC = 20. The CA of
Clusters D and F will appear at DF, and so on up the graph.
As
time passes, the evolutionary lines (viz., the varying haplotypes) to Clusters
A and F evolve down the timelines to the left and to the right, respectively.
As this evolution takes place, the line to Cluster A develops a Subcluster B
which eventually produces a progenitor at AB who is the common ancestor of the
CA of both Cluster A and Cluster B. While that evolution takes place, the line
to Cluster F spins off the Cluster C progenitor (at RCC=40), then Cluster D’s
progenitor (at RCC=30). The line to Cluster C then evolves directly to the present,
while the line toward Cluster E spins off Cluster D (at RCC=20). From the
figure, the rank order of surname cluster age, from oldest to youngest, is
A&F, C, D, and finally B&E.
This
schematic evolutionary sequence is mapped into the intercluster matrix in
Figure 1 (b). The boxed intersections AF, CF, DF,…..
AB and DE have the RCCs indicated at the row and column intersections of A and
F, D and F, etc. in the graph. The entries for the remaining intersections are
the RCC values where the two cluster lines intersect. Thus, for example, the
entries for intersections AC and AD are both at the intersection AF (at RCC=50)
and the entries for intersections CE and EF are both at the intersections CF
and DF, respectively.
The
challenge, described in the next section, is to turn the analysis around, and
derive the upper evolutionary plot in Figure 1 from an RCC matrix that results
from a correlation of pairs of Y-DNA testee results.
B.
Derivation and Analysis of RCC Matrix Parameters for Gordon Clusters
In
April 2010, the cutoff date of this analysis, 242 Y-DNA results were available
in the Gordon surname project2. We selected only those results where testees
had been tested at 37 or more markers and we use only the first 37 markers to
form the correlation matrix and then the RCC matrix (Howard 2009a).
This
process narrowed the analysis to 187 individuals from which we were able to
group 119 testees (64%) into well-defined Gordon clusters and subclusters
(viz., clusters within a cluster) in the RCC matrix. Of this RCC grouping, 104
(87%) were later matched with one of ten specific Gordon pedigree lines of
which one category was “ungrouped”. Pedigree lines for the remaining 68 testees
were available, but they were not assigned to specific Gordon clusters using
the RCC matrix.
B1.
The Histogram of the RCC Matrix for All Gordon Testees
The
first step in an RCC matrix analysis is to assess the distribution of surname
pairs. Figure 2 presents a histogram of the entire RCC Gordon matrix before
they were grouped into clusters.
Figure
2: Gordon Surname Histogram (187 individuals, each with 37 markers tested)
This
histogram shows three prominent peaks. The first peak results from pairs of
cluster members in subclusters and clusters; each pair is in the same
haplogroup, but different haplogroups are present. The second peak is composed
of pairs of testees who are either in different clusters or who are not in a
cluster, but who are in the same haplogroup. The third peak is composed of
pairs of testees who belong to different haplogroups.
B2.
Identification of Gordon Clusters from the Gordon RCC Matrix
The
clusters we found ran the gauntlet from sparse to well-populated. Since we
wanted to choose a number of well-populated examples to afford us good
statistical samples, we chose to study only major clusters, which we defined as
an RCC grouping that must contain at least four different testees so that at
least 6 pairs ((4 x 3)/2 = 6) would be available for comparison. This process
led to the identification of a reasonable sample --10 major Gordon Clusters, A,
C, D, and E (in Haplogroup I1), H, K, L, Q and T (in Haplogroup R1b1b2) and
Cluster G (in Haplogroup I2b1). The members of each major Gordon cluster are
given in Appendix B.
After
averaging the RCCs of the individual members of each cluster and each
intercluster region, and after determining the standard deviations (SD) of
their means, we get the results given in Table 1. Entries along the diagonal
show the average RCC of each Gordon cluster and the SD of that average. The
averages of intercluster pairs are listed in their intercluster intersections
above the diagonal and the SDs of their means are listed at the appropriate
intersection below the diagonal.
TABLE
1: Average Values of RCC for Gordon Clusters and Interclusters, Identified by
Haplogroup, and their Standard Deviations
Gordon
Cluster A contains 47 testees, resulting in (47*46)/2= 1081 testee pairs. This
cluster is a special case since it contains subclusters, Aa, Ab,
….. Af. Most of these subclusters are sparsely
populated. Subclusters are important because the RCC values of members paired
with other members of the subcluster indicate that their TMRCAs probably fall
within the time range of available pedigrees.
While
this study concentrates on the major clusters, we kept the subclusters because
the RCC values of their intercluster intersections with the major clusters
might give us additional insight into cluster evolution. Further information on
these Gordon subclusters, clusters, and interclusters can be found in Appendix
B.
C.
Locating the Points in the RCC Matrix that Share Identical Common Ancestors
In
the schematic RCC matrix (Figure 1b) there are points that share identical
common ancestors (e.g., the common ancestor at CF and RCC=40 is the same for CD
and CE). By inspection, we recognize in Table 1 that many interclusters have
average values of RCC that are nearly identical. Those points are the leading
candidates where the common ancestors of Gordon clusters are shared. Scarcity
of data causes uncertainties in those averages, which often vary greatly,
producing problems of interpretation. To meet the challenge of mapping the
results of Table 1 into an evolutionary diagram, we must identify those
junction points – the times when the progenitor of a new cluster line was
formed from an existing cluster line.
There
are three haplogroups to which the main Gordon clusters belong. In Table 1 they
are identified by different colors. The two inner boxes in the table contain
the clusters in Haplogroup I1 and R1b1b2. Only one major cluster, Gordon G, was
identified in Haplogroup I2b1.
To
simplify forming an evolutionary diagram, we treat each of the three
haplogroups separately. Figure 3 shows plots of the average value of RCC for
each of the intercluster CAs for Haplogroups I1 and R1b1b2, taken from Table 1.
The SD of the mean of each point is given by the error bars.
Figure
3: The Average Value of RCC for Gordon Intercluster Pairs in Haplogroups I1 and
R1b1b2.
These
two plots strongly suggest that the common ancestors of several interclusters
lived at the same time[7].
Table 2 gives the details.
Table
2: Common Ancestor Locations of Gordon Intercluster
Pairs1
Haplogroup |
Intercluster
Pair |
RCC |
SD |
Years
Ago |
DATE
(BCE) |
R1b1b2 |
HK & KT |
121.1 |
1.8 |
5250 |
3300 |
R1b1b2 |
KL & KQ |
91.7 |
1.1 |
4000 |
2000 |
R1b1b2 |
HQ, LT, HT &
QT |
52.3 |
0.6 |
2250 |
300 |
I1 |
AD & CE |
52.4 |
0.4 |
2250 |
300 |
I1 |
AC, AE, & DE |
47.8 |
0.4 |
2100 |
130 |
D.
Gordon Evolutionary Diagrams
Within
each haplogroup we start with the oldest pairs of clusters that appear in the
intercluster regions of Table 1. We plot the pairs from the oldest to the
youngest in time, taking into account the locations of the Gordon intercluster
pairs in Table 2. Each time a new cluster appears in the
evolutionary track, we interpret it to mean that one of the original pairs has
split off, producing the progenitor of a new line, which then proceeds to
evolve, through mutations, down the evolutionary diagram to the present.
The results are presented in Figures 4A, 4B and 4C. There are two RCC breaks in
Figure 4A and one in Figure 4B. The dates below and above the lower breaks use
the factors 43.3 and 52.7 in computing the dates for the common ancestors of
the clusters and interclusters, respectively (Howard 2009a). The upper break in
Figure 4A reduces the space needed to present the figure.
Figure
4A and 4B: The Evolutionary Diagrams of the Major Gordon Surname Clusters and
Interclusters in Haplogroups I1 and R1b1b2
Haplogroup
I1:
The
intercluster value of Gordon Clusters C and D appear at RCC ~ 65, indicating
that they shared a common ancestor (CA) about 2800 years ago. The SDs of these
age estimates are indicated by the green zones around each plotted point. We do
not know which cluster is older; we know only the location of their CA.
Clusters C and D evolved into Clusters A and E at approximately the same time,
2300 years ago. From that time, the Gordon Clusters E, A, D and C evolved
separately, with the TMRCAs of those clusters appearing in about 1050, 1360,
and 1580 CE, respectively. Cluster A has formed interclusters among which are Aa and Ae. Their TMRCA as an intercluster lived about 1630,
and the individual subclusters have CAs that are much
more recent, in the 19th century.
Haplogroup
R1b1b2:
There
are three Gordon interclusters H, K and T that have a CA who lived about 5000
years ago (RCC ~ 118). Again, we do not know which cluster is the oldest.
Gordon Clusters L and Q evolve along Cluster K’s evolutionary line and they
appear at RCC ~ 92, 4000 years ago. Cluster K then evolves directly to its CA
at about 1500 CE. The earlier evolutionary lines of H, K, L and Q evolve to the
shared intercluster locations of HT, HQ, LT and QT about 2250 years ago, or
about 300 BCE. From there, the lines H and L evolve to RCC ~ 40, 1700 years
ago, when the CA of Intercluster HL lived. From there the two lines evolve to
their own CAs who lived near 1000 CE. The lines to Clusters Q and T evolve from
their common intercluster ancestors at RCC~52 to their own CAs in the 17th and
8th centuries, respectively.
Major
Branching within Haplogroups I1 and R1b:
Figures
4A and 4B show that major branching occurred in both Gordon haplogroups at RCC
= 52, about 2250 years ago. This date, near 333 BCE, was a tumultuous time when Alexander first invaded Gordion (near Ankara,
Turkey) in Anatolia, then Gaul from Thrace, followed by the Romans.
Between 800 and 1000 CE R1b1 seems to encounter
considerable branching, between the times of Charlemagne and William the
Conqueror, corresponding to tumultuous times in French, English, and Scottish
history.
Given
the relative small number (~1%) of I1 in Anatolia, the probability increases
that origins of the paternal Gordon ancestors in Macedonia
and present-day Turkey are R1b1 (Cinnioğlu et al, 2004).
Haplogroup
I2b1 (Gordon Cluster G):
The
intercluster regions between Haplogroups R1b1b2 and I2b1 occur much earlier in
time. In Figure 4C we plot all the positions of the three haplogroup
intersections with Gordon Cluster G.
Figure
4C: The Evolutionary Diagram of the Major Gordon Surname Cluster G (Haplogroup
I2b1) and Its Intersections with Other Gordon Cluster Haplogroups
The
common ancestor of the interclusters of Gordon G, H and T is located at RCC ~
405, or about 17,500 years ago. Since Clusters H and T (red in the figure) are
in Haplogroup R1b1b2, these two haplogroups had a common ancestor at least this
far in the distant past. As evolution took place, the CAs of the interclusters
of Gordon Clusters G, L and K appeared at RCCs ~ 345, or 15,000 years ago.
Later, the CAs of the interclusters of Gordon Clusters G and Q appeared at RCCs
~ 290, or 12,500 years ago. As the line to Cluster G evolved, it intersected
with Haplogroup I1 (yellow in the figure) where it shared a common ancestor
with Cluster A at RCC ~265 (9500 BCE), with Cluster D at RCC ~ 245 (8600 BCE),
and with shared CAs of Clusters C and E at RCC ~220 (7600 BCE)1.
E.
Time Relationships Between Gordon Haplogroups I1 and
R1b1b2
Figure
5: RCC-Derived TMRCAs Among Gordon Clusters of
Haplogroups I1 and R1b1b2
Time
Relationships Derived from the Gordon RCC Matrix Results
Insight
into the evolutionary relationships among the Gordon clusters of Haplotypes I1
and R1b1b2 can be gained from a study of Figure 5. In the figure, the RCC time
scale is given at the far left of the diagram. The next column lists pairs of
Gordon clusters that belong to different haplogroups. The bottom row in the
Figure gives today’s haplogroup designation of the Gordon clusters E, K, A…..L.
Haplogroup I1 clusters are colored yellow; Haplogroup R1b1b2 clusters are
colored red.
At
the top left of the figure, the intersection EK at RCC~292 is where the present
Gordon Clusters E and K shared a most recent common ancestor, 12,600 years ago.
This is the earliest common ancestor resulting from a pairing of a cluster in
Haplogroup I1 with a cluster in Haplogroup R1b1b2. Their common ancestor’s date
is determined by finding the average RCC of the intercluster region between
Gordon Clusters E and K.
At
RCC~292 that common ancestor has the progenitor haplotype of what will be
Clusters E and K, and an assignment of a haplogroup to them at that time
would be meaningless. Only as their haplotypes begin to evolve downward in the
diagram to the present time do their haplogroup assignments become meaningful.
Evolution
has separated the cluster pairings into six distinct RCC intervals – at RCC~
292, 276, 248, 216, 192 and 176. The green zones surrounding each set of cluster
pairs represent two standard deviation error bars and the figure shows that
most of the clustered pairings are in distinct groups. For example, the common
ancestor of the cluster pairs of Gordon CT, AT, CK and ET all share the same
haplotype at RCC~248, about 10700 years ago, or 8800 BCE on the corresponding
date scale at the far right of the figure. The four sigma range of uncertainty
in RCC for these pairs goes from 244-252, or (252-244) x 43.3= 350 years1.
The
intersections of each Gordon cluster with a cluster in a different haplogroup
are given in the center part of the figure. For example, Gordon Clusters E and
K share a common ancestor at RCC~292 having the same 37-marker haplotype. As
Cluster E evolves, its haplotype mutates in such a way that at RCC~248, it
shares a common haplotype with Cluster T and at RCC~ 216 its haplotype is the
same as Clusters H, L and Q. The other vertical columns represent the evolution
of Clusters K, A, D, etc.
Another
example of the evolutionary sequence is shown the vertical column beginning
with Cluster K. It starts with a common ancestor with Cluster E, but evolves so
that at RCC~276 it shares a common ancestor with Clusters A and D. Cluster K
then evolves so that it shares a common haplotype with
Cluster C.
Figure
5 suggests that when Cluster K evolves to RCC~276, Clusters A and D form, since
the most recent common ancestor of AK and DK are the same and the Gordon A and
D clusters are not evident in the data earlier in time. Then, as Clusters K and
A evolve, they spin off Cluster C and Cluster T, respectively. This activity
occurs at RCC~248, at which point Clusters T and C share a common ancestor and
haplotype and evolve.
Further
insight into the evolution of these Gordon clusters come from the earliest
dates that they appear paired with another cluster in Figure 5 where they
appear in a box. Table 3 shows the cluster pairing. Clusters that are
underlined and in a larger font appear at the earliest dates of the cluster
pairs in the Gordon clusters found in this work.
Table
3: Earliest Dates Found for the Common Ancestors of Pairs of Gordon Clusters
Cluster Pair |
Earliest Paired RCC |
Corresponding Years Ago |
Corresponding Date (BCE) |
E-K |
292.9 |
12700 |
10700 |
AK-DK |
276.8 |
12000 |
10000 |
CT-AT-CK-ET |
248.7 |
10800 |
8800 |
EH-EL-AQ-EQ-DT |
214.7 |
9300 |
7350 |
DL-CL-AH-AL-CQ-CH |
191.2 |
8300 |
6300 |
DQ-DH |
177.0 |
7650 |
5700 |
F.
The Subclusters and Intersubclusters in Gordon Cluster A
Just
as clusters and interclusters in the RCC matrix give us insight into the evolution
of a surname, so do subclusters and their intersections, the intersubclusters.
The RCCs of subclusters are very small, so the probability of linking them to
testee pedigrees is high. Dictating against success, however, is that unknown
mutations at low RCCs cause major uncertainties in discovering such links.
Nevertheless, in this Section we will investigate what information might result
from the results of the six subclusters a, b, …..f,
within the major Gordon A surname cluster.
Subclusters
a and e contained 55 and 15 pairs of testees, but Subclusters b, c, d and f
contained only one pair. However, intersubcluster pairs for the latter four
subclusters give an indication of how the lines evolved, so we retained them in
the study. The approach for subclusters will be the same as that applied in
Section C to clusters. Table 4 summarizes these data for the subclusters and
their intersections within the main Gordon A cluster.
TABLE
4: Average Values of RCC for the Gordon A Subclusters
and Intersubclusters, and their Standard Deviations
The
RCC of the most recent common ancestor of each intersubcluster appears in the
diagonal of Table 4 along with the SD of the mean in parentheses. The averages
of intersubcluster pairs are listed in their intersections above the diagonal
and the SDs of their means are listed at the appropriate intersection below the
diagonal. Yellow entries for the SDs (0.6) are estimates based on the average
SD of other entries.
To
simplify forming the evolutionary diagram, Figure 6, derived from Table 4, can
be used to identify, by both inspection of the error bars and analysis, the
pairs of intersubclusters that overlapped in time. They are (1) ab, ac and ad;
(2) bc, ae, bd, df and af; and (3) ce, bf, be and de.
The SD of the mean of each point is given by the error bars in Figure 6.
Figure
6: The Average Value of RCC for the Major Gordon A
Subcluster and Intersubcluster Pairs
Within
each intersubcluster we start with the oldest pairs that appear in the
intersubcluster regions of Table 4. We plot the pairs from the oldest to the
youngest in time, taking into account the locations of the Gordon
intersubcluster pairs in Table 4. Each time a new subcluster appears in the
evolutionary track, we interpret it to mean that one of the original pairs has
split off the progenitor of a new line, which then proceeds to evolve, through
mutations, down the evolutionary diagram to the present. The results are
presented in Figure 7.
Figure
7: The Evolutionary Diagram of the Subclusters and Intersubclusters of the
Major Gordon ClusterA (Haplogroup I1)
The
columns to the left in Figure 7 indicate the RCC value of the common ancestor
subcluster and intersubcluster intersections, followed by estimates of the
number of years ago (from 1945, the assumed average birth year of the testees),
the corresponding date, and the number of generations (assuming 27 years per
generation) when the events to the right occurred1,3.
Figure
7 indicates that the suggested evolutions of these subclusters took place after
the invasion of England by William the Conqueror in 1066. Since pedigrees
sometimes extend back that far, the presence of subclusters may offer valuable
insights into the formation of individual surname lines within periods of time
covered by pedigrees. We must still stress that the presence of non-average
mutations introduce uncertainties that might amount to several hundreds of
years, so these lines must be approached with caution1. As more Gordons are tested, these subclusters,
hence their evolutionary relationships, will become better defined. At this
time, we can probably trust only the results that separate the oldest from the
youngest subclusters.
The
members of these major Gordon Cluster A subclusters
(Aa through Af) are among those listed in Appendix B. Other members of Gordon A
do not appear to be members of a subcluster.
A
COMPARISON OF INDEPENDENT GORDON SURNAME GROUPINGS DONE BY PEDIGREES AND Y-DNA MARKER
VALUES AND DONE BY RCC MATRIX ANALYSIS
Gordon
surname groupings were made independently by one of us (Gordon) in the
traditional way, using Y-DNA results and comparing them with pedigrees, when
known, and by one of us (Howard) by sorting the RCC matrix so that small values
of RCC appeared in different Gordon clusters. The following two histograms in
Figure 8 show a comparison of the results from both methods (top), and the
difference between the results of the two methods (bottom). Only pairs of testees
are included who were matched by each method; no unmatched or ungrouped testees
appear in the histograms. No changes in matchings or groupings were made by
either author prior to deriving Figure 8.
Figure
8: Histograms of the Frequency of Occurrence of RCC values for Gordon Groupings
from Pedigree and Haplotype Markers and the Difference Between
the Two Methods
The
comparison histogram of the two methods is remarkably similar. The total number
of pairs of testees in the pedigree and RCC matrix groups are 2352 and 2336
pairs, respectively, showing that both methods make approximately the same
groupings using the same sample of testees. It is evident from the difference
histogram that more groupings fall into RCC-derived Gordon clusters between RCC
0-20 than in the pedigree groupings. Moreover, the traditional approach
includes some testees in groupings that the RCC method does not include in a
cluster, and those inclusions, when paired, result in values of RCC between 30
and 70 and between 85 and 100. RCC values between 30-70
are more typical of intercluster relationships between those pairs of testees.
Fewer pairs fall between 85-100.
Figure
9 shows the difference in the cumulative distribution of the two approaches. It
shows that 95 per cent of testees have been included in matrix-derived clusters
when their RCC ~ 28 whereas that same percentage of pedigree groupings is not
achieved until their RCC ~ 60. These results are consistent with the contention
(Howard 2009a) that the use of both methods will yield better results than
using the traditional method alone.
Figure
9: The Cumulative Percentage of Gordon Testees Who Are Grouped by Haplotype
Matching and Pedigrees and by RCC Matrix Matching as a Function of RCC
PRESENTATION
OF THE RELATIONSHIPS BETWEEN THE GORDON PEDIGREE LINES AND THE GORDON CLUSTERS
After
comparing the groupings based on haplotypes and pedigrees with the clusters
derived from the RCC matrix, it became evident that some of the pedigrees
designations were not correct or could be refined. In Table 5 we show the
original pedigree designation as well as the final name of the pedigree-cluster
association and we will refer to the final designations in the rest of this
paper.
Appendix
B identifies, by Kit Number, the Gordon clusters derived from the Gordon RCC
Matrix and the Gordon lines derived from pedigrees and traditional marker
analysis. The statistics resulting from Appendix B are shown in Table 5.
Table
5: The Distribution of Gordon Testees Among Gordon
Pedigree Lines and Gordon Clusters Derived from the Gordon RCC Matrix
OBSERVATIONS BASED ON CLUSTER GROUPINGS (From
Table 5)
·
The
individual Gordon Clusters belonging to Haplogroups I1, I2b1 or R1b1b2 have no
overlap in membership.
·
All
43 testees in the Jock and Tam Gordon pedigree lines belong only to Gordon
Cluster A and its subclusters.
·
The
47 members of Gordon Cluster A are distributed only in the Jock and Tam Gordon
and the Sir William Gordon Branch. Gordon Cluster A also contains all 4 testees
in the Sir William Gordon Branch. The subcluster Gordon Aa partially overlaps
these two lines but its one member may be a transition between subclusters Aa and Ab,
or may actually belong to Ab.
·
Gordon
subclusters Ac, Ad, Ae and Af only appear in the
pedigree line of Jock and Tam Gordons.
o
These
observations indicate that when more Y-DNA test results become available, the
RCC matrix will show more subclusters for the major Gordon clusters, which
will, in turn, contain more members, just like Cluster A.
o
The
close association of the Sir William Gordon Branch with members of Cluster A
indicates a shared recent common ancestor.
·
The
16 testees in the Seton-Gordon pedigree line are distributed among four
separate Gordon Clusters (H, K, O and Q) in Haplogroup R1b1b2.
·
The
13 testees in the Progenitor Branch are evenly distributed in Gordon Clusters D
and E in Haplogroup I1.
·
The
4 testees in the pedigree line Small Grouping-Gordons - Subgroup 10 are the
only members of, and hence define, Gordon Cluster T in Haplogroup R1b1b2.
·
The
7 testees in the Stewart-Gordon line are all located in Gordon Cluster G in
Haplogroup I2b1.
·
The
same 5 members of Gordon Cluster C belong only to Subgroup 2 of the Small
Gordon Groupings.
·
Gordon
Cluster O and Subgroup 8 of the Small Gordon Groupings line each contain only
the same 2 members.
One
of the goals of this analysis is to see how closely members of particular
pedigree lines could be placed in a Gordon cluster derived from the values of
paired testees in the RCC matrix.
Table
5 shows the Gordon clusters derived from Y-DNA results that have pedigrees.
There were 68 testees who had pedigrees who were not assigned to RCC clusters.
Of those 68, only one had a value of RCC less than 20, 28% had an RCC under 25,
44% had an RCC under 30, and 56% had an RCC under 34. We can therefore conclude
that an RCC ~20 represents a practical limit for the identification of a Gordon
cluster from the RCC matrix and that Table 5 indicates that they can be matched
with available pedigrees. An RCC ~ 20 corresponds to ~ 900-1100 CE, a date
consistent with the first use of surnames (Howard 2009a,b).
Of
the 68 testees who had pedigrees but who were not assigned to a Gordon cluster,
121 pairs (2.6%) had values of RCC between 20 and 30, just over the practical
cluster definition. Eight of these had a total of 76 RCC values between 20 and
30.
OBSERVATIONS BASED ON PEDIGREE GROUPINGS (From
Table 5)
When relationships between pedigrees and subcluster
membership are compared, they reveal totally new information that could have
been derived only from this approach. The presence of subclusters was
discovered in Howard 2009b, but the remarkable relationships among pedigrees,
cluster and subcluster membership, geographical location, and their evolution
has become apparent only through this study.
The Sir William Gordon Branch 2 has been renamed the
“Progenitor Branch” of Haplogroup I1 because it is the earliest pair among the
Haplogroup I1 major Gordon Clusters (D and E) (See Figure 5). Gordon Clusters D and E have a common ancestor at approximately RCC = 46.2
(about 2000 years ago) and their TMRCA with Gordon Subcluster Ab at RCC = 41 and 34, respectively. The Gordon subcluster Ab has an earlier TMRCA with Clusters D and E,
approximately 1500 and 1800 years ago, respectively.
Second, the Sir William Branch 3 has been renamed
“Small Grouping-Gordon – Subgroup 10.”
This group comprises Gordon Cluster T but does not fit either documented
history, or the I1 Haplogroup. Subgroup
10 may actually be more closely related to Seton-Gordons at an RCC of approximately
54 (2300 years ago).
It is noteworthy that subclusters Aa,
Ab, Ad, Af and possibly Ae, all appear to have documentation connecting them to
the Lowland Gordons. Moreover, subclusters Ab, Ad, and Af
possibly have subcluster connections to the former larger southwestern region
of Galloway. [Note: Tradition states that the original Lowland Gordon
stronghold in the 11th century was in Berwickshire on the eastern
seaboard]
It should also be noted that we do not have enough
family information on Ae to determine where in Scotland they originated. Due to
insufficient documentation, we were also unable to draw any Lowland connections
for subcluster Ac.
The intersection of subcluster ef occurs during the
origins of the Gordon family in Scotland.
The intersubcluster cf occurs about the time of Sir
William Gordon’s death at about 1370.
Subclusters ce, bf, be, de
occur during a time of much upheaval in the Gordon family, when rival Gordon
family factions were fighting for titles.
The intersection of subclusters bc, ae, bd, df, af at
about 1629 occurs at the time when the two Gordon houses of Troquhain and Crogo in the South of Scotland came together through marriage of James Gordon of
Troquhain and his cousin Janet Gordon of Crogo and Dalquharm, as well as the
beginning of the house of Kenmure and continuation of Lochinvar. See: http://www.thegordondnaproject.com/93333.html
CLUSTER
A
No
analysis is provided for Cluster A as a whole, due to complexities inherent in
analyzing such a large group. The fact that we are unable to break it down into
further subclusters may have one or more interpretations.
1. Our
sample is not large enough to identify further subclusters.
With time and more testees,
more subclusters will appear.
2. Minimal
mutations have yet to occur in the first 37-markers since the origin of Cluster
A to permit identification of more subclusters.
However,
expanding RCC analysis to RCC 67-marker-based analysis may reveal further mutations,
and thus, subclusters and insight.
3. Subclusters
may reflect a bias towards the higher number of testees from the outside the
UK. This may be attributed the difficulties in recruiting testees from Western
Europe, where DNA testing has yet to gain popularity as an extension of genealogical
research.
4. We found a very strong correlation between
membership in a Y-DNA subcluster and membership in a pedigree group, indicating
that if two testees share an RCC value of the order of 10 or less, then it is
highly probable that they can be found in a pedigree group. Thus, using the RCC
correlation technique, we have linked near-term genetics to a genealogical
pedigree.
MERGING THE RESULTS OF GORDON CLUSTER GROUPINGS AND
THEIR ASSOCIATED DATES WITH THE RESULTS OF PEDIGREE/HAPLOTYPE GROUPINGS. DATING THE GORDON
PEDIGREE AND CLUSTER LINES.
Table
5 shows the distribution of Gordon testees among Gordon pedigree lines and
Gordon clusters derived from the Gordon RCC matrix. The evolutionary diagrams
in Figures 4, 5, and 7 show how these Gordon clusters evolved and give
estimates of when their most recent common cluster ancestor lived. When the
results of Table 5 are convolved with the evolutionary diagrams of the same clusters,
we can show the various times when the TMRCAs of each pedigree line of testees
lived. That convolution is shown in Table 6.
Table
6: Date Groups Within Which the TMRCAs of Gordon Clusters Having Identified
Pedigrees Lived.
*
See Footnote 1
The
TMRCA calculation is found by multiplying the RCC of the TMRCA by 57.2, the
factor appropriate for the derivation of the TMRCA for a cluster (Howard
2009a). Clusters in red belong to Haplogroup R1b1b2; clusters in yellow belong
to Haplogroup I1; the cluster in green belongs to Haplogroup I2b1.
OBSERVATIONS BASED ON CLUSTER DATES AND PEDIGREE LINES
(From Table 6)
In this section, the Gordon Septs and the subclusters
of Cluster A are not discussed further because their association as Gordons
(the Septs) are not proven and because the dating of TMRCAs of subclusters have
uncertainties that dominate the dating process.
Of the 104 testees, 69 percent are located in clusters
that have TMRCAs who lived in the late 13th and early 14th
centuries. Their RCC values cluster tightly between RCC 10.2 and 12.6. We
identify an “intermediate date group” that dominates the data set. We note that
these dates are estimates of the TMRCA of the Gordons tested. They probably
point to the times when their earliest identified ancestor adopted the name
Gordon. Each of these clusters has earlier ancestors, of course, but the
convergence in 1280-1410 is in agreement with other hypotheses about when most
of our early ancestors adopted their name. Unlike many other surnames (e.g., Cook(e) or Cooper which were adopted from occupations that
occurred throughout Europe), names like Gordon, based on titles or places, are
more tightly grouped in location.
We identify an “older date group” containing 16
percent of the testees whose TMRCAs lived before about 700 CE. They are all in
Haplogroup R1b1b2 and have RCC values that extend beyond the usual cluster
boundary of ~20-25. It is unusual for pedigree/cluster identifications to be
made for testees in clusters whose TMRCAs are located
so far back in time.
We identify a “younger date group” containing
approximately the same percentage as members in the older date group. This
group lies in a time interval where pedigrees are useful, but due to
uncertainties in random mutations, the RCC time scale is not as useful when
applied to this group. In fact, the presence of subclusters indicates only a
close relationship among the testees in the subcluster, many of whom may know
each other.
RELATIONSHIPS
BETWEEN HISTORICAL EVENTS AND THE CHRONOLOGICAL EVOLUTION OF THE GORDON SURNAME
Events
in Scottish and European History Compared with Events in the Evolution of
Gordon RCC Clusters
The
approximate ages of Gordon clusters and interclusters were derived in the
previous sections. Figure 10A and 10B presents the events in history over the
time intervals derived for the major Gordon clusters and interclusters.
Figure
10A: A Comparison of Events in the Evolution of Gordon Clusters and Events in
European and Scottish History from the Maximum of the Last Glaciation to 2500
BCE
The
right hand side of Figure 10A lists the points in time when the ancestors of
various pairs of Gordon interclusters lived. During this period, following
Figures 4A, B, and C, there were only intersecting haplogroups. For example,
Gordon G in Haplogroup I2b1 with Gordons H and T in Haplogroup R1b1b2 at RCC ~
405 or 17,500 years ago. In other words, the shared common ancestor in those
haplogroups had a ‘beginning’ haplotype that mutated down the lines of the
Gordon G, H and T clusters.
The
earliest Gordon haplogroup pairs appeared just after the time of the last
glacial maximum in Europe. Then the pairings of Gordon Haplogroups G, L and K
(L and K splitting from G) occurred when humans began to populate Europe as the
glaciers melted. The common ancestor of Gordon G and Q lived at the end of the
glacial period. When Scotland became habitable in about 9500 BCE, the common
ancestor of Gordon A and Gordon G lived. This was the first appearance of
paired Gordon clusters in Haplogroup I1. Gordon G and D had a joint CA at about
8700 BCE. The common ancestor of Gordon C and E had the beginning haplotype of
their lines at about 7800 BCE when there was a post glacial isostatic rebound
after the glaciers melted.
About
3700 BCE farming and framed buildings appeared. Gordon Clusters H, K and T
shared a common ancestor haplotype when stone houses appeared in the Orkney
Islands. By that time common Gordon ancestors had appeared within a single
haplogroup, R1b1b2.
Figure
10B: A Comparison of Events in the Evolution of Gordon Clusters and Events in
Scottish History Between 2000 BCE and the Present
The
first instance of a joint ancestor of major Gordon surname clusters (C and D)
in Haplogroup I1 occurs about the year 1000 BCE when hill forts were first
built, when the Celtic culture and language was introduced into Southern
Scotland and when late bronze age material was being used at Edinburgh Castle.
Gordon
Clusters A and E first appeared when their common ancestor-progenitors were
paired as AD and AE near the year 400 BCE at about the time that tribes in
Scotland became quarrelsome. The first intercluster progenitors that involved a
pairing of Clusters AC, AE, and DE lived at about the time of the Roman
invasion of Britain and their entry into Scotland.
The
Viking raids began in about 800 CE. At that time, or shortly afterwards, when
Scotland was assuming its modern identity, the common ancestors of the
currently defined Gordon Clusters lived, first those in Clusters T, L, and H in
Haplogroup R1b1b2 and Cluster E in Haplogroup I1 in about 1000 CE, and next
those in Cluster A about 600 years ago, followed by K, C and D, G and Q. The
common ancestors of the subclusters of Gordon Cluster A lived more recently.
TIME
ESTIMATE COMPARISONS:
Comparison
of the Gordon RCC Time Estimates and ISOGG Time Estimates
Figure
4C indicates that at RCC~405 (17,500 years ago) the earliest pairs of Gordon
clusters (GT and GH) in different haplogroups I1b1 and K had a most recent
ancestor. Figure 5 indicates that at RCC~292 (12,600 years ago) the earliest
pair of Gordon clusters (E and K) in different haplogroups I1 and R1b1b2 had a
most recent ancestor, with Gordon Clusters A and D paired with Cluster K at
about 12,000 years.
Table
7a: Summary of Times when MRCAs of Three Gordon Haplogroups Lived (Kyrs ago)
|
G(I2b1) |
H(R1b1b2) |
T(R1b1b2) |
K(R1b1b2) |
A(I1) |
D(I1) |
E(I1) |
G
(I2b1) |
|
17.5 |
17.5 |
14.9 |
11.5 |
10.6 |
9.5 |
H(R1b1b2) |
17.5 |
|
2.25 |
5.3 |
8.3 |
7.6 |
9.35 |
T(R1b1b2) |
17.5 |
2.25 |
|
5.3 |
10.7 |
9.35 |
10.7 |
K(R1b1b2) |
14.9 |
5.3 |
5.3 |
|
11.95 |
11.95 |
12.65 |
A
(I1) |
11.5 |
8.3 |
10.7 |
11.95 |
|
2.25 |
2.1 |
D(I1) |
10.6 |
7.6 |
9.35 |
11.95 |
2.25 |
|
2.1 |
E
(I1) |
9.5 |
9.35 |
10.7 |
12.65 |
2.1 |
2.1 |
|
Table
7b: Estimated ISOGG Dates for the Origins or Splits of Haplogroups I and R
(Kyrs ago)
ISOGG Event |
Time of Event
(Kyrs ago) |
Comparison with
Results in Table 7a |
ISOGG I |
Before 18-20 |
cf. I2b1 at 17.5 |
ISOGG I1-I2 split |
28 |
cf. I2b1 at 17.5 |
ISOGG R |
27 |
|
ISOGG R1 |
~ 18-22 |
cf. R1b1b2 at
17.5 |
ISOGG R1b1b2 |
4-8 |
cf. internals at
2.1-5.3 |
Estimated
dates for the origin of haplogroups are given in the International Society of
Genetic Genealogy’s Y-DNA Haplogroup Tree[8].
In its 2010 version, it is suggested that Haplogroup I likely divided into Haplogroups I1 and I2 approximately 28,000 years ago. Additionally,
Haplogroup R is believed to have arisen about 27,000 years ago in Asia, but its
subgroups, R1 and R2 arose more recently. R1 is estimated to have arisen during
the height of the last glacial maximum, with R1b arising in southwest Asia.
Haplogroup R1b1b2 also originated in southwest Asia and is observed most
frequently now in Europe, especially western Europe.
This branch of R holds the Gordon Clusters in Haplogroup R1b1b2 and the ISOGG
estimates that it originated approximately 4000-8000 years ago. These estimates
are summarized in Table 7b.
With
a larger sample of Gordon testees, an earlier date might be found for the
cluster intersections between haplogroups. Nevertheless, a comparison of the
ISOGG dates with those determined using the RCC time scale shows good agreement
and no inconsistency between the RCC- and ISOGG-derived estimates. The ISOGG
estimates that R1b1b2 arose approximately
4-8 Kyears ago in southwest Asia and that it spread into Europe from there. The
TMRCAs of Haplogroup R1b1b2 Clusters H, T and K, when paired, show a date range
of 2.25 to 5.3 Kyrs. Since these are lower limits to the date, there is good
agreement and no inconsistency between the two date estimates. In fact, it may
suggest that the progenitors of the Gordons within R1b1b2 formed when the
cluster progenitors had already reached Western Europe or even to the Scottish
Highlands.
The ISOGG time estimates, the RCC time scale, the Y-DNA
evidence and our results are consistent with an origin of the Gordon surname in
areas near modern Turkey and Greece. However, given the relative
small number (~1%) of I1 in Anatolia, the probability increases
that origins of the
Gordon ancestors who carry Y-DNA haplotypes in Macedonia and
present-day Turkey are R1b1 (Cinnioğlu et al, 2004).
Comparison
of the Gordon RCC Time Estimates and Historical and Pedigree Records
The
Gordon surname was probably chosen because of location and its association with
a famous contemporary and not by occupation or physical characteristic. The
origin of the Gordon surname may tie to early BCE settlements below 42 degrees
North latitude in modern day Turkey, Greece and Crete and/or to individuals
with names like Gordian, Gordias, Gortys, Gordus, Gordinis.
The historical and chronological record may trace the evolution of the name
from these areas into France during the first CE millennium and from there to
the British Isles at the time of William the Conqueror[9].
Gordon
clusters in Haplogroups I1 and I2b1 shared common ancestors as recently as 12.6
Kyears ago, and this places a lower limit on the epoch of their pairing, well
within the ISOGG estimate that the Haplogroups I1 and I2 split about 28 Kyears
ago. The origin of the Gordons at latitudes below 42 degrees North,
was comfortably below the southernmost extension of the last glaciation. The
earliest common ancestors of all Gordons, again a lower limit, lived about 17.5
Kyears ago, as the glaciers began receding. Members of the Gordon Haplogroups
I1 and I2 then probably migrated northward, following the glacier melt. These
dates are consistent with the contention that “Human
site occupation density was most prevalent in the Crimea region and increased
as early as ca. 16,000 years before the present. However, reoccupation of
northern territories of the East European Plain did not occur until 13,000
years before the present”[10].
The earliest common ancestor found between Haplogroups I1 and R1b1b2 (Gordon
Clusters E and K), lived about 12.6 Kyears ago according to the RCC Time Scale,
in good agreement with the reoccupation of those northern territories after the
glacier receded. It is then consistent with the DNA record that the Gordon
Clusters in Haplogroup I migrated to the northern regions of Scandinavia while
the Gordon Clusters in Haplogroup R migrated into France and other regions of
Western Europe. This activity occurred at times well before they could be
compared with pedigrees.
The first inhabitants of Britain probably came
from France, across a much shallower English Channel or by boat from the
seacoasts of Western Europe. Archeologists have found and dated artifacts near
lakes and seashores used by hunters who first visited only during the warm
season of the Mesolithic in 3500-8000 BCE, dates that are consistent with the
TMRCAs of the Gordon clusters in R1b1b2 haplogroups when H, K, L, T, and Q
members shared a common ancestor. Nomadic animal herders arrived some time
after 5000 BCE and became Britain’s first farmers. By the Neolithic period
(3500-2500 BCE) Britain had become an island. At its end, starting in the
Bronze Age (2500-500 BCE), farmers were settling, clearing forests and
beginning to use stone tools that transitioned to bronze after about 2300 BCE.
Weapons developed from bronze became more effective in the Iron Age (500 BCE-70
CE), a period when population pressures and the growth of the ruling class
prompted the need for defensive structures. Powerful chiefs formed the nucleus
of what was to develop into the Scottish clans, with the farmers transitioning
to vassal status, serving the chiefs in exchange for protection. This time
period was also before individuals would appear in pedigrees, but the ties of
the vassals to the chiefs probably resulted in the choice of the chiefs’
surnames when it was time to choose surnames.
While it is consistent with modern history and
the DNA record that Gordons within Haplogroup R came to Britain and Scotland
across the English Channel, Gordons within Haplogroup I probably came to
Britain and Scotland as Viking raiders from Normandy, married, stayed, and were
assimilated into Britain in the epoch between 500 and 1000 CE.
The first Gordon on record and in a pedigree, Richard of the Barony of Gordon, lived in the mid-12th
century. Crude pedigrees and the formation of House of Gordon go back only
to the 1300s when Sir Adam Gordon led the family in the Battle of Halidon Hill in 1333. More
trustworthy pedigrees date only from the 14th century when the House of Gordon first
appeared.
RELATIONSHIPS
BETWEEN THE GORDON PEDIGREE LINES AND THE GORDON CLUSTERS (see Table 5).
A. The Gordon Septs:
One of the goals of The
Gordon DNA Project has been to determine whether there might be any genetic
links between the Gordon Septs and other Gordons. Indeed, RCC reveals that
there are indeed some genetic ties between three Gordon Septs belonging to the
Lawrie, Todd, Atkinson and Craig families are represented by Clusters N (2
testees), and T (1) in Table 5. These clusters are in Haplotype R1b1b2, as are
the Seton-Gordons.
Cluster N is amorphous
and contains only four members, two of which have pedigrees of a Gordon Sept,
one is classified in Subgroup 0 of the Small Grouping of Gordons, and one is ungrouped.
Four out of the five members of Cluster T appear to belong
the Small
Grouping-Gordons - Subgroup 10; the other belongs to the Gordon Septs.
One
testee, Kit No. 127855, was assigned to a specific Gordon group but was not
included in a Gordon cluster. His haplotype and pedigree indicated that he was
a member of the Gordon Septs and he is in Haplogroup I. But in Table 5, all
three Gordon Septs belong to Haplogroup R1b1, an inconsistency that warranted
closer inspection since such “outliers” may offer valuable information on
differences between the traditional/pedigree and the RCC matrix approaches. He
definitely belongs to the Gordons since his RCC associations are under 50, near
the cluster edges of A, C, D, and E; he is in Haplotype I1. The inconsistency
is that he has been assigned to the Gordon Septs, and Haplogroup R1b1 instead
of Haplogroup I. Since the Spring of 2010 when the
composition of Gordons we studied was ‘frozen’, there have been more testees
assigned to the Gordon Septs. Since September 2010 the
number of Gordon Septs has grown from 4 to 41. There are 9 more Gordon Septs in
Haplogroup I, although 75 percent of all Septs are in Haplogroup R1b1. The fact
that the Gordon Septs appear to have membership in both Haplogroups I and R
means that two Gordon Sept lines in two different haplogroups will share no
common ancestor within 7,000 years, while two Gordon Septs within the same
haplogroup may share a common ancestor more recent than about 5,000 years. The
lesson to be learned here is that small sample statistics can be misleading and
that care must be exercised when broad, significant conclusions are drawn from
insufficient data.
B. The Jock and Tam Gordons and the Sir
William Gordon Branch
Of
the 43 testees assigned to the Jock and Tam Gordon group, all, without
exception, were assigned to Gordon Cluster A. About half of the members in
Cluster A are in closer associations within subclusters. Four testees assigned
to the Sir William Gordon Branch also appear to belong to Gordon Cluster A.
This overlap between the Jock and Tam Gordons and the members of the Sir
William Gordon Branch suggests the sharing of a common ancestor within the
genealogical time frame.
C. Small Grouping-Gordons
Clusters in this group
of Small Gordon Groupings are fragmented with no documented or otherwise
identifiable non-genetic connection between clusters; however, it is worth
noting that most clustering occurs in the R1b1 groupings, possibly attributable
to the high occurrence of the haplotype in Western Europe.
Through FTDNA, and
ySearch.org, the project has identified several groupings that do have
high-resolution 67-marker matches, such as the Stewart-Gordons. Thus, the
project has hyphenated the name Stewart-Gordon to reflect the likely Stewart
ties and hyphenates other small groupings if a high-resolution genetic link is
found with other surnames.
D. Small Grouping-Gordons
– The Stewart-Gordons
This Gordon group had a
documented common Gordon ancestor in the mid-1700s. Albeit, matching no other
Gordons, this group matched 65+ markers at the 67-marker level with the Stewart
families.
All
seven testees studied who belonged to Haplogroup I2b1 were also assigned to
Gordon Cluster G, the oldest Gordon cluster studied here. The Gordons in
Haplogroup I2b1 share a MRCA who is the best candidate yet to be the progenitor
of the Gordons – at least of the Gordons included in our study. Intersections
of Gordon Cluster G with Gordon members of other clusters and haplogroups are
estimated to have occurred about 17,500 years ago, at the end of the last
glacial maximum.
E. Subgroups
0, 2, and 8 of the Small Grouping-Gordons
All
members of subgroup 2 are members of Gordon Cluster C in Haplogroup I1, and all
members of Cluster C are in subgroup 2. The five members of subgroup 0 are in
Clusters L and N (Haplogroup R1b1b2). Cluster L contains only subgroup 0
members, while Cluster N contains membership from two Gordon Septs, one in
subgroup 0 and one ungrouped Gordon. There are only two members of subgroup 8
and both are in Cluster O, which also contains a member assigned to the
Seton-Gordon branch.
F. The
Seton-Gordon Branch – All are in Haplogroup R1b1b2
The
16 members of this pedigree branch appear in four different Gordon Clusters, H,
K, O and Q. The four members of Gordon Cluster K and the seven members of
Gordon Cluster Q are the only members of their clusters that contain a member
of the Seton-Gordon Branch; other members of the branch are members of Clusters
H and O. Figure 4B indicates that the haplotypes of Clusters K and Q have
similar haplotypes so each cluster shares a common ancestor 300-400 years ago,
making them leading cluster candidates for the Seton-Gordon relation.
One
might expect this branch to be confined to only one Gordon cluster as the Jock
and Tam Gordons are confined to Gordon Cluster A but this is not the case. It
shows the difficulty in making a pedigree assignment to an RCC cluster (or the
converse). More testees are needed to resolve this difficulty.
G. The
Ungrouped Gordons
Only
two testees, not grouped by the traditional approach, appeared in a Gordon
cluster and both were in different clusters, H and N. Table 5 shows their
association with other members of those clusters. Clearly more data are needed
to see if the ungrouped Gordons might be in other, as yet undiscovered, Gordon
clusters.
In
the foregoing discussion and in assessing the results shown in Table 5, one
must be careful not to over-interpret situations where only a small number of
testees and their pedigree groupings have been assigned to a Gordon Cluster.
When the entries in Table 5 are low, perhaps below five, they should be viewed
as suggestive. Thus, the most valid conclusions that can be drawn from Table 5
are:
CONCLUSIONS:
A. General Conclusions:
Totally new information that shows remarkable
relationships among pedigrees, cluster and subcluster membership has become
apparent through this study. Our results yield insight into the
evolution and time sequences of haplotypes. The analysis
suggests how a surname may be traced to a geographical location.
The
presence of subclusters within large surname clusters in the RCC matrix was
noted in Howard 2009b. Detailed study of subclusters (e.g., in Gordon Cluster A)
shows that available pedigrees correlate highly with membership in a subcluster.
The close association of subcluster haplotypes within the RCC matrix, combined
with the RCC time scale, indicates how subclusters and pedigree lines may be
tied together.
For many testees who do not yet know how they
connect to others with a shared surname around the world, these correlations
offer a significant new clue for focusing their research.
A
comparison of the ISOGG dates with those determined using the RCC time scale
shows good agreement and no inconsistency between the RCC- and ISOGG-derived
estimates.
B. Conclusions Applicable to the Gordon Surname
Our study has uncovered correlations between
recent historical activity and the formation of subclusters. For example, there
is activity in the Gordon A subclusters around the time of the Jacobite
Rebellions and internal family feuds over titles. It may be significant that we
see subclusters develop after such events, when families are torn apart.
The
ISOGG time estimates, the RCC time scale, the Y-DNA evidence and our results
are consistent with an origin of the Gordon surname in areas near modern Turkey
and Greece, with one major branch (the I Haplogroup) migrating to areas near
Scandinavia and then into modern day UK and the other major branch (the R
Haplogroup) migrating into western Europe and then into Britain. The times
derived from the RCC matrix for the early migrations into the British Isles
from Scandinavia and from Western Europe agree well with the history of the
area derived from archaeological excavations, genetics and anthropologic
studies like the Genographic Project[11].
The theory of Gordons originating from Normandy
and with Malcolm Canmore coming to Scotland fits the time scale of the I1
profile.
The Gordons became Scots and lived together having different I and R
haplogroups. In modern times many Gordons have populated other areas around the
earth, but their haplogroups give good clues as to the origins of many of their
individual Gordon branches.
When
surnames were adopted, those choosing the Gordon surname probably had placename
roots or adopted the name of a Clan chieftan.
The
assignment of a testee to a grouping based on traditional haplotype matching
and existing pedigree information correlates highly with the assignment of that
testee to a Gordon cluster through a comparison of his RCC value with those of
other testees in the sample. When about five or more testees are assigned to a
cluster and when the RCC values of the cluster testees are less than about 20,
there is a remarkable agreement between the cluster identity and its pedigree
assignment.
The
“value-added” feature of the RCC approach is to add a time dimension to the
analysis of the Gordon clusters. Application of that time dimension to the
pedigree line should not only suggest the proper pedigree line to which a
testee belongs, but also the time frame where his most recent common ancestor
with other cluster members may have lived.
Further exploration of French, Spanish and
Latin documents should be made for first-hand accounts on Gordons, prior to
their arrival in Scotland. Male Gordon testees with well-documented pedigrees
should be recruited for each of the Gordon branches, especially
the Seton-Gordons.
The testee group should be broadened to include Gordons and surname variations from Anatolia, Macedonian,
Ghent, French and Spanish regions. Additional studies are suggested on the
House of Gordon USA website.
ACKNOWLEDGEMENTS:
We
wish to acknowledge discussions with Mark A. Gordon whose results prompted this
paper, to David E. Hogg for discussions of the approach, to Sidney Sachs for
discussions about the relation of the cluster and intercluster TMRCAs.
Discussions with James H. Gordon and Lois Todd of the House of Gordon USA have
been particularly valuable.
REFERENCES:
Bulloch,
John Malcolm (1903), The House of Gordon, Volume 1, New Spalding
Club, Aberdeen.
Bulloch,
John Malcolm (1906), Scottish
Notes and Queries. D. Wyllie and Son.
Bulloch,
John Malcolm (1907), The House of Gordon, Volume 2, New Spalding
Club, Aberdeen.
Bulletin de la Société
scientifique, historique et archéologique de la
Corrèze, Volume 32. M. Roche, impr., 1910
Chalmers, George [b.1742-d.1825], (1887), Caledonia, or an Account, Historical and
Topographic, of North Britain, from the Most Ancient to the Present Times:
with a dictionary of places, chorographical and philological, Volumes I, II,
III, IV.; Downloadable from University of California Libraries.
Charles, 11th Marquess of Huntly, ed. (1894), The Records of Aboyne,
1230-1681, New Spalding Club. Aberdeen.
Cinnioğlu, C; King, R; Kivisild, T;
Kalfoğlu, E; Atasoy, S; Cavalleri, GL; Lillie, AS; Roseman, CC et al., (2004)
Excavating Y-Chromosome Haplotype Strata
in Anatolia, Human genetics 114 (2): 127–48.
Dickens, Charles (1887), All
the year round,
Volume 60, published by Charles Dickens.
Eutropias (370 CE), Book IX of Abridgement of
Roman History.
Fordun, John of , (ca. 1384), Chronica
Gentis Scotorum (The
Historians of Scotland).
Howard, William E. III,
(2009a), The Use of Correlation Techniques for the
Analysis of Pairs of Y-Chromosome DNA Haplotypes, Part I: Rationale,
Methodology and Genealogy Time Scale, J. Genet. Geneal., 5: 256-270.. J Genet Geneol,
4:104-124.
Howard (2009b): Howard,
William E. III, The Use of Correlation Techniques for the
Analysis of Pairs of Y-Chromosome DNA Haplotypes, Part II: Application to
Surname and
Other Haplotype Clusters,
J. Genet. Geneal., 5: 271-288.
Howard, William E. III, and Schwab, Frederic R.,
(2012), Dating Y-DNA Haplotypes on a
Phylogenetic Tree: Tying the Genealogy of Pedigrees and Surname Clusters into
Genetic Time Scales, J. Genet. Geneal, this issue.
McBride, Nancy S., (1973), Gordon Kinship. McClure Printing Company.
Rymer, Thomas, (1704-1735),Foedera, Vol. 20, A. & J.
Churchill
Seaton, Oren
Andrew (1906), The Seaton Family, with Genealogy and
Biographies, Crane & Company, pp. 55-56
Seton, Robert, Monsignor, (1899), An Old Family, History of the Setons of Scotland and America, Brentanos,
New York, pp. 44-49
Skelton,
Constance Oliver and Bulloch, John Malcolm (1912), The House of Gordon, Volume 3: Gordons Under Arms, a Biographical Muster
Roll of Officers named Gordon in the Navies and Armies of Britain, Europe,
America and in the Jacobite Risings, New Spalding Club,
Aberdeen.
Wyntoun, Androw (c. 1350
– c. 1423) (1872), Edited by David Laing, The Orygynale Cronykil
of Scotland, Syllabus 1-9 (Vol. 1-3), Edmonston and Douglas.
READING
LIST:
Chalmers, George, Letters (1784-1816) to George
Chalmers regarding the genealogy of the families of Gordon and Gregory. Personal and estate papers, Heritage Division of Aberdeen
University.
Ferrerius, John, (1545),MS. Historiae compendium de origine et incremento
Gordoniae familiae, Joanne Ferrerio Pedemontano authore, apud Kinlos, fideliter collectum.
Fordun, John (Skene, William F. ed.), (1877), The Historians of Scotland, Edinburgh.
Gordon, Charles, (11th
Marquess of Huntley), (1894), The Records of
Aboyne, MCCXXX-MDCLXXI, New Spalding Club.
Lythe, S.G.E., and J.
Butt, (1975), An Economic History of
Scotland, 1100-1939. Blackie, Glasgow.
Maitland, Richard, Sir, of Lethington, Knight, The History of the House of Seytoun to the Year
MDLIX, with the Continuation. Alexander Viscount Kingston, to
MDCLXXXVII. Printed at Glasgow. MDCCCXXIX.
Seton, George, Advocate, M.A., Oxon, (1896), A History of the Family of Seton during Eight
Centuries. George Seton, Advocate, M.A. Oxon., etc. Vol.
I & 2. Edinburgh.
Smith, William, ed., (1870),
Dictionary of Greek and Roman Geography, Little Brown & Co., Boston.
Most above Referenced and
Further Reading books have electronic, downloadable copies available at:
WEB PAGES:
The
Gordon DNA Project
House of
Gordon research
http://www.houseofgordonusa.org/
Seton and Winton family research
http://www2.thesetonfamily.com:8080/cadets/Winton_Family.htm
DNA processing lab
http://www.FamilyTreeDNA.com
APPENDIX
A – IDENTIFYING CLUSTERS IN THE RCC MATRIX:
We
have found the following method to be the most useful and straightforward for
identifying the clusters in the RCC matrix :
APPENDIX
B – Data on Kit Number associations with Gordon Subclusters, Clusters and
Interclusters, with Haplogroup and Pedigree designation, and with associated
statistics.
APPENDIX
B1: List of 119 Subcluster and Cluster members of the Gordon RCC Matrix by Kit
Number, Haplogroup and Gordon Cluster membership.
APPENDIX
B2: For each subcluster, major cluster and intercluster, the average RCC,
standard deviation (SD) of the distribution, number of testee pairs and the SD
of the mean, with conversions to time in the past and year in the past based on
1 RCC = 43.3 years for intercluster TCAs and 1 RCC = 52.7 for the TCA of
clusters.
APPENDIX
B3: 104 Gordon Clusters (from RCC Surname Matrix) and Corresponding Gordon
Lines (from Pedigrees and Traditional Marker Groupings)
[1]
Despite the apparent precision of dates in this analysis, they are probably
uncertain by of the order of 300 years (SD) because, for recent times,
differences in mutations, which average out over long periods of time, will
cause unpredictable uncertainties in the times when the common ancestor of
recent clusters lived (see Howard 2009a). An uncertainty of 300 years (~10
generations) is 30 percent when we deal with genealogically interesting times
of 1000 years. The RCC time scale has random, mutation-induced errors that are
about the same as those assigned by the testing agencies. However, an uncertainty
of 300 years that results from analyses of cluster intersections or haplogroup
intersections that often range above RCC ~100 translates to smaller
uncertainties when many testees are involved in the time determination.
[2] http://www.familytreedna.com/public/gordondna/default.aspx?section=yresults.
Ties between Kit numbers, haplotypes, Gordon cluster designations and pedigree
associations can be found in Appendix B.
[3] When two Y-DNA
haplotype 37-marker strings are correlated, the resulting correlation
coefficient is usually a number greater than 0.9. In order to simplify the
analysis, we define the Revised Correlation Coefficient (RCC) as the reciprocal
of the correlation coefficient minus one times 10,000. Thus RCC will typically
be a number between 0 and 1200. It is proportional to the elapsed time between
the TMRCA of the pairs of haplotypes. If TCA is the time when the common
ancestor of a cluster of intercluster lived, we found (Howard 2009a):
TCA,
cluster = Average RCC of all pairs of cluster members x 52.7 years.
TCA, intercluster = Average RCC of all
pairs of intercluster members x 43.3 years.
[4]
It must be emphasized
that the theories of the pre-pedigree ancestry of the Gordons, as well as their
familial origins cited throughout this paper, are often conjectural and most
may never be convincingly proved. The veracity of ancient records is nearly
impossible to verify. Because parts of these conjectures may have a factual
basis, we present them here in one place because they may be useful to future
researchers. New genetic tools such as the RCC method to be applied to the
Gordons later in this paper may serve to support or cast more doubt in these
theories, and we hope that applying the correlation approach will further our
understanding of the Gordon ancestry and its origins. Our paper attempts to
draw substantive conclusions only from the most creditable available
information; its purpose is to stimulate dialogue among future generations of
researchers as more extensive DNA testing and new analytical tools become
available.
[5]
After 75 years this type of information is in the Public Domain; Bulloch 1903,
and 1907
[6] William Gordon of Crogo see: http://www.thegordondnaproject.com/93333.html
Kit No.
89515 has a well-documented
pedigree, but its haplotype and RCC values are close to members of other
subclusters and other members of Cluster A that have not been assigned to a
subcluster. Similar observations show a similar anomaly with Kit No. 93333. We
believe that these uncertainties indicate a very close association with the
TMRCAs of subclusters Aa, Ab and Ae within major Gordon
Cluster A, with these two testees having ancestors near the transition point
into individual subclusters.
[7] It
is evident that when an applications program (e.g., Mathematica) is used to
form a phylogenetic tree from the same data, the common ancestor for
interclusters that have values of RCC that differ by less than about 10% lived
at the same time (see Howard and Schwab 2012, this issue)
[8]
The ISOGG’s Y-DNA Haplogroup Tree can be found at
<http://www.isogg.org/tree/index.html>. It is being continually updated.
[9] See the
genealogical section of the House of Gordon <http://www.houseofgordonusa.org>