Conclusions from the use of auto

Conclusions from the use of autosomal DNA

Most of the relationships we sought to prove are too many generations ago to give reliable DNA matches through conventional means. The potential link between for example the English Lousadas and their 10th and 11th cousins among the USA Lousadas goes back 9 generations to different siblings of Amador de Lousada of Vinhais. Even 4th cousins do not show up well, for I have only one (25cM) segment in common with my 4th cousin Jeremy Lousada, and only 3 3cM matches with my 5th cousin John Griffiths. From the outset, we were well aware of the poor match-prediction powers of small (3cM) segment matches. Even at 7cM there is significant uncertainty as to their reliability, and only going to 15M gives this. Therefore we sought to rely on maximising the number of matches and then using statistical and related techniques to discern patterns lying within our datasets.

As Qmatch was recommended to us for small matches by GEDmatch, and retaining our desire to accumulate many matches, we set about using Qmatch (set at 3cM, P=3) to compile all 2255 segment matches - mostly off-target matches and false positives of course - from our 13 relative sample - see here the 1963 matches found before ELL was added to the sample of relatives. We found 46 RSBCs, 25 lefthand and 21 righthand, and most were new to us. We spent much time looking at RSBCs, how frequently they occur across chromosomes and in the match-rich areas on chromosomes. But they proved quite difficult to work with, and were relatively unproductive. At this point we checked Qmatch at the default 7cM P=7 settings, and were surprised to find this most worthwhile, for with our set of 12 relatives it yielded inter-branch matches that were totally absent with our 7 relative set.

We then moved from RSBCs to ASBs; but they proved misleading and just as we prematurely claimed in 'Fun with Autosomal DNA', we again thought that we may have established a genetic link between the USA Lousadas, the English Lousadas, the Barrows, Scott's wife (and hence the Fischls) and Randy's parents. The problem with ASBs was that instead of an ASB being associated with a unique crossover, in fact many unrelated crossovers can all report to the same ASB (in fact the same applies to RSBCs as well). This is because a crossover is only measured as lying between 2 particular SNPs - these SNPs being the first non-matching SNP beyond the crossover at one end of a segment and last SNP within the segment at the other end of the segment. Typically there are 5000 base-pair positions between each SNP so the potential for unrelated and (from the family viewpoint) spurious crossovers at the same ASB is large. That is, ASBs are not the amazingly precise way of penetrating the fog of unreliable small matches that was hoped. (An unresolved thought about ASBs, in the light of their appearance below is that the 3pASBs which generate the 3cm P=7 matches shown below have an inbuilt strength that 4pASBs do not have in that the common relative in both abutting matches in a 3pASB is an assurance that the crossovers are genuine).

From this impasse it was timely advice from GEDmatch in February 2026 which assisted. Thus, we came to use Qmatch to look again at all 3cM matches, not at P=3 but at P=7. These conditions give reasonable quality matches, almost as good as 7cM matches from some other providers. With our set of 13 relatives (and our comparison set of 13 randoms) we were able to arrive at the position summarised as follows:

From this we can see the futility of the 3cM P=3 results. For at this setting the randoms show more matches! While the RSBC totals indicate that a family signal may exist, the ASB numbers seem odd (and perhaps can only be explained by a combination of spurious crossovers and the much greater possible ancestor pool where there is no genealogy). But at 3cM P=7, the relatives show 187 more matches than the randoms, the strongest such signal we have seen in comparing relatives with randoms. Looking at the first of the following 2 tables (in which each segment match is counted twice) we see the match-numbers for relative pairs. For example, among the 14 B/Je matches is one of 5cM on Cr10 which is presumably their Ancestry.com 7cM match. Missing matches (shown in grey) align with a family branch; JG misses DNA which would allow a match with 2 Barrow-Lousadas A and J whose specific ancestry may have been responsible though in different ways:

From the next table we see that the randoms show a greater dispersion - with 4 people in less than 4% of matches (compared with 1 relative - J, who despite proven close Lousada genetic and genealogical links, has anomalously low match numbers), and 4 people in more than 10% of matches (compared with 1 relative). That is, as expected, family connections despite some stochastic phenomena also being present, are tighter. The presence of 5cM and 7cm matches is indicated by colours as above, as is (in pale blue) the pairs showing 6 segment-matches (for the reason explained below). The average size of the relatives' segments is marginally smaller than that of the randoms' segments but not if we include the exceptionally large (1st cousin) match between A/Ju. But the random set could contain its own (but unexpected/unknown) internal family links which must be allowed for when we consider the real match count and the false (off-target) match count.

We consider that the 20 7cM matches reflects randoms' true match position in comparison with the 43 shown by the relatives, and we thus make the assumption that the randoms show 47% (that is 20/43) of the real matches in the relatives' set. By elementary algebra the false match component would be 207 in each case (assuming equality of false matches as the set size is 13 in each case), with real matches in the random set being 166 and the number of real relatives' matches being 353. In any case, almost certainly the real matches between relatives (just estimated to be 353) include the 43 relative matches of 7cM plus the 39 relative matches between 5-7cM. Where these relatively strong matches are located there are 202 matches.

We now start to draw together the probably proven matches from different sources. In the chart below the handful of RSBC matches and ASB matches are all that remains after an extensive winnowing process. Neither can be considered definitive The strongest data arises from the 10 links established by Qmatch at 7cM and P=7; these are marked yellow as in the chart above. The 13 5-7cM matches also established at P=7 marked green in both charts are included (actually below only 11 are, because ELL was not included when the table below was first drawn up), but they may be little less certain than the 7cM data.

Not included are the balance of the 353 estimated real matches in the relatives' match data; we don't know which are the real ones! But we can make an estimation. For, in the randoms' data, the 3 outstandingly low-match people (MB, M and KB) show no total of their pair segment-matches above 5. As these 3 people are unlikely to have family connection in the same timeframe as shown for the other 10 randoms, we take 6 segment-matches as indicating a family connection. In order to apply this to our relatives' set, we find that 36 pairs have 6 or more matches (apart from those pairs already showing at least one match of 5cM or more). These 6-match pairs (coloured pale blue above) total 308 segment-matches compared with the 203 segment matches for the pairs containing the 5cM and 7cMmatches, totalling 511. (The pairs showing 5 or less segment-matches total 49, presumably mainly false matches, of course leaving 511). Thus 158 (that is 511 - 353) of these nominal 511 true matches are false, or 31%. The odd of all 6 segments being false is 31%**6 or 0.09%, tantamount to proving that a 6-segment match is a real match. Of the 36 blue pairs, 30 of them (that is, without ELL) are included in another version of the previous chart (below). Here it will be seen that our 6 segment-match rule eliminates the need for RSBC and ASB data.

Above in the chart of relative matches, we can see the 4 categories of relative - English Lousada (MD), Barrow-Lousada (Ju, A, J), Barrow (E, JG, RM, RM, SW), and US Lousada (ELL, B, Je, TP). There is so much inter-branch matching that we may feel confident that despite the small segment sizes, our genealogy (in which the various branches originate in the ancestral family of Amador de Lousada) is supported. This is also despite the fact that some relatives (especially J, followed by JG and ELL) match more reluctantly than others. Illustrations of the inter-connectedness of the branches are:

The sole English Lousada (MD) has a satisfactory set of matches with all 3 other categories, remarkably much more numerous with the Barrows and the US Lousadas (with both of whom he has no conceivable genetic link other than the target Lousada link) than with the Barrow Lousadas with whom he has a proven Lousada link!
Triangulations are rare and precious - not so much that shown by the 1st and 4th cousins A, Ju and J at Cr16 (3219600 - 6259081) but those shown by Ju/RM/Je at Cr2 (217m - 220m), B/Je/E at Cr5 (79m - 81m), J/TP/E at Cr10 (116m - 119m), RM/E/A at Cr17 (31978888 - 33622774) and RM/B/Je at Cr22 (25640628 - 26225384). At the Cr2 site just noted the triangulations Ju/Je/TP and B/Je/TP are likely to be false (containing at least 2 non-Lousada matches as Ju does not match B here). By comparison the set of 13 randoms produced 2 triangulations, nicely further distingishing the randoms from the relatives, while confirming our cautious interpretation of small-match triangulations as just seen. The absence of a triangulation can also be informative - on Cr18 (6.6m - 8.1m) each of the 1st cousins Ju and A has a good match with TP but don't match each other here which shows that at least one of the 2 'good' matches is in fact false.

Finally, out of respect to those whose kit numbers went into generating the random sample, we re-present the table of random matches using our 6-match criterion. This shows perhaps 3 family linkages around 4-8 generations ago (A/H/J, MW/DG and N/M/P), together with an outlying group contributed by Julian Land (MB, MMC, KB). The random 'family' group of 10 contains kits contributed from John Griffiths (C, N, A, H, J) and Julian Land (S, MW, DG, M, P). If all the 56 matches contributed by the outlying group are false, then 151 matches (that is 207 - 56) in the 'family' group are false or 48% (151 of 317) compared with 31% of the relatives' matches. That is, the blue matches are less likely to be real in this case than for our Baruch Lousada relatives above, and there are fewer of them (10/45 = 22%) compared with the Lousada relatives (30/78 = 38%).