How to Use SQL to Query Your Family Tree

If you know some basic SQL and you have a family tree, this tutorial will show you how to:

  • Convert your family tree into a SQL database on your local computer
  • Run SQL queries to create a detailed genealogy report of everyone in your tree

All you need is some free software and a Windows or Mac computer.

Don’t worry if you’re a SQL beginner. We’ll show you how to create and combine simple queries to make a complex list.

And if you don’t want too much explanation, you can just download the fully completed script and get the insights from your tree.

YouTube Walkthrough

If you prefer a video walkthrough of this tutorial, then here it is:

YouTube player

Step 1: Export Your Family Tree to a GEDCOM File

A GEDCOM file is a text file that contains your family tree in a format that can be imported into a wide range of genealogy applications and websites.

If your family tree is online on a website like Ancestry, MyHeritage, or Geni, you can easily export it to a GEDCOM File. Here are detailed instructions for these websites:

If your tree is in desktop software like Family Tree Maker or Family Tree Builder, you’ll find the option to export it to GEDCOM under the File menu.

If your tree is in RootsMagic, then you have a nice shortcut in that you don’t need to export it for this tutorial. Just continue on.

Step 2: Install RootsMagic (Free)

RootsMagic is a desktop family tree application that can be installed on Windows or a Mac. We are going to use it to convert your GEDCOM into a SQL database.

You just need the free version (known as RootsMagic Essentials).

Follow this link to download the installer to your machine. You don’t need to sign up to anything (just unclick the little checkbox underneath the registration details).

Double-click on the installer and accept all the defaults in the wizard.

Step 3: Convert Your GEDCOM File into a Database

Now, you’ll use RootsMagic to get your tree into a SQL database.

Launch RootsMagic on your local machine and choose the free version.

You’ll land on the Home screen which has several tiles with different features.

The top left tile is headed “RootsMagic Files”. Click on “Import File” here.

The next screen presents several options for the type of file to import. Choose “GEDCOM” (it’s the last option in RootsMagic version 9).

The next screen lets you choose the GEDCOM file that you exported in Step One.

This screen actually searches your folders looking for files with the .ged extension. You may see it listed near the top of the screen. If you do, just click on the name.

Otherwise, click the “Browse for File” button at the bottom of the screen and go find your file.

The next screen lets you choose where to put your new family tree SQL database. The “Browse for Destination” button at the bottom will let you go to your chosen location.

The database will have the same name as your GEDCOM file (but a different extension) – you can change that here if you like.

The next screen isn’t very important. You can leave all the default options and click OK.

There’s just one more pop-up screen to navigate (whew!). Again, it’s not important – just leave the default options and click OK.

You now have your family tree displayed in the RootsMagic software.

At this point, we don’t need RootsMagic anymore because it has also converted your GEDCOM into a SQLite database. That’s what we’re interested in.

You can close the RootsMagic now, using the x button on the top right of the screen. Oh wait, there’s one more pop-up screen about a backup.

Don’t worry about that – click the “Skip Backup” button and get outta here.

Finding The Database

So, where’s the database? It’s where you put it using the RootsMagic screens.

When you chose the destination for your tree, the RootsMagic software created two files. You can ignore the one with the .LST extension.

Your database is in the file with the extension of “.rmtree”.

Step 4: Install SQLiteExpert or DBeaver (Free)

You need a SQL browser that can connect to your family tree database.

You may know that there are several types of database systems: SQL Server and Oracle would be two big names. The system that will hold our family tree is known as a SQLite database. This is an open source (i.e. free) database system.

There are several free browser options – I recommend SQLiteExpert if you’re using Windows. That’s the one I use in the video tutorial. You can download it from here.

DBeaver is another good choice and it will run on a Mac.

Double-click on the installer you downloaded and accept all the defaults in the wizard.

Step 5: Connect to your Family Tree SQL Database

After you install the SQL browser, you want to connect to your database.

Using SQLite Expert

If you’re using SQLite Expert, the option to open a database is under the File menu as in this picture:

You’ll get an explorer to go find the database but there is one challenge – SQLite helpfully defaults to only showing files with standard SQL extensions.

But your RootsMagic file has the extension of .rmtree so it won’t show in the explorer. The solution is to toggle the list of extensions to show “All Files”.

Using DBeaver

DBeaver is a little different – expand the “Database” menu in the top ribbon and choose “New Database Connection” (the top option).

Now, you need to find the blue SQLite icon on the screen. You may need to scroll down a little.

Choose SQLite and click the “Next” button. Here, the “Open” button will give you an explorer to pick your .rmtree file.

First Look at Your Family Tree Database

Whoo-hoo! Regardless of your browser, you should now see a list of tables in the family tree database that holds the data from your GEDCOM file.

It’s displayed in the left pane in SQLite Expert.

There are nearly thirty tables. If that seems like a lot, I’ll tell you that you’ll mostly be interested in about six of them.

You’ll spot immediately that RootsMagic likes to name all the tables with “Table” at the end. Yeah, that’s a bit odd, but we’ll live with it.

Some of the names are a bit cryptic (“PayLoadTable”?) but for the most part, the names make sense. If you think you’ll find the names you need in NameTable and the locations in AddressTable, then you’ll be half right.

Let’s take our first look. Your SQL browser will have a tab or page where you can type and execute SQL Statements against your connected database.

Here’s the SQL to look at all the data in NameTable.

SELECT	*
FROM	NameTable

Here’s the display of a few records from my family tree database.

If you’re curious about the AddressTable, you’ll probably find that it doesn’t have much data that relates to your family tree.

The PlaceTable is actually where you’ll find the birth and death places you’ve entered.

Extra Resources

If you’re used to using SQL professionally, at this point you’d be looking for a formal document that describes what each table is for and the relationships between them.

Unfortunately, RootsMagic doesn’t provide one. There’s no reason why it should. We’re lucky that their application uses an open source database that we can get at.

However, a collection of volunteers have pooled their analysis and knowledge on an independent website. You can find a really useful document there where hobbyists add their findings to describe the tables and relationships.

The Basic Person Report

Every person in your tree has a row in the PersonTable.

Each row has an ID that is used in other tables – such as the NameTable (it’s called the OwnerID in that table).

If you run some counts (SELECT COUNT(*) FROM [table]), you may find that there are more rows in the NameTable than the PersonTable.

That will be because you’ve got additional names for some persons e.g. nicknames or legal names changes.

If you want a simple way to pull back one First Name/Last Name for each person, you can use the IsPrimary flag in the NameTable.

SELECT	Given, 
                Surname
FROM	NameTable
WHERE	IsPrimary = True

I often find that I don’t need the PersonTable in my queries. However, that table has the gender of the individual. If you want to display that in a list, your query could look like this:

SELECT	nt.Given, 
        	nt.Surname,
        	pt.Sex
FROM	PersonTable pt
        	INNER JOIN NameTable nt ON nt.OwnerID = pt.PersonID
WHERE	nt.IsPrimary = True

If you run this query, you’ll see that the gender attribute is 0 or 1. You can infer from your own data as to which is male or female, I’ll tell you that 0 is male.

A more helpful report would look like this:

SELECT	nt.Given, 
        	nt.Surname,
        	CASE WHEN pt.Sex = False THEN 'Male' ELSE 'Female' End Gender
FROM	PersonTable pt
        	INNER JOIN NameTable nt ON nt.OwnerID = pt.PersonID
WHERE	nt.IsPrimary = True

Retrieving Birth and Death Dates

Any one person in your tree will likely have multiple dates associated with them: birth, death, marriage, and residence – just to name a few.

These dates are part of an event in the life of a person. So, you’ll find them in the EventTable.

Take a look at this query and result for a single person in my database. Four rows are returned with a rather cryptic looking date.

I’ll tell you that the date is formatted as YYYYMMDD, so that first date is the 22nd of September 1864.

I notice that I have zeroes in the second date – that’s what happens when the record only has a year and is missing the day and month in this case.

We’ll do a bit of formatting later, but the main problem is that I don’t know what these dates represent.

The solution is to use a category attribute in the EventTable. The column is called EventType and this is a numeric lookup to a list of categories in the FactType table.

This query pulls back the full set of data:

SELECT     nt.Given,
                nt.Surname,
        	ftt.Name,
        	et.Date
FROM      EventTable et 
        	INNER JOIN NameTable nt ON nt.OwnerID = et.OwnerID
        	INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType

I’ve highlighted the category field (the column name is “Name”) in the display below. The four records displayed show that this person has one birth and death date and two residence dates (likely found in census records).

I want to keep things simple for my first report, so I’m going to start with getting the birth date only.

You may think that your SQL should look like this:

SELECT    nt.Given,
        	nt.Surname,
        	et.Date
FROM      EventTable et 
        	INNER JOIN NameTable nt ON nt.OwnerID = et.OwnerID
        	INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType 
WHERE   ftt.Name = 'Birth'

The problem is that you’ll get a cryptic error when you run this query in the RootsMagic database.

Depending on the browser, it’ll be a pop-up or the error may be displayed beneath the query. It could look like this:

How To Deal With the RMNOCASE Error

The error message is “no such collation sequence: RMNOCASE”.

The problem for us is that RootsMagic have a special way of dealing with different languages and characters in the data. They have called this RMNOCASE and we don’t have access to it.

One solution is to override this named collation sequence with the standard sequence, which is called NOCASE.

You can use the syntax “COLLATE NOCASE” to do this. Here is our amended query:

SELECT    nt.Given,
        	nt.Surname,
        	et.Date
FROM      EventTable et 
        	INNER JOIN NameTable nt ON nt.OwnerID = et.OwnerID
        	INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType 
WHERE   ftt.Name = 'Birth' COLLATE NOCASE

How To Format the Date Field

Here is an example date from my data: “D.+18640922..+00000000..”

As I mentioned, the useful information is embedded within the string in the format of YYYYMMDD.

Here is one way to extract the relevant details:

SELECT    nt.Given,
        	nt.Surname,
        	SUBSTRING(et.Date, 4, 8) AS AS BirthDate
FROM      EventTable et 
        	INNER JOIN NameTable nt ON nt.OwnerID = et.OwnerID
        	INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType 
WHERE   ftt.Name = 'Birth' COLLATE NOCASE

Let’s break that down.

The year starts at the fourth character in the field and is itself four characters long. The month and day are two characters each:

 SUBSTRING(et.Date, 4, 8)

It still doesn’t look great when there are zeroes in the date. You may also want to split the date into separate day, month, and year fields. I’ll address that at the end of this tutorial.

Adding The Birth Place Details

The location details for each person are found in PlaceTable.

A particular event usually occurs at a specific place and there is a PlaceID lookup in EventTable to get the associated location.

But the place can be unknown. It’s a quirk of this database that PlaceID has a zero (0) value in EventTable in that case.

There’s no 0 value PlaceID in the PlaceTable so if you do a standard join (INNER) between the tables, the query won’t return anything for event records with a 0 PlaceID.

Instead, your query should be: “give me all the event records and show place details when they exist”. For this, we use a LEFT JOIN:

SELECT    nt.Given,
        	nt.Surname,
        	SUBSTRING(et.Date, 4, 8) AS AS BirthDate,
       	        pt.Name as BirthPlace
FROM      EventTable et 
        	INNER JOIN NameTable nt ON nt.OwnerID = et.OwnerID
        	INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType 
                LEFT JOIN PlaceTable pt ON pt.PlaceId = et.PlaceId
WHERE   ftt.Name = 'Birth' COLLATE NOCASE

Getting The Death Details

So, we’ve got a query that retrieves the birth date. What about the death date?

Just change the filter to look for “Death” instead of “Birth”, and I also change the column aliases:

SELECT    nt.Given,
        	nt.Surname,
        	SUBSTRING(et.Date, 4, 8) AS AS DeathDate,
       	        pt.Name as DeathPlace
FROM      EventTable et 
        	INNER JOIN NameTable nt ON nt.OwnerID = et.OwnerID
        	INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType 
                LEFT JOIN PlaceTable pt ON pt.PlaceId = et.PlaceId
WHERE   ftt.Name = 'Death' COLLATE NOCASE

How to Use Subqueries to Form a Single Query Report

We don’t want to have two lists – one with birth dates and one with death dates.

Instead, we want a single query that shows the birth and death details on the same line.

To achieve this, we’ll take the separate queries we’ve built up previously and use them as subqueries.

There’s nothing new in the giant query below – we’re just putting the pieces together.

SELECT 	nt.Given, 
		nt.Surname,
                bd.BirthDate,
                bd.BirthPlace,
                dd.DeathDate,
                dd.DeathPlace
FROM 	NameTable nt
                LEFT JOIN (
                         SELECT  et.OwnerID,
                                      SUBSTRING(et.Date, 4, 8) AS BirthDate,
                                      pt.Name COLLATE NOCASE AS BirthPlace
                        FROM    EventTable et 
                                      LEFT JOIN PlaceTable pt ON pt.PlaceId = et.PlaceId
                                      INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType                         
                        WHERE  ftt.Name = 'Birth' COLLATE NOCASE	
                         ) bd ON bd.OwnerID = nt.OwnerID         
                LEFT JOIN (
                        SELECT  et.OwnerID,
                                      SUBSTRING(et.Date, 4, 8) AS DeathDate,
                                       pt.Name COLLATE NOCASE AS DeathPlace
                        FROM    EventTable et 
                                      LEFT JOIN PlaceTable pt ON pt.PlaceId = et.PlaceId
                                      INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType                        
                        WHERE ftt.Name = 'Death' COLLATE NOCASE	
                         ) dd ON dd.OwnerID = nt.OwnerID 

FROM     NameTable nt

        LEFT JOIN (

             SELECT  et.OwnerID,

                     SUBSTRING(et.Date, 4, 8) AS BirthDate,

                     pt.Name COLLATE NOCASE AS BirthPlace

             FROM    EventTable et

                     LEFT JOIN PlaceTable pt ON pt.PlaceId = et.PlaceId

                     INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType                        

             WHERE   ftt.Name = ‘Birth’ COLLATE NOCASE   

                   ) bd ON bd.OwnerID = nt.OwnerID        

        LEFT JOIN (

             SELECT  et.OwnerID,

                     SUBSTRING(et.Date, 4, 8) AS DeathDate,

                     pt.Name COLLATE NOCASE AS DeathPlace

             FROM    EventTable et

                     LEFT JOIN PlaceTable pt ON pt.PlaceId = et.PlaceId

                     INNER JOIN FactTypeTable ftt ON ftt.FactTypeID = et.EventType                       

             WHERE ftt.Name = ‘Death’ COLLATE NOCASE  

             ) dd ON dd.OwnerID = nt.OwnerID

Improvements And Better Queries?

You now have a SQL report of the name, birth, and death details of everybody in your family tree.

That may be exactly what you want. But there are some things I don’t love about the display.

I don’t like the zeroes when the day or month are missing. And I don’t like to see NULLs in the display. Usually, I’d prefer to see blanks or empty space in those circumstances.

We have a free script available for download that formats the data more cleanly using various SQL functions.

I also like to see the parents’ names alongside every individual. With my Irish heritage, sometimes it’s good to have the extra context when cousins in the same village have the same name.

This is a more complex query as it pulls in the details from more tables. You can also download this upgraded “Super Report” for a few bucks at this link.

Coming Soon – Our “SQL For Your Family Tree” Course

This tutorial is a shortened version of Lesson 1 of our “SQL For Genealogy” video course.

Each lesson shows you how to build a different detailed genealogy report from your family tree data.

Here is a full breakdown of the course.

Register (for free) at this link to get notified of when the course is available.

Margaret created a family tree on a genealogy website in 2012. She purchased her first DNA kit in 2017. She created this website to share insights and how-to guides on DNA, genealogy, and family research.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.