Click Back To: Home > Learning Center > Article 26

Computer Genealogy Specialists

CGS Email Service


Would like to be notified when new classes are available, when genealogical news items are announced, when new articles appear, and new programs are released , click here.

If you find the technology and research of genealogical records hard, be thankful that Computer Genealogy Specialists is here to simplify the process.  With:

Articles

Tips & Tricks

Gen-News

Training CDs

Free-Helps

Useful links

Class offerings

          And

In-home computer and genealogy assistance in the St. George area.

Contact CGS

Learning Center

Secrets of a Successful Match-Merge

(Article 26)


For a printer friendly version of this document complete with illustrations contact CGS by email and request this document by number and title.  It will be returned to you by email as soon as possible.

26 Secrets of a Successful Match-Merge


Sources of GEDCOM files

GEDCOM files come from family members, other researchers, and repositories (libraries where people store their genealogy files).  GEDCOM (Genealogical Data Communication) is a file format that genealogical programs use to communicate with one another.  It is specially prepared to remember names, dates, places, relationships, notes, and sources.  Anyone with a genealogy program can copy all or part of their genealogical information to a GEDCOM file and share it with others.  So you may receive one on a diskette, as an attachment to an email message, or you may copy it from a GEDCOM repository (genealogical CD or Internet download).


GEDCOM repositories are a way to publish your genealogical work and have cousins and researchers contact you for further information.  Likewise you can find the work of others and contact them.  Repositories are collections of compiled genealogies submitted by other family researchers like yourself.  These sites are used in the initial survey stage of research. Success includes always trying to contact the submitter and remembering that without primary source documentation, the information given is only clues for further research.  Ask submitters for the specific source of the specific information you are interested in and offer to share with them the information you have.  Never ask them to send you everything they have.  Start with small requests, share your information and build a relationship. The following references to Internet sites and CDs are some of the largest and more popular.  There are over 65,000 Internet sites dealing with genealogy and more than a 1,000 genealogical CDs published.


FamilySearch.org (free - 35 million names in the Ancestral File)

Ancestry.com (free for surname searches - 25 million names)

Genealogy.com (free for surname searches - 10 million names)

GenealogyPortal.com (free)

KindredKonnection.com (free for surname index searches - 44 million names to view)

     (Downloading the actual pedigree requires a subscription $15/month or $100/year)

WorldConnect.genealogy.rootsweb.com  (free - 15.7 million names)

RootsWeb.com (free - surname and county searches, immigrant ship passenger lists and more.  Over 150

      million names.)

Everton.com (free for limited searches)

Geneanet.org (free - 6.5 million names)

GenServ.com ($12/yr +your GEDCOM, seniors and Students $6/year - 18.9 million names)

LDS International Genealogical Index on CD (400 million names)

LDS Ancestral File on CD (35 million names)

LDS Pedigree Resource File on CD (9 million names)

Family Tree Maker World Family Tree on CD (44 million names)


What to do with GEDCOM files you receive


How you work with GEDCOM files depends on where they came from.  Several options are discussed in the following sections.  Regardless of the option, it is a good idea to import a GEDCOM file into an empty PAF database and look at it before you decide to import any part of it into your own pedigree.  Loading a GEDCOM file into PAF gives you the ability to use all the display, search, and reporting options within PAF to scrutinize the new material before you deem it worthy to be included with your best work.  This gives you an opportunity to select and prepare a portion of the import for match-merge with your pedigree.  It is this advanced preparation that will give you full confidence that the match-merge will go smoothly, and that you will be in full control of every step of the process.


A special note about the traditional match-merge option in PAF.  Unless you are familiar with all the individuals to be match-merged, the process can lead to unexpected results.  Many individuals in a pedigree are poorly identified having only a name and a relationship (John, the son of John).  You may be tempted to merge the records because all fields of information (as few as they are) are the same.  Be careful!  Unless there are from 5 to 7 pieces of information that match for each individual, they may not be the same individuals.  You may need to consider information beyond what is displayed in the match-merge window, including children names, other parents, or other relatives in the pedigree.  The process below will give you some guidelines for a safe match-merge.


GEDCOM downloads from Ancestral File - special case


This option is mentioned first, because it is an especially simple case.  PAF and Ancestral File were designed to work together. They both understand Ancestral File Numbers (the number assigned to each individual in the Ancestral File).  For example, say you download a long pedigree from Ancestral File in several overlapping GEDCOM files (because of the limitation of the number of generations allowed with each download).  You can import all the downloads into an empty PAF database, and in one automatic operation, match-merge all overlapping (duplicate) individuals to make one united pedigree as though you had downloaded from the Ancestral File in a single large GEDCOM file.  To do this select Tools from the menu bar, Merge on AFNs from the drop-down list, answer No to making a backup (your import GEDCOM files are your backup), and answer Yes to All to "Merge these individuals?"  In one automatic operation all duplicate individuals are merged.


The reason this works is because every individual has an Ancestral File Number, and the information about each duplicate individual is the same (having come from the same Ancestral File source).  Furthermore, there are no notes that come with Ancestral File downloads, thus we are not creating two sets of notes for an individual when we merge records together.  There will be multiple identical sources for each merged individual (one for each GEDCOM file imported) indicating the Ancestral File as the source and the Family History Library as the repository of the original information. To merge duplicate sources and repositories, select Tools in the menu-bar at the top of the PAF screen, and click on Merge duplicate sources and citations from the drop-down menu.  In one automatic operation all duplicate sources and citations with identical information are merged together.  Pretty neat, huh?


Import a GEDCOM file - the basics


There are two things to know when you are importing a GEDCOM file into an empty PAF database.  (1) Where is the GEDCOM file stored on your computer (A-drive diskette, C-drive folder name)?  (2) What are you going to call the PAF database that will store the import (McGregors from cousin Minni Oct 2000.paf).  In Windows 95/98/ME/XP you can be a bit wordy with your file names, including spaces.  Giving fuller descriptions in pedigree file names will avoid confusion in the future.


If you look at a file listing from within MyComputer or Exploring and there is a PAF icon  next to the file name you can double-mouse-click on the icon and PAF will begin the import process.  This technique only works for a single GEDCOM file being imported into an empty PAF database.  It will tell you that it needs to create a new PAF database for the import, click OK.  In the Create New Family File window PAF will select the default file folder to store the new database and ask you for a file name.  After entering a file name, click on Save.  PAF then presents a screen for your name, address, and phone number.  Fill in at least the name and address information otherwise PAF will ask for it again when you export records to be merged with your primary pedigree, click OK.  Next PAF asks if you want notes, listing file data, and reuse deleted records.  Make sure the first two are checked (the last doesn't matter at this point), click OK.  PAF then performs the import and presents you with a GEDCOM Import Log report.  If there were any problems in the report, print the report and resolve the problems in this newly created database before you import the data into any other database and propagate the problems further.


Alternately, from within PAF, you can create a new empty database (File, New) and go through the same procedure as above to complete the file name, your name and address, etc.


If you are importing a GEDCOM file into an existing PAF database (one that already contains names), open the existing database first, then start the import process (File, Import).  On the Import GEDCOM file window, specify the storage folder name where it is located, click once on the Name of the GEDCOM file, and click Import.  On the next GEDCOM Import window note the Ending RIN number and copy it to a piece of paper.  This is the highest RIN number used within your PAF database before the GEDCOM file is imported.  Check the first two boxes to import notes and include listings in notes, but deselect  (no check mark) in the third box labeled Reuse deleted records.  We want all of the individuals imported to be assigned RIN number that are higher than the Ending RIN, instead of reusing the RINs from records that have previously been deleted from your pedigree.  When we match and merge, we can check that only records from the GEDCOM file (RINs above the Ending RIN) are being merged into your pedigree (RINs below the Ending RIN).  Click OK to begin the import. PAF then presents you with a GEDCOM Import Log report.  If there were any problems listed, print the report and resolve the problems in this newly created database before you match-merge and compound any problems.


The match-merge process


Once the GEDCOM file has been imported into empty PAF database, your next objective is to discover which individuals are of interest to you (pertain to your pedigree), and if any of these contain new or different information that you want to include in your pedigree. Use the PAF display, search, and report features to evaluate the new information and decide if you want it. If you are familiar with the imported information and it is small in volume, key it directly into your PAF database (your best work).


If, however, there are many names and you need to discover if any are names in your pedigree, consider one of the following options.


A) Print a full set of pedigree charts and family group sheets of the imported data with notes and sources.  Then compare information to your pedigree and make updates as appropriate.  Check off each name on the paper reports as you go to keep track of your progress.  This will allow you to break up the work into several sessions and know where you last left off.


B) Instead of printing reports on paper you can display two pedigrees side by side and compare them visually on the screen.  This technique works best if you have your screen resolution set to 800 or 1024 with a large monitor, otherwise the characters are too blurry to view properly.  Ask for technical directions if this technique proves difficult to read.


First open and display your pedigree (File, Open, click on the Name of your pedigree, click on Open).  Second, do it again for the GEDCOM file you imported into an empty PAF database.  With the second database open and displayed, select Window from the menu-bar at the top of the screen, and select Tile Vertically from the drop-down list.  This causes the two open databases to be displayed side by side on your screen.  Notice that if you click on the left pedigree display its title bar changes color indicating it is the active pedigree (likewise for the right).  The PAF menu-bar and task-bar items will operate on whichever pedigree is the active one.  You may use this dual display option to compare pedigrees and family groups side by side.  It works best when only limited comparisons between pedigrees are needed.  One or more printed reports may be helpful to keep track of the individuals that have been compared.  You may use the Windows copy-paste feature to copy a field of information from the GEDCOM pedigree and update the corresponding field in your pedigree.  It's a slow operation, but it works.


This next technique works in simple as well as complex matching situations.  It requires that all of the individuals in your pedigree be assigned unique numbers in the Ancestral File Number (AFN) field in the Edit Individual window.  If any individual does not have an AFN assigned, you may substitute some unique number in its place (RIN 156, assuming 156 is the RIN number PAF assigned to that individual).  Including RIN along with the numeric number 156 will remind you that it is a temporary number and not a real Ancestral File Number.


Create a temporary copy of your pedigree (File, Save-as), and give the new pedigree a different name (McGregor temporary match.paf).  This pedigree now becomes the active one, and its file name will appear at the top of the PAF screen in the title-bar.  Import the GEDCOM file into the renamed PAF database as described above.  Do not perform a match-merge at this time.  Instead use the print-report feature of PAF to print a Duplicate Individuals report (File, Print-reports, Lists, Duplicate individuals, Print).  Alternately, you can Preview the report on the screen instead of printing it to paper.  The report identifies potential duplicate individuals on consecutive lines with their RIN, Name, Birth/Christening, Death/Burial, MRIN, and closest relative (Spouse, Parent, or Child).  Any pair of individuals that does not contain a RIN higher than the Ending RIN recorded at the time of import can be bypassed (It does not include an individual from the imported GEDCOM file).  Check all events, relationships, notes, and sources for each pair of individuals in the Duplicate Individuals report, printing family group sheets for easier comparison if necessary.  You are looking for any new or changed information that is of value.  Unfortunately, there is yet no easier way to accomplish this comparison and find the differences between the two pedigrees.


When you discover information you want to include in your pedigree from the imported data, copy the Ancestral File Number of the individual in your pedigree to the individual in the imported GEDCOM record. You may use the Windows copy-paste feature to assure an accurate copy with no keystroke errors.  This will be used later during the match-merge process to assure a positive match of individuals and that only those individuals with corresponding AFNs are merged together.  Also copy AFNs for spouses, children, and/or parents if they are to be match-merged into your pedigree.  For each of these individuals also type the word  Export (or some other key word of your choice) in the Custom ID field just below the AFN field.  The word Export will allow you to easily identify these records, export them from the temporary PAF database, then import and match-merge them into your pedigree.


When you are finished matching individuals between the two pedigrees (using corresponding AFNs to indicate a positive match) start the export of the changed records.  Select File from the PAF menu-bar, and Export from the drop-down menu.  In the GEDCOM Export window click on Standard GEDCOM 5.5 (with Ansel) if it is not already selected, and Partial at the bottom of the left-hand column.  Check ü all boxes on the right-hand column.  Click on Select in the lower-right corner.


In the lower left corner of the Find Individual window click on Clear so that the number of selected individuals is zero.  Next click on Define in the Field Filter section (bottom center).  Scroll down the left-hand list and click on Custom ID.  Click the right arrow  between the two columns.  In the Custom ID Field Filter window make sure the upper field reads Matches and type in the word Export in the Text field, click OK, and OK again.  PAF now displays the Find Individual window and has placed a mark next to the names of all selected individuals.  The number of total individuals selected is also displayed in the lower right corner.  If you want to browse through the selected individuals you may place a check ü in the Show results only box (lower right).  When you are finished browsing, click OK (lower left) and Export (lower left).  Give a name to the export file (McGregor GEDCOM with AFNs assigned.ged), click Export.  PAF completes the export and reports the number of records in the export file.


Open your pedigree (your best work), the one into which the selected GEDCOM records will be imported.  If you do not have a backup of this pedigree place a diskette in the A-drive and make one at this time (File, Backup, click on Backup, then OK).  Making a backup is a precaution that allows you to restore your pedigree (un-merged) in the event that the final match-merge does not go as planned.


Import the newly created export file from above making sure not to reuse any deleted RIN numbers (see GEDCOM downloads from Ancestral File above for instructions on importing a GEDCOM file into an existing PAF database).  Note on paper the Ending RIN (see Import a GEDCOM file above) that separates the individuals of your pedigree from the imported GEDCOM individuals.


Begin the match-merge process.  Select Tools from the PAF menu-bar at the top of the screen.  Select Match/Merge from the drop-down menu.  Answer No to making a backup.  In the Merge Individuals window click the Options bottom.  Make sure all option boxes are checked ü.  This will assure that all possible matching individuals are found and that all possible pieces of information are merged.  The last box Confirm when Merge button is pressed may be left unchecked.  Click OK.  Each pair of individuals to be match-merged will go through the following process.


Click on Next Match (bottom) to display a pair of individuals.  The individual with the lowest RIN is displayed on the left.  Visually verify that both individuals have the same Ancestral File Number.  If they do not, click Next Match to go onto the next pair of individuals.  If the AFNs are the same, look over each field of information.  When you press the Merge button (lower left) all values on the left side will be included in the combined record unless there is a check ü on the right side indicating that that value should be included instead of the value on the left.  You may Edit either the left side or right side record before the Merge if needed to get the value exactly the way you want it.  When you are ready, click Merge to combine the records and then click Next Match to display a new pair of individuals.  Notes of both individuals will be combined and all sources from each of the individuals will be added to the combined individual.


The match-merge process may be interrupted (Close) at any time and resumed at a later time.  However, any individuals that have been merged cannot be un-merged.  There is no un-merge function.  If you have made a mistake you must start again from your pedigree backup, import the GEDCOM file, and start the match-merge anew.  When you have completed the match-merge sequence and PAF indicates that there are no more records to match, close the match-merge window and start the process again.  Not all matching individuals are found in the first pass.  It may take 3 to 5 passes or more to get them all.  You can do a custom report including AFN and RIN (sorted by AFN) to assure that all intended duplicates have been merged.


When you are finished with the above, combine any duplicate sources. To merge duplicate sources select Tools in the menu-bar at the top of the PAF screen, and click on Merge duplicate sources and citations from the drop-down menu.  In one automatic operation all duplicate sources and citations with identical information are merged together.  You can do a report on sources (File, Print reports, Lists, Sources) to assure that duplicate sources have been merged.  If minor variations need to be corrected to allow sources or repositories to merge, use the Edit, Source List or Edit, Repository List to make

Main Topics


CGS Home Page


Jamboree 2005


Gen Calendar

Learning Center

Tips & Tricks

Free Stuff

Research Links

Training Links