The GEDCOM parser this web site uses was written 100% from scratch, written using a GEDCOM export from legacy (http://www.legacyfamilytree.com) and the GEDCOM 5.5 standard (http://homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gcch1.htm) as a basis for the GEDCOM structure, and the GEDCOM test files from GEDITCOM (http://www.geditcom.com/gedcom.html) to test the full reading capabilities, it should be able to cope with any gedcom file that's thrown at it.
Before writing this I found several different parsers that were desigined to work with SQL databases but since this website has no access to an SQL database and I wanted more flexability, these were of no use. I came close with this pear package (http://pear.php.net/package/Genealogy_Gedcom) but, alas, it failed to work with the GEDCOM file I was working with and failed miserably with the tourture test files (see above). The pear package was nice, but it lacked several major features that I needed.
There are others, but these were enough to show it wasn't going to work for me so I wrote my own.
It's not as elegant as the pear package, and probably a bit more 'clunky' but, it provides far more flexibility and it's far easier to adapt
to your needs.
I've tried to comment the more confusing parts of the code and have started using phpDoc comments as well, but writing a full manual is not something I'm good at ;}
but in it's "default" state it should do for just about all cases.
This parser has gone through several re-writes in the time its been used and so far this is the best one yet, in my not so humble opinion.
The first versions read the entire GEDCOM into memory, this was fine if you only had a few hundred individuals in your GEDCOM, which is all it needed to cope with then,
but it wasn't long before that expanded to a good thousand and is currently hovering around 1490. This really strained my parser, it took about 5 seconds to parse the file
and ate just over 19Mb of memory, not good. This re-write aims to remedy all most of that.
The new method is 'parse on demand' if you will. When the Gedcom parser is created it merely indexes the records in the GEDCOM file and then, when a record is asked for,
it uses those indexes to read the required parts of the file, brilliant. This reduces memory usage to just under 2Mb and the time it takes to read the file,
since it only reads what it has to.
However I when testing I did find one small problem, I wrote the reader so it caches the items that are read, this is fine until you start making name lists. Where you end up reading most of the records in the GEDCOM and the memory usage starts going through the roof again, so to combat this the functions that retrive the records from the GEDCOM provide the option to not cache the record, bringing the memory back to a more down to earth usage level. I would suggest use use this option more often than not.
The code for this parser is released under the LGPL license (v3.0),
For full details of the LGPL license see:
plain text gpl-3.0.txt (COPYING.txt) and
plain text lgpl-3.0.txt (COPYING.LESSER.txt)
Start viewing the code here - Genealogy_Gedcom.html
Download a zip file of all required files here - Gedcom_Parser.zip
THE CODE THAT CONVERTS THE GEDCOM TO HTML IS NOT PART OF THE PARSER
~~~Supported features:~~~
~~~Features that are not yet implemented~~~