Front page | perl.gedcom |
Postings from September 2011
Re: Announce: DateTime::Format::Gedcom V 1.00
From: Mike Elston
September 15, 2011 10:41
Re: Announce: DateTime::Format::Gedcom V 1.00
Message ID: F5188B96-2141-4CE8-9851-C554781DCACC@one-name.org
I have now had a chance for a first look at DateTime::Format::Gedcom
I applaud the intention, and we all owe a great debt to Ron for his
work on Perl Gedcom.
But in my humble opinion, this class shows a complete
misunderstanding both of how dates are often presented in GEDCOM
files (and especially in files that claim to be GEDCOM but may not be
strictly so), and of the idea of a GEDCOM date.
(1) You cannot separate the <DATE_CALENDAR_ESCAPE> from the
<DATE_CALENDAR> when parsing a GEDCOM file: the
<DATE_CALENDAR_ESCAPE> specifies which calendar the date is from
(nothing, as I've said before, to do with languages), and the GEDCOM
specification is that it defaults to @#DGREGORIAN@.
(2) Although the GEDCOM specification pays lip-service to it, it has
no consistent way of differentiating between dates specified
according to the 'old-style' calendar (in which the year started on
Lady Day, March 25th) in use in countries such as England before they
adopted the Gregorian calendar, and dates specified according to the
'new-style' calendar where the year starts on January 1st.
For example, when reading a GEDCOM file presented, say, by the LDS's
own website familysearch.com, one has to remember that dates such as
christenings from English parish registers up to the middle of the
18th century, for example, are invariably written in the register as
(Julian) dates in old style, not (Gregorian) dates in the modern
style. The LDS's own transcribers, as far as my research has
indicated, generally (but not always) copied the dates as they were
entered, but many dates contributed to the IGI have been "converted"
to the new-style calendar. The IGI never uses (at least, I've never
seen it) the @#DJULIAN@ calendar escape, nor does it use the 'old-
style/new-style' format for specifying dates. For example, it is
quite possible for an entry on the IGI from a transcribed parish
register to state that someone was christened on 20 FEB 1675, having
been born on 23 DEC 1675, and there is no inconsistency (since in the
old calendar in use at the time, 23 DEC preceded 20 FEB in the year
1675. The year began on 25 MAR 1675 and ended on 24 MAR 1675, which
is the day before 25 MAR 1676 -- in England (and most of its
colonies), the new year began on March 25th until the year 1752 which
began on 1 Jan 1752. Yet someone contributing a record to the
Ancestral File or to the IGI may have converted the date to then new-
style calendar, and report the christening as 20 FEB 1676.
Note that for this reason, 20 FEB 1675 will often be written by
genealogists as 20 FEB 1675/76, a format GEDCOM recognises, implying
that it means 20 FEB 1675 by the old calendar (when the year started
on 25 MAR), which we would now think of as 20 FEB 1676, because we
would treat the year 1676 as starting on 1 JAN (ie the day after 31
In other words, GEDCOM files produced by familysearch.com do not
strictly obey the GEDCOM date rules, which are themselves incomplete.
Confused already? Try this...
Another case: a date from a Scottish parish register for 1675 would
be entered according to the Gregorian calendar (which was in use in
Scotland by then, but not in England), so (at that time) 23 JUL 1695
in Scotland was not the same day as 23 JUL 1695 in England (which, if
my mental arithmetic is correct, was 3 AUG 1695 on the Gregorian
calendar). By 1752, when England changed to the Gregorian calendar, a
difference of 11 days had accumulated, which is why 2 SEP 1752 was
followed by 14 SEP 1752 in England to effect the correction, which
gave rise to the cry "give us back our 11 days" by many of the masses
who believed the government of the day had shortened their lives by
In England, 1752 was also the first year which officially began on 1
JAN; thus 1751 was only 282 days long, lasting from 25 MAR 1751 to 31
DEC 1751. It is a matter of debate (and sometimes impossible to
decide) which calendar is being used on contemporary documents about
that time, as some sources started using the 'new' calendar year (1
Jan-31 Dec) before 1752, and some not until later.
However, strictly speaking, the 'old-style/new-style' calendar
dichotomy is separate from the 'Julian/Gregorian' calendar
distinction; the changes from one to t'other simply happen to have
been effected in England in the same year.
(To confuse matters further, the year in Saxon and Norman times began
on 25 December rather than 25 March or 1 January!)
If you're researching old French or Jewish documents, you may have a
much greater problem.
Suppose you know that someone was born on 14e Germinal in the year 3
by the French Republican calendar (ie "@#DFRENCH@14 GERM 3"), and
that they died on 17e Avril 1846 ("17 APR 1846"). How would you
calculate how old they were when they died? Surely this is something
DateTime::Format::Gedcom should be capable of doing for us?
Or suppose a Jewish register states that a child was born on Kislev
23 5445 ("@#DHEBREW@23 KSL 5445"). How old were they when their
family emigrated to another country on 14th April 1734 (according to
the local records which were probably using the Julian calendar)? You
need to know what is the Julian date corresponding to Rosh-Hashana
(the first day of the Jewish year, Tishri 1) in the year 5445. And if
you want to do exact date calculations, Kislev is the 3rd month in
the Jewish calendar, but how many days were there in the preceding
month (Cheshvan) in that year? 29 or 30? (it's variable from one year
to the next.)
OK, maybe I'm being pedantic. The most important thing for most
researchers is to get the Gregorian calendar right, and the next most
important thing is to recognise the differences between Julian dates
and Gregorian dates, and between the old-style and new-style Julian/
A proper GEDCOM date class ought to be able to represent a date
specified according to the Julian or the Gregorian calendar (or the
French Republican or Hebrew calendars) as a standard internal date
(say, the number of days since some arbitrary epoch), and present it
according to any requested calendar or style.
I think that's enough for a start :-) I will take up the issues of
approximate dates, of date periods and ranges, of non-exact dates, of
month names in different languages, and the fundamental job of
parsing DATE lines in GEDCOM-style files, in subsequent emails. (Some
of these issues have already been raised by other users...)
(Note: much of the above was the result of my research when I was
writing an Objective-C date class GenDate for genealogical dates for
use in my own GEDCOM-compatible application).