Re: [Hampshire] Database design for an address book

Top Page

Reply to this message
Author: Jacqui Caren
Date:  
To: hampshire
Subject: Re: [Hampshire] Database design for an address book
On 20/12/2011 12:56, James Courtier-Dutton wrote:
> On 20 December 2011 11:15, Jacqui Caren<jacqui.caren@???> wrote:
>>
>> I dont have any free time right now but I will see if $boss is OK with
>> publishing
>> the address dedupe code that was so usefull in this app.
>>
>
> I bet that dedupe code turned out to be not quite as simple as you thought.


Actually it was reasonably simple. I did not bother normalising address details
but left that to more complex batch jobs.

The code kept ref counts and deleted records as the ref count went to zero.

The find_address function would take address details, see if an existing entry matched
and return its id. If no entry matched it would create an entry and return the new id.

The batch code would clean up address entries using country specific normalisers
and then check to see if we had any dupes. It would then merge the two
entries into one by doing an "update ... set addr_id = ? where addr_id = ?".

As I said noddy stuff - apart from the address lexers, validators which were and still are
a real PITB.

Jacqui

--
Please post to: Hampshire@???
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--------------------------------------------------------------