Welcome to SPN

Register and Join the most happening forum of Sikh community & intellectuals from around the world.

Sign Up Now!

Multiple Spellings

Discussion in 'Information Technology' started by Rich, Jul 28, 2006.

  1. Rich

    Rich
    Expand Collapse
    Guest

    Hello Group,

    can I design a query that will give me the first instance of an address? I
    have acontact database with multiple instances of the same person. I need to
    come up with one address per person. I cannot take the lates date or any
    other field to some how make the record different. The problem lays in that
    people tend to spell words like "Park" differntly.

    Here is the data:

    Stacie Plant 1525 E Hyde Pk Blvd
    Stacie Plant 1525 E Hyde Park Blvd
    Stacie Plant 1525 East Hyde Pk Bldg 3rd Fl

    Stacie Plant has three records in the database. I want the last one but I
    have no way to isolate that record. Technically, all three records are
    different but our minds can see this really is hte same person, they just
    filled out their contact record three times.

    Rich
     
  2. Loading...


  3. Ken Sheridan

    Ken Sheridan
    Expand Collapse
    Guest

    Rich:

    In the example you give the only objective criterion which distinguishes the
    last address is its length. If this criterion distinguishes all 'duplicated'
    addresses in the table then you might be able to do it by extracting all
    those with the same FirstName and LastName values but the longest address. I
    suspect this is not going to be very reliable, but the query would go like
    this:

    SELECT DISTINCT *
    FROM Contacts As C1
    WHERE LEN(Address) =
    (SELECT MAX(LEN(Address))
    FROM Contacts AS C2
    WHERE C2.LastName = C1.LastName
    AND C2.FirstName = C1.FirstName);

    This relies on the contact's first and last names being identical in each
    row of course, so if there any typos in these they would be treated as
    different people (it does happen; I once worked on a database in which I
    found three versions of me as the author of technical articles). Also if two
    addresses were entered differently but were of the same length then each
    would be returned.

    How big is the table? Would it be feasible to add a Boolean (Yes/No)
    KeepMe field to the table and just go through it in a query sorted by
    LastName,FirstName and check each one you want to keep. You can then delete
    the redundant ones with:

    DELETE *
    FROM Contacts
    WHERE NOT KeepMe;

    Having backed up the table first of course!

    Ken Sheridan
    Stafford, England

    "Rich" wrote:

    > Hello Group,
    >
    > can I design a query that will give me the first instance of an address? I
    > have acontact database with multiple instances of the same person. I need to
    > come up with one address per person. I cannot take the lates date or any
    > other field to some how make the record different. The problem lays in that
    > people tend to spell words like "Park" differntly.
    >
    > Here is the data:
    >
    > Stacie Plant 1525 E Hyde Pk Blvd
    > Stacie Plant 1525 E Hyde Park Blvd
    > Stacie Plant 1525 East Hyde Pk Bldg 3rd Fl
    >
    > Stacie Plant has three records in the database. I want the last one but I
    > have no way to isolate that record. Technically, all three records are
    > different but our minds can see this really is hte same person, they just
    > filled out their contact record three times.
    >
    > Rich
     
  4. Rich

    Rich
    Expand Collapse
    Guest

    Hello Ken,
    you are right, the only parameter to work with is physical length of data.
    The assuption that the longest would be the best seems good. Most of the
    first names are the same however I have a few that say "Bob Smith" and
    "Robert Smith". They end up being the same person! Ouch.

    I am tryin gyour suggested query now.

    Rich

    "Ken Sheridan" wrote:

    > Rich:
    >
    > In the example you give the only objective criterion which distinguishes the
    > last address is its length. If this criterion distinguishes all 'duplicated'
    > addresses in the table then you might be able to do it by extracting all
    > those with the same FirstName and LastName values but the longest address. I
    > suspect this is not going to be very reliable, but the query would go like
    > this:
    >
    > SELECT DISTINCT *
    > FROM Contacts As C1
    > WHERE LEN(Address) =
    > (SELECT MAX(LEN(Address))
    > FROM Contacts AS C2
    > WHERE C2.LastName = C1.LastName
    > AND C2.FirstName = C1.FirstName);
    >
    > This relies on the contact's first and last names being identical in each
    > row of course, so if there any typos in these they would be treated as
    > different people (it does happen; I once worked on a database in which I
    > found three versions of me as the author of technical articles). Also if two
    > addresses were entered differently but were of the same length then each
    > would be returned.
    >
    > How big is the table? Would it be feasible to add a Boolean (Yes/No)
    > KeepMe field to the table and just go through it in a query sorted by
    > LastName,FirstName and check each one you want to keep. You can then delete
    > the redundant ones with:
    >
    > DELETE *
    > FROM Contacts
    > WHERE NOT KeepMe;
    >
    > Having backed up the table first of course!
    >
    > Ken Sheridan
    > Stafford, England
    >
    > "Rich" wrote:
    >
    > > Hello Group,
    > >
    > > can I design a query that will give me the first instance of an address? I
    > > have acontact database with multiple instances of the same person. I need to
    > > come up with one address per person. I cannot take the lates date or any
    > > other field to some how make the record different. The problem lays in that
    > > people tend to spell words like "Park" differntly.
    > >
    > > Here is the data:
    > >
    > > Stacie Plant 1525 E Hyde Pk Blvd
    > > Stacie Plant 1525 E Hyde Park Blvd
    > > Stacie Plant 1525 East Hyde Pk Bldg 3rd Fl
    > >
    > > Stacie Plant has three records in the database. I want the last one but I
    > > have no way to isolate that record. Technically, all three records are
    > > different but our minds can see this really is hte same person, they just
    > > filled out their contact record three times.
    > >
    > > Rich

    >
     

Share This Page