Welcome to SPN

Register and Join the most happening forum of Sikh community & intellectuals from around the world.

Sign Up Now!

Northwind Mystery

Discussion in 'Information Technology' started by Tom Ellison, Jul 28, 2006.

  1. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    In the Northwind database, there are several indexes for which I do not
    understand the purpose.

    In the Products table, there are indexes on CategoryID and SupplierID. For
    what purpose are these created? I haven't seen anywhere they are used. If
    there is a reason for them, I'd like to know.

    These are indexes of surrogate key values that are the autonumbers in other
    tables.

    Thanks.

    Tom Ellison
     
  2. Loading...

    Similar Threads Forum Date
    USA How Police Identified Severed Head After 24-Year Mystery Breaking News Mar 29, 2013
    Mystery Of Meaning Of Death In Sikhism Questions and Answers Dec 15, 2012
    Pace Bowling: Indian Mystery, Indian Obsession Sports & Fitness Aug 29, 2011
    Judaism What is the Greatest Mystery? Interfaith Dialogues Jul 18, 2011
    Ganga Sagar: History, Mystery, Legend, Devotion History of Sikhism Oct 3, 2010

  3. Norman Yuan

    Norman Yuan
    Expand Collapse
    Guest

    Those fields (CategoryID, SupplierID) are the foreign keys, which, along
    with primary keys, are used to establish relationships between tables.
    Primary/foriegn key is a very basic concept of relational database. One must
    understand these basic things in order to do some meaningful work with
    relational database system.

    "Tom Ellison" <tellison@jcdoyle.com> wrote in message
    news:uyaZi5UdGHA.1656@TK2MSFTNGP02.phx.gbl...
    > In the Northwind database, there are several indexes for which I do not
    > understand the purpose.
    >
    > In the Products table, there are indexes on CategoryID and SupplierID.
    > For what purpose are these created? I haven't seen anywhere they are
    > used. If there is a reason for them, I'd like to know.
    >
    > These are indexes of surrogate key values that are the autonumbers in
    > other tables.
    >
    > Thanks.
    >
    > Tom Ellison
    >
    >
     
  4. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    Dear Norman:

    OK. I won't recite my credentials, but I'm a senior developer, and I'm
    extremely familiar with these concepts. There MUST be a unique index on the
    column(s) of a relationship on the ONE side of the relationship. But why
    index them on the MANY side of the relationship? They certainly aren't
    unique there!

    I have generally not seen an index on the related columns placed in the
    table on the MANY side!

    I suppose my question presumes some familiarity with this common database
    example.

    Thanks for offering your suggestion, Norman!

    Tom Ellison


    "Norman Yuan" <NotReal@NotReal.not> wrote in message
    news:OzQ0wbVdGHA.1204@TK2MSFTNGP02.phx.gbl...
    > Those fields (CategoryID, SupplierID) are the foreign keys, which, along
    > with primary keys, are used to establish relationships between tables.
    > Primary/foriegn key is a very basic concept of relational database. One
    > must understand these basic things in order to do some meaningful work
    > with relational database system.
    >
    > "Tom Ellison" <tellison@jcdoyle.com> wrote in message
    > news:uyaZi5UdGHA.1656@TK2MSFTNGP02.phx.gbl...
    >> In the Northwind database, there are several indexes for which I do not
    >> understand the purpose.
    >>
    >> In the Products table, there are indexes on CategoryID and SupplierID.
    >> For what purpose are these created? I haven't seen anywhere they are
    >> used. If there is a reason for them, I'd like to know.
    >>
    >> These are indexes of surrogate key values that are the autonumbers in
    >> other tables.
    >>
    >> Thanks.
    >>
    >> Tom Ellison
    >>
    >>

    >
    >
     
  5. John Vinson

    John Vinson
    Expand Collapse
    Guest

    On Thu, 11 May 2006 17:58:13 -0500, "Tom Ellison"
    <tellison@jcdoyle.com> wrote:

    >In the Products table, there are indexes on CategoryID and SupplierID. For
    >what purpose are these created? I haven't seen anywhere they are used. If
    >there is a reason for them, I'd like to know.
    >
    >These are indexes of surrogate key values that are the autonumbers in other
    >tables.


    They're not "surrogate" key values - they are foreign keys. For
    example, there is an enforced Relationship defined between the primary
    key of Categories and the corresponding foreign key CategoryID in the
    Products table; the way that Access enforces this relationship is...
    tadaa!... with an Index.

    John W. Vinson[MVP]
     
  6. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    Dear John:

    An autonumber used for a relationship between two tables is certainly what
    I'd call a surrogate key. If you have another definition, I'd be glad to
    here it. But that's not the point here.

    The question is, why would there be an index on the CategoryID column in the
    Products table, that is, in the table on the MANY side of the relationshkp.
    I know there must be a unique index on the key column on the ONE side of the
    relationship. I have created many one-to-many relationships, but having an
    index on the key column on the MANY side has never been needed. Don't you
    concur?

    Which brings me back to the question. Why index the CategoryID and the
    SupplierID in the table on the MANY side of the relationshkp, the Products
    table?

    Thanks.

    Tom Ellison


    "John Vinson" <jvinson@STOP_SPAM.WysardOfInfo.com> wrote in message
    news:7rl762do6j7oc7vbinbrp1anjj93nud70a@4ax.com...
    > On Thu, 11 May 2006 17:58:13 -0500, "Tom Ellison"
    > <tellison@jcdoyle.com> wrote:
    >
    >>In the Products table, there are indexes on CategoryID and SupplierID.
    >>For
    >>what purpose are these created? I haven't seen anywhere they are used.
    >>If
    >>there is a reason for them, I'd like to know.
    >>
    >>These are indexes of surrogate key values that are the autonumbers in
    >>other
    >>tables.

    >
    > They're not "surrogate" key values - they are foreign keys. For
    > example, there is an enforced Relationship defined between the primary
    > key of Categories and the corresponding foreign key CategoryID in the
    > Products table; the way that Access enforces this relationship is...
    > tadaa!... with an Index.
    >
    > John W. Vinson[MVP]
     
  7. Albert D.Kallal

    Albert D.Kallal
    Expand Collapse
    Guest

    "Tom Ellison" <tellison@jcdoyle.com> wrote in message
    news:uyaZi5UdGHA.1656@TK2MSFTNGP02.phx.gbl...
    > In the Northwind database, there are several indexes for which I do not
    > understand the purpose.
    >
    > In the Products table, there are indexes on CategoryID and SupplierID.
    > For what purpose are these created? I haven't seen anywhere they are
    > used. If there is a reason for them, I'd like to know.


    Well, you are correct, they are not really needed, and the database will
    function without them.

    However, if you plan run any report based on products, and want to select
    (filter) only particular products of a given category, then a index would
    make a very large difference in performance. The same idea applies to the
    CatagryID.

    If I filter, or want to find a particular product by catagoryID, then a
    index would again make a VERY large difference in performance.

    As a side note, any time you create a field name of number type, and the
    field name ends with "ID" , ms-access will default to indexing that column
    (so, actually, you have to be careful to not accident create indexes when
    you don't need them!)

    I have to say for the most part, that indexing these types of "catgalogry"
    fields is a very good idea from a performance point of view.

    So, likely, they are there because when you enter field names that end in
    id...ms-access will automatic default to indexed (so, perhaps someone did
    nothing here (lazy), .and just let ms-access default to having a index).

    However, from a sorting, or selecting point of view...having a index here
    will make a large difference in performance....


    --
    Albert D. Kallal (Access MVP)
    Edmonton, Alberta Canada
    pleaseNOOSpamKallal@msn.com
    http://www.members.shaw.ca/AlbertKallal
     
  8. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    Dear Albert:

    Now, that's some good stuff! Thanks!

    I certainly didn't know Access would create indexes based on column names.
    That seems weird indeed. But it probably explains why these indexes are
    there. Whoever built Northwind may not have known there were there.

    Isn't it true that having extra indexes can also be a bad thing? It takes
    time to change the index whenever a new row is added, a row is deleted, or
    when the subject column is updated. This is NOT so good for performance.

    Your suggestion that this column might well be filtered in a query or form
    is quite reasonable. The amount of performance gained with an index would
    depend on what proportion of the table is filtered. If a filter includes
    80% of the table, then a table scan is not so inefficient - you're going to
    have to read 80% of the rows anyway. If the filter includes 1% of the
    table, an index could be a big boost. Makes sense here.

    To put this into Northwind without any explanation doesn't tend to serve the
    rationale behind Northwind, to teach and give examples. I could not learn
    this from Northwind, but fortunately there is Albert.

    Now, the question is, are these indexes used somewhere in Northwind,
    justifying their existence? Well, you might come along and filter by just
    about any column in the database, right? So, let's start out with all the
    columns indexed, just for good measure. Somehow, that's the impression I'm
    getting here. Not really a good idea.

    I'll be sure none of my numeric columns ends in ID. That's for sure. I
    like to control the design of my tables, not have it taken from me.

    Tom Ellison


    "Albert D.Kallal" <PleaseNOOOsPAMmkallal@msn.com> wrote in message
    news:urVU89VdGHA.2188@TK2MSFTNGP05.phx.gbl...
    > "Tom Ellison" <tellison@jcdoyle.com> wrote in message
    > news:uyaZi5UdGHA.1656@TK2MSFTNGP02.phx.gbl...
    >> In the Northwind database, there are several indexes for which I do not
    >> understand the purpose.
    >>
    >> In the Products table, there are indexes on CategoryID and SupplierID.
    >> For what purpose are these created? I haven't seen anywhere they are
    >> used. If there is a reason for them, I'd like to know.

    >
    > Well, you are correct, they are not really needed, and the database will
    > function without them.
    >
    > However, if you plan run any report based on products, and want to select
    > (filter) only particular products of a given category, then a index would
    > make a very large difference in performance. The same idea applies to the
    > CatagryID.
    >
    > If I filter, or want to find a particular product by catagoryID, then a
    > index would again make a VERY large difference in performance.
    >
    > As a side note, any time you create a field name of number type, and the
    > field name ends with "ID" , ms-access will default to indexing that
    > column (so, actually, you have to be careful to not accident create
    > indexes when you don't need them!)
    >
    > I have to say for the most part, that indexing these types of "catgalogry"
    > fields is a very good idea from a performance point of view.
    >
    > So, likely, they are there because when you enter field names that end in
    > id...ms-access will automatic default to indexed (so, perhaps someone did
    > nothing here (lazy), .and just let ms-access default to having a index).
    >
    > However, from a sorting, or selecting point of view...having a index here
    > will make a large difference in performance....
    >
    >
    > --
    > Albert D. Kallal (Access MVP)
    > Edmonton, Alberta Canada
    > pleaseNOOSpamKallal@msn.com
    > http://www.members.shaw.ca/AlbertKallal
    >
    >
     
  9. Arvin Meyer [MVP]

    Arvin Meyer [MVP]
    Expand Collapse
    Guest

    "Tom Ellison" <tellison@jcdoyle.com> wrote in message
    news:OWIn5LWdGHA.3352@TK2MSFTNGP03.phx.gbl...
    > Dear Albert:
    >
    > Now, that's some good stuff! Thanks!
    >
    > I certainly didn't know Access would create indexes based on column names.
    > That seems weird indeed. But it probably explains why these indexes are
    > there. Whoever built Northwind may not have known there were there.
    >
    > I'll be sure none of my numeric columns ends in ID. That's for sure. I
    > like to control the design of my tables, not have it taken from me.


    Hi Tom,

    Not just ID, Access automatically indexes fields ending in:

    ID; key; code; num

    unless you disable this in Options.
    --
    Arvin Meyer, MCP, MVP
    Microsoft Access
    Free Access downloads
    http://www.datastrat.com
    http://www.mvps.org/access
     
  10. Jeff Conrad

    Jeff Conrad
    Expand Collapse
    Guest

    Hi Tom,

    > I'll be sure none of my numeric columns ends in ID. That's for sure. I like to control the
    > design of my tables, not have it taken from me.


    If you go to Tools | Options | Tables/Queries
    look for the text box called "AutoIndex on Import/Create:"
    You'll see the default settings that Access uses to automatically
    create indexes without your permission. On a default installation
    you can see that Access has four entries in that text box. Any field
    names that end with ID, key, code, or num will be "graciously"
    indexed automatically for you.
    <g>
    --
    Jeff Conrad
    Access Junkie - MVP
    http://home.bendbroadband.com/conradsystems/accessjunkie.html
    http://www.access.qbuilt.com/html/articles.html

    "Tom Ellison" wrote in message:
    news:OWIn5LWdGHA.3352@TK2MSFTNGP03.phx.gbl...

    > Dear Albert:
    >
    > Now, that's some good stuff! Thanks!
    >
    > I certainly didn't know Access would create indexes based on column names. That seems weird
    > indeed. But it probably explains why these indexes are there. Whoever built Northwind may not
    > have known there were there.
    >
    > Isn't it true that having extra indexes can also be a bad thing? It takes time to change the
    > index whenever a new row is added, a row is deleted, or when the subject column is updated. This
    > is NOT so good for performance.
    >
    > Your suggestion that this column might well be filtered in a query or form is quite reasonable.
    > The amount of performance gained with an index would depend on what proportion of the table is
    > filtered. If a filter includes 80% of the table, then a table scan is not so inefficient - you're
    > going to have to read 80% of the rows anyway. If the filter includes 1% of the table, an index
    > could be a big boost. Makes sense here.
    >
    > To put this into Northwind without any explanation doesn't tend to serve the rationale behind
    > Northwind, to teach and give examples. I could not learn this from Northwind, but fortunately
    > there is Albert.
    >
    > Now, the question is, are these indexes used somewhere in Northwind, justifying their existence?
    > Well, you might come along and filter by just about any column in the database, right? So, let's
    > start out with all the columns indexed, just for good measure. Somehow, that's the impression I'm
    > getting here. Not really a good idea.
    >
    > I'll be sure none of my numeric columns ends in ID. That's for sure. I like to control the
    > design of my tables, not have it taken from me.
    >
    > Tom Ellison
    >
    >
    > "Albert D.Kallal" <PleaseNOOOsPAMmkallal@msn.com> wrote in message
    > news:urVU89VdGHA.2188@TK2MSFTNGP05.phx.gbl...
    >> "Tom Ellison" <tellison@jcdoyle.com> wrote in message
    >> news:uyaZi5UdGHA.1656@TK2MSFTNGP02.phx.gbl...
    >>> In the Northwind database, there are several indexes for which I do not understand the purpose.
    >>>
    >>> In the Products table, there are indexes on CategoryID and SupplierID. For what purpose are
    >>> these created? I haven't seen anywhere they are used. If there is a reason for them, I'd like
    >>> to know.

    >>
    >> Well, you are correct, they are not really needed, and the database will function without them.
    >>
    >> However, if you plan run any report based on products, and want to select (filter) only
    >> particular products of a given category, then a index would make a very large difference in
    >> performance. The same idea applies to the CatagryID.
    >>
    >> If I filter, or want to find a particular product by catagoryID, then a index would again make a
    >> VERY large difference in performance.
    >>
    >> As a side note, any time you create a field name of number type, and the field name ends with
    >> "ID" , ms-access will default to indexing that column (so, actually, you have to be careful to
    >> not accident create indexes when you don't need them!)
    >>
    >> I have to say for the most part, that indexing these types of "catgalogry" fields is a very good
    >> idea from a performance point of view.
    >>
    >> So, likely, they are there because when you enter field names that end in id...ms-access will
    >> automatic default to indexed (so, perhaps someone did nothing here (lazy), .and just let
    >> ms-access default to having a index).
    >>
    >> However, from a sorting, or selecting point of view...having a index here will make a large
    >> difference in performance....
    >>
    >>
    >> --
    >> Albert D. Kallal (Access MVP)
    >> Edmonton, Alberta Canada
    >> pleaseNOOSpamKallal@msn.com
    >> http://www.members.shaw.ca/AlbertKallal
    >>
    >>

    >
    >
     
  11. Albert D.Kallal

    Albert D.Kallal
    Expand Collapse
    Guest

    > I certainly didn't know Access would create indexes based on column names.
    > That seems weird indeed. But it probably explains why these indexes are
    > there. Whoever built Northwind may not have known there were there.


    You can *see* this then you are in table design mode...if you type in a
    field name..and choose the type as number *and* the field name ends in
    id...notice how the "index" property is set...you can if you please at this
    point set the field to no index. So, this is NOT a invisible
    occurrence...but just simply a default that access has, and you can EASILY
    see this during the field design/creating process when in table design mode.

    >
    > Isn't it true that having extra indexes can also be a bad thing? It takes
    > time to change the index whenever a new row is added, a row is deleted, or
    > when the subject column is updated. This is NOT so good for performance.


    Yes...have to 100% agree. Just throwing up a index on every field with blind
    eyes is just as bad (or worse!!) as a approach of not having any indexes at
    all!!

    So, that default is just a guess that MOST fields that are number type, and
    end in ID likely are VERY good candidates to be indexed...but, not always...

    This decision to index or not....often comes down to the black art of
    performance. For example, if we were NEVER to search, or build reports by
    productID, or category, then those indexes would most certainly be not
    needed, and would be a performance liability here. (however, those fields
    are part of relationships...and thus they *really* do need to be indexed!).

    In the case of relational data joins, again the query processor can, and
    *often* will use the index on the foreign key field.

    Having met you in the past, I am aware that a considerable amount of your
    work is with sql server. It turns out that missing a index here or there
    when using sql server is not so bad from a network point of view. If you
    search for a productID with sql server, and no index is present, then sql
    server will perform a table scan to get that match. However, during this
    scan, NO NETWORK LOAD occurs. Only AFTER sql server finds the records, does
    it send them down the wire. However, the sql server disk drive certainly did
    get a workout..and more load was placed on the server.

    So, often missing a index here, or there does not really effect *NETWORK*
    traffic and performance very much with sql server.

    With a file share/jet on a network, then this is a much more critical issue.

    So, sql server can do the search locally, and THEN send the records. With a
    file share/jet system, that index is REALLY important, since without the
    index, all records will travel across the network when searching. With a
    index, only those matching records will travel. down the network wire.

    So, good indexing is much more needed when you don't have sql server. In
    fact, you have to be MORE aggressive in the use of indexes for a file share
    as compared to sql server...

    > is quite reasonable. The amount of performance gained with an index would
    > depend on what proportion of the table is filtered. If a filter includes
    > 80% of the table, then a table scan is not so inefficient - you're going
    > to have to read 80% of the rows anyway. If the filter includes 1% of the
    > table, an index could be a big boost. Makes sense here.


    Yup..agreed...

    >
    > Now, the question is, are these indexes used somewhere in Northwind,
    > justifying their existence? Well, you might come along and filter by just
    > about any column in the database, right? So, let's start out with all the
    > columns indexed, just for good measure. Somehow, that's the impression
    > I'm getting here. Not really a good idea.


    Well, they did not index all of the fields. Another *Very* important point
    here is that those fields are part of a relationship.

    If you build a form based on Suppliers, and then have a sub-form to
    display/show all of the products, how will access retrieve ONLY just the
    records that belong to a given SupplerID? We load one suppler...but have a
    sub-form of products. In this case, with a with NO index on the
    [Products].[SupplerID] field, then table scans will occur. So, a index is
    very impoarant here.

    In addition to sub-forms, the issue of building a relational sql join from
    Suppliers to products also *really* needs a index to perform a join with any
    amount of performance. We might restive two Suppliers...and want to see all
    of their products. JET (and sql server) can thus restive the two
    records..and then based on the index field...grab the child records here
    (say, the 12, or 15 products). Without a index on the products.SupplerID
    field....jet (or sql server) can't just retrieve the few needed records.

    In fact, one the BEST performance gains you can get in ms-access is to
    ensure that ALL Forign key fields are indexed for relations. I index ALL of
    these in my applications.

    >
    > I'll be sure none of my numeric columns ends in ID. That's for sure. I
    > like to control the design of my tables, not have it taken from me.


    As mentioned, this is only a default presented *during* table design mode.
    You can see/choose the index setting. So, it is more a default during design
    mode, and NOT a hidden default that you don't know about...(

    Try creating a test mdb file...and start a new table in design mode. .I
    believe access also scans for "code" and a few other types of keywords that
    will "default" the index setting to change from none to "indexed:....but,
    note it is ALWAYS in plain view to see, and this ONLY occurs when you are
    adding a new field.........it is never done under the covers hidden...

    --
    Albert D. Kallal (Access MVP)
    Edmonton, Alberta Canada
    pleaseNOOSpamKallal@msn.com
    http://www.members.shaw.ca/AlbertKallal
     
  12. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    Dear Albert:

    Well, you are right on, as usual. Mostly, that is.

    Now, I do not agree that having an index on these columns in the table on
    the MANY side of the relationship are needed. I rather doubt that's what
    you meant, however.

    Your comments about how this works with SQL Server are quite appropriate. I
    try hard not to create table scans there. There are usually a good number
    of indexes in that work.

    Because the SQL Server applications tend to have more users, and more
    traffic, the need for indexing is probably greater. The analyzer tools help
    point this out for us. If all things were equal, then you're right, this
    would be more critical for Jet.

    Finally, here's the nasty part. You can, of course, have only one Primary
    Key. When the database is Compact/Repaired, the rows are put in the order
    of that PK.

    When you build a filter for a form, possibly based on its controls, you
    would be filtering by the natural key. When you have a form/subform setup,
    and you navigate successively through the FK table, the dependent table is
    accessed successively in the natural key order, that is, the order of the
    foreign key table in its form. Most reports would likely follow the same
    path. The more so for my applications, where I commonly have a continuous
    form for the FK table as well as the dependent table (I rewrite the SQL for
    the subform using the Current event of the FK table's form, and assign a new
    RecordSource each time).

    Now, if the full natural key is placed into the dependent table, and is its
    primary key, then when the database is compacted you are actually traversing
    the dependent table in its physical sequence, which is very, very good for
    performance. Again, the same thing happens in reports. If instead you have
    an autonumber key for the FK table, and that is in the dependent table
    instead of the natural key from the FK table, then there is no opportunity
    to put the rows of the dependent table in natural key order, and the disk
    access will be random instead of sequential. Considering that the segments
    of the database are 4K, this means that you will be reading all over the
    database frequently, instead of processing 20 to 50 rows or so sequentially
    before another HD access. Remember, each access takes 10 mS or so, and they
    really add up when compared to processing what is in memory.

    I'm creating a version of Northwind without autonumbers, but just all
    natural keys. I'll do some extensive side-by-side testing of NW as is with
    my natural key version and try to demonstrate what happens. I've seen this
    debated enough. I don't want to phony up some database that just makes this
    test appear one way or the other. I've picked NW as a "standard" not of my
    own making. So, knowing why these indexes are there is important, so we can
    compare apples to apples.

    Thanks for your contribution, Albert. Hope to be seeing you soon!

    Tom Ellison


    "Albert D.Kallal" <PleaseNOOOsPAMmkallal@msn.com> wrote in message
    news:%23yPDXtWdGHA.5116@TK2MSFTNGP02.phx.gbl...
    >> I certainly didn't know Access would create indexes based on column
    >> names. That seems weird indeed. But it probably explains why these
    >> indexes are there. Whoever built Northwind may not have known there were
    >> there.

    >
    > You can *see* this then you are in table design mode...if you type in a
    > field name..and choose the type as number *and* the field name ends in
    > id...notice how the "index" property is set...you can if you please at
    > this point set the field to no index. So, this is NOT a invisible
    > occurrence...but just simply a default that access has, and you can EASILY
    > see this during the field design/creating process when in table design
    > mode.
    >
    >>
    >> Isn't it true that having extra indexes can also be a bad thing? It
    >> takes time to change the index whenever a new row is added, a row is
    >> deleted, or when the subject column is updated. This is NOT so good for
    >> performance.

    >
    > Yes...have to 100% agree. Just throwing up a index on every field with
    > blind eyes is just as bad (or worse!!) as a approach of not having any
    > indexes at all!!
    >
    > So, that default is just a guess that MOST fields that are number type,
    > and end in ID likely are VERY good candidates to be indexed...but, not
    > always...
    >
    > This decision to index or not....often comes down to the black art of
    > performance. For example, if we were NEVER to search, or build reports by
    > productID, or category, then those indexes would most certainly be not
    > needed, and would be a performance liability here. (however, those fields
    > are part of relationships...and thus they *really* do need to be
    > indexed!).
    >
    > In the case of relational data joins, again the query processor can, and
    > *often* will use the index on the foreign key field.
    >
    > Having met you in the past, I am aware that a considerable amount of your
    > work is with sql server. It turns out that missing a index here or there
    > when using sql server is not so bad from a network point of view. If you
    > search for a productID with sql server, and no index is present, then sql
    > server will perform a table scan to get that match. However, during this
    > scan, NO NETWORK LOAD occurs. Only AFTER sql server finds the records,
    > does it send them down the wire. However, the sql server disk drive
    > certainly did get a workout..and more load was placed on the server.
    >
    > So, often missing a index here, or there does not really effect *NETWORK*
    > traffic and performance very much with sql server.
    >
    > With a file share/jet on a network, then this is a much more critical
    > issue.
    >
    > So, sql server can do the search locally, and THEN send the records. With
    > a file share/jet system, that index is REALLY important, since without the
    > index, all records will travel across the network when searching. With a
    > index, only those matching records will travel. down the network wire.
    >
    > So, good indexing is much more needed when you don't have sql server. In
    > fact, you have to be MORE aggressive in the use of indexes for a file
    > share as compared to sql server...
    >
    >> is quite reasonable. The amount of performance gained with an index
    >> would depend on what proportion of the table is filtered. If a filter
    >> includes 80% of the table, then a table scan is not so inefficient -
    >> you're going to have to read 80% of the rows anyway. If the filter
    >> includes 1% of the table, an index could be a big boost. Makes sense
    >> here.

    >
    > Yup..agreed...
    >
    >>
    >> Now, the question is, are these indexes used somewhere in Northwind,
    >> justifying their existence? Well, you might come along and filter by
    >> just about any column in the database, right? So, let's start out with
    >> all the columns indexed, just for good measure. Somehow, that's the
    >> impression I'm getting here. Not really a good idea.

    >
    > Well, they did not index all of the fields. Another *Very* important point
    > here is that those fields are part of a relationship.
    >
    > If you build a form based on Suppliers, and then have a sub-form to
    > display/show all of the products, how will access retrieve ONLY just the
    > records that belong to a given SupplerID? We load one suppler...but have a
    > sub-form of products. In this case, with a with NO index on the
    > [Products].[SupplerID] field, then table scans will occur. So, a index is
    > very impoarant here.
    >
    > In addition to sub-forms, the issue of building a relational sql join from
    > Suppliers to products also *really* needs a index to perform a join with
    > any amount of performance. We might restive two Suppliers...and want to
    > see all of their products. JET (and sql server) can thus restive the two
    > records..and then based on the index field...grab the child records here
    > (say, the 12, or 15 products). Without a index on the products.SupplerID
    > field....jet (or sql server) can't just retrieve the few needed records.
    >
    > In fact, one the BEST performance gains you can get in ms-access is to
    > ensure that ALL Forign key fields are indexed for relations. I index ALL
    > of these in my applications.
    >
    >>
    >> I'll be sure none of my numeric columns ends in ID. That's for sure. I
    >> like to control the design of my tables, not have it taken from me.

    >
    > As mentioned, this is only a default presented *during* table design mode.
    > You can see/choose the index setting. So, it is more a default during
    > design mode, and NOT a hidden default that you don't know about...(
    >
    > Try creating a test mdb file...and start a new table in design mode. .I
    > believe access also scans for "code" and a few other types of keywords
    > that will "default" the index setting to change from none to
    > "indexed:....but, note it is ALWAYS in plain view to see, and this ONLY
    > occurs when you are adding a new field.........it is never done under the
    > covers hidden...
    >
    > --
    > Albert D. Kallal (Access MVP)
    > Edmonton, Alberta Canada
    > pleaseNOOSpamKallal@msn.com
    > http://www.members.shaw.ca/AlbertKallal
    >
     
  13. John Vinson

    John Vinson
    Expand Collapse
    Guest

    On Thu, 11 May 2006 19:49:26 -0500, "Tom Ellison"
    <tellison@jcdoyle.com> wrote:

    >Dear John:
    >
    >An autonumber used for a relationship between two tables is certainly what
    >I'd call a surrogate key. If you have another definition, I'd be glad to
    >here it. But that's not the point here.


    Well, yes - it's a surrogate key in the "one" side table. Sorry!

    >The question is, why would there be an index on the CategoryID column in the
    >Products table, that is, in the table on the MANY side of the relationshkp.
    >I know there must be a unique index on the key column on the ONE side of the
    >relationship. I have created many one-to-many relationships, but having an
    >index on the key column on the MANY side has never been needed. Don't you
    >concur?


    >Which brings me back to the question. Why index the CategoryID and the
    >SupplierID in the table on the MANY side of the relationshkp, the Products
    >table?


    I'm not certain just HOW the JET database engine uses these indexes in
    the process of enforcing relationships, but that's my understanding.
    If you use the Relationships Window (or even the Lookup Wizard) to
    relate two tables, you must have the unique index already set in the
    "one" table; the program will create a nonunique index in the "many"
    table as part of the process.

    Again, I'm not privy to the details of the code so I can't say why;
    but I can speculate that it's more efficient to determine whether an
    orphan record would be created by joining two indexes.

    John W. Vinson[MVP]
     
  14. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    Dear John:

    You can drop these indexes and the relationships survive just fine. They
    are not needed for a relationship. A UNIQUE index on the One side is
    required, but, the best I understand, no index is required on the MANY side
    table. That makes sense to me.

    I believe Albert has exposed this accurately. Access can be creating these
    automatically. Now that he mentions it, I'm just sure I've seen it. Never
    knew why, though.

    Tom Ellison


    "John Vinson" <jvinson@STOP_SPAM.WysardOfInfo.com> wrote in message
    news:eek:p2862546ias26a93vc546q93niileadjo@4ax.com...
    > On Thu, 11 May 2006 19:49:26 -0500, "Tom Ellison"
    > <tellison@jcdoyle.com> wrote:
    >
    >>Dear John:
    >>
    >>An autonumber used for a relationship between two tables is certainly what
    >>I'd call a surrogate key. If you have another definition, I'd be glad to
    >>here it. But that's not the point here.

    >
    > Well, yes - it's a surrogate key in the "one" side table. Sorry!
    >
    >>The question is, why would there be an index on the CategoryID column in
    >>the
    >>Products table, that is, in the table on the MANY side of the
    >>relationshkp.
    >>I know there must be a unique index on the key column on the ONE side of
    >>the
    >>relationship. I have created many one-to-many relationships, but having
    >>an
    >>index on the key column on the MANY side has never been needed. Don't you
    >>concur?

    >
    >>Which brings me back to the question. Why index the CategoryID and the
    >>SupplierID in the table on the MANY side of the relationshkp, the Products
    >>table?

    >
    > I'm not certain just HOW the JET database engine uses these indexes in
    > the process of enforcing relationships, but that's my understanding.
    > If you use the Relationships Window (or even the Lookup Wizard) to
    > relate two tables, you must have the unique index already set in the
    > "one" table; the program will create a nonunique index in the "many"
    > table as part of the process.
    >
    > Again, I'm not privy to the details of the code so I can't say why;
    > but I can speculate that it's more efficient to determine whether an
    > orphan record would be created by joining two indexes.
    >
    > John W. Vinson[MVP]
     
  15. Wolfgang Kais

    Wolfgang Kais
    Expand Collapse
    Guest

    Hello Tom.

    Tom Ellison wrote:
    > In the Northwind database, there are several indexes for which I do
    > not understand the purpose.
    >
    > In the Products table, there are indexes on CategoryID and SupplierID.
    > For what purpose are these created? I haven't seen anywhere they are
    > used. If there is a reason for them, I'd like to know.
    >
    > These are indexes of surrogate key values that are the autonumbers in
    > other tables.


    Indeed, these two indexes you see in the indexes window from the table
    design window are unnecessary, they even are redundant.
    Unlike SQL Server, Jet always creates "foreign" indexes when you create
    foreign key contraints (relationships with referential integrity enabled).
    These indexes aren't listed in the indexes window, but you'll find them
    using vba (dao) or using the database documentor (Tools, Analyze).
    They should be named CategoriesProducts and SuppliersProducts.

    --
    Regards,
    Wolfgang
     
  16. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    Dear Wolfgang:

    Thanks very much. Although I hadn't understood such indexes were being
    created automatically, as Albert also explained, I had strong doubt they are
    unnecessary.

    I'm working on a paper to critique Northwind. It seems that many Access
    users have studied Northwind, and it's held up as an example of good
    practice. I propose to provide a version of Northwind that is more worthy
    of that. I appreciate your help in doing this.

    Tom Ellison


    "Wolfgang Kais" <firstoffirstname.lastname@gmx.de> wrote in message
    news:Ok3d4gadGHA.1656@TK2MSFTNGP02.phx.gbl...
    > Hello Tom.
    >
    > Tom Ellison wrote:
    >> In the Northwind database, there are several indexes for which I do
    >> not understand the purpose.
    >>
    >> In the Products table, there are indexes on CategoryID and SupplierID.
    >> For what purpose are these created? I haven't seen anywhere they are
    >> used. If there is a reason for them, I'd like to know.
    >>
    >> These are indexes of surrogate key values that are the autonumbers in
    >> other tables.

    >
    > Indeed, these two indexes you see in the indexes window from the table
    > design window are unnecessary, they even are redundant.
    > Unlike SQL Server, Jet always creates "foreign" indexes when you create
    > foreign key contraints (relationships with referential integrity enabled).
    > These indexes aren't listed in the indexes window, but you'll find them
    > using vba (dao) or using the database documentor (Tools, Analyze).
    > They should be named CategoriesProducts and SuppliersProducts.
    >
    > --
    > Regards,
    > Wolfgang
    >
    >
     
  17. Albert D.Kallal

    Albert D.Kallal
    Expand Collapse
    Guest

    >
    > Finally, here's the nasty part. You can, of course, have only one Primary
    > Key. When the database is Compact/Repaired, the rows are put in the order
    > of that PK.


    Yes, they are. And, unfortunately, as a design criteria, in those child
    tables...I *always* include a primary key autonumber field
    (before even reading farther...I know where this is going!!).

    > When you build a filter for a form, possibly based on its controls, you
    > would be filtering by the natural key. When you have a form/subform
    > setup, and you navigate successively through the FK table, the dependent
    > table is accessed successively in the natural key order, that is, the
    > order of the foreign key table in its form.


    Without question, we now have a dilemma. It would seem *better* to NOT have
    a primary key (autonumber) in the child table, but ONLY have the foreign key
    value..and, if we make it the index, then the child table will be
    "clustered" for us (to steal a sql server term). In a sense, just like sql
    server can only have one clustered index....we kind of have the same thing
    in jet (however, with sql server, you can have both a primary key, and the
    clustered index does NOT have to be this PK).

    > If instead you have an autonumber key for the FK table, and that is in the
    > dependent table instead of the natural key from the FK table, then there
    > is no opportunity to put the rows of the dependent table in natural key
    > order, and the disk access will be random instead of sequential.
    > Considering that the segments of the database are 4K, this means that you
    > will be reading all over the database frequently, instead of processing 20
    > to 50 rows or so sequentially before another HD access. Remember, each
    > access takes 10 mS or so, and they really add up when compared to
    > processing what is in memory.


    *Excellent* observation on your part!!! (just great!!).

    Ok, so, now I going to give my snottily answer to the fact that I *usually*
    include a PK field (autonumber) in that child table!!!
    I have a PK in those child tables for two reasons

    1) - well, obviously, if this table needs a child table...then I
    will need a PK
    (so, for flexibility, and future design consideration...I have a
    pk)
    2) - in invoice type stuff, I often want to retrieve the *last*
    invoice date...and that also includes the case where a person might have
    more then ONE invoice in the same day...so, I want the last one...the only
    way to get this is grab the top 1 date....but throw in a "order by ID desc".
    With the addition of the autonumber field..then each row is unique.
    3) not in love with CODD..but I do believe that each table should
    have a unique key to identify it.

    However, as you point out, the instant I throw in that autonumber PK in this
    child table, then I mess up the clustering I would normally get if I just
    had a index on the FK field..and NO autonumber....

    However, it turns out that *much* of the time, you do actually get the child
    table in the correct order. For example, take a order taking application,
    and I start to type in order details (obviously in a sub-form that is a
    child table). It turns out that in *MOST* cases, I will be adding those
    child tables one after another. So, while I don't get clustering by FK
    order, the PK order is *useally* the same!!! In other words, if I add 5 new
    child records, they all have the same FK, but the PK is actually also
    auto-numbered one after another. Thus even after compacting...my frame/data
    order is in fact clustered by PK value order, but they WERE entered in that
    order!! Thus OFTEN the PK order will group values together by FK values.(at
    least the times when I entered data in a sub-form in one shot...and that
    occurs a LOT of the time!!).

    Of course, this luck of the same order is not the same all the time. For
    example, adding a new customer invoice would happen over time. I do loose
    out on clustering in that case. However, for each invoice I create, and
    other types of data entry, surprisingly often, the PK order follows the FK
    order very close.

    So, I can try and make a case to point out that having a PK is not ideal in
    a child table...but, it is also not as bad as one might think!!


    --
    Albert D. Kallal (Access MVP)
    Edmonton, Alberta Canada
    pleaseNOOSpamKallal@msn.com
    http://www.members.shaw.ca/AlbertKallal
     
  18. Tom Wickerath

    Tom Wickerath
    Expand Collapse
    Guest

    Tom,

    > I'm working on a paper to critique Northwind. It seems that many Access
    > users have studied Northwind, and it's held up as an example of good
    > practice.


    While many Access users, including myself, have studied Northwind, I've
    never heard of anyone holding it up as a example of good practices. It's
    simply convenient to use in examples, since this database is so widely
    available. The fact is that there simply are not m(any) high quality Access
    samples put out by Microsoft. A lot of them have Option Explicit missing in
    one or more code modules, some use reserved words, naming conventions are
    usually not used, duplicate indexes, etc. etc.

    > I propose to provide a version of Northwind that is more worthy
    > of that.


    That shouldn't be too hard. However, you might want to provide two versions
    that are more worthy. One that has only the natural keys that you have such a
    professed love of using, and another copy for those of us who choose to use
    pseudo keys (autonumber). I'm not going to get in that religous argument with
    you, other than to say I am on the pseudo key side of the fence. From your
    past posts, I already know that we are on opposite sides of the fence on this
    issue.

    Wolfgang is 100% correct in his statements. Here are some statements made by
    author and MVP John Viescas on this subject that you might want to check out:

    http://groups.google.com/group/micr..._frm/thread/a61c485120605fb9/cfb61e61c63c7b35

    http://groups.google.com/group/micr..._frm/thread/ad42632ff4ab30bb/08fdb992fef47676

    and

    http://groups.google.com/group/micr..._frm/thread/8a2e6a43362d23cd/c11330eaafa3bf45

    In answer to your initial question about why the duplicate indexes are
    present, here is an short tip that I wrote a few years ago that explains the
    situation:

    http://www.access.qbuilt.com/html/gem_tips.html#AutoIndex


    Tom Wickerath, Microsoft Access MVP

    http://www.access.qbuilt.com/html/expert_contributors.html
    http://www.access.qbuilt.com/html/search.html
    __________________________________________

    "Tom Ellison" wrote:

    > Dear Wolfgang:
    >
    > Thanks very much. Although I hadn't understood such indexes were being
    > created automatically, as Albert also explained, I had strong doubt they are
    > unnecessary.
    >
    > I'm working on a paper to critique Northwind. It seems that many Access
    > users have studied Northwind, and it's held up as an example of good
    > practice. I propose to provide a version of Northwind that is more worthy
    > of that. I appreciate your help in doing this.
    >
    > Tom Ellison
     
  19. John Vinson

    John Vinson
    Expand Collapse
    Guest

    On Thu, 11 May 2006 23:49:33 -0500, "Tom Ellison"
    <tellison@jcdoyle.com> wrote:

    >You can drop these indexes and the relationships survive just fine. They
    >are not needed for a relationship. A UNIQUE index on the One side is
    >required, but, the best I understand, no index is required on the MANY side
    >table. That makes sense to me.


    Interesting! No, I was not aware of that. Thanks!

    John W. Vinson[MVP]
     
  20. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    Dear Albert:

    Your analysis is following my concepts very, very well.

    Now, not every application we write is order entry. Consider what happens
    in an inventory application. The foreign key table is the list of all
    inventroy items, the dependent table is the "activity" on that item,
    purchases, sales, shrinkage, etc.

    In this case, the rows in the dependent table are not grouped together in
    time, and therefore not in their numbering. Doesn't this blow away the
    organization you would have if the autonumbers were assigned "close together
    in time"? And isn't this an example of what happens in most table
    relationships in most applications written?

    As I understand the analysis you've undertaken (and you seem to have lept
    gracefully to the conclusion before I every tried to make it) then having
    the natural key be the PK would be the solution. This does not preclude
    having a surrogate autonumber key, and making it a unique key of the table,
    which also allows it to be used in relationships. At this point, the
    discussion is not whether surrogate key relationships are superior or not.
    It is just about which key should be the PK.

    This presumes you would typically have BOTH the identity key and a unique
    natural key. That's OK, isn't it? So, which should be the PK? I do not
    believe you lose anything making the natural key the PK, and you would often
    gain significantly. Is that what your analysis tell you, too?

    Thanks very much for the discussion. I'll heat up some more vegetable soup
    for you, and pour a glass of wine. (What a fine, stimulating evening that
    was!)

    See you again at a Summit I hope!

    Tom


    "Albert D.Kallal" <PleaseNOOOsPAMmkallal@msn.com> wrote in message
    news:O2IAqbfdGHA.4576@TK2MSFTNGP05.phx.gbl...
    > >
    >> Finally, here's the nasty part. You can, of course, have only one
    >> Primary Key. When the database is Compact/Repaired, the rows are put in
    >> the order of that PK.

    >
    > Yes, they are. And, unfortunately, as a design criteria, in those child
    > tables...I *always* include a primary key autonumber field
    > (before even reading farther...I know where this is going!!).
    >
    >> When you build a filter for a form, possibly based on its controls, you
    >> would be filtering by the natural key. When you have a form/subform
    >> setup, and you navigate successively through the FK table, the dependent
    >> table is accessed successively in the natural key order, that is, the
    >> order of the foreign key table in its form.

    >
    > Without question, we now have a dilemma. It would seem *better* to NOT
    > have a primary key (autonumber) in the child table, but ONLY have the
    > foreign key value..and, if we make it the index, then the child table will
    > be "clustered" for us (to steal a sql server term). In a sense, just like
    > sql server can only have one clustered index....we kind of have the same
    > thing in jet (however, with sql server, you can have both a primary key,
    > and the clustered index does NOT have to be this PK).
    >
    >> If instead you have an autonumber key for the FK table, and that is in
    >> the dependent table instead of the natural key from the FK table, then
    >> there is no opportunity to put the rows of the dependent table in natural
    >> key order, and the disk access will be random instead of sequential.
    >> Considering that the segments of the database are 4K, this means that you
    >> will be reading all over the database frequently, instead of processing
    >> 20 to 50 rows or so sequentially before another HD access. Remember,
    >> each access takes 10 mS or so, and they really add up when compared to
    >> processing what is in memory.

    >
    > *Excellent* observation on your part!!! (just great!!).
    >
    > Ok, so, now I going to give my snottily answer to the fact that I
    > *usually* include a PK field (autonumber) in that child table!!!
    > I have a PK in those child tables for two reasons
    >
    > 1) - well, obviously, if this table needs a child table...then I
    > will need a PK
    > (so, for flexibility, and future design consideration...I have
    > a pk)
    > 2) - in invoice type stuff, I often want to retrieve the *last*
    > invoice date...and that also includes the case where a person might have
    > more then ONE invoice in the same day...so, I want the last one...the only
    > way to get this is grab the top 1 date....but throw in a "order by ID
    > desc". With the addition of the autonumber field..then each row is unique.
    > 3) not in love with CODD..but I do believe that each table should
    > have a unique key to identify it.
    >
    > However, as you point out, the instant I throw in that autonumber PK in
    > this child table, then I mess up the clustering I would normally get if I
    > just had a index on the FK field..and NO autonumber....
    >
    > However, it turns out that *much* of the time, you do actually get the
    > child table in the correct order. For example, take a order taking
    > application, and I start to type in order details (obviously in a sub-form
    > that is a child table). It turns out that in *MOST* cases, I will be
    > adding those child tables one after another. So, while I don't get
    > clustering by FK order, the PK order is *useally* the same!!! In other
    > words, if I add 5 new child records, they all have the same FK, but the PK
    > is actually also auto-numbered one after another. Thus even after
    > compacting...my frame/data order is in fact clustered by PK value order,
    > but they WERE entered in that order!! Thus OFTEN the PK order will group
    > values together by FK values.(at least the times when I entered data in a
    > sub-form in one shot...and that occurs a LOT of the time!!).
    >
    > Of course, this luck of the same order is not the same all the time. For
    > example, adding a new customer invoice would happen over time. I do loose
    > out on clustering in that case. However, for each invoice I create, and
    > other types of data entry, surprisingly often, the PK order follows the FK
    > order very close.
    >
    > So, I can try and make a case to point out that having a PK is not ideal
    > in a child table...but, it is also not as bad as one might think!!
    >
    >
    > --
    > Albert D. Kallal (Access MVP)
    > Edmonton, Alberta Canada
    > pleaseNOOSpamKallal@msn.com
    > http://www.members.shaw.ca/AlbertKallal
    >
    >
     
  21. Tom Ellison

    Tom Ellison
    Expand Collapse
    Guest

    Dear Tom:

    I propose to demonstrate the results of the two different database design
    methods, with and without autonumber surrogate keys, testing performance
    under conditions of inserts, updates, deletes, and straight SELECT queries.
    I chose Northwind as the basis for this. I will soon have two copies the
    table, indexes, and relationships, each built on the different paradigms.
    I'll create a large amount of randomly generated data and insert it in each
    database, then run timings of various operations.

    Tom Ellison


    "Tom Wickerath" <AOS168b AT comcast DOT net> wrote in message
    news:2663BB79-4D85-4D34-88A2-63DFFAA3017E@microsoft.com...
    > Tom,
    >
    >> I'm working on a paper to critique Northwind. It seems that many Access
    >> users have studied Northwind, and it's held up as an example of good
    >> practice.

    >
    > While many Access users, including myself, have studied Northwind, I've
    > never heard of anyone holding it up as a example of good practices. It's
    > simply convenient to use in examples, since this database is so widely
    > available. The fact is that there simply are not m(any) high quality
    > Access
    > samples put out by Microsoft. A lot of them have Option Explicit missing
    > in
    > one or more code modules, some use reserved words, naming conventions are
    > usually not used, duplicate indexes, etc. etc.
    >
    >> I propose to provide a version of Northwind that is more worthy
    >> of that.

    >
    > That shouldn't be too hard. However, you might want to provide two
    > versions
    > that are more worthy. One that has only the natural keys that you have
    > such a
    > professed love of using, and another copy for those of us who choose to
    > use
    > pseudo keys (autonumber). I'm not going to get in that religous argument
    > with
    > you, other than to say I am on the pseudo key side of the fence. From your
    > past posts, I already know that we are on opposite sides of the fence on
    > this
    > issue.
    >
    > Wolfgang is 100% correct in his statements. Here are some statements made
    > by
    > author and MVP John Viescas on this subject that you might want to check
    > out:
    >
    > http://groups.google.com/group/micr..._frm/thread/a61c485120605fb9/cfb61e61c63c7b35
    >
    > http://groups.google.com/group/micr..._frm/thread/ad42632ff4ab30bb/08fdb992fef47676
    >
    > and
    >
    > http://groups.google.com/group/micr..._frm/thread/8a2e6a43362d23cd/c11330eaafa3bf45
    >
    > In answer to your initial question about why the duplicate indexes are
    > present, here is an short tip that I wrote a few years ago that explains
    > the
    > situation:
    >
    > http://www.access.qbuilt.com/html/gem_tips.html#AutoIndex
    >
    >
    > Tom Wickerath, Microsoft Access MVP
    >
    > http://www.access.qbuilt.com/html/expert_contributors.html
    > http://www.access.qbuilt.com/html/search.html
    > __________________________________________
    >
    > "Tom Ellison" wrote:
    >
    >> Dear Wolfgang:
    >>
    >> Thanks very much. Although I hadn't understood such indexes were being
    >> created automatically, as Albert also explained, I had strong doubt they
    >> are
    >> unnecessary.
    >>
    >> I'm working on a paper to critique Northwind. It seems that many Access
    >> users have studied Northwind, and it's held up as an example of good
    >> practice. I propose to provide a version of Northwind that is more
    >> worthy
    >> of that. I appreciate your help in doing this.
    >>
    >> Tom Ellison
     

Share This Page