Asp Forum - Advice needed: Bigint (identities) - Am i overdoing it?

Lim Siew Yin

3/16/2007 2:55:00 AM

Hi ppl,

Its me again. Sorry for sending so much noobie questions but since we're
scraping the previous project and redoing it all over... i wanted to make
sure we do it right this time round.

k... Using bigint identities is soooooo convenient (instant primary key)
that i am worried we're overdoing it. Almost all of our tables have it as
primary key... is this bad?
(Its ok as long as I am indexing important searchable columns ya?)

Thanks again.

Limsy the worried noobie... obviously.

9 Answers

unknown

3/16/2007 3:13:00 AM

"Lim Siew Yin" <limsyn.nospam@gmail.com> wrote in message
news:eV0P6Z3ZHHA.4420@TK2MSFTNGP02.phx.gbl...
> Hi ppl,
>
> Its me again. Sorry for sending so much noobie questions but since we're
> scraping the previous project and redoing it all over... i wanted to make
> sure we do it right this time round.
>
> k... Using bigint identities is soooooo convenient (instant primary key)
> that i am worried we're overdoing it. Almost all of our tables have it as
> primary key... is this bad?
> (Its ok as long as I am indexing important searchable columns ya?)

IMO it's the best way to do it.

Michael

David Portas

3/16/2007 4:17:00 AM

On 16 Mar, 02:54, "Lim Siew Yin" <limsyn.nos...@gmail.com> wrote:
> Hi ppl,
>
> Its me again. Sorry for sending so much noobie questions but since we're
> scraping the previous project and redoing it all over... i wanted to make
> sure we do it right this time round.
>
> k... Using bigint identities is soooooo convenient (instant primary key)
> that i am worried we're overdoing it. Almost all of our tables have it as
> primary key... is this bad?
> (Its ok as long as I am indexing important searchable columns ya?)
>
> Thanks again.
>
> Limsy the worried noobie... obviously.

Naively adding IDENTITY columns as an "instant primary key" has the
potential to cause a lot of trouble for you. Convenient they are NOT.
It's far more important that you normalize your data model first. Do
you have alternate keys on every table for which IDENTITY is the
surrogate key? If not, then you already have problems.

Do you need to support more than 4 billion rows in these tables? If
not, then you could save 4 bytes per row by using INT instead of
BIGINT.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/librar...(en-US,SQL.90).aspx
--

Lim Siew Yin

3/16/2007 7:08:00 AM

> Naively adding IDENTITY columns as an "instant primary key" has the
> potential to cause a lot of trouble for you. Convenient they are NOT.
> It's far more important that you normalize your data model first. Do
Don't worry... i already have the data model ready. What i was struggling
was whether to use alphanumeric for more easy-to-understand data or bigint
identity for my primary keys.
(eg. CustomerID .... bigint identity or alphanumeric? Bigint identity is
convenient to implement whereas I can make more meaningful coding by using
alphanumeric ... 100102 vs AKL11012 ... something along the line)
I usually use the alphanumeric method.. which ended up me needing a table to
store the running numbers and a stored proc like gimmeNewCustomerID
So basically, before i take the plunge, i wanted to know what is the norm
out there and if there are pitfalls ... what are they.

Err.. lastly, i don't really understand the danger you tried to describe
(sorry.. i am really really bad with technical terms...)
Can you explain further?
> you have alternate keys on every table for which IDENTITY is the
> surrogate key? If not, then you already have problems.
Alternate keys as in Secondary keys? or just key fields? (fields to be
indexed but are not indicated as keys or constraints)
Surrogate key? as in Secondary keys? Foreign keys?

> Do you need to support more than 4 billion rows in these tables? If
> not, then you could save 4 bytes per row by using INT instead of
> BIGINT.
No... we're small, i don't think we will be that big but i have no worries
on disk space. Just wanna make sure i have a decent design with good
performance.
(I avoided INT whenever possible as i ran into problems previously when the
values exceeded INT's limit.. of course that was in the OLD days when INT is
small but still... call it phobia)
[INT for me is only reserved for things like AGE *grin*, QUANTITY and such]

Thanks for your time and feedback.

Limsy the still trying to figure out Alternate and Surrogate keys one....
*scratches head*

Tibor Karaszi

3/16/2007 7:15:00 AM

> Err.. lastly, i don't really understand the danger you tried to describe (sorry.. i am really
> really bad with technical terms...)

I have a feeling that what David was aiming at is if you don't protect your natural keys. Say that
you stores customers and you need a way to identify a customer, and you don't want the same customer
to be stored twice. If you only slap on an identity or home grown key (surrogate key), then you have
nothing that hinders you from having several rows for the same customer.

--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/d...
http://sqlblog.com/blogs/tib...

"Lim Siew Yin" <limsyn.nospam@gmail.com> wrote in message
news:%23J44Xn5ZHHA.4008@TK2MSFTNGP05.phx.gbl...
>
>> Naively adding IDENTITY columns as an "instant primary key" has the
>> potential to cause a lot of trouble for you. Convenient they are NOT.
>> It's far more important that you normalize your data model first. Do
> Don't worry... i already have the data model ready. What i was struggling was whether to use
> alphanumeric for more easy-to-understand data or bigint identity for my primary keys.
> (eg. CustomerID .... bigint identity or alphanumeric? Bigint identity is convenient to implement
> whereas I can make more meaningful coding by using alphanumeric ... 100102 vs AKL11012 ...
> something along the line)
> I usually use the alphanumeric method.. which ended up me needing a table to store the running
> numbers and a stored proc like gimmeNewCustomerID
> So basically, before i take the plunge, i wanted to know what is the norm out there and if there
> are pitfalls ... what are they.
>
> Err.. lastly, i don't really understand the danger you tried to describe (sorry.. i am really
> really bad with technical terms...)
> Can you explain further?
>> you have alternate keys on every table for which IDENTITY is the
>> surrogate key? If not, then you already have problems.
> Alternate keys as in Secondary keys? or just key fields? (fields to be indexed but are not
> indicated as keys or constraints)
> Surrogate key? as in Secondary keys? Foreign keys?
>
>> Do you need to support more than 4 billion rows in these tables? If
>> not, then you could save 4 bytes per row by using INT instead of
>> BIGINT.
> No... we're small, i don't think we will be that big but i have no worries on disk space. Just
> wanna make sure i have a decent design with good performance.
> (I avoided INT whenever possible as i ran into problems previously when the values exceeded INT's
> limit.. of course that was in the OLD days when INT is small but still... call it phobia)
> [INT for me is only reserved for things like AGE *grin*, QUANTITY and such]
>
> Thanks for your time and feedback.
>
>
> Limsy the still trying to figure out Alternate and Surrogate keys one.... *scratches head*
>
>
>

Razvan Socol

3/16/2007 7:24:00 AM

On Mar 16, 9:07 am, "Lim Siew Yin" <limsyn.nos...@gmail.com> wrote:
> > Naively adding IDENTITY columns as an "instant primary key" has the
> > potential to cause a lot of trouble for you. Convenient they are NOT.
> > It's far more important that you normalize your data model first. Do
>
> Don't worry... i already have the data model ready. What i was struggling
> was whether to use alphanumeric for more easy-to-understand data or bigint
> identity for my primary keys.
> (eg. CustomerID .... bigint identity or alphanumeric? Bigint identity is
> convenient to implement whereas I can make more meaningful coding by using
> alphanumeric ... 100102 vs AKL11012 ... something along the line)
> I usually use the alphanumeric method.. which ended up me needing a table to
> store the running numbers and a stored proc like gimmeNewCustomerID
> So basically, before i take the plunge, i wanted to know what is the norm
> out there and if there are pitfalls ... what are they.
>
> Err.. lastly, i don't really understand the danger you tried to describe
> (sorry.. i am really really bad with technical terms...)
> Can you explain further?> you have alternate keys on every table for which IDENTITY is the
> > surrogate key? If not, then you already have problems.
>
> Alternate keys as in Secondary keys? or just key fields? (fields to be
> indexed but are not indicated as keys or constraints)
> Surrogate key? as in Secondary keys? Foreign keys?
>
> [...]

Considering your above example, before worrying what data type you
should use for the CustomerID column (which is a surrogate key,
because it's generated by the database), you should think if you
already have a unique key in the Customers table (like CustomerName or
EmailAddress) which is entered by the user. If you don't, then the
user may enter the "Foobar Company Inc" customer twice (and the system
will generate the CustomerID-s FBR001 and FBR002, for example, without
complaining that the user is entering duplicate data). Unwanted
duplicates in a table can be a very serious problem, as they can
denaturate the results of a lot of reports (besides allowing
contradictory values for other columns).

Razvan

Lim Siew Yin

3/16/2007 7:56:00 AM

Ahh... thanks for the reply. That cleared up my confusion. :)
Well.. most of my tables (other than customers) have proper unique key.
As for customers, that is the problem. I dont have any unique identifier
for them. The system proceeds to generate a unique identifier for them
(customer code) and all further references of that customer is based on
that. Controls are in placed in Customer Registration whereby the admin are
responsible to check the existance of the customer before creating them in
the database.
This is a limitation that i have to live with for now. Our business model
allows us to function without having full detailed data of our customers
(dont ask me.. not my decision), this means.. i cant just create a mandatory
unique identifier because all my existing customer data do not have it.
(and ya.. our customer need not have valid business reg no and they have the
right not to give us their personal id number as well... so....)
But i believe eventually, i can implement some proper identifier .. but it
will need time.
Anyone having the problem i have? How do you overcome this problem?

Thanks again guys.

So to summarize, using Bigint identities is ok, just need to make sure my
database is properly design and have unique keys.

unknown

3/16/2007 9:11:00 AM

"David Portas" <REMOVE_BEFORE_REPLYING_dportas@acm.org> wrote in message
news:1174018593.340868.194620@l77g2000hsb.googlegroups.com...
> Naively adding IDENTITY columns as an "instant primary key" has the
> potential to cause a lot of trouble for you.

Hi David,

What are these troubles? I've been using IDENTITY columns for 10 years now
and still don't see what all the fuss is about. The only problem i've had is
that you need to get the new ID value back but that is well and truly
managable and I would call it just a minor disadvantage.

On the other hand I've encounter many real life problems with natural
primary keys.

> Convenient they are NOT.

I don't know. I reckon they are about as convenient as it gets. Having a
consistent pk that you can use throughout your app is very convenient.

> It's far more important that you normalize your data model first. Do
> you have alternate keys on every table for which IDENTITY is the
> surrogate key? If not, then you already have problems.

This is the most common issue raised about int pks (as can be seen by the
fact that 2 other people in this thread raised the same issue). But to me
this a completely seperate and totally irrelevant issue. To say int pks are
bad because the rest of the model has a fault is ludicrous.

> Do you need to support more than 4 billion rows in these tables? If
> not, then you could save 4 bytes per row by using INT instead of
> BIGINT.

I was going to suggest the same thing.

Michael

Someone Else

3/16/2007 9:33:00 AM

It is more than just 4 bytes per row, the reason being that I would
assume you've made keys like CustomerID your clustered index, as you
are likely to be joining on that a lot, and would want to reduce
bookmark lookups. Your clustered key is stored in every single non-
clustered index that you have on that table (so that it can perform
those bookmark lookups that we love so much). So it is in fact
4x(numIndexes) bytes per row.

On 16 Mar, 07:07, "Lim Siew Yin" <limsyn.nos...@gmail.com> wrote:
> > Naively adding IDENTITY columns as an "instant primary key" has the
> > potential to cause a lot of trouble for you. Convenient they are NOT.
> > It's far more important that you normalize your data model first. Do
>
> Don't worry... i already have the data model ready. What i was struggling
> was whether to use alphanumeric for more easy-to-understand data or bigint
> identity for my primary keys.
> (eg. CustomerID .... bigint identity or alphanumeric? Bigint identity is
> convenient to implement whereas I can make more meaningful coding by using
> alphanumeric ... 100102 vs AKL11012 ... something along the line)
> I usually use the alphanumeric method.. which ended up me needing a table to
> store the running numbers and a stored proc like gimmeNewCustomerID
> So basically, before i take the plunge, i wanted to know what is the norm
> out there and if there are pitfalls ... what are they.
>
> Err.. lastly, i don't really understand the danger you tried to describe
> (sorry.. i am really really bad with technical terms...)
> Can you explain further?> you have alternate keys on every table for which IDENTITY is the
> > surrogate key? If not, then you already have problems.
>
> Alternate keys as in Secondary keys? or just key fields? (fields to be
> indexed but are not indicated as keys or constraints)
> Surrogate key? as in Secondary keys? Foreign keys?
>
> > Do you need to support more than 4 billion rows in these tables? If
> > not, then you could save 4 bytes per row by using INT instead of
> > BIGINT.
>
> No... we're small, i don't think we will be that big but i have no worries
> on disk space. Just wanna make sure i have a decent design with good
> performance.
> (I avoided INT whenever possible as i ran into problems previously when the
> values exceeded INT's limit.. of course that was in the OLD days when INT is
> small but still... call it phobia)
> [INT for me is only reserved for things like AGE *grin*, QUANTITY and such]
>
> Thanks for your time and feedback.
>
> Limsy the still trying to figure out Alternate and Surrogate keys one....
> *scratches head*

David Portas

3/16/2007 4:34:00 PM

On 16 Mar, 09:11, "Michael C" <nos...@nospam.com> wrote:
>
> > It's far more important that you normalize your data model first. Do
> > you have alternate keys on every table for which IDENTITY is the
> > surrogate key? If not, then you already have problems.
>
> This is the most common issue raised about int pks (as can be seen by the
> fact that 2 other people in this thread raised the same issue). But to me
> this a completely seperate and totally irrelevant issue. To say int pks are
> bad because the rest of the model has a fault is ludicrous.
>

I didn't say INTEGER primary keys were bad. I said that "naively"
adding IDENTITY columns as a way of creating primary keys will cause
problems. What I meant by not doing it "naively" was what I explained
further on: you need a normalized data model first - i.e. you *already
have* the primary keys. You then add surrogate keys. I didn't consider
this an irrelevant and separate issue because the OP said "instant
primary key", which I (wrongly) took to mean that his tables didn't
have other keys.

IDENTITY columns undeniably have big disadvantages: they can't be
updated; they can't easily be assigned at runtime; they are not table-
independent; the IDENTITY property can't ever be dropped or added;
they can't easily be merged; you can't necessarily retrieve the
inserted values at runtime without requerying the table. These
limitations cause big problems during replication, data integration or
data deployment scenarios. Other DBMSs do surrogate keys much better.
The decision to add an IDENTITY column therefore should not be taken
without forethought and without good reason.

--
David Portas, SQL Server MVP

Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.

SQL Server Books Online:
http://msdn2.microsoft.com/librar...(en-US,SQL.90).aspx
--

microsoft.public.sqlserver.programming

Advice needed: Bigint (identities) - Am i overdoing it?

Lim Siew Yin

unknown

David Portas

Lim Siew Yin

Tibor Karaszi

Razvan Socol

Lim Siew Yin

unknown

Someone Else

David Portas

x Login to ForumsZone