Thursday, June 7, 2012

Difference between Unique Key and Primary Key in Database that are most important and most infrequently discussed

I am going to discuss some of the most important differences between Unique Key and Primary key in databases. Typically this is a very common topic but the most misleading as well. I am not going to write down the already known differences but will discuss the things that one should understand before deciding the primary key and unique key in database design.
Fig 1

By definition , Primary key and Unique key technically means the same thing that is , a column or group of columns that can identify a uniqueness in a row. The role of both Unique key and primary key is the same in  this term (so technically they are same). The most common mistake that people do during database design is to create a primary key as ID int not null column and think that it is a unique identifier for the row. Although this is a correct statement but this primary key (Id column) is just a number added as an extra attribute to the table and technically does not belong to the definition of the data. Let me give you an example(See Fig 1):
The Fig 1 shows a table with Id column as PK . Now, this table consists of the Employee FName and LName. By looking at an example you can see that 1,John,Neville ; 2,John,Neville and so on is just a same thing. Adding Id column really did not make the row unique. So, Just by creating an Id column does not mean that you have  found a way to uniquely identify your row(Does not solve the problem). Basically what we are looking to do is to create a key that belongs to a data that can truly identify uniqueness.
A better way of designing a same table would be to add some more attributes to the table to make the row unique that also conforms to the meaning of the data. May be a add an SSN or something.
So the takeaway from this discussion is that always think about the unique key on the table in terms of the data related to the table and not by adding a value such as Id.

Creating a unique key that is data relevant helps the query statistics to remain consistent and helps the query optimizer to come with an optimal query execution plan that results in faster processing.

Another misconception that people have about Unique and Primary key is that Primary key always creates a Clustered Index and Unique key always creates a Non Clustered index. However this statement is true in a way that when you use a SQL server management studio and use a User Interface to create keys , it by default create clustered index on Primary key and Non Clustered index on Unique key  but it is not mandatory to do that. I would suggest always use scripts to create your keys.
So, it is not mandatory to create clustered index on primary or non clustered on unique key . You can create a clustered index on unique key and  non clustered on primary key.

[Id] ASC
This is a perfectly valid statement in SQL. So don't be confused with this indexing for keys.

So to sum up, Following are the things that needs to be keep in mind before thinking about UK and PK
1) How can I make this row unique without introducing the fake Id column. 
2) Technically Primary key and Unique key is the same thing
3) You can typically save the extra joins if there are some look up tables by figuring out the unique values. I will show this thing in my later blogs. 
4) Sql server by default does not allow nulls in PK but you can put nulls in a UK column. 

I hope you have enjoyed reading this blog. Please post on the comments if you have any thoughts to share.