Sunday, July 13, 2008

Database object naming conventions - SQLSERVER

Database object naming conventions

There is no such hard and fast rule to define database names, but I prefer to use some naming conventions which will polish your task(s):

I always usee the following(s):-

1. Tables
Tables represent the instances of an entity. For example, you store all your customer information in a table. Here, 'customer' is an entity and all the rows in the customers table represent the instances of the entity 'customer'. So, why not name your table using the entity it represents, 'Customer'. Since the table is storing 'multiple instances' of customers, make your table name a plural word.

Rules: Pascal notation; end with an ‘s’
Examples: Employees, Customers, Headlines, Groups etc.

This is a more natural way of naming tables, when compared to approaches which name tables as tblCustomers, tbl_Orders. Further, when you look at your queries it's very obvious that a particular name refers to a table, as table names are always preceded by FROM clause of the SELECT statement.

If your database deals with different logical functions and you want to group your tables according to the logical group they belong to, it won't hurt prefixing your table name with a two or three character prefix that can identify the group.

2. Views
A view is nothing but a table, for any application that is accessing it. So, the same naming convention defined above for tables, applies to views as well, but not always. Here are some exceptions:

a) Views not always represent a single entity. A view can be a combination of two tables based on a join condition, thus, effectively representing two entities. In this case, consider combining the names of both the base tables. Here's an example:

If there is a view combining two tables 'Customers' and 'Addresses', name the view as 'CustomersAddresses'. Same naming convention can be used with junction tables that are used to link two many-to-many related base tables. Most popular example is the 'TitleAuthor' table from 'Pubs' database of SQL Server.

b) Views can summarize data from existing base tables in the form of reports. You can see this type of views in the 'Northwind' database that ships with SQL Server 7.0 and above. Here's the convention that database follows. (I prefer this):

'Product Sales for 1997'
'Summary of Sales by Quarter'
'Summary of Sales by Year'

However, try to stay away from spaces within object names.

3. Stored Procedure
Stored procedures always do some work for you, they are action oriented. So, let their name describe the work they do. So, use a verb to describe the work.

This is how I would name a stored procedure that fetches me the customer details given the customer identification number:
'GetCustomerDetails'. Similarly, you could name a procedure that inserts a new customer information as 'InsertCustomerInfo'. Here are some more names based on the same convention: 'WriteAuditRecord', 'ArchiveTransactions', 'AuthorizeUser' etc.

As explained above in the case of tables, you could use a prefix, to group stored procedures also, depending upon the logical group they belong to. For example, all stored procedures that deal with 'Order processing' could be prefixed with ORD_ as shown below:

ORD_InsertOrder
ORD_InsertOrderDetails
ORD_ValidateOrder

If you are using Microsoft SQL Server, never prefix your stored procedures with 'sp_', unless you are storing the procedure in the master database. If you call a stored procedure prefixed with sp_, SQL Server always looks for this procedure in the master database. Only after checking in the master database (if not found) it searches the current database.

I do not agree with the approach of prefixing stored procedures with prefixes like 'sproc_' just to make it obvious that the object is a stored procedure. Any database developer/DBA can identify stored procedures as the procedures are always preceded by EXEC or EXECUTE keyword

Rules: sp_[_]<table/logical instance>
Examples: spOrders_GetNewOrders, spProducts_UpdateProduct

4. User Defined Functions
In Microsoft SQL Server 2000, User Defined Functions [UDFs] are almost similar to stored procedures, except for the fact that UDFs can be used in SELECT statements. Otherwise, both stored procedures and UDFs are similar. So, the naming conventions discussed above for stored procedures, apply to UDFs as well. You could even use a prefix to logically group your UDFs. For example, you could name all your string manipulation UDFs as shown below:

str_MakeProperCase
str_ParseString

5. Triggers
Though triggers are a special kind of stored procedures, it won't make sense to follow the same naming convention as we do for stored procedures.

While naming triggers we have to extend the stored procedure naming convention in two ways:

a) Triggers always depend on a base table and can't exist on their own. So, it's better to link the base table's name with the trigger name

b) Triggers are associated with one or more of the following operations: Insert, Update, Delete. So, the name of the trigger should reflect it's nature

Rules: TR_<TableName>_
Examples: TR_Orders_UpdateProducts
Notes: The use of triggers is discouraged

6. Indexes
Just like triggers, indexes also can't exist on their own and they are dependent on the underlying base tables. So, again it makes sense to include the 'name of the table' and 'column on which it's built' in the index name. Further, indexes can be of two types, clustered and nonclustered. These two types of indexes could be either unique or non-unique. So, the naming convention should take care of the index types too.
Rules: IX_<TableName>_
Examples: IX_Products_ProductID

7. Columns
Columns are attributes of an entity, that is, columns describe the properties of an entity. So, let the column names be meaningful and natural.

Here's a simplest way of naming the columns of the Customers table:

CustomerID
CustomerFirstName
CustomerAddress

As shown above, it'll be a good idea to prefix the column names with the entity that they are representing.

Here's another idea. Decide on a standard two to four character code for each table in your database and make sure it's unique in the database. For example 'Cust' for Customers table, 'Ord' for Orders tables, 'OrdD' for OrderDetails table, 'Adt' for Audit tables etc. Use this table code to prefix all the column names in that table. Advantage of this convention is that in multi-table queries involving complex joins, you don't have to worry about ambiguous column names, and don't have to use table aliases to prefix the columns. It also makes your queries more readable.

If you have to name the columns in a junction/mapping table, concatenate the table codes of mapped tables, or come up with a new code for that combination of tables.

So, here's how the CustomerID column would appear in Customers table:

Cust_CustomerID

The same CustomerID column appears in the Orders table too, but in Orders table, here's how it's named:

Ord_CustomerID

Some naming conventions even go to the extent of prefixing the column name with it's data type. But I don't like this approach, as I feel, the DBA or the developer dealing with these columns should be familiar with the data types these columns belong to.

If a column references another table’s column, name it <table name>ID
Example: The Customers table has an ID column
The Orders table should have a CustomerID column

8. Uer Defined DataTypes
User defined data types are just a wrapper around the base types provided by the database management system. They are used to maintain consistency of data types across different tables for the same attribute. For example, if the CustomerID column appears half a dozen tables, you must use the same data type for all the occurrences of the CustomerID column. This is where user defined data types come in handy. Just create a user defined data type for CustomerID and use it as the data type for all the occurrences of CustomerID column.

So, the simplest way of naming these user defined data types would be:
Column_Name + '_type'.
So, I would name the CustoerID type as:

CustomerID_type

9. Primary Keys
Primary key is the column(s) that can uniquely identify each row in a table. So, just use the column name prefixed with 'pk_' + 'Table name' for naming primary keys.

Rules: PK_<TableName>
Examples: PK_Products

10. Foreign Keys
Foreign key are used to represent the relationships between tables which are related. So, a foreign key can be considered as a link between the 'column of a referencing table' and the 'primary key column of the referenced table'.

I prefer the following naming convention for foreign keys:

fk_referencing table + referencing column_referenced table + referenced column.

Based on the above convention, I would name the foreign key which references the CustomerID column of the Customers table from the Order's tables CustomerID column as:

fk_OrdersCustomerID_CustomersCustomerID

Foreign key can be composite too, in that case, consider concatenating the column names of referencing and referenced tables while naming the foreign key. This might make the name of the foreign key lengthy, but you shouldn't be worried about it, as you will never reference this name from your code, except while creating/dropping these constraints.

Rules: FK_<TableName1>_<TableName2>
Example: FK_Products_Orderss

11. Defaults and Check Constrains
Use the column name to which these defaults/check constraints are bound to and prefix it with 'def' and 'chk' prefixes respectively for Default and Check constraints.
I would name the default constraint for OrderDate Column as def_OrderDate and the check constraint for OrderDate column as chk_OrderDate.

Rules: DF_<TableName>_
Example: DF_Products_Quantity

12. Variable
For variables that store the contents of columns, you could use the same naming convention that we used for Column names.

13. General Rules
a)Do not use spaces in the name of database objects.
b)Do not use SQL keywords as the name of database objects. In cases where this is necessary, surround the object name with brackets, such as [Year]
c)Do not prefix stored procedures with ‘sp_’ Prefix table names with the owner name.

13. Strucure
a) Each table must have a primary key
o In most cases it should be an IDENTITY column named ID
b) Normalize data to third normal form
o Do not compromise on performance to reach third normal form. Sometimes, a little denormalization results in better performance.
c) Do not use TEXT as a data type; use the maximum allowed characters of VARCHAR instead
d) In VARCHAR data columns, do not default to NULL; use an empty string instead
e) Columns with default values should not allow NULLs
f) As much as possible, create stored procedures on the same database as the main tables they will be accessing.

14. Formatting
· Use upper case for all SQL keywords
o SELECT, INSERT, UPDATE, WHERE, AND, OR, LIKE, etc.
· Indent code to improve readability
· Comment code blocks that are not easily understandable
o Use single-line comment markers(--)
o Reserve multi-line comments (/*.. ..*/) for blocking out sections of code
· Use single quote characters to delimit strings.
o Nest single quotes to express a single quote or apostrophe within a string
 For example, SET @sExample = 'SQL''s Authority'
· Use parentheses to increase readability
o WHERE (color=’red’ AND (size = 1 OR size = 2))
· Use BEGIN..END blocks only when multiple statements are present within a conditional code segment.
· Use one blank line to separate code sections.
· Use spaces so that expressions read like sentences.
o fillfactor = 25, not fillfactor=25
· Format JOIN operations using indents
o Also, use ANSI Joins instead of old style joins4
· Place SET statements before any executing code in the procedure.


15. Coding
· Optimize queries using the tools provided by SQL Server5
· Do not use SELECT *
· Return multiple result sets from one stored procedure to avoid trips from the application server to SQL server
· Avoid unnecessary use of temporary tables
o Use 'Derived tables' or CTE (Common Table Expressions) wherever possible, as they
perform better.
· Avoid using <> as a comparison operator
o Use ID IN(1,3,4,5) instead of ID <> 2
· Use SET NOCOUNT ON at the beginning of stored procedures7
· Do not use cursors or application loops to do inserts8
o Instead, use INSERT INTO
· Fully qualify tables and column names in JOINs
· Fully qualify all stored procedure and table references in stored procedures.
· Do not define default values for parameters.
o If a default is needed, the front end will supply the value.
· Do not use the RECOMPILE option for stored procedures.
· Place all DECLARE statements before any other code in the procedure.
· Do not use column numbers in the ORDER BY clause.
· Do not use GOTO.
· Check the global variable @@ERROR immediately after executing a data manipulation statement (like INSERT/UPDATE/DELETE), so that you can rollback the transaction if an error occurs
o Or use TRY/CATCH
· Do basic validations in the front-end itself during data entry
· Off-load tasks, like string manipulations, concatenations, row numbering, case conversions, type conversions etc., to the front-end applications if these operations are going to consume more CPU cycles on the database server
· Always use a column list in your INSERT statements.
o This helps avoid problems when the table structure changes (like adding or dropping a column).
· Minimize the use of NULLs, as they often confuse front-end applications, unless the applications are coded intelligently to eliminate NULLs or convert the NULLs into some other form.
o Any expression that deals with NULL results in a NULL output.
o The ISNULL and COALESCE functions are helpful in dealing with NULL values.
· Do not use the identitycol or rowguidcol.
· Avoid the use of cross joins, if possible.
· When executing an UPDATE or DELETE statement, use the primary key in the WHERE condition, if possible. This reduces error possibilities.
· Avoid using TEXT or NTEXT datatypes for storing large textual data.9
o Use the maximum allowed characters of VARCHAR instead
· Avoid dynamic SQL statements as much as possible.10
· Access tables in the same order in your stored procedures and triggers consistently.
· Do not call functions repeatedly within your stored procedures, triggers, functions and batches.
· Default constraints must be defined at the column level.
· Avoid wild-card characters at the beginning of a word while searching using the LIKE keyword, as these results in an index scan, which defeats the purpose of an index.
· Define all constraints, other than defaults, at the table level.
· When a result set is not needed, use syntax that does not return a result set.
· Avoid rules, database level defaults that must be bound or user-defined data types. While these are legitimate database constructs, opt for constraints and column defaults to hold the database consistent for development and conversion coding.
· Constraints that apply to more than one column must be defined at the table level.
· Use the CHAR data type for a column only when the column is non-nullable.14
· Do not use white space in identifiers.
· The RETURN statement is meant for returning the execution status only, but not data.

7 comments:

  1. hi gaurav a very good article the same i have recommended to my team kee it up.

    this article will light our task and tell the way of writting to steamline the coding convensions

    ReplyDelete
  2. a good thought came into mind when read your blog i appreciate your way of writting there are many helpful posts in this blog please post some important questions on the blog thanks

    ReplyDelete
  3. Gaurav,

    You have copied lots of stuff from my website http://blog.sqlauthority.com .
    I suggest you do not copy my stuff as all of them are copyrighted.

    Additionally, if you like my article, you can post link to my article but you can not just copy paste them.

    This is polite reminder.

    Kind Regards,
    Pinal Dave

    ReplyDelete
  4. Thanks Gaurav

    This is really nice stuff.you sumrized all these big topics.

    please add some advaced topic with them.

    My present sql querry is how do we find total no of rows in a table witout usig select statement.

    ReplyDelete
  5. Thanks pinal for your co-operation I have sent you a mail regarding and I will place the links as you suggested.

    ReplyDelete
  6. Hi Gaurav,
    Its very good article, if followed, will lead the peer(s) to understand actual meaning of the object(s), their scope and easy maintenance of the same.

    Could you please suggest me solution to following requirement. Requirement is "To display set of records in different order each time page refreshes". Currently I am trying to solve it with the help of random function, but no luck.

    Thanks in Advance.
    Ram Nath Rao

    ReplyDelete
  7. Thanks Ram!

    Its my pleasure to describe your request. I have published a post for the same.

    Thanks!

    ReplyDelete