I’ve always found that when defining a database table it is always best to create a integer primary key, even if a unique key such as social security number, ISBN, or some other business value. Recently I had to go through the unpleasant process of updating a database table and associated scripts, resources, and code that had used a business attribute as the primary key but because of business requirements it was no longer unique and had grown to have different meanings. Because your business requirements will change, don’t use business attributes as primary keys for you database design. In addition, don’t overload attributes to more than one meaning. For example, the database that I was working with had a database column called Sequence that functioned as the primary key, the line number, and a workflow execution order to process the data. It is a source of confusion and bugs to overload one attribute to so many different meanings.
Most applications require a database of some sorts, and when working with a database you most likely will have a production, development, and test database to work with in addition to backups. A common problem I found is that sometimes I want to move rows of data for a particular database table between the different databases I work with. For example, I found I need to copy a subset of data from a table from the production database to the test database so that I had good data to test with. A typical approach is to create a backup from the production database and restore it as a test database, but if you just want to copy a subset of rows from a particular table that approach is overkill.
In SQL Server, you can use a insert statement to insert new rows in a database where you select and filter the data from a different database. This is also true for other database vendors but the SQL might be slightly different.
To further illustrate this, imagine I have a database table blog_posts in a production database called BlogProduction and a test database is called BlogTest and they both reside on the same SQL Server instance. From SQL Server Management Studio or SQLCMD or from where ever you run your SQL statement you can run the following statement.
INSERT INTO blog_posts SELECT * FROM BlogProduction.dbo.blog_posts
If you run the above statement from the perspective of the BlogTest database, you are will insert all the records from the blog_posts table in the BlogProduction database into the blog_posts table in BlogTest. The above example, assumes the database owner is dbo. It is not required to name the database owner and the following does the same.
INSERT INTO blog_posts SELECT * FROM BlogProduction..blog_posts
You can also add a where clause to the select statement to filter out certain data to insert to your current database.
INSERT INTO blog_posts SELECT * FROM BlogProduction..blog_posts WHERE id > 100
In this way, you can move data from different database tables from different databases.
When working with a database, it’s is always handy to develop a series of scripts to backup, restore, create, move databases around. As of late, I’ve been working with SQL Server a lot so I’ve come to appreciate the sqlcmd utility.
You can run the sqlcmd utility from the command line and run a input sql script and capture the output in a file. The input sql script file can run any sql statement, such as creating a new databases, views, tables, or procedures. You can also restore and backup any given database or whatever else you require.
To run the sqlcmd utility and have it read a file, createTables.sql, you can run something like the following from your command line, of course if you have SQL Server Management Studio and sqlcmd installed.
sqlcmd -S localhost -d dbName -U user -P password -i createTables.sql -o output.out
There are many situations when you need to copy data from one database table to another one. You may want to migrate data from one database table to a new table, sometimes you need to do so to copy data to a backup table, other times to bulk load data from from different databases. To copy date between tables in a SQL Server database you can execute the following.
INSERT INTO Note (Id, Name) SELECT Id, Name FROM Notes WHERE name IS NOT NULL
The above state will copy date from a database table Notes to a new table Note where some condition is meet.
A typical database server instance hosts multiple databases, such as production and several test databases. You can copy data not only between tables within a database, but between databases within a database server instance. To copy between databases you need to use the fully qualified database name. For example, the following SQL snippet copies data from a tables called Notes in a database called TestDB.
INSERT INTO Note (Id, Name) SELECT Id, Name FROM TestDB.dbo.Notes WHERE name IS NOT NULL
When working with SQL Server, two procedures that I use from time to time is sp_who and sp_lock. The sp_who procedure provides information as to which loginame, hostname, dbname are for current users. Usually I am just interested the activity on a given user, which you can use the following SQL command.
sp_who @loginame = 'sa'
The sp_lock procedure reports on which sessions ids are locked in the database. There are different level of locks such as database, table, and data locks.
Every production application I have been involved with has made use of a database. As a programmer, you don’t have to be a DBA, but just as with software development you need to adhere to a core set of database best practices. Below are a few of the best practices I try to follow.
- Use a database modeling tool that is vendor neutral.
- Decide upon a database naming convention, standardize it across your organization and be consistent in following through with it. Always be consistent with the use of case and plural vs singular.
- Have a similar naming between your table columns and object properties.
- Try not to use system tables directly as they are not SQL standard and may change between releases.
- Do not use SELECT * in your queries, explicitly write the required columns names in the select statement.
- Always use a column list in your insert statements. This helps in avoiding problems when the table structure changes, like adding or removing columns.
- Try to avoid wildcard characters at the beginning of the word while searching using the LIKE keyword.
- Use the more readable ANSI standard join clause instead of he told style join.
- If you have a choice, do not store binary files in the database. Instead store the path to the binary file in the database and store the file in the disk, preferably another disk drive.
- Offload tasks like string manipulations, concatenations, row numbering, case conversions, type conversions, etc to the front-end application.
- Store computed string manipulations, concatenation, case conversion, type conversions, etc in the database so you don’t have to waste CPU cycles recalculating.
- Avoid leaking your database design details, such as primary keys.
- Just like your code, be sure to comment your database design.
- Use the bulk/multi insert statement.
- Study the SQL execution plan prior to a major release.
Again, as a software developer you don’t have to be a database tuning expert but you also can’t be excused for not following the above best practices.