Showing posts with label T-Sql. Show all posts
Showing posts with label T-Sql. Show all posts

SQL and MSBI Practice Questions

Hey Folks! Few days back i attended an interview. I got few interesting questions, I am sharing few which i remembered :

Q1 : Interviewer need each character count in table. { Lets say there is a table with one column varchar data type, You need to find each character count}

Q2 : There is a txt file. It has only one column with some string and some numbers. You need to send string to one table and integer to other table with SSIS.

Q3 : There are two tables. both having one column as id. What could be minimum and maximum number of rows if we use inner join and first table having 2 rows and second table having 3 rows. {data could be anything in table}

Q4 : Write a query to find the name of the bank or banks whose revenue never went less from previous year. {data is shown below}

BankName                         Year       Revenue(Cr)
ICICI 2013 515
ICICI 2014 602
BOI 2015 477
BOI 2016 564
Axis 2013 497
Hdfc 2013 503
Hdfc 2014 580
Axis 2016 666
BOI 2013 369
BOI 2014 366
ICICI 2015 715
ICICI 2016 780
Hdfc 2015 740
Hdfc 2016 760
Axis 2014 560
Axis 2015 480

---------------------------------------------------------------------
Ans1 :
CREATE TABLE #t (val NVARCHAR(10))

INSERT INTO #t
VALUES ('vimal')
 ,('kamal')

SELECT *
FROM #t;

WITH cte
AS (
 SELECT Val
  ,len(val) AS lnt
  ,val RemainingStr
  ,1 no
 FROM #t
 
 UNION ALL
 
 SELECT Cast(Right(RemainingStr, 1) AS NVARCHAR(10))
  ,lnt - 1
  ,SUBSTRING(RemainingStr, 1, len(RemainingStr) - 1)
  ,no + 1
 FROM cte
 WHERE lnt > 0
 )
 ,ct
AS (
 SELECT Val
 FROM cte
 WHERE no > 1
 )
SELECT Val
 ,COUNT(Val) Occurance
FROM CT
GROUP BY Val

DROP TABLE #t



Ans2 : 



Ans 3 : 
Minimum : No rows [No join match]
Maximum : 6 [all values are same like 1]

Ans 4 : 
CREATE TABLE Bank (
 BankName VARCHAR(100)
 ,Year INT
 ,Revenue MONEY
 )

INSERT INTO Bank
VALUES (
 'ICICI'
 ,2013
 ,515
 )
 ,(
 'ICICI'
 ,2014
 ,602
 )
 ,(
 'BOI'
 ,2015
 ,477
 )
 ,(
 'BOI'
 ,2016
 ,564
 )
 ,(
 'Axis'
 ,2013
 ,497
 )
 ,(
 'Hdfc'
 ,2013
 ,503
 )
 ,(
 'Hdfc'
 ,2014
 ,580
 )
 ,(
 'Axis'
 ,2016
 ,666
 )
 ,(
 'BOI'
 ,2013
 ,369
 )
 ,(
 'BOI'
 ,2014
 ,366
 )
 ,(
 'ICICI'
 ,2015
 ,715
 )
 ,(
 'ICICI'
 ,2016
 ,780
 )
 ,(
 'Hdfc'
 ,2015
 ,740
 )
 ,(
 'Hdfc'
 ,2016
 ,760
 )
 ,(
 'Axis'
 ,2014
 ,560
 )
 ,(
 'Axis'
 ,2015
 ,480
 )

---------------------------
SELECT *
FROM Bank
ORDER BY BankName
 ,Year ASC;

WITH BankCTE
AS (
 SELECT *
  ,LEAD(Revenue) OVER (
   PARTITION BY BankName ORDER BY Year ASC
   ) NextYearRevenue
 FROM Bank
 )
 ,FindBank
AS (
 SELECT BankName
 FROM BankCTE
 WHERE NextYearRevenue < Revenue
  AND NextYearRevenue IS NOT NULL
 )
SELECT BankName
FROM Bank
GROUP BY BankName

EXCEPT

SELECT BankName
FROM FindBank

SCD Type2 with sql server Merge statement or SSIS package

Hi folks, Few days back i was working on SCD Type2. SCD is slowly changing dimensions & Type2 is used for managing history. I have implemented it through SSIS package however i am also describing sql merge statement to implement it.
I have few execution segments. I am explaining the order of execution of these segments. It is necessary because i am using some segment code in SSIS as well-

/*Segment1 */
Create table StagingEmployee(EmployeeCode int, EmployeeName varchar(100), Address varchar(100), 
 ChecksumData int Default -1);

Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1001,'Vijay','Noida');
Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1002,'Manoj','Gurugram');
Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1003,'Jai','Delhi');

Update StagingEmployee Set ChecksumData=BINARY_CHECKSUM(EmployeeCode,EmployeeName,Address);

Create table dimEmployee(Employeeid int identity(1,1),Employeecode int,EmployeeName varchar(100),
  Address varchar(100),CreatedDate datetime null Default Getdate(),ExpirationDate datetime null,
  CurrentRecord char(1) Default 'Y',DimChecksum int Default -1);

Select * from StagingEmployee
Select * from dimEmployee
/*Segment2 */
Insert into dimEmployee(EmployeeCode,EmployeeName,Address,CreatedDate,ExpirationDate,DimChecksum)
Select EmployeeCode,EmployeeName,Address,CreatedDate,ExpirationDate,DimChecksum
From
(
 Merge into dimEmployee as T
 Using StagingEmployee as S
  ON T.EmployeeCode=S.EmployeeCode
 When Matched and T.DimChecksum<>S.ChecksumData and T.CurrentRecord='Y'
 Then
  Update Set
  ExpirationDate=GetDate()-1,
  CurrentRecord= 'N'
 When Not Matched
 Then 
  Insert(EmployeeCode,EmployeeName,Address,DimChecksum)
  Values(S.EmployeeCode,S.EmployeeName,S.Address,S.ChecksumData)
 Output
  $action,S.EmployeeCode,S.EmployeeName,S.Address,GetDate(),Null,S.ChecksumData
) as Tab 
  (action,EmployeeCode,EmployeeName,Address,CreatedDate,ExpirationDate,DimChecksum)
where action='UPDATE';

Select * from StagingEmployee
Select * from dimEmployee
/*Segment3 */
Truncate table StagingEmployee;
Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1001,'Vijay','Pune');
Insert into StagingEmployee(EmployeeCode,EmployeeName,Address) Values(1005,'Varun','Mumbai');
Update StagingEmployee Set ChecksumData=BINARY_CHECKSUM(EmployeeCode,EmployeeName,Address);

/*Segment4 */ -- <----Execute2 code
Select * from StagingEmployee
Select * from dimEmployee
/*
Truncate table StagingEmployee;
Truncate table dimEmployee;
Drop table StagingEmployee;
Drop table dimEmployee;
*/

SQL SERVER :

Execute Segment1 ---> Segment2 --->Segment3 --->Segment2 --->Segment4
Check result after each segment.

Now execute Segment4


SSIS :

Execute Segment1 from above code.
Open BIDS or SSDT ---> Open/Create SSIS project --->Create new project
Create a connection string and drag data flow task and rename it as below :


Click on SCD Type2 (Control flow task) --> drag a source assistant and connect with table StagingEmplohyee and name it StagingEmployee
 

Now drag a slowly changing dimension and connect it from stagingEmployee(Source)
Click on slowly changing dimension and one wizard will open then follow like below :
 


 

Now it will look like this :

Now run the package and try the result in sql server management studio.

 





If you will check the result then DimChecksum value not correct so here either we can remove the column as we don't need it or we can correct it with the help of some package correction.
Also you need to correct some other components as below :

Correction on OLEDB Command as below (in sql query section and no change in mapping)

Correction on Insert destination > Mapping the checksumdata and then execute Segment4


Now Execute Segment1 ---> Run the package see the results


Now Execute Segment3 ---> run the package and check the results and then Execugte Segment4



Interview Question : Adding Each character of a number in a table

CREATE TABLE addnum ( id INT identity(1, 1) ,
                             val INT )
INSERT INTO addnum
SELECT ABS(CAST(NEWID() AS BINARY (6)) % 1000) + 1 randomNumber GO 1000
SELECT *
FROM addnum
;

WITH cte AS
  (SELECT id,
          cast(val AS varchar(10)) val,
          len(val) leng
   FROM addnum),
     cte1 AS
  (SELECT substring(val,1,1) x,
          id,
          1 num
   FROM cte
   UNION ALL SELECT Isnull(substring(cte.val,num+1,1),'0'),
                    cte.id,
                    num+1
   FROM cte1
   JOIN cte ON cte1.id=cte.id
   AND cte1.num<cte.leng)
SELECT cte.val,
       cte.id,
       sum(cast(x AS int)) addition
FROM cte1
JOIN cte ON cte.id=cte1.id
GROUP BY cte.val,
         cte.id
ORDER BY id




Dynamic Restore Script

Hi Folks! Few days back i was working on a project, I was trying to automate few things, I will share most of my work in next blogs, so i have also automate restore process for database having multiple files. It will work well with any number of files. You need to provide database name and backup location.

This script is for Sql Server. 

SET NOCOUNT ON

DECLARE @restoreTable TABLE (
 [LogicalName] NVARCHAR(128)
 ,[PhysicalName] NVARCHAR(260)
 ,[Type] CHAR(1)
 ,[FileGroupName] NVARCHAR(128)
 ,[Size] NUMERIC(20, 0)
 ,[MaxSize] NUMERIC(20, 0)
 ,[FileID] BIGINT
 ,[CreateLSN] NUMERIC(25, 0)
 ,[DropLSN] NUMERIC(25, 0)
 ,[UniqueID] UNIQUEIDENTIFIER
 ,[ReadOnlyLSN] NUMERIC(25, 0)
 ,[ReadWriteLSN] NUMERIC(25, 0)
 ,[BackupSizeInBytes] BIGINT
 ,[SourceBlockSize] INT
 ,[FileGroupID] INT
 ,[LogGroupGUID] UNIQUEIDENTIFIER
 ,[DifferentialBaseLSN] NUMERIC(25, 0)
 ,[DifferentialBaseGUID] UNIQUEIDENTIFIER
 ,[IsReadOnly] BIT
 ,[IsPresent] BIT
 ,[TDEThumbprint] VARBINARY(32) -- remove this column if using SQL 2005
 )

INSERT INTO @restoreTable
EXEC ('RESTORE FILELISTONLY FROM DISK = ''' + @backuppath + '''')

--SELECT * FROM @fileListTable
SET @Query = 'RESTORE DATABASE [' + @Dbname + '] FROM  DISK = N''' + @backuppath + ''' 
WITH  FILE = 1,'

SELECT @Query = @Query + ' 
MOVE N''' + LogicalName + ''' TO N''' + PhysicalName + ''','
FROM @restoreTable

SET @Query = @Query + ' 
NOUNLOAD,  REPLACE,  STATS = 5'

PRINT @Query
--Exec(Query)

Splitting Data Query


DECLARE @STR VARCHAR(100) = 'Hi how are you doing? I am good.'
DECLARE @dev VARCHAR(1) = ' '

SELECT Split.a.value('.', 'VARCHAR(100)') AS E
FROM (
 SELECT Cast('<M> ' + Replace(Replace(@str, @dev, @dev + '</M><M>'), '.', '') + ' </M>' AS XML) AS yo
 ) a
CROSS APPLY yo.nodes('/M') AS SPLIT(a)

Encrypt And Decrypt Data Using Certificate In SQL Server

CREATE DATABASE Dbavimal
GO

USE Dbavimal
GO

--Create MasterKey
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'DBAVimal!';
GO

-- Create Certificate
CREATE CERTIFICATE [EncryptionCert]
 WITH SUBJECT = 'DBAEncryption'
GO

-- Symmetric Key
CREATE SYMMETRIC KEY SymmetricDBAVimalKey
 WITH ALGORITHM = AES_192 ENCRYPTION BY CERTIFICATE [EncryptionCert]
GO

--Use Symmetric Key
OPEN SYMMETRIC KEY SymmetricDBAVimalKey DECRYPTION BY CERTIFICATE [EncryptionCert]
GO

-----------------------------
CREATE FUNCTION [dbo].[fn_Encrypt] (@Data VARCHAR(max))
RETURNS VARBINARY(256)
AS
BEGIN
 DECLARE @Result VARBINARY(256)

 SET @Result = EncryptByKey(Key_GUID('SymmetricDBAVimalKey'), @Data)

 RETURN @Result
END
GO

-------------------------------
CREATE FUNCTION [dbo].[fn_Decrypt] (@Data VARCHAR(max))
RETURNS VARCHAR(max)
AS
BEGIN
 DECLARE @Result VARCHAR(max)

 SET @Result = DecryptByKey(@Data)

 RETURN @Result
END
GO

-----------------------------------
--Test the result (Same way you can store your data on tables)
DECLARE @Str VARCHAR(500) = 'Hello'

PRINT '====Original Data============'
PRINT @Str
PRINT '============================='

DECLARE @EncryptedData VARCHAR(256)
DECLARE @DecryptedData VARCHAR(256)

SET @EncryptedData = [dbo].[fn_Encrypt](@Str)

PRINT '====Encrypted Data==========='
PRINT @EncryptedData
PRINT '============================='

SET @DecryptedData = [dbo].[fn_Decrypt](@EncryptedData)

PRINT '====Data After Decryption===='
PRINT @DecryptedData
PRINT '============================='

USE master
GO

DROP DATABASE Dbavimal
 
 
 

SQL Server T-Sql Performance Tuning Tips : Sargable


In a WHERE clause, the various operators used directly affect how fast a query is run. This is because some operators lend themselves to speed over other operators. Of course, you may not have any choice of which operator you use in your WHERE clauses, but sometimes you do.
Here are the key operators used in the WHERE clause, ordered by their performance. Those operators at the top will produce results faster than those listed at the bottom.
  • =
  • >, >=, <, <=
  • LIKE
  • <> 
  • A single literal used by itself on one side of an operator
  • A single column name used by itself on one side of an operator, a single parameter used by itself on one side of an operator
  • A multi-operand expression on one side of an operator
  • A single exact number on one side of an operator
  • Other numeric number (other than exact), date and time
  • Character data, NULLs
The simpler the operand, and using exact numbers, provides the best overall performance.
Try to avoid WHERE clauses that are non-sargable. The term "sargable" (which is in effect a made-up word) comes from the pseudo-acronym "SARG", which stands for "Search ARGument," which refers to a WHERE clause that compares a column to a constant value. If a WHERE clause is sargable, this means that it can take advantage of an index (assuming one is available) to speed completion of the query. If a WHERE clause is non-sargable, this means that the WHERE clause (or at least part of it) cannot take advantage of an index, instead performing a table/index scan, which may cause the query's performance to suffer.
Non-sargable search arguments in the WHERE clause, such as "IS NULL", "<>", "!=", "!>", "!<", "NOT", "NOT EXISTS", "NOT IN", "NOT LIKE",  and "LIKE '%500'" generally  prevents (but not always) the query optimizer from using an index to perform a search. In addition, expressions that include a function on a column, expressions that have the same column on both sides of the operator, or comparisons against a column (not a constant), are not sargable.
But not every WHERE clause that has a non-sargable expression in it is doomed to a table/index scan. If the WHERE clause includes both sargable and non-sargable clauses, then at least the sargable clauses can use an index (if one exists) to help access the data quickly.
In many cases, if there is a covering index on the table, which includes all of the columns in the SELECT, JOIN, and WHERE clauses in a query, then the covering index can be used instead of a table/index scan to return a query's data, even if it has a non-sargable WHERE clause. But keep in mind that covering indexes have their own drawbacks, such as producing very wide indexes that increase disk I/O when they are read.
In some cases, it may be possible to rewrite a non-sargable WHERE clause into one that is sargable. For example, the clause:
WHERE SUBSTRING(firstname,1,1) = 'm'
can be rewritten like this:
WHERE firstname like 'm%'
Both of these WHERE clauses produce the same result, but the first one is non-sargable (it uses a function) and will run slow, while the second one is sargable, and will run much faster.
WHERE clauses that perform some function on a column are non-sargable. On the other hand, if you can rewrite the WHERE clause so that the column and function are separate, then the query can use an available index, greatly boosting performance. for example:
Function Acts Directly on Column, and Index Cannot Be Used:
SELECT member_number, first_name, last_name
FROM members
WHERE DATEDIFF(yy,datofbirth,GETDATE()) > 21
Function Has Been Separated From Column, and an Index Can Be Used:
SELECT member_number, first_name, last_name
FROM members
WHERE dateofbirth < DATEADD(yy,-21,GETDATE())
Each of the above queries produces the same results, but the second query will use an index because the function is not performed directly on the column, as it is in the first example. The moral of this story is to try to rewrite WHERE clauses that have functions so that the function does not act directly on the column.
WHERE clauses that use NOT are not sargable, but can often be rewritten to remove the NOT from the WHERE clause, for example:
WHERE NOT column_name > 5
to
WERE column_name <= 5
Each of the above clauses produce the same results, but the second one is sargable.
If you don't know if a particular WHERE clause is sargable or non-sargable, check out the query's execution plan in Query Analyzer. Doing this, you can very quickly see if the query will be using index lookups or table/index scans to return your results.
With some careful analysis, and some clever thought, many non-sargable queries can be written so that they are sargable. Your goal for best performance (assuming it is possible) is to get the left side of a search condition to be a single column name, and the right side an easy to look up value.
If you currently have a query that uses NOT IN, which offers poor performance because the SQL Server optimizer has to use a nested table scan to perform this activity, instead try to use one of the following options instead, all of which offer better performance:
  • Use EXISTS or NOT EXISTS

     
  • Use IN

     
  • Perform a LEFT OUTER JOIN and check for a NULL condition
When you have a choice of using the IN or the BETWEEN clauses in your Transact-SQL, you will generally want to use the BETWEEN clause, as it is much more efficient. For example:
SELECT customer_number, customer_name
FROM customer
WHERE customer_number in (1000, 1001, 1002, 1003, 1004)
is much less efficient than this:
SELECT customer_number, customer_name
FROM customer
WHERE customer_number BETWEEN 1000 and 1004
Assuming there is a useful index on customer_number, the Query Optimizer can locate a range of numbers much faster (using BETWEEN) than it can find a series of numbers using the IN clause (which is really just another form of the OR clause).
If you have a WHERE clause that includes expressions connected by two or more AND operators, SQL Server will evaluate them from left to right in the order they are written. This assumes that no parenthesis have been used to change the order of execution. Because of this, you may want to consider one of the following when using AND:
  • Locate the least likely true AND expression first. This way, if the AND expression is false, the clause will end immediately, saving time.

     
  • If both parts of an AND expression are equally likely being false, put the least complex AND expression first. This way, if it is false, less work will have to be done to evaluate the expression.
You may want to consider using Query Analyzer to look at the execution plans of your queries to see which is best for your situation.
If you want to boost the performance of a query that includes an AND operator in the WHERE clause, consider the following:
  • Of the search criterions in the WHERE clause, at least one of them should be based on a highly selective column that has an index.

     
  • If at least one of the search criterions in the WHERE clause is not highly selective, consider adding indexes to all of the columns referenced in the WHERE clause.

     
  • If none of the column in the WHERE clause are selective enough to use an index on their own, consider creating a covering index for this query.
The Query Optimizer will perform a table scan or a clustered index scan on a table if the WHERE clause in the query contains an OR operator and if any of the referenced columns in the OR clause are not indexed (or does not have a useful index). Because of this, if you use many queries with OR clauses, you will want to ensure that each referenced column in the WHERE clause has a useful index.

Whenever SQL Server has to perform a sorting operation, additional resources have to be used to perform this task. Sorting often occurs when any of the following Transact-SQL statements are executed:
  • ORDER BY
  • GROUP BY
  • SELECT DISTINCT
  • UNION
  • CREATE INDEX (generally not as critical as happens much less often)
In many cases, these commands cannot be avoided. On the other hand, there are few ways that sorting overhead can be reduced. These include:
  • Keep the number of rows to be sorted to a minimum. Do this by only returning those rows that absolutely need to be sorted.

     
  • Keep the number of columns to be sorted to the minimum. In other words, don't sort more columns that required.

     
  • Keep the width (physical size) of the columns to be sorted to a minimum.

     
  • Sort column with number datatypes instead of character datatypes.
When using any of the above Transact-SQL commands, try to keep the above performance-boosting suggestions in mind.
If your SELECT statement contains a HAVING clause, write your query so that the WHERE clause does most of the work (removing undesired rows) instead of the HAVING clause do the work of removing undesired rows. Using the WHERE clause appropriately can eliminate unnecessary rows before they get to the GROUP BY and HAVING clause, saving some unnecessary work, and boosting performance.
For example, in a SELECT statement with WHERE, GROUP BY, and HAVING clauses, here's what happens. First, the WHERE clause is used to select the appropriate rows that need to be grouped. Next, the GROUP BY clause divides the rows into sets of grouped rows, and then aggregates their values. And last, the HAVING clause then eliminates undesired aggregated groups. If the WHERE clause is used to eliminate as many of the undesired rows as possible, this means the GROUP BY and the HAVING clauses will have less work to do, boosting the overall performance of the query.
If your application performs many wildcard (LIKE %) text searches on CHAR or VARCHAR columns, consider using SQL Server's full-text search option. The Search Service can significantly speed up wildcard searches of text stored in a database.

The GROUP BY clause can be sped up if you follow these suggestion:
  • Keep the number of rows returned by the query as small as possible.

     
  • Keep the number of groupings as few as possible.

     
  • Don't group redundant columns.

     
  • If there is a JOIN in the same SELECT statement that has a GROUP BY, try to rewrite the query to use a subquery instead of using a JOIN. If this is possible, performance will be faster. If you have to use a JOIN, try to make the GROUP BY column from the same table as the column or columns on which the set function is used.

     
  • Consider adding an ORDER BY clause to the SELECT statement that orders by the same column as the GROUP BY. This may cause the GROUP BY to perform faster. Test this to see if is true in your particular situation.
Instead of using temporary tables, consider using a derived table instead. A derived table is the result of using a SELECT statement in the FROM clause of an existing SELECT statement. By using derived tables instead of temporary tables, you can reduce I/O and boost your application's performance.

It is fairly common request to write a Transact-SQL query to to compare a parent table and a child table and find out if there are any parent records that don't have a match in the child table. Generally, there are three ways this can be done:
Using a NOT EXISTS

SELECT a.hdr_key
FROM hdr_tbl a
WHERE NOT EXISTS (SELECT * FROM dtl_tbl b WHERE a.hdr_key = b.hdr_key)
Using a LEFT JOIN
SELECT a.hdr_key
FROM hdr_tbl a
LEFT JOIN dtl_tbl b ON a.hdr_key = b.hdr_key
WHERE b.hdr_key IS NULL
Using a NOT IN

SELECT hdr_key
FROM hdr_tbl
WHERE hdr_key NOT IN (SELECT hdr_key FROM dtl_tbl)
In each case, the above query will return identical results. But, which of these three variations of the same query produces the best performance? Assuming everything else is equal, the best performing version through the worst performing version will be from top to bottom, as displayed above. In other words, the NOT EXISTS variation of this query is generally the most efficient.
I say generally, because the indexes found on the tables, along with the number of rows in each table, can influence the results. If you are not sure which variation to try yourself, you can try them all and see which produces the best results in your particular circumstances.
If you need to verify the existence of a record in a table, don't use SELECT COUNT(*) in your Transact-SQL code to identify it, which is very inefficient and wastes server resources. Instead, use the Transact-SQL IF EXITS to determine if the record in question exits, which is much more efficient. For example:
Here's how you might use COUNT(*):
IF (SELECT COUNT(*) FROM table_name WHERE column_name = 'xxx')
Here's a faster way, using IF EXISTS:
IF EXISTS (SELECT * FROM table_name WHERE column_name = 'xxx')
The reason IF EXISTS is faster than COUNT(*) is because the query can end immediately when the text is proven true, while COUNT(*) must count go through every record, whether there is only one, or thousands, before it can be found to be true.
Let's say that you often need to INSERT the same value into a column. For example, perhaps you have to perform 100,000 INSERTs a day into a particular table, and that 90% of the time the data INSERTed into one of the columns of the table is the same value.
If this the case, you can reduce network traffic (along with some SQL Server overhead) by creating this particular column with a default value of the most common value. This way, when you INSERT your data, and the data is the default value, you don't INSERT any data into this column, instead allowing the default value to automatically be filled in for you. But when the value needs to be different, you will of course INSERT that value into the column.
If you have created a complex transaction that includes several parts, one part of which has a higher probability of rolling back the transaction than the others, better performance will be provided if you locate the most likely to fail part of the transaction at the front of the greater transaction. This way, if this more-likely-to-fail transaction has to roll back because of a failure, there has been no resources wasted on the other less-likely-to-fail transactions.


Guidelines for Using Joins

*  If you perform regular joins between two or more tables in your queries, performance will be optimized if each of the joined columns have their own indexes. This includes adding indexes to the columns in each table used to join the tables

*  If you have to regularly join four or more tables to get the recordset you need, consider de normalizing the tables so that the number of joined tables is reduced. Often, by adding one or two columns from one table to another, joins can be reduced.

*  Don't use CROSS JOINS, unless this is the only way to accomplish your goal. What some inexperienced developers do is to join two tables using a CROSS JOIN, then they use either the DISTINCT or the GROUP BY clauses to "clean up" the mess they have created. This, as you might imagine, can be a huge waste of SQL Server resources.

*  For maximum performance when joining two or more tables, the indexes on the columns to be joined should have the same data type. This also means that you shouldn't mix non-Unicode and Unicode datatypes when using SQL Server 7.0 or later. (e.g. varchar and nvarchar). If SQL Server has to implicitly convert the data types to perform the join, this not only slows the joining process, but it also could mean that SQL Server may not use available indexes, performing a table scan instead


*  Use joins in preference to sub or nested queries for improved performance.

If you have a query that contains two sub selects containing an aggregate function (SUM, Count, etc.) in the SELECT part. The query may perform sluggishly. To rectify the problem, you can replace the sub selects with a series of JOINS  .
So as a rule, should use JOINS in lieu of sub selects when the sub select contains aggregate functions.