Logins with no default database

This can happen when databases are dropped or renamed.

In Azure, cross database connections are disabled so its important to connect to the actual database you will be working on, rather than -say- master.

This will list all logins where the default database is missing:-

SELECT [name] Login,
       default_database_name MissingDefaultDB
FROM sys.server_principals
WHERE DATABASEPROPERTYEX(default_database_name, 'Status') IS NULL
AND default_database_name IS NOT NULL
ORDER BY [name];

Removing duplicates from a large table before adding a clustered primary key

The challenge with this one, was to remove over 2,000 duplicate rows from a 600,000,000 row heap, before a multi-column clustered primary key could be added.

(I am so happy to finally be able to document this, from a while ago. The phrase “uniquely challenging” was never so apt.)

My first idea was to create a empty copy of the source table, add a “ignore duplicates” index, then start an INSERT [dbo].[targettable] SELECT * FROM [dbo].[sourcetable]. However, I stopped after an hour as it was going to take too long.

My second idea was to use SSIS to populate the target-table. Again this was just too slow.

My third idea was to just remove the duplicates (2,000 being a much smaller number than 600,000,000). First though, a disclaimer, I only considered this ** high risk ** solution because the data was in a QA environment (not Live), and people were waiting for me.

Step one then, was to populate a temp-table with the duplicates …

SELECT column1, column2, column3, column4
INTO #tmp1
FROM [dbo].[sometablename]
GROUP BY column1, column2, column3, column4
HAVING COUNT(*) > 1;

(“column1” etc were not the real column names by-the-way 🙂 ). Step two was to loop through this list removing duplicates from the permanent table

WHILE exists (SELECT 1 FROM #tmp1)
BEGIN

	DECLARE @1 BIGINT, @2 INT, @3 BIGINT
	DECLARE @4 VARCHAR(10)

	SELECT TOP(1) 
		@1 = column1, 
		@2 = column2, 
		@3 = column3, 
		@4 = column4 
	FROM #tmp1

	DELETE TOP(1) 
	FROM [dbo].[sometablename] 
	WHERE column1 = @1
	AND column2 = @2
	AND column3 = @3
	AND column4 = @4

	DELETE 
	FROM #tmp1
	WHERE column1 = @1
	AND column2 = @2
	AND column3 = @3
	AND column4 = @4

END

Step three was to implement the clustered primary key. (by-the-way I was half expecting this to fail with a duplicates error, but happily the above loop had cleared a single duplicate from each pair, and thats all that was needed).

ALTER TABLE [dbo].[sometablename] 
ADD CONSTRAINT pk_sometablename
 PRIMARY KEY CLUSTERED
 ([column1], [column2], [column3], [column4]);

** just to point out – I consider the loop “high risk” because, an error message, a typo, accidentally running it twice, or faulty logic, could have resulted in disaster 🙂