Fun with numbers in Transact-SQL queries
Last updated: June 9th '02 | Best viewed with: All popular browsers | Best viewed at: 1024x768 | Links to external sites will open in a new window

About myself
My technical skills
My favorites
My picture album

Shortcut keys
My code library

VB resources
SQLServer resources
SQLServer books
Replication FAQ
Scripting resources
ASP resources

Search my site
Sign my guestbook
Contact information

SQL Server Articles New

 
NEW!!! Subscribe to my newsletter:
Want to keep in touch with the latest in SQL Server world? Email vyaskn@hotmail.com with 'subscribe' in the subject line
See also:  
Database coding conventions SQL Server interview questions
Evaluation of federated database servers SQL Server FAQ on programming, administration, replication and tools
SQL Server security best practices SQL Server administration best practices
Related books:
The guru's guide to Transact SQL Advanced Transact-SQL for SQL Server 2000 SQL Server 2000 Programming by example

Fun with numbers in Transact-SQL queries


Recently, someone asked, how to sort the characters, within a string? For example, a string contains 'CBA', and he wanted to sort the characters within the string and make it 'ABC'. Instead of going for a procedural solution, just for fun, I tried approaching it in a relational way (T-SQL specific). In this article, I'll show you how we can make use of a numbers table for writing this kind of innovative queries.

Let's first create a numbers table called 'Numbers', with only one column called 'Number'. The 'Number' column is an IDENTITY column with a primary key and clustered index defined on it. A clustered index in this case improves the performance of the queries as we will be querying this numbers table for ranges of numbers.

The following script will drop the 'Numbers' table if it already exists, creates it and populates the table with numbers 1 to 8000 (We need only 8000 numbers as a char/varchar variable or column can have a maximum of 8000 characters):


SET NOCOUNT ON
GO

IF EXISTS
(
	SELECT	1
	FROM 	INFORMATION_SCHEMA.TABLES
	WHERE		TABLE_NAME 	= 'Numbers'
	    	AND 	TABLE_SCHEMA 	= 'dbo'
	    	AND 	TABLE_TYPE 	= 'BASE TABLE'
)
BEGIN
	DROP TABLE dbo.Numbers
END
GO

CREATE TABLE dbo.Numbers
(
	Number smallint IDENTITY(1, 1) PRIMARY KEY
)
GO

WHILE 1 = 1
BEGIN
	INSERT INTO dbo.Numbers DEFAULT VALUES
	
	IF @@IDENTITY = 8000 
	BEGIN
		BREAK
	END
END
GO

Now let us address the problem at hand: "Sorting the characters within a string"

The following script will use the Numbers table to split the string into individual characters and create a derived table, from which it creates a sorted string using an aggregate concatenation query:

DECLARE @input varchar(100), @output varchar(100), @len smallint
SET @input = 'CDBEA'
SET @output = ''
SET @len = LEN(@input)

SELECT @output = @output + Val 
FROM 
(
	SELECT TOP 100 PERCENT SUBSTRING(@input, Number, 1) AS Val
	FROM dbo.Numbers (NOLOCK)
	WHERE Number <= @len
	ORDER BY Val
) AS Derived

SELECT	@input AS 'Original string', 
	@output AS 'Sorted string'

This idea can be extended for other scenarios. How about extracting the unique characters from a given string? That is, given an input of 'abbcccdddd', output 'abcd'. The following script will do exactly that, using the DISTINCT keyword:

DECLARE @input varchar(100), @output varchar(100), @len smallint
SET @input = 'ABBCCCDDDD'
SET @output = ''
SET @len = LEN(@input)

SELECT @output = @output +  Val
FROM
(
	SELECT DISTINCT TOP 100 PERCENT SUBSTRING(@input, Number, 1) AS Val
	FROM dbo.Numbers (NOLOCK)
	WHERE Number <= @len
	ORDER BY Val
) AS Derived

SELECT	@input AS 'Original string',
	@output AS 'Sorted string with UNIQUE characters only'

Here's another scenario! How about getting just the numbers from a given string? Remember those cellphones pulling out the numbers from SMS/Text messages? :-) Here's the code:

DECLARE @input varchar(100), @output varchar(100), @len smallint
SET @input = 'My Number is: 0771 543 2360'
SET @output = ''
SET @len = LEN(@input)

SELECT @output = @output +  Val
FROM
(
	SELECT TOP 100 PERCENT Number, SUBSTRING(@input, Number, 1) AS Val
	FROM dbo.Numbers (NOLOCK)
	WHERE Number <= @len
	ORDER BY Number
) AS Derived
WHERE Val LIKE '[0-9]' 

SELECT	@input AS 'Original string',
	@output AS 'Extracted numbers'

If you are running SQL Server 2000, the above scripts can be converted to handy User Defined Functions (UDF), as shown below:

User Defined Function 1: SortString()


IF EXISTS
(
	SELECT	1
	FROM	INFORMATION_SCHEMA.ROUTINES
	WHERE		ROUTINE_NAME 	= 'SortString'
		AND	ROUTINE_SCHEMA	= 'dbo'
		AND	ROUTINE_TYPE	= 'FUNCTION'
)
BEGIN
	DROP FUNCTION dbo.SortString
END
GO

CREATE FUNCTION dbo.SortString
(
	@input varchar(8000)
)
RETURNS varchar(8000)
AS
BEGIN
	DECLARE @output varchar(8000), @len smallint
	SET @output = ''
	SET @len = LEN(@input)

	SELECT @output = @output + Val 
	FROM 
	(
		SELECT TOP 100 PERCENT SUBSTRING(@input, Number, 1) AS Val
		FROM dbo.Numbers (NOLOCK)
		WHERE Number <= @len
		ORDER BY Val
	) AS Derived

	RETURN @output
END
GO

SELECT dbo.SortString('911Abcdzyxfghjie999') AS 'Sorted string'
GO

User Defined Function 2: ExtractUniqueChars()


IF EXISTS
(
	SELECT	1
	FROM	INFORMATION_SCHEMA.ROUTINES
	WHERE		ROUTINE_NAME 	= 'ExtractUniqueChars'
		AND	ROUTINE_SCHEMA	= 'dbo'
		AND	ROUTINE_TYPE	= 'FUNCTION'
)
BEGIN
	DROP FUNCTION dbo.ExtractUniqueChars
END
GO

CREATE FUNCTION dbo.ExtractUniqueChars
(
	@input varchar(8000)
)
RETURNS varchar(8000)
AS
BEGIN
	DECLARE @output varchar(8000), @len smallint
	SET @output = ''
	SET @len = LEN(@input)

	SELECT @output = @output +  Val
	FROM
	(
		SELECT DISTINCT TOP 100 PERCENT SUBSTRING(@input, Number, 1) AS Val
		FROM dbo.Numbers (NOLOCK)
		WHERE Number <= @len
		ORDER BY Val
	) AS Derived

	RETURN @output
END
GO

SELECT dbo.ExtractUniqueChars('cba abc bac') AS 'Unique characters (Sorted)'
GO

User Defined Function 3: ExtractNumbers()


IF EXISTS
(
	SELECT	1
	FROM	INFORMATION_SCHEMA.ROUTINES
	WHERE		ROUTINE_NAME 	= 'ExtractNumbers'
		AND	ROUTINE_SCHEMA	= 'dbo'
		AND	ROUTINE_TYPE	= 'FUNCTION'
)
BEGIN
	DROP FUNCTION dbo.ExtractNumbers
END
GO

CREATE FUNCTION dbo.ExtractNumbers
(
	@input varchar(8000)
)
RETURNS varchar(8000)
AS
BEGIN
	DECLARE @output varchar(8000), @len smallint
	SET @output = ''
	SET @len = LEN(@input)

	SELECT @output = @output +  Val
	FROM
	(
		SELECT TOP 100 PERCENT Number, SUBSTRING(@input, Number, 1) AS Val
		FROM dbo.Numbers (NOLOCK)
		WHERE Number <= @len
		ORDER BY Number
	) AS Derived
	WHERE Val LIKE '[0-9]' 
	
	RETURN @output
END
GO
	
SELECT dbo.ExtractNumbers('My Number is: 0771 574 0609')
GO

If you can spare some 16 KB of memory, you can pin the Numbers table into memory, so that the pages of this table remain in memory, once read into memory. This is okay with smaller tables like the 'Numbers' table, but do not try this with larger tables as that can negatively impact SQL Server performance The following command can be used to pin the 'Numbers' table in memory (Also see DBCC PINTABLE in SQL Server Books Online (BOL)):

EXEC sp_tableoption 'Numbers', 'pintable', 'true'
GO


That concludes this article. I concentrated only on string manipulations, but for sure, there's more that can be done with a table of numbers. Watch out for more articles, code samples in the near future. Have fun!