Evaluation of Federated Database Servers and Distributed Partitioned Views of SQL Server 2000
Last updated: March 26th '01 | Best viewed with: All popular browsers | Best viewed at: 1024x768 | Links to external sites will open in a new window

About myself
My technical skills
My favorites
My picture album

Shortcut keys
My code library

VB resources
SQLServer resources
SQLServer books
Replication FAQ
Scripting resources
ASP resources

Search my site
Sign my guestbook
Contact information

SQL Server Articles New

NEW!!! Subscribe to my newsletter:
Want to keep in touch with the latest in SQL Server world? Email vyaskn@hotmail.com with 'subscribe' in the subject line
 
See also: What are federated databases?

Okay, there are no high end systems, no quad processor boxes, no RAID, and no dedicated network. It's an evaluation scenario at it's simplest form.

I horizontally partitioned a huge table (5 Million rows, not so huge anyways!) across two database servers with 2.5 million rows on each server. Idea is to compare the response times in the following two scenarios:

  • Having the table on one server (5 million rows)
  • Having the table spread across two servers (2.5 million rows on each server)


Here is the hardware configuration of both the servers:

Server1: Server1
OS: Microsoft Windows 2000 Advanced Server with SP1
System model: DELL OptiPlex GX1
Processor: Pentium III, 500MHz
RAM: 128 MB


Server2: Server2
OS: Microsoft Windows 2000 Advanced Server with SP1
System model: DELL OptiPlex GX1
Processor: Pentium III, 500MHz
RAM: 128 MB


Here is the table structure:

CREATE TABLE [dbo].[tblPriceDefn] (
[ConfigID] [numeric](19, 0) NOT NULL ,
[PriceListID] [varchar] (40) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[ValidityDate] [datetime] NOT NULL ,
[Price] [money] NOT NULL
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblPriceDefn] WITH NOCHECK ADD
CONSTRAINT [PK_PriceDefn] PRIMARY KEY CLUSTERED
(
[ConfigID],
[PriceListID],
[ValidityDate]
) ON [PRIMARY]


Horizontal partitioning of the data:
For this evaluation purpose I took the tblPriceDefn table which has 5 Million rows. I divided the table into two equal halves depending upon the value of the PriceListID column. Stored the first half on the first server (Server1) and the second half on the second server (Server2). Then I created a distributed partioned view named View1 in the first server (Server1) which cobmines both the halves of the table tblPriceDefn. Here is the view definition:

CREATE VIEW View1
AS
SELECT * FROM [Server1].test.dbo.table1
UNION ALL
SELECT * FROM [Server2\inst1].test.dbo.table1


Performance testing:
The following queries are executed on the first server locally on both the base table and the partitioned view and the response times were recorded. Each query is executed 5 times and the average response time is recorded

Query 1:

--On the base table:
SELECT
PriceListID,
COUNT(*) [# of rows]
FROM
ArenaDBRel2.dbo.tblPriceDefn
GROUP BY
PriceListID
ORDER BY
PriceListID

Average response time on the base table with 5 million rows: 59 seconds

--On the partitioned view:
SELECT
PriceListID,
COUNT(*) [# of rows]
FROM
test.dbo.View1
GROUP BY
PriceListID
ORDER BY
PriceListID

Average response time on the partitioned view : 41 seconds

A gain of 18 seconds which is a 30% improvement in performance



Query 2:

--On the base table:
SELECT
DISTINCT PriceListID
FROM
ArenaDBRel2.dbo.tblPriceDefn

Average response time on the base table with 5 million rows: 45 seconds

--On the partitioned view:
SELECT
DISTINCT PriceListID
FROM
test.dbo.View1

Average response time on the partitioned view: 32 seconds

A gain of 13 seconds which is a 28% improvement in performance


Query 3:

--On the base table:
SELECT
PriceListID,
ConfigID,
Price
FROM
ArenaDBRel2.dbo.tblPriceDefn
WHERE
PriceListID = 'M05-05'
OR
PriceListID = 'M10-01'

Average response time on the base table with 5 million rows: 59 seconds

--On the partitioned view:
SELECT
PriceListID,
ConfigID,
Price
FROM
test.dbo.View1
WHERE
PriceListID = 'M05-05'
OR
PriceListID = 'M10-01'

Average response time on the partitioned view: 44 seconds

A gain of 15 seconds which is a 25% improvement in performance

Conclusion:
The federated database servers feature in SQL Server 2000 is a very useful one and we can be sure to see performance gains in the query response times. As I mentioned earlier, this is a very simple evaluation of the feature. I am going to try it out with more concurrent connections and complicated queries and will post more information on this site. Be sure to check back after a while! Please let me know if you have any suggestions regarding this page!