What are federated databases?
Last updated: December 2nd '01 | Best viewed with: All popular browsers | Best viewed at: 1024x768 | Links to external sites will open in a new window

About myself
My technical skills
My favorites
My picture album

Shortcut keys
My code library

VB resources
SQLServer resources
SQLServer books
Replication FAQ
Scripting resources
ASP resources

Search my site
Sign my guestbook
Contact information

SQL Server Articles New

NEW!!! Subscribe to my newsletter:
Want to keep in touch with the latest in SQL Server world? Email vyaskn@hotmail.com with 'subscribe' in the subject line
 
See also: Evaluation of federated database servers

What are federated database servers?

"Federated database servers" is a feature introduced in SQL Server 2000. A federation is a group of SQL Servers that cooperate to share the processing load of a system. Federated database servers let you scale out a set of servers to support the processing needs of large systems and websites.

Scaling out is the process of increasing the processing power of a system by adding one or more additional computers, or nodes, instead of beefing up the hardware of a single computer (Scaling up).

Federated database servers can be implemented using "Distributed Partitioned Views" (DPV). You can partition tables horizontally across several servers, and define a distributed partitioned view on one server, covering all member server partitions. This view makes it appear as if a full copy of the original table is stored on one server.

SQL Server 7.0 supports partitioned views too, but SQL Server 2000 came up with the following enhancements, that allow the views to scale out and form federations of database servers: 

  • In SQL Server 2000, partitioned views are updateable
  •  The SQL Server 2000 query optimizer supports new optimizations that minimize the amount of distributed data that has to be transferred. The distributed execution plans generated by SQL Server 2000 result in good performance for a larger set of queries than the plans generated by SQL Server 7.0 

Why federated database servers and when to scale out?

When websites, applications generate processing loads that exceed the capacity of large individual servers, scaling out is the best option for increasing the processing capacity of the system. When the server for a particular application is at its maximum potential, and is no longer able to meet user demands, you should consider scaling out.

According to Microsoft, a federation of servers running SQL Server 2000 is capable of supporting the growth requirements of any Web site, or of the largest enterprise systems.

How are we able to gain performance by scaling out?

  • Distributed processing of data, which means individual member servers of the federation are working with smaller subsets of data, instead of the complete data
  • More CPUs working in parallel
  • Parallel disk I/O
  • Availability of more RAM, as each member server is working with a smaller subset of data

How to create distributed partitioned views?

Creation of partitioned views is explained in SQL Server Books Online, in the page titled "Creating a Partitioned View". The rules for creating updateable partitioned views are also explained in this page.

What is the impact of partitioned views on my front-end applications?

Instead of accessing the base tables directly, your front-end applications or stored procedures will access the partitioned view. SQL Server will take care of getting the right data from right servers, transparent to your application. Even in OLTP scenarios SQL Server manages the INSERT, UPDATE and DELETE commands transparently and sends them to the right partition. 

What other things I should consider before implementing federated database servers?

  • Availability: Make sure all table partitions spread across different servers are accessible all the time, or else you will not be able to use the partitioned view.
  • Backup/Restore: If transactional consistency across the partitioned is not a concern, you can backup and restore some or all of your partitions individually. If you must achieve transactional consistency, perform coordinated backup and restore operations. This means that you backup all your partitions simultaneously and in case you have to restore them, you will have to restore all your partitions to a certain point in time.
  • Security: Since you have your data partitioned across multiple database servers, you have to follow consistent security practices across all your servers.

What is the performance gain I can expect by scaling out?

There is no easy way to calculate the performance gain, without actually implementing and testing the scenario, as the performance depends on a lot of other factors too. But in general, in a simulated DSS system, I achieved a performance gain of 25 to 30% by scaling out to just two SQL Servers.