Database Migration: The Key to Unlocking your Business Assets
Database Migration: The Key to Unlocking your Business Assets
In today’s hyper competitive business environment, business success hinges on the ready availability of accurate and timely information derived from corporate information assets. The mission-critical data from which that information is derived predominately resides in relational database systems distributed across the organization, in it’s departments and corporate data centers. To improve competitiveness it is often necessary to upgrade legacy systems to newer technologies to achieve more flexible use of the data, or better performance or reliability. In addition, as IT costs increase and budgets are tightened, consolidation of systems is often seen as a way both to improve operational efficiency and to significantly reduce the total cost of ownership of the various hardware and software platforms.
In spite of the need to achieve these critical business objectives, quite often the cost required to achieve them can be prohibitive. One reason for such prohibitive costs, and the subject of this article, is the high cost of migrating databases from one vendor’s relational DBMS to another’s. Either through poor programming practice in which proprietary DBMS features are used [CELKO], or where the use of such features are required to achieve performance objectives, many databases are virtually locked in to a vendor’s DBMS. While SQL standards [SQL92] have been developed precisely to avoid vendor lock-in, every major vendor has added useful features to their SQL dialect that are outside the scope of the standard. As a result of these proprietary extensions, there are numerous differences between each vendor’s SQL dialect, spanning just about every language feature. Because of these differences, applications that extensively use business logic in the database, such as stored procedures, functions, triggers, etc., are difficult to migrate to another database. Until very recently, there have been essentially no tools to assist in converting the business logic from one vendor’s SQL dialect to another’s. Consequently, migrating a medium to large size database is a manual activity that may require hundreds or even thousands of staff-hours to complete. Given that a large enterprise can easily have tens to hundreds of databases to migrate, and given the current cost of labor, even using off-shore resources, the migration expense quickly becomes prohibitive, running into the millions or even tens of millions of dollars.
Rather than being locked-in to a particular DBMS, if technology were available to significantly reduce the time and cost of database migration, then corporate databases could be “liberated,” allowing the business to move them to the database platform that best meets their needs. Then, corporate executives would be better able to access the information they need to make informed decisions, and IT executives would be able to reduce costs and improve operational efficiency.
To provide an understanding of the difficulty of database migration, and why it can be prohibitively expensive, we explain each step of the database migration process, and the difficulties involved. We also discuss technology that exists today to significantly speed up and cost reduce almost every step of the migration process. We focus primarily on the software products recently introduced to the market by DB Best, (www.dbbest.com), specifically to reduce the time and effort required to migrate databases from one DBMS to another. The initial products address Oracle to Microsoft SQL Server migration, and additional migrations will be added in the future as the market dictates.
Database Migration Process
Database migrations are quite common within a brand, e.g., when it becomes necessary to upgrade to the vendor’s latest release either to gain access to the new features or because the current version has been designated at “end of life.” Migrations within a brand are typically non-trivial, but they don’t compare in difficulty to migrations from one brand to another, e.g., say, from Oracle to Microsoft SQL Server.
The main steps in the migration of a database from one vendor’s DBMS to another’s are discussed in the subsections below and are depicted in the flow diagram shown in Figure 1.
Assess the Migration Project. Suppose that IT management has determined that it would be beneficial to migrate some number of databases from, say, Oracle to SQL Server. The first step in the migration is to determine what would be involved, namely to assess the difficulty, time and cost of the migration. Factors that contribute directly to this are the “complexity” of the databases and of the applications that use them. While there is no formal measure of database complexity, a database will be increasingly complex (and costly) to migrate the more business logic objects (stored procedures,
functions, triggers, packages, etc.) it contains, the larger the size of those objects, and the choice of language features used to implement them. Adding to the migration complexity are system function differences between the two platforms. Functions provided on one platform may not be provided on the other, or there may be syntactic or semantic differences in the implementation of common functions.
The complexity of migrating the applications to the new database also need to be assessed. Contributing to the complexity are the number of SQL requests made by the application, since all of them must be converted. More importantly, the degree to which dynamic SQL is used adds significantly to the conversion complexity, since a major code rewrite may be necessary to convert dynamic SQL queries.
The result of the migration assessment is (hopefully) a good estimate of the amount of effort that is required to accomplish each step in the migration process, and the overall expense of the project.
It should be noted that conducting a migration assessment can itself be a time consuming and costly effort. It’s not unusual for an assessment to take several months for a large migration project, and to cost tens of thousands of dollars.
Once the assessment is completed, it is not unusual to decide not to migrate because of the time and cost that would be required. Either of these factors, time or cost, could be the show stopper. The cost to migrate may be too high or the ROI too low to justify the project. Or possibly the project would take too long or be too disruptive of ongoing business operations. Thus, it may cost thousands of dollars just to determine that you don’t want to migrate after all.
Fortunately, there’s a new migration assessment tool [DBB1] that performs a deep analysis to obtain very good estimates of the time and effort required to migrate a database. Currently this tool, the DB Best Migration Analyzer, is limited to assessing Oracle to SQL Server migration, but this limitation will be removed in the future. Typically, an assessment will take less than a day to complete. The tool provides an estimate of the labor required to migrate the database manually, as well as the percentage of the database that the DB Best SQL Converter [DBB2] can convert automatically. The Migration Analyzer computes approximately 100 specific statistics that comprehensively characterizes the database. The SQL Converter typically converts over 90% of the SQL code, automatically, and it has accomplished up to 99% automatic code conversion. In addition to the comprehensive profile of the Oracle database, the assessment report indicates the database objects that the SQL Converter can’t convert together with the specific constructs in the code that cause the problem.
It should be noted that the Migration Analyzer entirely eliminates the need to incur a significant expense just to find out whether migration is a viable option. Notice also that since the migration has been largely automated, the migration expense and time can both be reduced significantly, by dramatically reducing the human effort both to perform the conversion and to test it.
Migrate the Schema and the Data. The schema is the basis for defining the target database. In migrating from Oracle to SQL Server, the new schema can be derived in a fairly straightforward manner from the original schema. Because of feature differences between the two platforms, the mapping between the schemas may not be one to one. An example of where this arises is in dealing with long-valued attributes such as BLOBs.
Oracle supports multiple BLOB-valued attributes per table while SQL Server supports only one per table. In this case, a single table in Oracle may map to multiple tables in SQL Server. Because of such differences, migrating the data between databases may not be a simple table copy, since the new table structures may be somewhat different from the old ones. Tools to help in the manual definition of database schemas have been available for many years [CAER, EMBER, SYPD]. Another set of tools exist to transform and physically move large databases between platforms [ASDS, CAIP]. The DB Best Schema and Data Migrator [DBB3] automatically converts an Oracle database schema (tables, constraints, indexes) to a Microsoft SQL Server schema and physically migrates the data to populate the new database.
Migrate the Business Logic. As we noted above, the complexity of a database from a migration perspective is a function of the number of business logic objects it contains, the size of those objects, and the particular language features used to implement them. Additional complexity derives from differences in the system function libraries provided in the two platforms.
For example, Oracle’s stored procedure language, PL/SQL, has many features not supported in Transact-SQL (T-SQL), the stored procedure language of SQL Server, and even where there are similar features, the languages contain many syntactic and semantic differences. To migrate a PL/SQL object that uses a feature that has no T-SQL counterpart requires that that capability be emulated in T-SQL using the available features. Developing a good emulation can be quite subtle and time consuming. In fact many subtleties need to be addressed in the migration whether developing emulations for missing features or accounting for semantic differences in corresponding features. Just a few small examples of these differences are shown in Table 1, below:
Table 1. Examples of differences between Oracle and SQL Server
In addition to the language differences, system library differences need to be dealt with. There are functions in one library that either don’t exist in the other, or that exhibit semantic differences. For example, the Oracle system function BITAND(p1, p2), that does a bit by bit AND operation on the arguments, has no counterpart in SQL Server. As a result, a corresponding function must be developed for use on SQL Server. There are approximately 18 Oracle functions that have no SQL Server counterpart and must be developed to support the migration. In addition, about 17 Oracle functions are only partially supported by SQL Server functions, e.g., some use cases are not supported, so if any of the unsupported cases are needed, these must be developed as well. SQL Server supports 29 of the Oracle functions directly.
DB Best provides a set of functions for SQL Server to support the Oracle functions, or Oracle function use cases that are absent from the SQL Server system function library.
Until very recently, most of the database conversion effort had to be done by hand. IBM and Oracle provide some tools to assist in migrating to their DBMSs, but their tools are not of product quality and are incomplete, leaving much work to be done by hand. Microsoft doesn’t provide any code migration tools. For any sizable database, there may be thousands of objects that must be converted by hand. Given the knowledge level needed to do this work successfully, this labor-intensive activity is very expensive, and even if offshore labor is utilized for such a project, it can require months of effort and millions of dollars to complete.
Fortunately, tools are beginning to emerge to assist with code migration. DB Best’s SQL Converter [DBB2] for converting PL/SQL to T-SQL, has consistently been able to achieve 90% to 99% code conversion on the databases encountered so far. This is good news for corporate and IT executives since this tool dramatically shortens the time needed to convert a database, and it eliminates most of the labor to do so. Thus, this represents an important first step towards liberating corporate database assets.
Test the Converted Database. Once the converted schema, data, and code have been loaded into the new DBMS, it is necessary to ensure that the new database is semantically equivalent to the old database. For this to be so, the results of all operations on the new database must be identical to those on the old one. There’s no better way to gain such assurance than to exhaustively perform the same operations on both databases and to compare the results. After sufficient testing, when it is clear that the new database behaves as required, the database may be considered migrated and ready for operational use.
To facilitate the comprehensive testing of a converted database, DB Best is about to release its Migration Tester that automatically generates two sets of test data conforming respectively to the old and new schemas. It also constructs a set of test operations to exhaustively apply against both databases. The test operations closely model the current operations on the old database. The goal of the Migration Tester is to provide, through comprehensive testing with a credible set of test queries, a high degree of assurance that the new database is semantically equivalent to the old one. This product is scheduled for release during the summer, 2004.
Migrate the Application. In addition to migrating the SQL code in the database, the SQL queries issued by mid-tier or client applications must also be converted. Even when a standards-based database interface is used, such as ODBC or JDBC, the SQL dialect issued by the application to interact with the old database must be converted to interact with the new database. Since the new database schema is unlikely to be identical to the old schema, the converted queries must reflect the new schema.
Notice that if the application uses a standard driver such as ODBC or JDBC, and if it does not use dynamic SQL, then converting the application to use the new database will be relatively straightforward – scan the application code for all the SQL calls and substitute equivalent calls on the new database. On the other hand, if dynamic SQL is used, then the problem becomes much more difficult since the queries are constructed at run time. In this case, tracking down all the SQL calls requires considerable code analysis and rewriting. Depending on the size of the application, this can be quite time consuming and there is risk that the application won’t work exactly like the old one.
In addition, if the application was implemented in a proprietary language or if it uses a proprietary database API, then it may be necessary or desirable to rewrite the application entirely. This clearly can be a very time-consuming step in the migration process, both to rewrite the application and to do comprehensive testing.
As an alternative to rewriting an application, DB Best is developing a unique technology, called the Runtime Query Converter [DBB 2] that completely eliminates the need to convert the application. The Runtime Query Converter works regardless of whether the application uses static or dynamic SQL. The product wraps the database driver and catches every SQL call which it dynamically converts to the new database, using the new schema. It also implements a caching scheme to minimize the overhead of this process, with the result that the processing overhead is negligible. This product, like the other DB Best products are currently aimed at Oracle to SQL Server migration. At it’s initial release in 2004, the Runtime Query Converter will support JDBC-based applications, followed also in 2004 with releases to support ODBC and OCI, Oracle’s proprietary database driver.
Test, Integrate and Deploy. This is intrinsically a manual step in which the converted application is integrated with the overall system and tested extensively under load against the new database to ensure that it’s ready for operational deployment. Since the database by this time should already have been tested and accepted as correct, if the Runtime Query Converter is used to convert the application’s database calls, then this step, although important, should not be very time consuming. Otherwise, the time to perform this step will be proportional to how much of the application needed to be rewritten.
Optimize the Performance. Once everything is working correctly, it may be desirable to tune the database to improve the overall performance. This could entail performing benchmarking runs to try to find operations that seem to consume excessive time. Once the database performance is better understood, the choice of algorithms in some of the stored procedures might be changed, or the indexing strategy might be modified, or the physical layout of the database on disk might be altered. If the performance is inadequate after the previous migration steps, then performance tuning and optimization will be needed to achieve acceptable performance. Otherwise, this optimization step can be deferred to normal DBA activity.
DB Best is planning a future release of a database optimizer that will provide automation in support of improving the logical and physical design of the migrated database.
Migration Process Flow Summary
In the previous subsections we discussed what is involved in each step of the migration process. In addition, we noted the current and planned migration products developed by DB Best to greatly reduce the time and effort of migrating a database from Oracle to SQL Server, and ultimately, from any major database to another. DB Best currently supports Oracle to SQL Server migration, preferring to add depth to this particular migration path rather than breadth of platforms served. Other migration paths will be supported in the future, depending on market demand. Table 2 summarizes the DB Best technologies that apply to each step of the migration process. The first four products in the table are included in the DB Best Migration Lifecycle Suite that also includes a powerful GUI-based IDE that integrates the four products. The Runtime Query Converter is a runtime module that is licensed separately from the suite.
Table 2. DB Best Products Applied to Migration Process
Distribution of Effort
Above we described the steps involved in database migration. In Table 3 below, the left hand columns present the tasks Accenture uses to describe a migration project and the distribution of effort they have observed in such projects [HAWK]. The table is based on Accenture’s experience with a “typical” Oracle to SQL Server migration, performed in Accenture’s “migration factory” [ACCDMS]. The right hand columns of the table indicate the DB Best product that can be applied to each task, and an estimate of the percentage of the task that is automated by the product. Based on these estimates, we conclude that the DB Best Migration Lifecycle Suite will reduce the total effort to migrate a typical enterprise database by approximately 75%. In fact, we were recently told by Accenture [MCGA] that using just the Migration Assessor and the SQL Converter, are likely to result in a 75% reduction of effort. Thus we expect that the actual effort reduction, once the full Migration Lifecycle Suite and the Runtime SQL Converter are available, will be greater than 75%, thereby making database migrations much more practical than they have been to date.
The days of being locked in to a particular database system or vendor are rapidly nearing an end. The technologies available today to automate the migration of a database from one DBMS to another can provide sufficient time and cost savings to enable IT executives to migrate to the DBMS platform that best meets their needs. DB Best Technologies provides or soon will provide the capability to automate almost every aspect of database migration. As a result, the time, cost, and risk associated with migrating databases between vendor systems will be greatly reduced. We expect a time and cost saving of at least 75% over the almost purely manual methods that are still employed today, and these expectations have been borne out by our experiences to date.
Table 3. Breakdown of Database Conversion Effort
[MCGA] Steve McGarr. Personal communication.
[ACCDMS] Database Migration Solution. Accenture White Paper.
[ASDS] Ascential DataStage.
[CAER] Computer Associates ERwin.
[CAIP] Computer Associates InfoPump.
[CIOM] CIO Editorial Study, IDG Research, CIO Magazine, October 2001.
[CELKO] Joe Celko, “I Will Never Have To Port This Code,” Intel. Enterprise, April 2003.
[DBB1] DB Best Migration Analyzer.
[DBB2] DB Best SQL Converter.
[DBB3] DB Best Schema and Data Migrator.
[EMBER] Embarcadero ER/Studio.
[HAWK] Gerald Hawkins, Accenture. Personal communication, April 8, 2004.
[SQL92] ANSI X3.135-1992, "Database Language SQL."
[SYPD] Sybase PowerDesigner.