Database Reengineering: Some Examples

13 March 2018 | Vasyl Soloshchuk

There are several situations in which database reengineering may be required:

Software performance problems caused by an inefficient database.
Changed requirements to software products demanding updates to data structure.
Significant differences in the latest database version compared to earlier versions, where these differences demand changes in the database.

When database reengineering is on the agenda, there are a number of possible approaches. These are based on the following changes to the database:

Redesign. If the existing data structure is deficient, its redesign can result in higher performance.
Migration to or addition of a database of the same type. Using another database that has more acceptable indicators compared to the existing one can improve performance of the software system or solve other issues.
Migration to a database of another type. In some cases, using a non relational database instead of a relational one may solve performance problems. In other cases, a relational database might be more efficient.

Today, I would like to discuss database reengineering, and to provide some examples of this.

Redesign

If, when analyzing a database structure, you see that it is not relevant to the first and second normal forms, it should be redesigned. This means that you may need to change essences, tables, or columns, or carry out decomposition and aggregation.

Under database redesign, significant code upgrades may be required. Any changes or redesign of the data structure will cause code modification on the server side. This is a complicated task that becomes easier if you are able to benefit from an ORM (e.g., Hibernate). Then, when you modify code, the database is automatically renewed and all the changes are represented there.

Example 1

A scheme in a database included about 450 tables. Data profiling showed problems with some of these tables. During reengineering, we provided the following changes:

added some columns and removed others;
cut all essentials that had been designed incorrectly, and added improved essentials instead.

As a result, the scheme included more than 450 tables. The system performance was improved as it was able to work more quickly thanks to the addition of new essentials (which included cascade operations, indexes, connections between tables, and primary keys), and the improvements to incorrect essentials (for example, where possible, simpler data types had been implemented).

Example 2

In this case, database reengineering was needed not due to performance problems, but to changes in business logic, which had caused new data and data types to appear. The database structure had thus changed, and some tables appeared to be redundant.

As a result of reengineering, the modified database became relevant to the new business logic.

Migration to or addition of a database of the same type

If some other database has better indicators, migrating to it may result in your software working more efficiently. The following table compares some of the indicators of popular relational databases.

Database	Max DB size	Max table size	Max row size	Average execution time⁽⁵⁾	Average CPU utilization⁽⁵⁾	Average memory usage⁽⁵⁾
MySQL	Determined by operating system ⁽¹⁾	MyISAM storage limits: 2TB; Innodb storage limits: 64TB ⁽¹⁾	64KB ⁽¹⁾	1222.3 ms	24%	12.5 MB
Oracle	Determined by operating system ⁽²⁾	4GB * block size ⁽²⁾	8KB ⁽²⁾	1275.3 ms	34%	17.9 MB
SQL Server	524,272TB ⁽³⁾	524,272TB ⁽³⁾	8060 bytes ⁽³⁾	1101.3 ms	27%	13.2 MB
BD2	2⁴⁸ bytes ⁽⁴⁾	128TB ⁽⁴⁾	32KB ⁽⁴⁾	1032.3 ms	28%	22.6 MB

Data sources: (1) dev.mysql.com; (2) docs.oracle.com; (3) msdn.microsoft.com; (4) www.ibm.com; (5) Youssef Bassil. A Comparative Study on the Performance of the Top DBMS Systems. Journal of Computer Science & Research (JCSCR) – ISSN 2227-328X Vol. 1, No. 1, Pages. 20-31, February 2012

Depending on which parameters are the most important, the most suitable database can be selected.

Another reason for adding a new database to your software: the new software client stores data in a different database from the original. Thus, to enable the new client to use your software, the software product should work with the database.

Example 3

A software product was developed to process data stored in a Microsoft SQL Server. A new client had all data stored in PostgreSQL, and used the same data in own application, meaning that the data could not be migrated to another database. The software product was to be modified so that it could work with PostgreSQL as well. Each database has its own specific features. Thus, to minimize changes in code we had to withdraw some features and use only general ones. In addition, we changed the data structures in both databases to make them compatible. Still, we had to create some exclusive code for each database.

Migration to a database of another type

Because relational databases were not created for OOP languages, you may need to consider using a non-relational database.

Conclusion

Databases are one of the most important components of every software product, since they store all data used in the software. Database reengineering may be required for various reasons; depending on the need, one of the approaches mentioned above can be used. I hope the examples of database reengineering provided help to demonstrate some of these reasons, objectives, and challenges, as well as the results of our actions.

database reengineering

Database Reengineering: Some Examples

CTO Insights

CTO Insights #1 with Christian Longberg— “Processes stop people from being creative”

Tags & Labels

Follow Us

Redesign

Example 1

Example 2

Migration to or addition of a database of the same type

Example 3

Migration to a database of another type

Conclusion

Featured Articles

The Legacy Pain of Late-Stage Software Startups and How to Cope With It

Merging companies and integrating platforms

Frontend–Backend VS Full-stack Development: “Divide and Conquer” or “Join Together”?

Multi-tier (8-Layer) Architecture for Enterprise Solutions

How to Organize Efficient Work of a Distributed Development Team