Denormalization

In a relational database, denormalization is an approach to optimizing performance in which the administrator selectively adds back specific instances of duplicate data after the data structure has been normalized.

In a relational database, denormalization is an approach to speeding up read performance (data retrieval) in which the administrator selectively adds back specific instances of redundant data after the data structure has been normalized. A denormalized database should not be confused with a database that has not been normalized.

During normalization, the database designer stores different but related types of data in separate logical tables called relations. When a query combines data from multiple tables into a single result table, it is called a join. Multiple joins in the same query can have a negative impact on performance. Introducing denormalization and adding back a small number of redundancies can be a useful for cutting down on the number of joins.

After data has been duplicated, the database designer must take into account how multiple instances of the data will be maintained. One way to denormalize a database is to allow the database management system (DBMS) to store redundant information on disk. This has the added benefit of ensuring the consistency of redundant copies. Another approach is to denormalize the actual logical data design, but this can quickly lead to inconsistent data. Rules called constraints can be used to specify how redundant copies of information are synchronized, but they increase the complexity of the database design and also run the risk of impacting write performance.