The Origins of Databases

Let’s discover the evolution of databases—from IBM’s early IMS to the revolutionary relational model. We are going to learn how DBMS transformed data management and shaped modern systems.

This article is based on the lessons I teach about DBMS at Fondazione PIN in Prato in 2025.

If you’re new to the world of databases, I recommend checking out my Introduction to Database post for a beginner-friendly overview of key concepts before we explore the history of DBMS.

The Origins of Databases
#

In the 60s and 70s, the first data management systems were closely tied to the hardware they ran on. Companies used hierarchical and network databases, which were strictly bound to the physical structure of the data. A well-known example is IMS (Information Management System) by IBM, developed to manage data for NASA’s Apollo mission.
These systems had a series of problems:

Lack of flexibility: every change to the data structure required significant rewrites of the application code.
High management cost: developers had to know the details of the physical implementation to query the data efficiently.
Maintenance difficulties: as the volume of data grew, these systems became increasingly complex and hard to manage.

The Relational Model Revolution
#

In the 70s, Edgar F. Codd, an IBM researcher, proposed a new approach: the relational model.

Published in 1970 in the famous article "A Relational Model of Data for Large Shared Data Banks", the relational model suggested representing data through relations (tables) instead of rigid structures based on trees or networks.
The advantages of the relational model included:

Separation between data and physical implementation: users did not have to worry about how the data was physically stored.
Flexibility: tables could be modified more easily than hierarchical or network structures.
Ease of querying: the introduction of a declarative language to query data, SQL (Structured Query Language).

At first, IBM ignored Codd’s idea because they already had a successful database system, IMS. However, over time, the relational model proved to be clearly superior. In the following years, IBM developed System R, one of the first relational databases, and with it SQL was created—the language that today is the standard for querying databases. In the 80s, the relational model became firmly established, thanks to products such as:

IBM DB2
Oracle Database
Ingres (from which PostgreSQL later originated)
Microsoft SQL Server

The relational model became the foundation for most commercial and open-source databases, thanks to its ability to manage data in a structured and efficient way.

The Relational Model
#

The relational model defines a database abstraction based on relations to avoid the burden of maintenance. The three key ideas of the relational model are:

Data organized in relations (tables) without exposing physical details to the application.
Physical independence: data can be stored on different devices or structures without affecting the application.
Declarative queries in SQL, which allow you to specify “what” to search for rather than “how” to find it, leaving the database to optimize data access.

There are three components of the relational model:

Structure: The definition of the database relations and its content is independent from their physical representation.
Integrity: Ensures that the database content meets the set constraints.
Manipulation: The programming interface used to access and modify the database content.

Data Independence
#

The idea of data independence is precisely to not expose the low-level data representation to the user or application. Different levels of a db schema

At the lowest level, there is the data storage, where our data is physically saved.
The second level is the physical schema, which represents how our database will be stored in the underlying layer; for example, how the files are organized, whether there is one or more, how the pages are structured, etc.
The third level is the logical schema, which defines the schema of the database and its constraints.

Example: The Relational Model
#

Let’s take an example to better understand what a relational model is. Imagine our relation Album(name, artist, year)

name	artist	year
Hybrid Theory	Linkin Park	2000
21	Adele	2011
Fine Line	Harry Styles	2019

A relation is an unordered set that contains the attribute relations representing entities.
A tuple is a set of values for an attribute (its domain) in the relation.

The values are “atomic” (i.e., no JSON).
The special value NULL is a member of every domain (if available).
A primary key uniquely identifies a single tuple.
Album(id, name, artist, year)

id	name	artist	year
1	Hybrid Theory	Linkin Park	2000
2	21	Adele	2006
3	Fine Line	Harry Styles	2019

A foreign key specifies that an attribute of one relation corresponds to a tuple in another relation.
Artist(id, name, year, country)

id	name	year	country
101	Linkin Park	1996	USA
102	Adele	2011	UK
103	Harry Styles	2010	UK

Album(id, name, artist, year)

id	name	artist	year
1	Hybrid Theory	101	2000
2	21	102	2011
3	Fine Line	103	2019

Constraints are conditions defined by the user or application that must be met in every instance of the database.

They can validate data in a single tuple or across entire relations.
DBMSs prevent modifications that violate these constraints.

The API that the DBMS exposes to applications to store and retrieve information from the database can be:

Procedural (Relational Algebra): The request defines the (high-level) strategy for retrieving the desired results based on sets or multisets.
Declarative (Relational Calculus): The query specifies only the data you want and not how to find it.

Undestading Relational Algebra is important to understand how SQL and query processing works. Next lesson on Relational Algebra

The Origins of Databases#

The Relational Model Revolution#

The Relational Model#

Data Independence#

Example: The Relational Model#

The Origins of Databases
#

The Relational Model Revolution
#

The Relational Model
#

Data Independence
#

Example: The Relational Model
#