What is the name of the individual pieces of information that are stored in a database table?

What is Data?

Before we get into the concept of a database, we should first understand what data is. Put simply, data are pieces of information or facts related to the object being considered. For example, examples of data relating to an individual would be the person’s name, age, height, weight, ethnicity, hair color, and birthdate. Data is not limited to facts themselves, as pictures, images, and files are also considered data.

Índice Show

What is Data?
What is the Role of Databases in an Enterprise?
Components of a Database
What is a Database Management System (DBMS)?
More Resources
1-5 Data is stored in tables
1-6 A row represents an object and the information about it
1-7 A column represents one type of information
1-8 A cell is the smallest part of a table
1-9 Each cell should express just one thing
1-10 Primary key columns identify each row
1-11 Most tables are tall and thin

There are a few key terms that would be useful to help one understand data more, particularly the relation between data and databases.

Fields: Within a database, a field contains the most detailed information about events, people, objects, and transactions.

Record: A record is a collection of related fields.

Table: A table is a collection of related records with a unique table name

Database: A database is a collection of related tables.

What is the Role of Databases in an Enterprise?

Enterprises typically make use of both internal databases and external databases. Internal databases typically include operational databases and data warehouses. The former, operational databases, refer to databases that are actively used in the operations of the business, such as accounting, sales, finance, and HR.

Data warehouses contain data collected from several sources, and the data contained within are generally not used for routine business activities. Instead, data warehouses are usually used for business intelligence purposes. External databases refer to databases external to an organization and are generally accessed over the Internet and are owned by other organizations. An example of an external database is the SEC database.

Components of a Database

The five major components of a database are:

1. Hardware

Hardware refers to the physical, electronic devices such as computers and hard disks that offer the interface between computers and real-world systems.

3. Software

Software is a set of programs used to manage and control the database and includes the database software, operating system, network software used to share the data with other users, and the applications used to access the data.

3. Data

Data are raw facts and information that need to be organized and processed to make it more meaningful. Database dictionaries are used to centralize, document, control, and coordinate the use of data within an organization. A database is a repository of information about a database (also called metadata).

4. Procedures

Procedures refer to the instructions used in a database management system and encompass everything from instructions to setup and install, login and logout, manage the day-to-day operations, take backups of data, and generate reports.

5. Database Access Language

Database Access Language is a language used to write commands to access, update, and delete data stored in a database. Users can write commands using Database Access Language before submitting them to the database for execution. Through utilizing the language, users can create new databases, tables, insert data, and delete data.

What is a Database Management System (DBMS)?

A Database Management System (DBMS) is a well-known term in data analysis. It refers to a collection of programs that enable users to access databases and manipulate, maintain, report, and relate data. A DBMS is often used to reduce data redundancy, share data in a controlled way, and reduce data integrity problems. DBMS is not an information system but is simply software.

The relational model, which saves data in table formats, is the most widely used DBMS. The relational DBMS organizes information into rows, columns, and tables, making it easier to find relevant information. Relational databases are popular because they are easy to extend, and new data categories can be added after the original database is created without large amounts of modification.

The Structured Query Language (SQL) is considered the standard user and application program interface for a relational database, and all relational DBMS software supports SQL. Examples include FileMaker Pro, Microsoft Access, Microsoft SQL Server, MySQL, and Oracle.

More Resources

To keep learning and developing your knowledge of business intelligence and data analysis, we highly recommend the additional resources below:

In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation.

This chapter is from the book 

SQL always deals with data that is in tables. You probably understand tables already on an informal level. The tables used in a relational database have a few unusual features. Because computers need precise definitions, the description of a table must be formalized. In this section, I define what a table is and what its parts are.

1-5 Data is stored in tables

In a relational database, all the data is stored in tables. A table is a two-dimensional structure that has columns and rows. Using more traditional computer terminology, the columns are called fields and the rows are called records. You can use either terminology.

Most people are familiar with seeing information in tables. Bus schedules are usually presented in tables. Newspapers use tables to list stock values. We all know how to use these tables. They are a good way to present a lot of information in a very condensed format. The tables in a relational database are very similar to these tables, which we all understand and use every day.

All the information in a relational database is kept in tables. There is no other type of container to keep it in — there are no other data structures. Even the most complex information is stored in tables. Someone once said that there are three types of data structures in a relational database: tables, tables, and tables. In a relational database, we have nothing but tables; there are no numbers, no words, no letters, and no dates unless they are stored in a table.

You might think that this restricts what a relational database can do and the data it can represent. Is it a limitation? The answer is no. All data is capable of being represented in this format. Sometimes you have to do some work to put it in this format. It doesn’t always just fall into this format by itself. But you can always succeed at putting data into tables, no matter how complex the data is. This has been proven in mathematics. The proof is long and complex and I do not show it to you here, but you can have confidence that tables are versatile enough to handle all types of data.

The following two depictions show a basic table structure and how a table might store information.

A conceptual diagram of a table.

An example of a table that stores information about children.

Each row contains information about one child. Each column contains one type of information for all the children. As always, this table contains only a limited amount of information about each child. It does not say, for instance, how much each child weighs.

1-6 A row represents an object and the information about it

Each row of a table represents one object, event, or relationship. I call them all objects for now, so I do not have to keep repeating the phrase “object, event, or relationship.”

All the rows within a table represent the same type of object. If you have 100 doctors in a hospital, you might keep all the information about them in a single table. If you also want to keep information about 1,000 patients who are in the hospital, you would use a separate table for that information.

The tables in a relational database may contain hundreds or thousands of rows. Some tables even contain many millions of rows. In theory, there is no limit to the number of rows a table can have. In practice, your computer will limit the number of rows you can have. Today, business databases running on large computers sometimes reach billions of rows.

There are also some tables with only one row of data. You can even have an empty table with no rows of data in it. This is something like an empty box. Usually, a table is only empty when you first build it. After it is created, you start to put rows of data into it.

In a relational database, the rows of a table are considered to be in no particular order so they are an unordered set. This is different from the tables most people are familiar with. In a bus schedule, the rows are in a definite and logical order. They are not scrambled in a random order.

Database administrators (DBAs) are allowed to change the order of the rows in a table to make the computer more efficient. In some products, such as Access, this can be done automatically by the computer. As a result, you, the end user seeking information, cannot count on the rows being in a particular order.

A conceptual diagram of a row.

1-7 A column represents one type of information

A column contains one particular type of information that is kept about all the rows in the table. A column cannot, or should not, contain one type of information for one row and another type for another row. Each column usually contains a separate type of information.

Each column has a name, for instance “favorite game,” and a datatype. We discuss datatypes in chapter 6, but for now let’s keep it simple. There are three main datatypes: text, numbers, and dates. This means that there are three types of columns: columns containing text, columns containing numbers, and columns containing dates.

Some columns allow nulls, which are unknown values. Other columns do not allow them. If a column does not allow nulls, then data is required in the column for every row of the table. This means it is a required field. When a column does allow nulls, the field is optional.

Most tables contain 5 to 40 columns. A table can contain more columns, 250 or more, depending on the relational database product you are using, but this is unusual.

Each column has a position within the table. That is, the columns are an ordered set. This contrasts with the rows, which have no fixed order.

Information about the columns — their names, datatypes, positions, and whether they accept nulls — is all considered to be part of the definition of the table itself. In contrast, information about the rows is considered to be part of the data and not part of the definition of the table.

A conceptual diagram of a column.

1-8 A cell is the smallest part of a table

A cell occurs where one row meets with one column. It is the smallest part of a table and it cannot be broken down into smaller parts.

A cell contains one single piece of data, a single unit of information. At least that is the way it is in theory, and this is how you should begin to think about it. In practice, sometimes a cell can contain several pieces of information. In some applications a cell can contain an entire sentence, a paragraph, or an entire document with hundreds of pages. For now we will consider that a cell can contain one of the following:

One word
One letter
One number
One date, which includes the time
A null, which indicates that there is no data in the cell

For the first few chapters of this book, we consider the information in a cell to be atomic, which means that it is a single indivisible unit of information. We gather and arrange information from a table by manipulating its cells. We either use all the information within a cell or we do not use that cell at all. Later, when we discuss row functions, you will see how to use only part of the data from a cell.

A column is a collection of cells. These cells have the same datatype and represent the same type of information. A row is a collection of cells. Together, they represent information about the same object, event, or relationship.

A conceptual diagram of a cell.

1-9 Each cell should express just one thing

Each cell expresses just one thing — one piece of information. That is the intent of the theory of relational databases. In practice, it is not always clear what this means. The problem, partly, is that English and other spoken languages do not always express information clearly. Another part of the problem is that information does not always come in separate units.

Let’s examine one case in detail. A person in America usually has two names — a first name and a last name. Now that is a bit of a problem to me when I want to put information in the computer. There is one person, but there are two names. How should I identify the person? Should I put both names together in one cell? Should I put the names into two separate cells? The answer is not clear.

Both methods are valid. The designers of the database usually decide questions like this. If the database designers think that both names will always be used together, they will usually put both names in a single cell. But if they think that the names will be used separately, they will put each name in a separate cell.

The problem with this is that the way a database is used may change over time, so even if a decision is correct when it is made, it might become incorrect later on.

1-10 Primary key columns identify each row

Most tables contain a primary key that identifies each row in the table and gives it a name. Each row must have its own identity, so no two rows are allowed to have the same primary key.

The primary key consists of several columns of the table. By convention, these are usually the first few columns. The primary key may be one column or more than one. We say that there is only one primary key, even when it consists of several columns, so it is the collection of these columns, taken as a single unit, that is the primary key and serves to identify each row.

The primary key is like a noun because it names the object of each row. The other columns are like adjectives because they give additional information about the object.

A table can only contain a single primary key, even if it consists of several columns. This makes sense because there is no point in identifying a row twice — those identities could conflict with each other. Suppose, for example, that we have a table of employees. Each employee can be identified by an employee number or a Social Security number. The database designers would need to choose which column to make the primary key of the table. They could choose either one to be the primary key of the table, or they could choose to use both together to make a primary key. However, they are not allowed to say that each column by itself is a primary key.

The name of a column is considered to be part of the definition of the table. In contrast, the name of a row, which is the primary key of the row, is considered to be part of the data in the table.

There are two rules that regulate the columns of the primary key of a table:

None of the columns of the primary key can contain a null. This makes sense because a null is an unknown value. Therefore, a null in any part of the primary key would mean we do not know the identity of the object or the row. In databases, we do not want to enter information about unidentified rows.
Each row must have an identity that is different from every other row in the table. That is, no two rows can have the same identity — the same values in all the columns of the primary key. For any two rows of the table, there must be at least one column of the primary key where the values are different.

The first column is usually the primary key of the table.

Sometimes the primary key is the first several columns of the table.

1-11 Most tables are tall and thin

Many books on SQL give the impression that tables are usually square — that they have about the same number of rows as they have columns. This false impression is left because the tables in most SQL books are approximately square. In any book, the tables must be kept small. In a book, when you run SQL code you must be able to examine the results in full detail.

However, the tables that are used in real production systems usually have a different shape. They are tall and thin. They may have 30 columns, but 1,000,000 rows.

Not all tables have this shape, but most do. Some tables have only one row.

I tell you this because I like to visualize the data and the tables I am working with. If you like to visualize them, too, then at least I have provided you with the correct picture. If you are not inclined to visualize these things, do not worry about it. Just go on to the next page.

Most tables have many more rows than columns.