What is keys in sql?

Question

Lotus Aster · Answer

Pre-Requisite: DBMS | Relational Model Introduction and Codd Rules

Keys are one of the basic requirements of a relational database model. It is widely used to identify the tuples(rows) uniquely in the table. We also use keys to set up relations amongst various columns and tables of a relational database.

1. Candidate Key: The minimal set of attributes that can uniquely identify a tuple is known as a candidate key. For Example, STUD_NO in STUDENT relation.

Example:

Table STUDENT

Example:

Table STUDENT_COURSE

Note: In SQL Server a unique constraint that has a nullable column, allows the value ‘null‘ in that column only once. That’s why the STUD_PHONE attribute is a candidate here, but can not be a ‘null’ value in the primary key attribute.

2. Primary Key: There can be more than one candidate key in relation out of which one can be chosen as the primary key. For Example, STUD_NO, as well as STUD_PHONE, are candidate keys for relation STUDENT but STUD_NO can be chosen as the primary key (only one out of many candidate keys).

Example:

Table STUDENT

3. Super Key: The set of attributes that can uniquely identify a tuple is known as Super Key. For Example, STUD_NO, (STUD_NO, STUD_NAME), etc. A super key is a group of single or multiple keys that identifies rows in a table. It supports NULL values.

Example:

4. Alternate Key: The candidate key other than the primary key is called an alternate key.

Example:

5. Foreign Key: If an attribute can only take the values which are present as values of some other attribute, it will be a foreign key to the attribute to which it refers. The relation which is being referenced is called referenced relation and the corresponding attribute is called referenced attribute the relation which refers to the referenced relation is called referencing relation and the corresponding attribute is called referencing attribute. The referenced attribute of the referenced relation should be the primary key to it.

Example:

Table STUDENT_COURSE

It may be worth noting that, unlike the Primary Key of any given relation, Foreign Key can be NULL as well as may contain duplicate tuples i.e. it need not follow uniqueness constraint. For Example, STUD_NO in the STUDENT_COURSE relation is not unique. It has been repeated for the first and third tuples. However, the STUD_NO in STUDENT relation is a primary key and it needs to be always unique, and it cannot be null.

6. Composite Key: Sometimes, a table might not have a single column/attribute that uniquely identifies all the records of a table. To uniquely identify rows of a table, a combination of two or more columns/attributes can be used. It still can give duplicate values in rare cases. So, we need to find the optimal set of attributes that can uniquely identify rows in a table.

Example:

sxspo Bisht · Answer

The PRIMARY KEY constraint uniquely identifies each record in a table. Primary keys must contain UNIQUE values, and cannot contain NULL values. A table can have only ONE primary key; and in the table, this primary key can consist of single or multiple columns (fields).

Geir Verdon · Answer

Databases are used to store massive amounts of information stored across multiple tables. Each table might be running into thousands of rows. Needless to say, there will be many duplicate rows with redundant information. How do we deal with that? How do we manage records so that we store only unique data? And how do we relate the multiple tables that are present in the database?

SQL keys are the answer to all these queries.

SQL keys ensure that there are no rows with duplicate values. Not only that, but they also help in establishing a relationship between multiple tables in the database. Therefore, it becomes imperative to learn about the different keys in SQL.

Super Key can contain multiple attributes that might not be able to identify tuples in a table independently, but when grouped with certain keys, they can identify tuples uniquely.

Let me take an example to clarify the above statement. Have a look at the following table with the schema employees(Id, Name, Gender, City, Email, Dep_Id)

Consider that the Id attribute here corresponds to the employee id. It is unique to every employee at the table. In that case, we can say that the Id attribute can uniquely identify the tuples of this table. So, Id is a Super key of this table. Note that we can have other Super Keys, too, in this table.

For instance – (Id, Name), (Id, Email), (Id, Name, Email), etc. can all be Super keys as they can all uniquely identify the tuples of the table. This is so because of the presence of the Id attribute, which can identify the tuples uniquely. The other attributes in the keys are unnecessary. Nevertheless, they can still identify tuples.

A candidate key is a column or a combination of columns that uniquely identifies each row in a table. It is used to ensure that there are no duplicate or ambiguous records in a table.

A Candidate key is a subset of Super keys and devoid of unnecessary attributes that are not important for uniquely identifying tuples. For this reason, you can also call the Candidate key a minimal Super key.

The value for the Candidate key is unique and non-null for all tuples. It encapsulates two important constraints – the unique key constraint and the not null constraint. This ensures that values in the Candidate key do not contain any duplicate values. And every table has to have at least one Candidate key. But there can be more than one Candidate Key too.

For example, in the example that we took earlier, both Id and Email can act as a Candidate for the table as they contain unique and non-null values.

On the other hand, we cannot use the attributes like City or Gender to retrieve tuples from the table, as they have no unique values.

Whereas querying the table on the Id attribute will help us to retrieve unique tuples.

Out of all the Candidate keys that can be possible for a table, only one key will be used to retrieve unique tuples from the table. This Candidate key is called the Primary Key.

There can be only one Primary key for a table. Depending on how the Candidate Key is constructed, the primary key can be a single attribute or a group of attributes. But the important point is that the Primary key should be a unique and non-null attribute(s).

There can be two ways to create a Primary key for the table. The first way is to alter an already created to add the Primary key constraint on an attribute. This is shown below:

Here, I have chosen the Id as the primary key attribute.

Now if I try to add a new row with a duplicate Id value, it gives me an error message.

The second way of adding a Primary key is during the creation of the table itself. All you have to do is add the Primary Key constraint at the end after defining all the attributes in the table.

To define a Primary Key constraint on multiple attributes, you can list all the attributes in the parenthesis, as shown below.

But remember that these attributes should be defined as non-null values; otherwise, the whole purpose of using the Primary key to identify tuples uniquely gets defeated.

Note: Knowing the difference between the Candidate key and the Primary key is extremely important as it is a popular interview question!

There can be only one Primary key for a table. Therefore all the remaining Candidate keys are known as Alternate or Secondary keys. They can also uniquely identify tuples in a table, but the database administrator chose a different key as the Primary key.

If we look at the Employee table once again, since I have chosen Id as the Primary key, the other Candidate Key (Email) becomes the Alternate key for the table.

A Foreign key generates a relationship between the parent and host tables. For example, in addition to the Employee table containing the employees’ personal details, we might have another table, Department containing information related to the department of the employee with the following schema Department(Id, Name, Location).

The Primary key in this table is the Department Id. We can add this attribute to the Employee by making it the Foreign key in the table. We can either do this when we are creating the table or we can alter the table later to add the Foreign Key constraint. Here I have altered the table, but creating Foreign Key during table creation is similar to that for Primary Key.

Here, Dep_Id is now the Foreign Key in table, Employee while it is a Primary Key in the Department table.

The Foreign key allows you to create a relationship between two tables in the database, thereby ensuring normalization in relational databases. Each of these tables describes data related to a particular field (employee and department here). Using the Foreign key, we can easily retrieve data from both tables.

Note: To operate on Foreign keys, you need to know about Joins, which you can find out in detail in this article.

Using Foreign keys makes it easier to update the database when required. This is so because we only have to make the necessary changes in limited rows. For example, if the Marketing department shifts from Kolkata to Pune, instead of updating it for all the relevant rows in the Employee table, we can simply update the location in the Department table. This ensures that there are only a few places to update and less risk of having different data in different places.

The concept of Foreign key is fundamental to understanding the referential integrity constraint in RDBMS or relational databases. Referential integrity ensures that any column which is declared as a foreign key in a table can contain only null values or the values which are present in the primary key of the base table. For example, the Dep_Id foreign key in the Employees table can contain only the department ids present in the Department table or the null value. Anything beyond that would break the referential integrity in a relational model.

Sometimes it is possible that no single attribute will have the property to identify tuples in a table uniquely. In such cases, we can use a group of attributes to guarantee uniqueness. Combining these attributes will uniquely identify tuples in the table.

Consider the following table:

Here, neither of the attributes contains unique values to identify the tuples. Therefore, we can combine two or more attributes to create a key uniquely identifying the tuples. For example, we can group Transaction_Id and Product_Id to create a key that can uniquely identify the tuples. These are called composite keys.

In this article, we understood the importance of keys in databases. We covered the most common and widely used SQL keys that any professional looking to work with databases should know about. We also saw how to implement each of those keys in SQL and understood the differences between each type of key.

I hope this gave you a good background on keys in SQL, whether you are from computer science, data science, data analysis, or even data engineering background.

Key Takeaways

If you are looking to work with SQL in Python, I suggest going through this article. Or if you are someone who is looking for some SQL techniques to employ for better data analysis, then you shouldn’t miss this great article.

Vega Ahmed · Answer

Super Key. A super key is a set of one or more than one key that can be used to identify a record uniquely in a table. Candidate Key. Primary Key. Alternate key. Composite/Compound Key. Unique Key. Foreign Key.

Madalen Marcoux · Answer

In SQL, keys are fields in a table that are used to identify specific row(s) in a table and also to find or create relationship between tables.

Keys are considered useful because of the following

SQL supports various types of keys:

Primary Key, Candidate Key, Unique Key, Composite Key, Super Key, Alternate Key, Foreign Key

A key selected to identify each record uniquely in a table. Columns marked as primary keys aren’t allowed to have null values. It keeps unique values throughout the column.

A table is expected to have only one primary key, although there are cases where a table can derive its primary key using two columns.

Hint: See below — Composite key.

Primary keys can be defined using the create statement when creating a table

Another way to create a table with the primary key

Suppose we didn’t define the primary key when the table was created, we can define it later like this:

Candidate key is a key of a table which can be used as the primary key. A table can consist of multiple candidate keys, and out of these, one can be selected as a primary key. Any column that can be selected as primary key and it still makes sense, that column or columns are candidate keys.

Id, registration_number, ssn are candidate keys. We have identified id to serve as the primary key. Should we drop/remove id from the table, registration_number or ssn can serve as primary key. This is because each attribute is unique to each student. It wouldn’t be right to have two students that have the same registration number or social security number (ssn).

Unique key can identify each row in a table uniquely like a Primary Key. But, unlike a primary key, a unique key can have only a single null value and it does not allow for duplicate values in the column. In the students table, registration_nos is identified as the unique key. We can’t have two students having the same registration number.

When creating a table, situations may arise where a single column doesn’t provide enough unique information to serve as primary key. In this case, two columns in the table would be combined to serve as the primary key.

When the primary key in the table consists of two or more columns, we have a composite key. Composite key can also be called compound or concatenated key.

None of these columns alone can play a role as primary key.

customer_id alone cannot become a key as the same customer can place multiple orders, thus the same customer can have multiple entries.

order_id alone cannot be a primary key as a same order can contain the order of multiple products, thus same order_id can be present multiple times.

product_code cannot be a primary key as more than one customer can place order for the same product.

Based on this, it is safe to assume that the primary key should be having more than one attribute: Primary Key in above table: {customer_id, product_code}

It is a set of columns that uniquely identifies each row in a table. It is the combination of two or more column that can be used to identify a record uniquely in a table.

Alternate key is a candidate key, currently not selected as primary key. It can work as a primary key, but currently isn’t the primary key. Alternate keys can also be called secondary keys.

ssn, registration_number in the above defined student table are examples of an alternate key.

Foreign Key is a column in a table that is used as the Primary key in another table. It can accept multiple nulls and duplicate values. In the students table, the dept_id column identifies as the foreign key, because it is the primary key in the department table and it relates each student to a department (Each student must belong to a department).

Knowledge of keys in relational databases, how they are defined and used is an important concept for any SQL developer or administrator.

In this article, we have looked at the different types of keys and how the relate to each other.

I do believe you now have an understanding of the different types of keys and how they are used.

Please leave your comments and reviews, it would be helpful to me.

Ask Sawal

What is keys in sql?

Related Questions

More Questions

Contact