While working with the real dataset you may see so many feature names that contain different types of data like integer, float, categorical data, date-time data, and so on. All the features are variables that have a different value in each row.
Variable is a characteristic, number, or quantity that can be changed within the context of mathematical problems or experiment. In other words, a variable is something that can contain different values or the value is not fixed. The value can be changed according to our wishes. Generally, it is defined as X, Y, a, b, etc.
For an example;
- x = 20
- X = 23
- X = 53
Here “x” is a variable that is changing its value according to our wishes.
For different purposes, different types of variable are used and as a data scientist, you need to know about all the different types of variables. Variables are mainly classified into two categories;
- Numerical Variable
- Categorical Variable
As the name suggests a numerical variable deal with all types of numeric data like integer, float, double & so on. Numerical data are further classified into two categories.
Discrete Numerical Variable:
This type of variable deals with whole numbers of integer types of data. For example, 1, 2, 3, the number of cars a person has, the number of family members. This can’t be a fractional number. This must a whole number.
Continuous Numerical Variable:
This type of variable can contain any numeric values either it can a whole or fractional. For example, 2.50, 5.10, the height of a person, or the weight of a person. It can be a fractional number that contains any values.
As the name suggests category these types of variable deal with categories. For example, gender, the color of cars, and so on. Categorical variables are classified into three categories:
Ordinal Categorical Variable:
These variables are in a meaningful order this means this type of variable is in ordered shape. Like, the days of the week this comes in an order form like Sunday is the 7th day of the week or Monday is 1st day.
Nominal Categorical Variable:
This type of variable has no intrinsic order. It’s an unordered categorical data type. For example, the color of cars can be red, blue, green, black, or something else. There is no order in the color name. Such it’s a nominal categorical variable.
This is a special type of categorical variable which deals with date & times. It can be either only data, only time, or date-time. This type of data is often used for time series analysis or finding trends in the dataset.
Note: There is another special type of discrete variable which is the Boolean discrete variable. This type deal with only 0 & 1. Though it represents categorically it is a discrete variable. For say, male & female which are categorical variable but sometimes in the dataset, they are represented with 0 for male and 1 for female.