Research/Blog

CellStrat > Research/Blog > Artificial Intelligence > Machine Learning > Feature Engineering vs Feature Scaling

Feature Engineering vs Feature Scaling

February 22, 2018
Posted by: CellStrat
Category: Artificial Intelligence Machine Learning

What is Feature Engineering?

Feature engineering is an informal topic, and there are many possible definitions. We define feature engineering as creation of new features from our existing ones to improve model performance.

A typical data science process might look like this:

Project Scoping / Data Collection
Exploratory Analysis
Data Cleaning
Feature Engineering
Model Training (including cross-validation to tune hyper-parameters)
Project Delivery / Insights

What does not come in Feature Engineering?

Initial data collection is not feature engineering.

Creating the target variable is not feature engineering.

Removing duplicates, handling missing values, or fixing mis-labelled classes is not feature engineering. They fall under data cleaning.

Scaling or normalization is not feature engineering because these steps belong inside the cross-validation loop (i.e. after we’ve already built our analytical base table).

Lastly, Feature selection is not feature engineering. This also belongs inside cross-validation loop.

What is Feature Scaling?

Feature scaling is a method used to standardize the range of independent variables or features of data. In data processing, it is also known as data normalization and is generally performed during the data pre-processing step.

Research/Blog

Feature Engineering vs Feature Scaling

Leave a Reply Cancel reply