BUILD: The Dev Conference for AI & Apps (Nov. 4-6)

Hear the latest product announcements and push the limits of what can be built in the AI Data Cloud.

From Data Wrangling to Feature Engineering

Every year, insights from business analytics and machine learning (ML) have a bigger and bigger effect on how organizations solve business problems with data. However, the insights in a business dashboard or predictions from an ML model are only as valuable as the quality of the data behind them.

Building high-quality data sets is a multi-step process known as data wrangling which includes cleaning, mapping, and transforming data into a workable format.

These activities commonly involve the following:

  • Merging multiple data sources into a single data set
  • Identifying gaps in the data (for example, empty cells in a table) and either filling or deleting them
  • Deleting data that’s either unnecessary or irrelevant to the project at hand, such as removing duplicates
  • Identifying extreme outliers in the data

This ebook describes how analytics and data science teams can maximize efficiency by leveraging a cloud data platform to unify and govern both data wrangling and feature engineering activities.