Profiling data

After loading data from a file or database you may want to make a few checks to see if the data makes sense and there are no issues with its quality. In case of data quality issues, you may want to perform a more thourough investigation in order to understand the scale and patterns of the problem. This process is frequently called data profiling. EasyMorph provide comprehensive means to profile data:

  • The "Cell Metadata" dialog is used to profile cell values.
  • The "Column Profiler" dialog is for profiling individual columns.
  • Finally, the "Analysis View" is a powerful tool for associative filtering and analyzing relationships between table values.

Cell Metadata

The "Cell Metadata" dialog is invoked by right-clicking a cell and choosing "Cell metadata". It displays the cell value's data type and additional metadata. For instance, in the screenshot below, with the help of the cell profiler you can see that the cell value is actually a text, not a number.

Cell profiler

Note that the dialog is floating — you can keep it open while clicking different cells.

Column Profiler

The Column Profiler is invoked by double-clicking a column header. Alternatively, right-click the column header and choose "Filter/Profile".

Column profiler

The tab "Values" shows a list of unique values in the column. The list is searchable. Also, you can select particular values and create a filter action with them right from the profiler with a single click.

The tab "Profile" shows various counts and metadata that help understand what kind of values are present in the column.

Column profiler

Note that dates are numbers in EasyMorph (the type system of EasyMorph is explaned later in the tutorial). Therefore, the Profiler shows counts for possible dates among number counts. Each count/metadata metric has a button for quick filtering.

Hint: The "Column Profiler" dialog is floating too. When the header of another column is clicked, the column metadata is automatically displayed in the Profiler window.

Watch a relevant video in: French, Spanish.

Advanced topics

Analysis View

When you maximize a table, EasyMorph automatically switches to the Analysis View. To maximize a table simply double-click its title bar, or click the "Maximize" button on the title bar.

Maximize table

The Analysis View allows performing instant filtering of the result dataset of any action in a table. To create an instant filter for a column simply drag the column's header into the filtering pane (turned on by default). The instant filters are searchable and sortable, and retain selections when switching between actions. They are a very powerful tool for exploratory data analysis and profiling.

Analysis View

Hint: In the Analysis View you can still add/remove actions, and edit action properties in the left sidebar (collapsed by default).

Table metadata

To see the table metadata summary press the "Table metadata" button in the ribbon menu of the Analysis View. The summary shows the column profiles (the same as in the Column Profiler described above) for all columns in the current dataset. Note that all numbers in the metadata summary table are clickable. When you double-click a number (or Ctrl + double-click for exclusion), an according selection is immediately applied in the respective instant filter.

Table metadata summary

To exit the Analysis View, press the "Exit Analysis View" button in the ribbon menu, or simply double-click the table's title bar again.

Read next: Applying transformations