Profiling data

After loading data from a file or database you may want to make a few quick checks to see if the data makes sense and there are no obvious issues with its quality. This process is frequently called data profiling. EasyMorph offers a few ways to profile data.

Column Profiler

The Column Profiler is invoked by double-clicking a column header. Alternatively, right-click the column header and choose "Filter/Profile". The Column Profiler has 3 tabs: Values, Profile, Histogram.

Column profiler

Tab "Values" shows a list of unique values in the column. The list is searchable. Also, you can select particular values and create a filter action with them with a single click. You will see quick creation of filters everywhere in the Column Profiler.

Tab "Profile" shows various counts and metadata that help understand what kind of values are present in the column.

Column profiler

Note that dates are numbers in EasyMorph (the type system of EasyMorph is explaned later in the tutorial). Therefore, the Profiler shows counts for possible dates among number counts. Each count/metadata metric has a button for quick filtering.

Tab "Histogram" displays the distribution of numbers in the column. Resize the Profiler window to increase the number of bins in the histogram. Use the sliders and the "Filter range" button to filter a particular range of numbers, if necessary.

Column profiler

Hint: You can click other columns without closing the Column Profiler. When another column is clicked, its metadata is automatically displayed in the Profiler window.

Read next: Applying transformations