Data Catalog basics

Pre-requisites

The actionable Data Catalog (or just Catalog) is a feature of EasyMorph Server. It can be accessed from EasyMorph Desktop or via a web browser (with limited functionality).

If you are a Desktop user, make sure you have configured the Server Link in your EasyMorph Desktop (see the tutorial chapter "Server Link" for more details).

If you're an EasyMorph Server administrator, enable the Data Catalog in the Server space settings (it can be disabled by default).

Overview

The Data Catalog lists various data assets that your team works with on a daily basis: reference lists, tabular datasets, data extracts, reports, etc. The Catalog can be seen as a "data supermarket" where you can find, retrieve, and manipulate any of these data assets no matter where they are located. For instance, your tabular data can be in a database, refernce lists can be in SharePoint, CSV datasets can be on Google Drive, and so on. However, you can access all these assets from the Catalog. Note that the Catalog doesn't store the assets. You can think of the Catalog as a library of "smart bookmarks" that lead to various data used in your organization. Here is what it looks like:

Data Catalog items

In the Catalog, assets are organized into hierarchical directories, basically, virtual folders. Each directory can contain items (assets) and sub-directories (see the image above). You can perform various operations with your data assets right from the Catalog. Available operations depend on the asset type. For instance:

  • Metrics can be displayed
  • Datasets can be retrieved and, in some cases, edited
  • Web resources (such as BI reports) can be opened
  • Files can be downloaded

Let's look closer at various asset types available in the Catalog.

Metrics

A metric is a number (such as a key performance indicator, KPI) that you or your colleagues might be interested in. In the Catalog, metrics are easy to create and modify. You can create metrics in EasyMorph Desktop and also in the web UI of EasyMorph Server. To create a metric, go to the Catalog, and press the "Add item(asset)" button, and choose "Metric".

Metrics in EasyMorph Desktop

Under the hood, the value of each metric is stored in Shared Memory, so you don't need an external database or file to store it. You can modify (update) the value of a metric manually, or in a workflow, using an action.

To modify the value of a metric manually, just open its settings and press the "Change value" button.

To modify the value of a metric in a workflow, use the "Catalog command" action, the "Set metric value" command. It is also possible to change the value of a metric by changing the assiciated shared memory value directly, using the "Shared Memory" action (although, it's not recommended).

Datasets

Datasets are, basically, unformatted tables (a table can have one or many columns, each with a header and data). Any business data that can be represented in a tabular form can be a dataset in the Catalog. For instance, lists of orders, customers, inventory, transactions, employees, products, accounts, to name a few. Technically, such data can be stored in various forms: spreadsheet tables, database tables, text (CSV) files, records in enterprise applications, SharePoint lists, Salesforce objects, and so on. However, where the data is actually stored doesn't matter. In the Catalog, every dataset has a standard look and feel no matter how it's stoed because the Catalog abstracts away the source of data from the user. Therefore, with the Catalog, you can retrieve any business data regardless of its location and source format and regardless of your technical skills.

How does it do it?

When you find the necessary dataset in the Catalog and press the "Retrieve" button (can be seen in the image above), under the hood, EasyMorph Server runs a workflow that retrieves the requested data on the fly. The Catalog then delivers the data from the Server to you and displays it in the Dataset Viewer (see image below) where you can explore it, analyze it, and export to a spreadsheet or one of the other supported file formats. In a way, computed Catalog items can be seen as "smart" Server tasks with a result table that is automatically delivered to the user when the task finishes.

Therefore, datasets in the Catalog are dynamic in the sense that they are computed on the fly when retrieved. Also, because EasyMorph workflows can be parameterized, you can specify parameters (such as start and end dates) when retrieving a dataset. Parameters can reduce the retrieved data down to only the needed records. Excluding unnecessary data helps retrieve data faster and reduce the workload on the underlying systems.

Dataset Viewer

Dataset viewer

When you retrieve a dataset, it opens in the Dataset Viewer. Here, you can do quite a lot with the data:

  • Export data into Excel spreadsheet, CSV file, or a few other formats
  • Filter rows (drag a column header into the filtering area above)
  • Sort rows
  • Open it in the Workflow Editor for data transformation
  • Re-retrive the dataset with different parameters (e.g. for another date range)
  • Update Catalog item properties, such as field metadata and description
  • Trigger commands and retrieve related data (to be explained later in this tutorial)

Filtering is an extremely powerful feature of the Dataset Viewer. It allows finding records, exploring relationships in data, and identifying data quality issues. It is especially convenient, that the filters show not only the column values included into the current selection, but also the excluded ones. To add a filter, click a column header and drag it into the filtering area (above the table). Select one or more values in the filter, and press "Apply".

Dataset filtering

Hint: To quickly create an instant filter with a table value, simply double-click it in the table. To quickly create an instant filter that excludes a table value, Ctrl + double-click the value.

Web pages

Besides datasets, you can add references to various web resources (pages) to the Catalog. Retrieving a web resource is different from retrieving a dataset (described above). When you retrieve a web resource, EasyMorph opens a web browser (the default browser in your computer) with the requested web page. You can add pretty much any web resource to the Catalog. Just a few examples:

  • Business Intelligence reports (e.g. Power BI or Tableau)
  • Corporate wiki pages (e.g. Confluence)
  • Government portals
  • Web applications (e.g. Google Sheets)

Since every web resource has a web address (a.k.a. URL), the Catalog can also calculate it on the fly using a visual EasyMorph workflow. It provides a lot of flexibility, one Catalog item can lead to many web pages depending on the provided values of the workflow parameters. For instance, when opening a Power BI report, its date range can be preset from the computed web address using URL parameters. As a result, the BI report will display exactly the data you need.

Files

Finally, you can retrieve just any file with the Catalog. This can be convenient if your organization operates with lots of uniform files — PDF invoices, images, zip-archives, etc., that have been occasionally retrieved. When retrieving a file item from the Catalog, the file will be delivered and saved into the local folder specified by the user. Just like with datasets and web pages, the location and path of the requested file can be calculated dynamically with an EasyMorph workflow depending on provided parameters.

Advanced topics

Static items

All 3 main Catalog item types: datasets, web resources, and files, can be static. A static item simply means that it's not computed with a workflow, but its result is either hardcoded or pre-calculated. A static dataset is simply a .dset file stored on the Server. When retrieved, it opens in the Dataset Viewer. A static web resource is a constant (hardcoded) URL. Finally, a static file is any file that is stored on the Server. It's downloaded to the user's computer when retrieved.

Charts in Dataset Viewer

To help understand data, a dataset can come with charts that are also displayed in the Dataset Viewer. To open the chart pane, press the "Chart pane" button (1) on the main toolbar. The chart pane can contain one or multiple charts and cross-tables that are updated instantly when a filter is applied or changed.

Chart pane in Dataset viewer

To add or edit charts and crosstables, press the "Add" and "Edit" buttons (2) in the chart pane toolbar. You can drag column headers right into the dropzones (3) of a chart or a cross-table. If you wish to save the current configuration of filters and charts, press the "Update item" button (4) and tick the "Update filters and charts configuration" checkbox.

Regular User license

While holders of the Professional User license can create and retrieve any Catalog items, providing a Professional User license to anyone who needs access to the Catalog can be overly expensive. For users that only consume (retrieve) Catalog items, there is a special license: "Regular User". Holders of the Regular User license can retrieve any Catalog item as well as trigger workflow. Additionally, if permitted by the Server administrator, they can add static items (explained above) to the Catalog. Unlike Professional Users, Regular Users are licensed as unlimited per space, not individually per each person. For more details, see Catalog Add-On pricing.

The "Retrieve Catalog item" action

Catalog items can be used in workflows too. The "Retrieve Catalog item" action imports datasets from "Dataset" items, downloads "File" items, and retrieves URLs from "URL" items.

The "Catalog command" action

Maintenance of Catalog items can be automated using the "Catalog command" action. The action allows updating descriptions of catalog items, as well as add/delete fields and update field names and descriptions. The action can also move items between directories, create new directories, and delete existing ones.

Do you want to discuss this topic?
Join our community forum.