Loading data from files

There are three ways to load data from files into EasyMorph:
  • Load one file at once
  • Load multiple uniform files
  • Load a list of files

Watch a similar video in: French, Spanish, German.

Loading a file

To load a file press the "Add data" button, select "Load file" and choose the type of file you want to load.

Below are the file formats supported in EasyMorph:

Description Extensions
Delimited text file (e.g. CSV) .csv, .psv, .tsv, .txt or other
Text with fixed width columns .txt or other
Excel spreadsheets .xls, .xlsx, .xlsm
XML files .xml
Qlik QVD files .qvd
SAS data files .sas7bdat
SPSS/PSPP data files .sav
EasyMorph datasets .dset
SQLite data files* .sqlite, .sqlite3 or other
* To load from an SQLite data file create a database connector first. See Loading from databases for details.

Example: US Census 2012

Alternatively, you can load a file by simply dragging it into EasyMorph. In this case EasyMorph automatically creates a new import action depending on the file extension. Extensions that are recognized automatically: xls, xlsx, txt, csv, psv, tsv, qvd, sas7bdat, dset. You may need to adjust settings of the created import action in order to load the file correctly — e.g. choose a separator, or pick only particular columns.

Hint: When you load multiple files into multiple tables it may clutter your workspace. To reduce clutter move tables to different tabs. To move a table to another tab create a new tab, then right-click the table's tile bar and choose "Move to another tab", or press Ctrl+M.

Loading multiple uniform files as one table

While it's always possible to concatenate multiple tables into one using the "Append" action, it may be inconvenient when many files have to be loaded. In this case, instead of appending tables explicitly, it possible to load multiple files and automatically concatenate them into one table. The files must be uniform — e.g. they must be of the same type and have the same set of columns. To load several uniform files use the multiple load mode which is available in any file import action.

Loading multiple files

In this mode multiple files in a particular folder are loaded and automatically concatenated into one table in EasyMorph. The files to load can be specified in two ways:

  • Explicitly select particular files to load.
  • Select all files that match a search criterion such as search string, wildcard, or regular expression.

Hint: To quickly load multiple files from a folder, drag and drop the folder into EasyMorph. A necessary file import action will be created automatically. This will work even if the folder contains files of different types — EasyMorph will offer you to pick a file type to load.

Advanced topics

Loading a list of files

Loading a list of files is a powerful feature that allows loading and automatically concatenating multiple files defined by a list of file paths. The list of files to load can be defined by various rules that not only include file name, but also file size, file creation date, or file extension. A few examples:

  • Load only the latest 10 files based on file creation date
  • Extract timestamps from file names, load only the lastest 10 files based on the timestamps
  • When multiple files per day exist, load only the latest file
  • Exclude files with zero length
  • Load only files with timestamps that aren't already loaded
  • Include files from subfolders
  • Extract timestamps from subfolder names

Loading a list of files typically requires 3 steps:

  • Obtain a list of files using the "List of files" action.
  • Filter the list of files.
  • Load all files from the list into one table using a file import action in the "Load list of files" mode.
Steps of loading a list of files

Obtaining a list of files is most frequently done using the "List of files" action. This action generates a table with a list of files in specified folder. The list can include not just file paths, but also file size, creation date and extension. There are other ways to produce a list of files in EasyMorph. For instance, the "Fetch email" action receives emails and saves their attachments into a designated folder, returning a list of saved files.

Hint: To quickly produce a list of files in a folder in EasyMorph, drag and drop the folder into EasyMorph. The "List of files" action will be created automatically.

The list of files to load can further be filtered using one or more filtering actions. For instance, you can keep only files with sizes greater than 10'000 bytes. See also the chapter "Filtering data" in this tutorial.

Finally, use an appropriate file import action in the "Load list of files" mode and specify the column that contains the list of files. Note that the list should consist of full file paths, not just names.

Loading a list of files

Processing very big text files

If the file(s) to load are too big (e.g. hundreds of gigabytes) and can't fit in memory then you can split them into chunks first, and process one chunk at a time. The "Split delimited files" action splits text files into chunks based either on the number of rows (used for files with fixed width columns), or unique values in a particular field (i.e. partitioning of delimited text files). The action returns a list of created files (the chunks), which can further be used for iterating.

Read next: Load data from database