Watch a similar video in:
French,
Spanish,
German.
To load a file press the "Add data" button, select "Load file" and choose the type of file you want to load.
Below are the file formats supported in EasyMorph:
Description | Extensions |
---|---|
Delimited text file (e.g. CSV) | .csv, .psv, .tsv, .txt or other |
Text with fixed width columns | .txt or other |
Excel spreadsheets | .xls, .xlsx, .xlsm |
XML files | .xml |
Qlik QVD files | .qvd |
SAS data files | .sas7bdat |
SPSS/PSPP data files | .sav |
EasyMorph datasets | .dset |
SQLite data files* | .sqlite, .sqlite3 or other |
Example: US Census 2012
Alternatively, you can load a file by simply dragging it into EasyMorph. In this case EasyMorph automatically creates a new import action depending on the file extension. Extensions that are recognized automatically: xls, xlsx, txt, csv, psv, tsv, qvd, sas7bdat, dset. You may need to adjust settings of the created import action in order to load the file correctly — e.g. choose a separator, or pick only particular columns.
Hint: When you load multiple files into multiple tables it may clutter your workspace. To reduce clutter move tables to different tabs. To move a table to another tab create a new tab, then right-click the table's tile bar and choose "Move to another tab", or press Ctrl+M.
While it's always possible to concatenate multiple tables into one using the "Append" action, it may be inconvenient when many files have to be loaded. In this case, instead of appending tables explicitly, it possible to load multiple files and automatically concatenate them into one table. The files must be uniform — e.g. they must be of the same type and have the same set of columns. To load several uniform files use the multiple load mode which is available in any file import action.
In this mode multiple files in a particular folder are loaded and automatically concatenated into one table in EasyMorph. The files to load can be specified in two ways:
Hint: To quickly load multiple files from a folder, drag and drop the folder into EasyMorph. A necessary file import action will be created automatically. This will work even if the folder contains files of different types — EasyMorph will offer you to pick a file type to load.
Loading a list of files is a powerful feature that allows loading and automatically concatenating multiple files defined by a list of file paths. The list of files to load can be defined by various rules that not only include file name, but also file size, file creation date, or file extension. A few examples:
Loading a list of files typically requires 3 steps:
Obtaining a list of files is most frequently done using the "List of files" action. This action generates a table with a list of files in specified folder. The list can include not just file paths, but also file size, creation date and extension. There are other ways to produce a list of files in EasyMorph. For instance, the "Fetch email" action receives emails and saves their attachments into a designated folder, returning a list of saved files.
Hint: To quickly produce a list of files in a folder in EasyMorph, drag and drop the folder into EasyMorph. The "List of files" action will be created automatically.
The list of files to load can further be filtered using one or more filtering actions. For instance, you can keep only files with sizes greater than 10'000 bytes. See also the chapter "Filtering data" in this tutorial.
Finally, use an appropriate file import action in the "Load list of files" mode and specify the column that contains the list of files. Note that the list should consist of full file paths, not just names.
If the file(s) to load are too big (e.g. hundreds of gigabytes) and can't fit in memory then you can split them into chunks first, and process one chunk at a time. The "Split delimited files" action splits text files into chunks based either on the number of rows (used for files with fixed width columns), or unique values in a particular field (i.e. partitioning of delimited text files). The action returns a list of created files (the chunks), which can further be used for iterating.