The easiest way to load a file is to simply drag the file into EasyMorph. In this case EasyMorph automatically creates a new import action depending on the file extension. Extensions that are recognized automatically: xls, xlsx, txt, csv, psv, tsv, qvd, sas7bdat. You may need to adjust settings of the created import action in order to load the file correctly — e.g. choose a separator, or pick only particular columns.
Another way is to create an import action explicitly. You can select appropriate import action from the Start screen, or by going to menu Main (or Design) and pressing "Add action" button. Choose action category "Import" and pick necessary file import action.
Below are the file formats supported in EasyMorph:
|Delimited text file (e.g. CSV)||.csv, .psv, .tsv, .txt or other|
|Text with fixed width columns||.txt or other|
|Excel spreadsheets||.xls, .xlsx, .xlsm|
|Qlik QVD files||.qvd|
|SAS data files||.sas7bdat|
|SPSS/PSPP data files||.sav|
|SQLite data files*||.sqlite, .sqlite3 or other|
Example: US Census 2012
To load a few files drag them one after another into EasyMorph (one at a time), or create a separate import action for each file. One import action creates one table. You can later use the "Append" action to concatenate them into one table, if needed.
Hint: When you load multiple files into multiple tables it may clutter your workspace. To reduce clutter move tables to different tabs. To move a table to another tab create a new tab, then right-click the table tile bar and choose "Move to another tab", or press Ctrl+M.
While it's always possible to concatenate multiple tables into one using the "Append" action, it may be inconvenient when many files have to be loaded. In this case, instead of appending tables explicitly, it possible to load multiple files and automatically concatenate them into one table. The files must be uniform — e.g. they must of the same type and have the same set of columns. To load several uniform files use the multuple load mode which is available in any file import action.
In this mode multiple files in a particular folder are loaded and automatically concatenated into one table in EasyMorph. The files to load can be defined in two ways:
Hint: To quickly load multiple files from a folder, drag and drop the folder into EasyMorph. A necessary file import action will be created automatically. This will work even if the folder contain files of different types — EasyMorph will offer you to pick a file type to load.
Loading a list of files is a powerful feature that allows loading and automatically concatenating multiple files defined by a list of file paths. The list of file to load can be defined by various rules that not only include file name, but also file size, file creation date, or file extension. A few examples:
Loading a list of files typically requries 3 steps:
Obtaining a list of files is most frequently done using the "List of files" action. This action generates table with a list of files in specified folder. The list can include not just file paths, but also file size, creation date and extension. There are other ways to obtain a list of files in EasyMorph. For instance, the "Fetch email" action receives emails and saves their attachments into a designated folder, returning a list of saved files.
Hint: To quickly produce a list of files in a folder in EasyMorph, drag and drop the folder into EasyMorph. The "List of files" action will be created automatically.
The list of files to load can further be filtered using one or more filtering actions. For instance, you can keep only files which size is greater than 10'000 bytes. See also chapter "Filtering data" in this tutorial.
Finally, use an appropriate file import action in the "Load list of files" mode and specify the column that contains the list of files. Note that the list should consist of full file paths, not just names.
If the file(s) to load are too big (e.g. hundreds of gigabytes) and can't fit in memory then you can split them into chunks first, and process one chunk at a time. The "File splitter" action splits text files into chunks based either on number of rows (used for files with fixed width columns), or unique values in particular field (i.e. partitioning of delimited text files). The "File splitter" action returns a list of created files (the chunks), which can further be used for iterating.