Edit dateAuthorDescription
8/30/2017M.PetersonAdded annotations

Project US census

A demo project that shows import from a 100MB text file, lookups and aggregations, export to a TDE, and a report.

Location: D:\Documents\Projects\morph3\Examples\US census 2012\US census.morph

Overview

The project requires 4 sources of data:

  1. Text file "fips.csv"
  2. Text file "empclass.csv"
  3. Text file "lfo codes.csv"
  4. Text file {Source file name}

It exports data to 1 destination:

  1. Tableau extract file "US Census.tde"

The project creates 8 tables that are later transformed using 28 transformations.

Parameters

The project has the following parameters:

ParameterTypeValueNote
Source file nameFile namecbp12st.txt
Top NText or number20

Table: "FIPS"

Lookup table to replace state codes with state names.

Step 1: Import 2 columns (listed below) from a text file.. Other parameters: the separator is comma, encoding is ANSI (current system codepage 1252 is used), decimal separator is point.

Load file "fips.csv".

  • Name
  • Numeric code

Table: "Emp. class"

Lookup table for organization size.

Step 1: Import all columns from a text file.. Other parameters: the separator is comma, encoding is ASCII, decimal separator is point.

Load file "empclass.csv".

Step 2: Modify these column(s) using following expressions:

Column NameExpression
Classupper([Class])

Table: "LFO Codes"

Lookup table for LFO codes.

Step 1: Import all columns from a text file.. Other parameters: the separator is comma, encoding is ASCII, decimal separator is point.

Load file "lfo codes.csv".

Table: "Census data"

US Census data imported from a file specified by a parameter.

Step 1: Import 9 columns (listed below) from a text file.. Other parameters: the separator is comma, encoding is ASCII, decimal separator is point.

Load file which path is defined by parameter {Source file name}.

  • fipstate
  • naics
  • lfo
  • empflag
  • emp_nf
  • emp
  • ap_nf
  • ap
  • est

Step 2: Modify these column(s) using following expressions:

Column NameExpression
fipstateasnumber([fipstate])

Field 'fipstate' contains number that start from 0 which are imported as text. We need to make them numbers too.

Step 3: Look up values from [fipstate] in lookup table "FIPS", column [Numeric code], and replace them with [Name]. If a value is not found in the lookup table then keep it unchanged.

Step 4: Look up values from [empflag] in lookup table "Emp. class", column [Class], and replace them with [Description]. If a value is not found in the lookup table then keep it unchanged.

Step 5: Look up values from [lfo] in lookup table "LFO Codes", column [LFO Code], and replace them with [LFO Description]. If a value is not found in the lookup table then keep it unchanged.

Step 6: Modify these column(s) using following expressions:

Column NameExpression
empflagifempty([empflag],'Other')

Step 7: Rename 5 columns:

Old NameNew Name
apPayroll, MM
empEmployees
estEstablishments
fipstateState
lfoLFO Description

Step 8: Remove 4 columns:

  • ap_nf
  • emp_nf
  • empflag
  • naics

Step 9: Remove rows with the following values in column [LFO Description]:

  • All Establishments

Remove rows with 'All Establishments' totals to avoid double-counting.

Table: "Export to Tableau"

Helper table to export to TDE. It's convenient to keep export in derived tables so that it doesn't block calculations in Auto-run mode.

Step 1: Take table "Census data"

Step 2: Export table into Tableau Data Extract file "US Census.tde".

Column NameType
StateText
LFO DescriptionText
EmployeesNumber
Payroll, MMNumber
EstablishmentsNumber

Create Tableau Data Extract which then is used by Dashboard.twb (comes in the example package). Press Run Project to perform export.

Table: "Payroll by State: Totals"

Calculate national average across all states.

Step 1: Take table "Census data"

Step 2: Aggregate data. Calculate:

  • Sum of [Payroll, MM]
  • Sum of [Employees]

without grouping

Step 3: Calculate new column(s) using following expressions:

Column NameExpression
Avg. Payroll[Payroll, MM] / [Employees] * 1000

Step 4: Rename 1 column:

Old NameNew Name
Avg. PayrollNational Avg.

Table: "Top states by Avg. Payroll (proxy)"

Calculate metrics by state.

Step 1: Take table "Census data"

Step 2: Aggregate data. Calculate:

  • Sum of [Payroll, MM]
  • Sum of [Employees]

grouped by

  • State

Step 3: Calculate new column(s) using following expressions:

Column NameExpression
Avg. Payroll[Payroll, MM] / [Employees] * 1000

Step 4: Take 1st value from column [National Avg.] of table "Payroll by State: Totals", and populate with it a new column.

Bring in the national average from another table.

Step 5: Sort table

ColumnOrder
Avg. PayrollDescending

Step 6: Trim table. Keep only top {Top N} rows

Use a parameter to keep only the top states.

Table: "Top states by Avg. Payroll"

Step 2: Remove 1 column:

  • National Avg.

Remove columns that we don't need in the PDF report.