EasyMorph and Python - a powerful combination

EasyMorph and Python

If you’ve frequented the EasyMorph community or seen any of our social media posts in the last few months, you might have seen mention that we’ve been working on an integration between EasyMorph and Python.

For those that don’t know, Python is a high-level programming language which has become the de facto standard for implementing data science and machine learning models.

“A programming language? But I thought EasyMorph was designed to be no-code where possible?”

It is certainly true that we want EasyMorph to be as simple and easy to use as possible, for the widest range of people whatever their technical ability. This, therefore, does mean that, where possible, we avoid users having to write any code. Take our visual SQL query builder for example. Or our recent introduction of the Generate Expression feature, using AI to help people write valid EasyMorph expressions by simply describing what calculation they wish to perform.

Generate expressions with AI assistance

Why is Python used for Data Science and Machine Learning?

Python has become so prolific in data science for a few reasons. Firstly, Python is relatively simple to write and understand compared to a lot of other programming languages. Most data scientists or machine learning engineers are far more likely to come from a mathematical background than from a software engineering one. Python does away with the complex structures of nested brackets and unfathomable syntax found in most programming languages. The learning curve for Python is a lot less steep than for most other languages. This makes it a great entry point for people to begin programming, including mathematicians who wish to begin programming complex statistical analysis of data. In fact, many software development courses now begin with Python.

Partly due to this reduced learning curve, a strong online community and ecosystem has sprung up around Python. This means it's usually possible to find an example online which you can quickly adapt to your needs. And if not, there are near endless websites where you can ask questions and get helpful advice and tips, much like the EasyMorph Community.

But most of all, there is a huge ecosystem of prebuilt libraries and tools in Python specifically designed for data science and machine learning, such as those shown below. It's much easier and therefore much quicker to use a prebuilt code library to perform some complex statistical analysis than it is to try and code it from scratch yourself.

Python data science libraries

Why we created the EasyMorph Python integration

In most businesses today, the data people are split into a few separate teams:

  • The infrastructure team - looking after the databases and business systems where most of the data originates.
  • The data engineering team - extracting the data from the business systems, using SQL, Python, and visual tools like EasyMorph, merging and transforming data to make it useful for reporting and analysis or for data exchange with other systems.
  • The data science team - using data science and machine learning to derive valuable new insights from the data. This team frequently relies on Python for its tasks.
  • The reporting/business intelligence team(s) - building and maintaining the large number of reports and dashboards that all businesses rely on.
  • The business teams - not just consuming the dashboard and reports, but often also creating their own bespoke one-off analyses based on the data.

Whilst EasyMorph’s powerful data preparation and automation workflows play a huge part in this flow and use of data across a business, there is often a disconnection. The data science team using the power and familiarity of Python, is completely separate from the other teams using data preparation and automation tools such as EasyMorph.

We want to bring together all of these data teams from across a business, allowing them to maintain the tools and ecosystems they are familiar with, whilst enabling them to collaborate and integrate their parts of the chain of data into one, coherent process. It's for this reason that we’ve been working on making EasyMorph and Python the best of friends.

Let’s also not forget that data preparation can take up to 90% of data scientists’ time. Having a visual data preparation tool like EasyMorph tightly integrated with Python can easily cut the time required for shaping data in half, if not more.

And if you haven’t yet got a dedicated data science team or ventured down the rabbit hole of machine learning and predictive analytics, EasyMorph working seamlessly with Python will lower the barrier and make it so much more accessible.

Wasn’t it always possible to run Python code from EasyMorph?

Yes, kind of! EasyMorph has actions to allow you to call any command you could type into the Windows terminal. You can therefore run a Python script using the “Run program” or “Iterate program” actions. But that was about it. Fire it off and wait.

Call program action

If you need to pass some data from EasyMorph to the Python script or back again, you had to export the data to a temporary file the other could then read in. This not only required additional steps but could also be tricky to coordinate. Temporary files could get left in a mess. It was hard to know when the Python script had stopped running and EasyMorph could continue with the rest of the workflow. All in, it just wasn’t a very elegant solution.

What is possible with the EasyMorph Python integration?

An easier question might be what can’t you do with EasyMorph and Python. Python is a programming language and so you could write a program to do just about anything you want.

Unsurprisingly, the most likely things people will do have already been mentioned multiple times above - data science and machine learning. Together, these two disciplines open up many possibilities, such as:

  • Predicting future trends based on historic data.
  • Classifying customers based on the products they buy, how often they return or what they are most likely to buy next.
  • Helping humans to make better business decisions based on data rather than gut feeling.
  • Prioritizing what will have the greatest impact, not what makes the most noise.
  • Spotting patterns in data, allowing you to prevent problems before they occur.
  • Any many many more…

Another interesting possible use for Python is to connect EasyMorph to systems that otherwise pose a problem. With its vast list of supported systems, databases, APIs and files it's rare that you find a system that EasyMorph can’t load data from. Over the last 25 years working with data, from time to time I’ve come across an old legacy system that uses its own weird file format or some completely non-standard interface. With Python, it is possible to write the necessary code to wrangle the data from such a system, converting it into an EasyMorph dataset and passing it to an EasyMorph workflow to then transform just like any other data.

Many companies already have large amounts of data cleansing, business logic and data validation built in Python. Whilst it is likely almost all of it could be reworked into EasyMorph workflows, being able to directly reuse this code in EasyMorph can both save time and keep all the logic in a single place.

I’m looking forward to seeing all of the other uses our customers find for Python and EasyMorph.

How does it work?

This post isn’t intended as a technical deep dive into how you can get started with EasyMorph’s Python integration. I’ll save that for another post. But for those curious, let’s briefly look at what this integration brings.

NOTE: This integration of Python into EasyMorph is brand new and considered an experimental feature. As such, we have more improvements and performance enhancements planned for future releases.

The first thing to note is that a new “Call Python” action has been added to EasyMorph in version 5.9.7.

Call Python action

The first and most obvious setting is the Python script you want EasyMorph to run. This can be any Python script and so you could just use it to run a python script much like when using the “Run program” action.

You’ll also notice options to pass the input dataset to the python script and to expect a dataset returned from it. The “Call python” action can be placed anywhere in a workflow and as with all EasyMorph action, the output dataset of one action passes to the next in line. This means data can be extracted and processed in EasyMorph using other actions, passed to a Python script to do whatever job it does, and then passed back to EasyMorph to continue the journey through the workflow.

Data flow through EasyMorph and Python

To make working with EasyMorph datasets in Python as easy as possible, we’ve added an “easymorph” library which includes methods for receiving this passed data and working with it in Python. Most notably, it enables converting EasyMorph datasets to Pandas DataFrames and vice versa. Pandas is a popular library for handling data in Python and many other data science Python libraries and tools work with Panda DataFrames.

The “easymorph” library also allows building and returning a dataset from the Python script back to the EasyMorph workflow which called it, as well as providing the script access to any parameters configured in the workflow. Besides that, it provides extra convenience by enabling status messages and custom warnings, and supporting workflow cancellations right from the Python script.

How to get started with Python and EasyMorph

For standard Python installations, the "Call Python" action works out of the box. Being an experimental feature, we’re still working to make the integration even more useful as well as to provide more information, documentation and examples.

We’ve begun a help page for the integration where you can find more technical detail as well as example scripts showing how to use the “easymorph” library and the capabilities mentioned above.

There are also some great topics on the EasyMorph community to check out, including some from our developers providing tips and tricks.

Do you want to read more posts like this one?

Subscribe to the newsletter

What is 3 + 4?
See EasyMorph in action

Not sure whether EasyMorph is the best option to simplify your daily data prep? Download the free version and try it out today. No strings attached.

See EasyMorph Server in action