Analytics and Search Marketing Tips
If you’re like me, you’ve been waiting a long time for Tableau to add the Holy Grail of functionality, cross database joins. And with Tableau 10 just around the corner, it is on the threshold of arrival. Get ready for a shorter, more simple path to a holistic picture of your data.
A cross database join allows you to join data from two different types of databases as if they were in the same database. The ability to execute a Tableau cross database join will save you time and money, as it allows users to perform tasks that normally require IT resources to be involved; specifically, creating a project where IT extracts the needed data from various databases and consolidates them into a single database (i.e. a data warehouse).
Instead of being dependent on IT, you can join data to create reports and visualizations to answer your own questions.
The new Tableau cross database join functionality enables:
Suppose you have Order data in a Redshift database which contains Order information and Customer Information. And in Oracle you have Regional Sales information, such as region, zip code, and salesperson.
You want to know how many orders were placed in what Region, and what sales person is responsible for sales. Since the data is in two different databases, you previously couldn’t answer the question without either:
However, in Tableau 10, you can join these two tables, one from Redshift, the other from Oracle, as if they were in the same database. Essentially the resulting SQL would look something like:
Sales Region Oracle
Redshift.Customer Zip = Oracle.Zip Code
This is just a simple example, as you can do much more complex joins using multiple sources and types of databases.
For instance, in Tableau 10.0 beta, using data from a Redshift database and four CSV files which I unioned (using Tableau’s text file Union functionality), I created a cross database join with Redshift data to the unioned CSV files:
CLICK IMAGE BELOW FOR FULL SIZE VIEW
Another example shows data from Google Sheets (yes, in Tableau 10 you will be able to use Google Sheets as a datasource just like Excel or text files) being joined with data from a Redshift database (joining the two “databases” using a common field found in both data sources of Lead ID to Id):
CLICK IMAGE BELOW FOR FULL SIZE VIEW
Previously, if you had a report that required data from multiple databases, you had to do one of the following:
(1) Submit a project request to IT and wait until they could put the required data into a data warehouse before the reports could be created. (2) Use Tableau data blending to combine the data sources as needed to create the reports. Although data blending is quite powerful and useful (I once created a dashboard which required nine different data sources to be blended) it presents a few challenges:
The process of creating a Tableau cross database join becomes a prototype for how the data should be brought into the data warehouse in order to meet reporting needs. How many times have you gone through the process of implementing a project to get data added to the data warehouse only to find out, after the work is done:
With cross database join functionality, you can spend time with the end-user prototyping with actual data, rapidly adding and changing data as new insights occur, and changing requirements as needed.
Once the solution is delivered, the results can be used as the requirements for the data warehouse project,
thus eliminating unsuccessful projects and inefficient re-work.
Here’s a great example of a data model:
The resulting data source can be published to Tableau Server for others to use, or an extract created, and if published it can be scheduled for refreshes. Most things you can do with a data source (with a couple of caveats listed below) you can do with the resulting cross database join data source.
Like most things, as I’ve tested Tableau 10.0 beta, it doesn’t give me everything I hoped for. Tableau’s development team says additional functionality will be added in later releases, but below are some caveats on what can be done:
Some things that I want to see to make this a Holier Grail (is that even possible, holier?) would be the ability to perform cross database joins on extracts and Tableau Server data sources, including extracts on Tableau Server. Why? Let me ask you a question first.
How often have you said, “If I could just get my hands on the data, I could do the analysis and create visualizations?”
If Tableau allows cross database joins with extracts and data sources on Tableau Server, including extracts, you can become the creator of your own data warehouse without having to rely on IT! You can access the databases you need, publish the data to Tableau Server (as a scheduled extract or live) and then use Tableau Server as a data warehouse, doing cross database joins with any of the sources you have published. If you don’t have Tableau Server, you can still do this by creating a set of extract files for all sources and joining what you need.
Of course, with great power comes great responsibility. You would want to ensure you have processes in place to manage the data, keep accurate documentation, and ensure you don’t needlessly replicate data. But if done properly and wisely, the results can be powerful and they can be used for data warehouse requirements when IT resources become available, with the resulting additions to the data warehouse already proven to meet reporting, visualization and analysis needs!
So after this, what’s next on my checklist?
Eh, maybe not… too much responsibility.
Meet Nick and his fellow data visualization enthusiasts during the next Tableau User Group meeting at the Blast HQ in the Sacramento area.
Share this Post
Nick Mannon is the Director of Data Solutions at Blast Analytics & Marketing. Working with data, finding insights, and creating visualizations that clearly communicate stories to answer business questions, that is his passion. He is a Tableau Desktop Qualified Associate and Google Analytics Certified. Nick Mannon has written 2 posts on the Web Analytics Blog.
Digital Analytics Blog
We're here to help with tips and insight on the following topics:
Optimize your website and marketing campaigns