This is the 1st of a series of blog posts on data preparation in Tableau. In this blog post, I will introduce the Data Interpreter, a useful tool that can help you get started in preparing your data for analysis in Tableau.
We will use the Uniform Crime Reporting data from 1996 to 2005 as an example. The UCR is the official crime data in the United States, published by the Federal Bureau of Investigation (FBI). We encountered this data in a DataSquad teaching and research support project for Carleton professor Annette Nierobisz’s Fall 2019 class SOAN314 Contemporary Issues in Critical Criminology.
The original data is provided in the table format of the following Excel sheet. We can easily identify the recorded number of offenses for males and females in different years. However, the data is not yet in an ideal format for Tableau to interpret because not all cells in the Excel sheet are data inputs; the rows are not at the same level of analysis. There are titles, footnotes, empty cells, merged cells, and pre-aggregated data mixed in among the actual data fields. Unfortunately for doing secondary analysis, many data-based reports are available only with this kind of mixed-up formatting. It must be detangled before we can use it.
Therefore, if we import the data, we will see that Tableau cannot distinguish the non-data cells and the actual data fields, and thus titles are included along with lots of empty cells. This is not what we want to be working with.
However, Tableau is actually smart enough to distinguish the actual data fields. On the left-hand side, there is the checkbox for Use Data Interpreter. Check the box in front of Use Data Interpreter.
As we can see, with Data Interpreter turned on, Tableau is now able to distinguish the actual data from the header and the footnotes, and able to pick up column names correctly.
Data Interpreter also allows us to manually inspect its interpretation process to ensure accuracy. If you are having second thoughts about the reliability of Data Interpreter, click the blue underlined text Review the results. Tableau will prompt you to open the data an Excel workbook to let you see how Tableau is now seeing the data with Data Interpreter.
As we can see, the green area indicates what Tableau interprets as the main data and the red area shows what Tableau interprets as column names. The uncolored cells indicate the cells that Tableau has excluded.
Therefore, we recommend turning on the Data Interpreter if your data contain things like titles, footnotes, stacked headers, etc. We would also recommend manually checking the interpretation results Tableau gives in Review the results, to make sure Tableau has not made a mistake.
At this point, Data Interpreter has given us a head start in preparing this data for analysis. We will introduce more Tableau functionalities in preparing and cleaning data as we talk about further modifications that need to be done with this data.