At work this week, I needed to compare two tables to see if the they had similar data. The problem was they were aggregated at different levels with different dimensions and some data in table A was not in table B and some data in table B was not in table A. I needed to find and quantify these differences as well as locate the missing data in each table. I was dealing with ad units, orders, key values, impressions and revenue from the world of online advertising and dealing with 50,000 or so rows of data, but for this post we will look at a more simplified example.
- Excel 2016 Power Query Tutorial
- How To Use Power Query
- Enable Power Query Excel 2016
- Power Query In Excel 2016 Download
In this example we have table A and table B.
Power Query is included with Excel 2016 (Office 365). It has been renamed and is now on the Data tab of the Ribbon in the Get & Transform section. This means there is nothing to install. If you are using Excel 2016, go to the Data tab on the ribbon and press the New Query button to create a query and open the Power Query editor.
- Table A contains data for clothing purchases by customer and product along with the quantity purchased and the total amount paid. This table has two dimensions of customer and product, and two metrics of quantity purchased and amount paid.
- Table B contains data for the same clothing purchases by customer and product with an extra dimension of description and also includes quantity purchased and the total amount paid. This table has three dimensions of customer and product, and two metrics of quantity purchased and amount paid.
- The Power Query queries in this workbook might be incompatible with your current version of Excel. These queries were authors with a newer version of Power Query and might not work in your current version. I was completely baffled by this one as I've always been working in Excel 2016.
- Microsoft Power Query for Excel requires Internet Explorer 9 or greater. Microsoft Power Query for Excel is available for 32-bit (x86) and 64-bit (x64) platforms, your selection must match architecture of the installed version of Office. Install Instructions. Download the version of the Power Query add-in that matches the architecture (x86 or x64) of your Office installation.
- This wikiHow teaches you how to use the Power Query feature in Excel 2016 on a Windows and Mac computer. Power Query is an Excel function that allows you to import data from a variety of sources (Excel tables, CSV files, online database, etc.), and easily reconfigure the data into a pivot table in a spreadsheet.
- Learn how to install Power Query with this step-by-step tutorial for Excel 2010 or 2013. If you have Excel 2016 then Power Query is already built-in and loc.
We want to compare these two sets of data and find out where the differences are and quantify these differences. If we take a look at the data we can see some differences.
- Table A contains data for Alex but table B is missing Alex
- Geoff has 2 sweaters in table A but has 3 sweaters in table B
There are several other differences in the tables, but spotting them manually will be hard and won’t scale when your tables have more data. Get & Transform will allow us aggregate these tables to the same level of granularity and join the aggregated data by their common dimensions to easily find the differences.
Get & Transform was previously called Power Query in Excel 2010 and 2013, and you will need to install is as an add-in. Find out how to install Power Query here. If you’re running Excel 2016 then it’s already installed and can be found in the Data tab of the ribbon.
Excel 2016 Power Query Tutorial
Aggregating The Tables
First we will need to aggregate the data to the customer and product level so that we can compare the two tables.
If you look at table A, you will notice that Bob and Jane have rows of data that will need to be aggregated. We ideally want only 1 row of data for Bob and Sweaters and 1 row of data for Jane and T-Shirt.
TIP: I always find it’s a good idea to use Excel Tables with your data. This way, your queries can reference a table name instead of a range. When you add data you won’t need to update the range in your queries as they will reference the name. The data sets in this example have already been turned into tables named Table_A and Table_B, but you can read about how to make a table here.
First let’s select our table and make a query.
- Select a cell in table A or select the whole table.
- Go to the Data tab in the ribbon.
- Press the From Table / Range button in the Get & Transform section.
This will open up the Query Editor.
Then select a Group By transformation.
- Go to the Transform tab.
- Press the Group By button.
Create your Group By query.
- Select Advanced to create a Group By query which groups by more than one dimension.
- Select Customer first and then Product second. You will need to use the Add grouping button to add a second dimension.
- Add a descriptive column name, select Sum as the Operation for both the Quantity and Amount columns. You will need to use the Add aggregation button to add the second metric.
Now save the query.
- Go to the Home tab in the query editor.
- Press the Close & Load button.
- Select Close & Load To.
From the Import Data menu select Only Create Connection. We could load this to another table by selecting Table if we want to see this intermediary step in our spreadsheet, but it’s not necessary. We can also select where to load the table to if we do select Table. Press the OK button to finish.
You should now see the Queries & Connections window pane docked to the right of your spreadsheet and it will contain our new Table_A query.
We can repeat the same process to create a Group By query for Table_B with the exact same groupings.
You should now see two Connection only queries in the Queries & Connections window pane for Table_A and Table_B.
Join Queries With Merge
Now we will combine our queries.
- Go to the Data tab.
- Press the Get Data button from the Get & Transform Data section.
- Choose Combine Queries then Merge from the menu.
How To Use Power Query
Now we can setup our merge query.
- Select Table_A for the first query.
- Select Table_B for the second query.
- Select Full Outer (all rows from both) for the Join Kind. This will mean all rows in Table_A and all rows in Table_B will be shown in the resulting table.
- Now we can select which columns our merge query will join on. Hold Ctrl then click on the Customer column and then the Product column. You should see a small 1 next to Customer and a small 2 next to Product.
- Hold Ctrl then click on the Customer column and then the Product column. We should again see a small 1 next to Customer and a small 2 next to Product.
- Press the OK button.
Python pandas tutorial pdf. In the editor we will see our Table_A Group By query along with a Table_B column. We will need to expand this column to show the data in our Table_B Group By query.
- Right click on the expand icon in the right side of the Table_B column.
- Select Expand.
- Press the OK button.
Enable Power Query Excel 2016
Go to the Home tab and press the Close & Load button to create a table of the results in a new sheet.
Compare Data
Power Query In Excel 2016 Download
It’s now easy to compare the data in table A and B and see where the differences are.