From the course: Learning Data Analytics Part 2: Extending and Applying Core Knowledge

Unpivoting data from existing reports

From the course: Learning Data Analytics Part 2: Extending and Applying Core Knowledge

Unpivoting data from existing reports

- [Instructor] One of the biggest challenges we face as data analysts is the inability to get to all of the raw data that we need. Source systems will sometimes limit you to the pre-built reports that are exported to PDF or Excel. These reports are canned reports and they're generated by the system internally. These canned reports can be valuable but we want to extend what we can do with them and other visualizations. We really need them to function more as records than pivot displays. In our research team, we often refer to this format as the wide format. Again, there are limitations in the visuals if you leave your data in the wide format. One of the key cleaning commands of power query is unpivot. Power query gives us the ability to take wide displays and create long displays that act as records. We unpivot them giving us a valuable record set to work with. Unpivot exists in power query for any dataset and you can unpivot Excel data or even PDFs. Let's go ahead and get started and unpivot our data. I'll go to get data, I'll choose from Excel. I'll choose my unpivot data start and I'll go ahead and open it. I'll go ahead and select my pivot data and then I'll go straight to transform data. I'll just wait through the data real quick. And one of the things that I see is that it produces a null value for a month where a customer didn't order anything. I really actually want that to be a zero. So I'll go ahead and highlight January, I'll scroll over to December, hold my shift key. I'll go replace my values and change that null to a zero. If I want to create eight a total, then I can create a total from all of my months. But what I don't want to do is have my total year come in and actually inflate my totals. I'll go ahead and remove that total year. Okay, scroll over and take a look. All right, I'll go ahead and choose close and apply and look at the fields. What you see in the pivot data field is that each month has its own column header. This makes visualizing this data almost impossible or it does create some limitations. It would be better to have a single field that represents the month and then the value that's associated with it. This is where unpivot comes in. All right, I'll go to transform data. Okay, I'll go ahead and highlight my January, hold my shift key and highlight December, I'll right-click and choose unpivot columns. This creates two new column headers for me, attribute, which is the original name of the column header and value which was the value underneath that column. I'll go ahead and change attribute to read month, month name and value is actually the amount ordered. All right, perfect. I want to make sure that I don't have any blanks in the customer names. Perfect, I'm good. I'll go ahead and click OK. I'll choose close and apply. And now I see, I have my three column headers customer name, month name and amount ordered. This will allow me to stack those displays. All right, so let me go ahead and bring my month name into my legend, my customer name to my access and my amount ordered. Okay, let me have those. This display would not be possible had I not unpivoted the data. I could have layered and stacked all of the information and played around with it in an inefficient method but unpivot is a winner for all data analyst. If you're a new analyst, be glad you're starting now with tools capable of doing this. And if you're an existing analyst, you will immediately appreciate how powerful the unpivot command can be.

Contents