UPD: Updated the publication in connection with the release of Power BI Desktop from the Preview stage on July 24.Today I want to tell you about building a simple dashboard with a plan-actual analysis of income and expenses in
Power BI Desktop and
Power BI . In the first part, we will work with queries, learn how to combine data from several sources and perform their cleaning.
Baseline data: several Excel files that store information on the company's income and expenditure indicators over time. Since the form of reports and accounting items changed several times, the information sheets have a similar, but not quite the same structure. At the exit, you need to get something that will allow managers to get an idea of ​​what is happening with the company's finances.
Let's take a look at how Power BI Desktop allows you to combine data and visualize it.
')
Figure 1. Power BI Desktop start windowWhen launching, Power BI Desktop asks you to specify a data source or select one of the recent sources. By the way, about the sources - it is possible to connect to a large number of them, including online and local sources. For example, Google Analytics. In our case, all reports are stored in the same folder, so we will select “Folder” as the source. This choice will allow you to add new reports to this folder in the future and load their data for analysis by pressing the “Update” button.
Figure 2. Select data sourceAfter clicking the "Connect" button, a preview of the query results to the data source is displayed. You can load the data “as is” or switch to the request editing mode by clicking on the “Edit” button. In most cases, the query results need to be processed.
Figure 3. Preview query resultsAfter clicking the “Change” button, the “Query Editor” window automatically opens, in which you can and should perform data cleaning and preparation.
Figure 4. Query Editor WindowAs you can see in the screenshot, Power BI Desktop loaded the files from the folder and basic meta-information. Remove all columns except “Content”, “Name” and “Extension”. Since the file name shows the year to which the corresponding report relates, we use it for that.
If you click on any line in the "Content" column, Power BI Desktop will open the contents of the corresponding book. Please note that in the "Query Parameters" section, all completed steps are recorded and can be changed or deleted. If you open the Advanced Editor window, a window will open in which the program code for all the actions performed will be visible. Yes, Power BI Desktop has its own programming language “M” and this is very cool.
Figure 5. Advanced Query EditorSince the report requires the contents of all Excel files in the folder, not just one, I will delete the last two steps and use some of the “M” functions to parse the contents of the Excel workbooks from the folder.
Before further processing of data, it is also necessary to take into account that not only Excel files can be loaded into the folder. Therefore, you need to apply a filter to the column “Extension”, which will eliminate unnecessary file types.
Figure 6. Applying a filterNow you need to "extract" the contents of Excel workbooks. To do this, I will add a new column using the Excel.Workbook function, which allows you to “extract” the contents of your Excel workbooks. The new column contains the values ​​of the type "Table", which allows you to "expand" its contents into several other columns. When "deploying" you can choose which columns will be displayed. In this case, the semantic load is carried by the columns “Data” and “Item”
Figure 7. Adding a custom columnThe “Data” column contains data from Excel sheets, and I use “Name” and “Item” for timestamps.
Since the “Name” column contains data of the form yyyy.xlsx, where yyyy is the year of the report, we will perform a simple operation to separate the data in the column. Separation can be performed both by the number of characters and by separator. In this case, the column must be divided by separator.
Figure 8. Split a column
Figure 9. Column split settings windowAfter splitting the column, you will need to rename it.
Figure 9. Column split settings window
Figure 10. Prepared for "deployment" requestThen I “unwrap” the Data column and see the contents of all the files and Excel sheets, while in a form that is unsuitable for constructing final reports. But I can use the Power BI Desktop features to clean the data.
Figure 11. Query after “deploying” file contents1. Use the top rows as headers and then rename the columns. I will delete the columns “Actual deviation” and “Deviation in%”. Further we will recalculate them.
2. Delete the rows that contain empty values ​​and the value "Indicator" in the column "Indicator", applying filtering. In the same way, I delete the lines that contain total values, for example, “Total Chod”, “Total”, etc.
Figure 12. Data filtering menu3. Using the function "Replace values" I will replace synonyms, for example, "Income" and "Incomes".
Figure 13. Replacing values4. After a little more study of the contents of the “Indicator” column, I find that all my income relates to the indicator “Income” or “Income”. Everything else relates to costs, which greatly facilitates the task. For convenience, further processing and filtering create a column “Category”, which will contain the value “Income” if the word “Income” is present in the “Indicator” column, and in all other cases it will assume the value “Expense”.
Figure 14. Adding a Category column5. Next, I need to specify that the type of values ​​in the “Plan” and “Fact” columns is a decimal number. But before that, I need to remove spaces from the contents of these columns.
6. After performing the operation, I check the columns for errors and negative values. Since in our case the presence of a negative value means an input error, using the value conversion function, I select the absolute value in the “Plan” and “Fact” columns. This completes the basic data cleaning.
7. In order to display indicators with a time reference, you must specify a date for each record. The table contains the month in text form and year. For convenience, we assume that the data is displayed at the end of each month. Here we have a problem - the language “M” does not allow us to convert the names of the months into dates at the moment. Therefore, you will need to take a few intermediate steps.
8. Create a new query that will contain the names of the months and their numbers. To do this, create an empty query, open the advanced editor and paste the following code:
let
Source = {"", "", "", "", "", "", "", "", "", "", "", ""},
#"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
#"Added Index" = Table.AddIndexColumn(#"Converted to Table", "Index", 0, 1),
#"Added to Column" = Table.TransformColumns(#"Added Index", {{"Index", each List.Sum({_, 1})}}),
#"Renamed Columns" = Table.RenameColumns(#"Added to Column",{{"Column1", ""}})
in
#"Renamed Columns"
Figure 15. Adding an empty query9. I turn to the query “Plan-Fact” and combine the queries by selecting the desired type of union.
Figure 16. Merge Queries10. The data from the “Months” query was added as a new column. I open it and now we have all the data for the formation of the date. I create a new column with the name “Date” using the formula: = Date.EndOfMonth (#date ([Year], [Index], 1)) The column has been added and it contains the last day of each month. In order for Power BI Desktop to group dates by months and years, you will need to explicitly set the “Date” type
Figure 17. Adding the Date column11. Remove the Index, Year, and Month columns. We don't need them anymore. Then you need to click on the "Close and Load" button to proceed to the modeling and visualization of data.
At this point, the basic part of data processing is completed and you can proceed to visualization. The possibilities of data visualization in Power BI Desktop will be discussed in the
next section .