So after the Power BI Desktop July update I’ve decided to check out 2 feature updates. In this post I’ll be discussing the Using R scripts inside the query editor.
So I diligently followed the official Microsoft Docs instructions found here: https://docs.microsoft.com/en-us/power-bi/connect-data/desktop-r-in-query-editor
I already had R installed on my machine, but for some reason Power BI had some issues with connecting to the install and libraries. This is an issue I had before when I was playing around with the R and Python visuals. Again, I’ll just chalk that down to just me being clumsy and not paying enough attention during the installation process. But once I got it to work, it worked quite smooth. It almost felt like I’m just coding in a notebook and the stuff I expect to happen, happened with the data.
The processing was quite fast, surprisingly past actually. This probably has to do with the fact that you don’t need to render visualizations to check if the output is correct. So less computing is needed in a way. And if you’re R is your go to programming language, you’ll probably prefer this option of molding your data before reporting over the usual Power Query M.
So will I swap out Power Query M for R? No. And that is mostly because I’m a Python girl. And the option to run Python scripts has been around for at least 2 years already (check the date on this Microsoft Docs page) . Did I trade Power Query M in for Python in the past two years? Also no. So R in that sense did not stand a chance. Since for dashboarding and reporting I primarily work with Azure Analysis Services, Power BI and some Excel, Power Query M is just the more optimal language choice.
Boy… that was a lot of Power Query M mentions in 1 paragraph. Let me just mention it one more time: Power Query M.
More reasons why
But aside from person preferences from when to use what language for which purpose with which tools… I do have some other concerns why I decide running R scripts in the query editor is a pass for me.
- The data privacy level has to be set to public. Which gives me the shivers. Imagine having to mark data from your organization, which could potentially have some sensitive component as Public? Not sure what I’m going on about? Read this Microsoft document on privacy levels.
- Although I did say that the processing was surprisingly fast, I’m still worried about performance. The sample data set it quite small, which means the scripts has to run on a quite small amount of rows. But the data I work with on a daily basis is significantly bigger. Like so big that I needed to re-evaluate my cube contents and structure. And honestly, Power BI for me is just not the place to go for big data processing, because it’s slow. Maybe if you have a higher tier of the Premium version, this argument wouldn’t apply. But not everyone can afford to fork out that kind of money. (And before you blame my lovely machines, even though they’re both from 2018, they can handle heavy loads perfectly fine)
- You need a Personal Gateway to be able to refresh the data outside of Power BI Desktop. So if your machine is turned off or offline during the scheduled data refresh, it will not refresh. Also, don’t get me started on credentials and the hassle when you need to re-install it for whatever reason. Enough reason for me to avoid it like the plague.
So… even though I think it’s a very cool feature, I’ll probably hardly use it. If I use it, it’ll probably be for fun.
But hey! That’s just my two cents. 😉