Microsoft create Purview, a data governance solution. Since Microsoft Fabric was in preview, there was a promise of a deeper integration between Fabric and this governance solution. This integration is finally available. This governance solution is a complete world by itself. This is only a small summary of what’s available. First access to Purview Purview … Read more
The May Microsoft Fabric updates bring new about Time Travel in a Data Warehouse. This is good but surprising because this feature is available for a while. Let’s discover what’s new in Data Warehouse time travel. Time travel is a feature of Delta Tables which allow us to retrieve the data as it was in … Read more
Sometimes, when a new feature is announced, it’s in fact hiding bigger changes on the entire environment. This is exactly what happens with these Workspace Identity and Resource Instance Rules. The announcement: We can create OneLake shortcut to an Azure Storage Account protected by the Storage Account firewall. This seems to be a very specific … Read more
Over the years Power BI has evolved into a complex and varied ecosystem of tools and solutions, which in its turn demands several supporting roles: there are, of course, developers, data engineers and data scientists, but there is need for one more, i.e. a capacity administrator. Of course some of these roles may be covered … Read more
A Fabric Pipeline uses JSON as source code. They are also saved in repositories as JSON. We first idea we get is editing the pipeline in JSON format. We can copy the JSON and create new pipelines with small variations, making changes directly on the JSON. However, at first sight we get disappointed, because the … Read more
Over the past years, “traditional” ETL development has morphed into data engineering, which has a more disciplined software engineering approach. One of the benefits of having a more code-based approach in data pipelines is that it has become easier to build metadata driven pipelines. What does this mean exactly? Say for example you need to … Read more
PySpark is a powerful language for data manipulation and it’s full of tricks. Let’s discover some of them. Control the Type of a NULL column If you are creating a pysspark dataframe, but one of the columns contains only null values (None), how could you control the type of the column? There is an interesting … Read more
When implementing real-time ingestion, we usually implement an architecture called lambda. Using the lambda architecture, KustoDB in Microsoft Fabric is always recommended for the speed layer. Do you know why? Let’s analyze in detail. 1 – KustoDB uses SSD KustoDB uses an internal SSD storage. Lakehouses use ADLS as their backend. In this way, Kusto … Read more
I have been talking about Data Exploration in Power BI on many of my sessions, specially the sessions about Data Marts. The new data exploration feature is one more feature on this expanding scenario for data exploration. This one brings some interesting details. We start using this feature from a query. The feature will allow … Read more
Finally, mirroring is available for Fabric! You can mirror an Azure SQL to Fabric. It works for CosmoDB and Snowflake as well, but in this article, I will focus on Azure SQL. It is 100%, no, but it is definitely a feature that is really great even now. Before getting into a step-by-step of the … Read more
Let’s consider a simple statement for partitioning and save a table in a lakehouse: df.write.mode("overwrite").format("delta").partitionBy("Year","Month","Day").save("Tables/" + table_name) Let’s consider we load the data daily, with all the transactions from the day. The table will save the transactions for each day in different partitions. We can expect the table to keep the partitions from previous day, … Read more
On the blog Fabric Notebook and Deployment Pipelines I explained a technique to keep notebooks configuration values in JSON files on lakehouses, a good solution from many different points of views. What if we need to provide maintenance to the JSON configuration file using notebooks? The first problem is the fact the typical statement to … Read more
Dataflows Gen 2 are the new version of Power BI dataflows. There are so many changes in relation to the previous version they are considered a new feature. The main difference is the possibility to set a target for the result of each query in the dataflow. In this way, it can be used as … Read more
On my article about Source Control with GIT, Power BI and Microsoft Fabric, I illustrate how to use the PBIP file format to include Power BI reports and semantic models in a source control process and stablish a SDLC (Software Development Lifecycle) for Power BI. However, the complete explanation is based on saving the development using … Read more
When organizing our SDLC (Software Development Lifecycle) in Power BI/Fabric, we use Deployment Pipelines and create rules to change connection configurations every time we promote an object from one environment (dev for example) to another (test, for example). Kusto connections, on the other hand, are not so simple. You can check more about Deployment Pipelines … Read more
Eventstream has many differences in relation to the technologies it proposes to replace. Event Hub, Stream Analytics, Streaming Dataflows and more. We can compare these technologies, but EventStream in Microsoft Fabric has some specific differences from all of them. One of the differences is how the transformation of the input data is linked to the … Read more
Power BI and Fabric are implementing source control support. It’s a long-awaited feature for Power BI. However, it’s important to highlight some basic principles which should be followed as source control best practices. Some of them apply to any project in source control, some are specific for this environment, and some are specific for this … Read more
Nikola Ilic, best known as Data Mozart, published a great article and video about how to make semantic model data available in Microsoft Fabric. This allows the data to be used in lakehouses or data warehouses. One major question that arises is, “should we use a top-down or bottom-up (or both) approach in Microsoft Fabric?” … Read more
Recently Azure Resource Graph was announced as a new connector in Power BI. Azure Resource Graph provides access to almost all resources inside the azure environment of a company. Why is this important? Resource Graph by itself is a very important tool to analyze the provisioned resources on Azure environment without lose the control of … Read more
We can say Fabric is the evolution of the Power BI environment. Power BI is a self-service environment, and so is Fabric. This allows the implementation of very interesting architectures, which will be the subject of future videos and articles. However, it’s not something free-and-easy, and it shouldn’t be. Using Fabric Admin Portal (or Power … Read more