From the course: Advanced Microsoft Fabric Implementation and Governance

OneLake and workspaces

- [Presenter] Microsoft Fabric has a single one lake shared across the entire tenant, and the data within it is stored and accessed in workspaces. This raises a lot of questions about how to organize your data, and we're going to answer those here. If you are wondering how to implement a medallion architecture or organize your data at different workspaces, this section is where we're going to clarify it. So, OneLake is really underneath the covers an Azure Data Lake Gen two Data store with many folders representing workspaces. And when thin fabric data content is managed and secured in these workspaces. Data is organized in workspaces and access is granted accordingly. You can create folders to sort elements, but you can't assign security that well to individual folders. So the medallion architecture and fabric gathers data in the raw or bronze area where you might have shortcuts to different data sources. You'll likely have Azure data pipelines here to move the data to the silver or staging workspaces. Data and raw workspaces is not going to be modified. This is where the data lands from its original source. You'll need to develop, test, and release pipelines and lake houses for data contained here using data pipelines, which we are going to be talking about later in the course. Data flow Gen two or ADF pipelines will transform your data into a star schema for Power BI analytical reporting into the gold or enriched layer. In the silver layer, you'll need development tests and production workspaces and the gold area for the curated or enriched area. That's where you're going to have the files like houses and SQL endpoints used for Power BI reporting. Security should allow many people to access the data stored in these different areas, which will likely also be moved through development pipelines from dev to test to prod. Within OneLake, workspaces will contain versions of the data for the different stages of development. The gold or curated areas will have read only access to several workspaces designed solely for reporting. Let's take a look at how the medallion architecture is implemented in OneLake and Microsoft Fabric. You can see here that we have a number of different workspaces. In this particular environment. The bronze layer is called raw, so you'll notice that we have a dev raw and a dev stage, which is commonly in medallion architecture called the silver layer. And then our production layer just doesn't have a name. It's the one that's default, but we have one of these for dev. The reason that we have it for dev is because we're playing around with things like, you know, creating our different pipelines, perhaps adding notebooks, creating Lakehouse to investigate. We need to do this for dev, for test, and then for actual the production released. In raw, but we're bringing in our raw data and perhaps, it's in a shortcut to an ADLS account, and this is going to land in that raw area. And then the data is processed in the stage areas. As you can see here, we also need to test that. And finally, they're moved to the areas that we are going to be using for Power BI reporting in the gold or the curated areas. And here they're not labeled. The ones at the end would be test the sales and the dev area. These would be the curated areas which are going to be used to connect Power BI. It does end up with a lot of workspaces.

Contents