by Nate Wang, Consultant – Altis Melbourne
Using both Azure SQL Database and Power BI as Business Intelligence tools in your company?
Then this easy, but often neglected, functionality might help you supercharge your reports’ performance with no additional cost. Sounds good? Read on!
This blog post will cover:
- The background
- What is the Azure SQL Database read-only replica?
- Benefit & Limitation
- Methods & Step-by-step guide
- Testing & Verification
Background
One of our clients’ company is using the Microsoft BI services extensively, which involves Azure SQL Database and Power BI.
As we’re using the Premium tier of Azure SQL Database, it comes with an accessible read-only replica. So, naturally, a service like Power BI which only requires read-access is the perfect candidate to be switched to read-only replica. However, I didn’t manage to find any Microsoft official documentation on the method to do this.
After some research, I’ve found a very straight forward solution and it’s almost as easy as a click-of-a-button. And to save you the time and efforts to track down all the information and piece them together, I will include all my findings for you in this blog.
(If you know the relevant concepts already, you can jump straight to the step-by-step guide section.)
What is the Azure SQL Database read-only replica?
As part of the architecture for the ‘Always-on Availability Group’, the read-only replicas (or secondary replicas) of Azure SQL Database are provisioned (free of charge) to enable the feature of Read Scale-out. And one of them has endpoint set up, therefore, available for read-only access. (However, at any point of time, only one of the AlwaysON replicas is accessible by the ReadOnly sessions.)
Currently, of the three Microsoft Azure SQL Database offerings, two of them have read-only replica available for access:
General Purpose/Standard (no read-only replica available)
Hyperscale (read-only replica available, default to Disabled)
Business Critical/Premium (read-only replica available, default to Enabled)
As shown in the graph below, Business Critical instance that has one primary instance and three replicas where you can access read-only data in one instance using the read-only endpoint.I won’t go into details in this blog about the ‘Always-on Availability Group’, but you can find more information here.
Also, Read Scale-out is a ‘General Availability’ functionality for Azure SQL Database since March 2019, so it has been proven to be reliable. Find out more about it here:
Use read-only replicas to load-balance read-only query workloads
General availability: Read scale-out support for Azure SQL Database (29 March 2019)
Benefits & Limitations
Benefits of Connecting to Read-Only Replicas
‘Increase’ compute power
‘double’ the compute power with no added cost (pretty awesome right?) – the main and read-only replicas are essentially separate database instances, so the accessible read-only replica can act as the 2nd database to process the read-only workloads.
Load-balancing for primary database instance
By off-loading some read-only workloads to the read-only replica, it frees up the resources on the main database instance for mission-critical workloads (e.g. Scheduled Data Ingestion/Processing) so that the resource bottleneck is effectively mitigated during ‘peak hour’ of the day.
Limitations of Read-Only Replicas
Data latency
sync with primary instance within seconds, but still shouldn’t be used by reports with real-time data or time-critical data.
Data consistency
data consistency is maintained with transactional logs and at a session level, so while a process is locking the resources in read-only replica, the changes in the main instance might not be reflected in the read-only replica immediately.
Rule of thumb
In short, if your report doesn’t contain any real-time data, it’s a good idea to use the read-only replica as a source.
However, you should keep the ‘balance of things’ too. Having a read-only replica with no additional charge is awesome, but it doesn’t mean you should point all your read-access processes to it. The balance of the workloads between the main and the read-only replica should be controlled and managed.
What are the ways you can connect to read-only replicas?
- API
- Power BI
- SQL Server Management Studio
In this blog, I’ll go through all three methods of connecting to read-only replicas.
Step-by-step Guide: Connecting to Read-Only Replica
API
When you’re connecting to a database with read-only replicas available, the ‘ApplicationIntent’ option in the connection string provided by the client dictates whether the connection is routed to the primary instance or to a read-only replica. Specifically, If the ApplicationIntent value is ReadOnly, the connection is routed to a read-only replica. (the default value is ReadWrite). For example, the following connection string connects the client to a read-only replica (replacing the items in the angle brackets with the correct values for your environment and dropping the angle brackets):
Server=tcp:<server>.database.windows.net;Database=<mydatabase>;ApplicationIntent=ReadOnly;User ID=<myLogin>;Password=<myPassword>;Trusted_Connection=False; Encrypt=True;
SQL Server Management Studio
Step-1: Choose the server name
Step-2: Click “Options” button and then “Additional Connection Parameters” and type ApplicationIntent=ReadOnly
Step-3: Click the “Connect” button
Validating connection to read-only database:
Once you are in the Microsoft SQL Server Management Studio Query window.
Execute:
SELECT DATABASEPROPERTYEX(DB_NAME(), ‘Updateability’)
You should be getting the result as ‘READ_ONLY’. This confirms you are in read-only database.
Power BI
How can Power BI access the read-only replica?
Since Power BI service only requires read-access to the Azure SQL Database, it seems like it’s the ideal type of service to point to the read-only replica.However, since Azure Active Directory is not enabled in this case for Power BI, the conventional method of ODBC/JDBC connection is inapplicable. (as seen in the below snapshot)
OLE DB doesn’t support Microsoft Active Directory.
After some research, I’ve found a solution and the cherry on the top is that, with any exist Power BI dashboard, it’s as easy as selecting a button and re-publishing the dashboard.
Step-1: Choose the server name and database name.
Step-2: Click “Advanced options” button and then “Additional Connection Parameters” and select “Enable SQL Server Failover support”. Then click OK.
This enables MultiSubnetFailover (fail over function in MS availability group) and set ApplicationIntent to ‘read-only’ (to use the read-only replica of SQL DB).Bonus – by enabling MultiSubnetFailover option in the process, this gives you all the added benefits of Always-on Availablity groups too, which ensure the availability of your reports and disastrous backups.
Step-3: Click “Apply changes” button.
Testing & Verification
Verification of this Approach
Simply selecting a button to make this happened seems almost too easy.
So, we want to be sure that, with ‘Enable SQL Server Failover Support’ enabled, Power BI refresh process only connects to the read-only replica. We can prove this by monitoring the system tables on both the main and the read-only replica instances of Azure SQL Database.After repointing an existing Power BI dashboard from the main database instance to the read-only instance, I refreshed the Power BI dataset.
Then, immediately, I’m tracking the Power BI refresh process in the database by monitoring the ‘Mashup Engine’. And as can be seen in the snapshots. This process only runs in the read-only database instance after the change.Please find the SQL query here:
SELECT DATABASEPROPERTYEX(DB_NAME(), ‘Updateability’) AS Access;
SELECT
req.session_id,
req.status,
req.command,
req.cpu_time,
req.total_elapsed_time,
s.host_name,
s.program_name,
s.login_name
FROM sys.dm_exec_requests req
INNER JOIN sys.dm_exec_sessions s ON req.session_id = s.session_id
WHERE s.program_name = ‘Mashup Engine’;
An alternative way of verification can be found here – (SQL server profiler monitoring at 8:45)This is also supported by comments from Microsoft sources – (Miguel is a program manager at Microsoft – January 9, 2017)
Testing of Power BI Refresh Performance & Compute Usage
Overall, refreshing Power BI dataset takes roughly the same time to finish in the Primary instance and the Read-only Replica. Since it won’t clash with other workloads, there will be some potential performance boost as well.For your reference, below are some details on the compute usage and refresh performance.
This is a relatively quick change but has a huge impact on the compute utilisation of the database.
Compute Usage – daily Power BI refresh linked to Primary Instance
100% DB compute usage for the duration of refresh (~15 min)
This may cause slower performance for other processes during this period (e.g. data ingestion, data processing and UI refreshes)
Compute Usage – Power BI dataset refresh linked to Read-only Replica
0% DB compute usage on the primary instance
Power BI Refresh Performanceroughly the same time to finish refresh (faster at times, possibly due to less clashing workloads)
Final Remarks
In this blog, we covered all things related to the Read-only Replicas of Azure SQL Database. If you have any question or issue that requires Power BI expertise, please don’t hesitate to reach out to us. Good luck and have fun with your Power BI data journey.