-
AlwaysON Issues
1. Quarantined nodes in Windows Failover Clusters:
I’ve observed this issue on Windows Server 2016, where one of the nodes were quarantined after many node failover attempts in an hour. For the following few hours, the WSFC prevents the node from rejoining the cluster. This could be due to a network issue that I’ve noticed in the current environment. Quarantined receives a ping response from one node. Intermittently, the ping response from one node to the quarantined issue node receives a Timeout response.
For more details go through below link it helped me a lot :
https://www.mssqltips.com/sqlservertip/6714/windows-failover-cluster-node-quarantined/
2. You will encounter an error, The primary replica is not Active .The command cannot be executed.
This is due to an issue with the endpoint port not listening on the correct port or the tcp endpoint being stopped for some reason.
Go through below link for more :
3. Unexpectedly, an availability group was missed, dropped, or removed.
This was caused by the fact that SQL was not being transmitted to WSFC. The SQL will then delete the Availability group.
Please go through below link for more details:
4. Availability group is in Resolving State :
Issue : For a few minutes, there was a cluster failure that affected the availability group, and the replicas went into a resolving state. The replicas returned to their regular primary and secondary states once the cluster was brought back online, however several databases were still not synchronising. Furthermore, the databases on the primary were unavailable.
Resolution : The only remedy we found was to restart the primary replica’s SQL instance. Even a simple restart of the SQL service through the config manager, however, was stuck on “stopping service.” We had to use the TSQL command “SHUTDOWN WITH NOWAIT” to make SQL to stop. The databases were available again after SQL was brought back up, and AG was in sync and healthy.
For further details go through below link :
https://blog.pythian.com/resolving-an-availability-group-sync-issue-hadr_ar_critical_section_entry/
Issue : The secondary replica does not correctly transition to the primary role if an automated failover event fails. As a result, the availability replica will indicate that this replica is in the process of resolving. Furthermore, the availability databases state that they are not synchronising, and apps are unable to access them.
Reasons to occur :
Case 1 : The value of “Maximum Failures in the Specified Period” has been reached.
The clustered resource will fail three times in a six-hour period if the default behaviour is followed. RESOLVING STATE is applied to the AG replica.Case 2 : Local SQL Server NT AUTHORITY\SYSTEM login account has insufficient NT Authority SYSTEM account rights. The following permissions are provided by default to this local login account:
Alter Any Availability Group
Connect SQL
View server stateCase 3 : If one of the availability databases in the availability group is in the synchronizing or not synchronized state, automatic failover will not be able to successfully transition the secondary replica into the primary role.
To get more details check below link :
5 . Diagnose Unexpected Failover or Availability Group in RESOLVING State
Lease Timeout : A lease timeout can be triggered if SQL Server does not react within the normal 20-second lease timeout period.
Lease Timeout CAUSE – 100% CPU Utilization: A lease timeout might occur if CPU utilization is extremely high for an extended length of time. Using Performance Monitor, keep an eye on your CPU usage.
Sync Issues :
how to troubleshoot Always On synchronization issue?
Multiple reasons for the database status changing to not in sync
- Network Issue
- Huge transactions
- Space Issues
-
SPN Registration Of Windows Service Accounts and Permissions
SPN Registration:
SQL Server can be started and operated using domain user accounts, local user accounts, managed service accounts, virtual accounts, and built-in system accounts. After installation, a startup account must be configured for each SQL Server service in order for it to function.
The accounts that can be used to start SQL Server services, the default values in SQL Server Setup, the concept of per-service SIDs, startup options, and firewall configuration are all covered in this section.Default Service Accounts:
The table below lists the default service accounts that setup uses when installing all components. The default accounts listed are the suggested accounts, unless otherwise stated.
Stand-alone Server or Domain Controller
Component Windows Server 2008 Windows 7 and Windows Server 2008 R2 and higher Database Engine NETWORK SERVICE Virtual Account* SQL Server Agent NETWORK SERVICE Virtual Account* SSAS NETWORK SERVICE Virtual Account* ** SSIS NETWORK SERVICE Virtual Account* SSRS NETWORK SERVICE Virtual Account* SQL Server Distributed Replay Controller NETWORK SERVICE Virtual Account* SQL Server Distributed Replay Client NETWORK SERVICE Virtual Account* FD Launcher (Full-text Search) LOCAL SERVICE Virtual Account SQL Server Browser LOCAL SERVICE LOCAL SERVICE SQL Server VSS Writer LOCAL SYSTEM LOCAL SYSTEM Advanced Analytics Extensions NTSERVICE\
MSSQLLaunchpadNTSERVICE\
MSSQLLaunchpadPolyBase Engine NETWORK SERVICE NETWORK SERVICE PolyBase Data Movement Service NETWORK SERVICE NETWORK SERVICE Service Accounts for SQL Services When resources from outside the SQL Server workstation are needed, Microsoft advises using a Managed Service Account (MSA) with the least amount of privileges. When set up on a Domain Controller, a virtual account cannot serve as the service account.
SQL Server Failover Cluster Instance
Component Windows Server 2008 Windows Server 2008 R2 Database Engine None. Provide a domain user account. Provide a domain user account. SQL Server Agent None. Provide a domain user account. Provide a domain user account. SSAS None. Provide a domain user account. Provide a domain user account. SSIS NETWORK SERVICE Virtual Account SSRS NETWORK SERVICE Virtual Account FD Launcher (Full-text Search) LOCAL SERVICE Virtual Account SQL Server Browser LOCAL SERVICE LOCAL SERVICE SQL Server VSS Writer LOCAL SYSTEM LOCAL SYSTEM SQL Service Accounts for SQL Components Managed Service Accounts, Group Managed Service Accounts, and Virtual Accounts are all types of managed service accounts.
In order to give essential programs like SQL Server their own accounts without requiring an administrator to manually manage the Service Principal Name (SPN) and credentials for these accounts, managed service accounts, group managed service accounts, and virtual accounts were created. The long-term maintenance of service account users, passwords, and SPNs is made simpler as a result.
Managed Service Accounts:
The domain controller can establish and administer a specific kind of domain account called a Managed Service Account (MSA). It’s used to execute a service and is connected to a single member computer. The password is automatically managed by the domain controller. A MSA can be used to start a Windows service by a computer, but it cannot be used to log into a computer. When allowed read and write service PrincipalName rights, an MSA can register a Service Principal Name (SPN) in Active Directory. A $ suffix, such as DOMAINACCOUNTNAME$, identifies an MSA. The password box should be left empty when specifying an MSA. A MSA cannot be used on several nodes because it can only be used on one machine. Use the before setting up SQL Server
Group Managed Service Accounts:
Group Managed Service Accounts are MSAs for multiple servers (GMSA). Windows controls a service account for applications running across a number of hosts. Active Directory automatically changes the group managed service account password without having to restart any services. To set up SQL Server services, you can utilise a group managed service account principle. Starting with SQL Server 2014, group managed service accounts are available for standalone instances, and for failover cluster instances and availability groups, group managed service accounts are available starting with SQL Server 2016.To use a group managed service account for SQL Server 2014 or later, the operating system must be Windows Server 2012 R2 or later. KB 2998082 must be deployed on Windows Server 2012 R2 servers in order for services to log in without interruption following a password change. For further information, see Group Managed Service Accounts. Note: The domain administrator must create the group managed service account in Active Directory before SQL Server setup can utilize it for SQL Server services.
Virtual Accounts:
In Windows Server 2008 R2 and Windows 7, virtual accounts are managed local accounts that simplify service administration by offering the following features. The virtual account has automatic administration and a domain environment connection capability. If the default option for service accounts is selected during SQL Server setup, a virtual account in the format NT SERVICE is created with the instance name as the service name.To start SQL Server, while creating a virtual account, leave the password blank. If the virtual account is unable to do so, manually register the Service Principal Name (SPN).For further details on manually registering an SPN, see Manual SPN Registration.
Note:
Virtual accounts cannot be utilized with SQL Server Failover Cluster Instances because they do not share the same SID across all cluster nodes. The table below gives some names of virtual accounts as examples.
Service Virtual Account Name Default instance of the Database Engine service NT SERVICE\MSSQLSERVER Named instance of a Database Engine service named PAYROLL NT SERVICE\MSSQL$PAYROLL SQL Server Agent service on the default instance of SQL Server NT SERVICE\SQLSERVERAGENT SQL Server Agent service on an instance of SQL Server named PAYROLL NT SERVICE\SQLAGENT$PAYROLL Virtual SQL Service Accounts For further information on Managed Service Accounts and Virtual Accounts, see the Managed Service Accounts Frequently Asked Questions and the Managed Service Accounts ideas section of the Service Accounts Step-by-Step Guide (FAQ).
Security Note:
Always utilise the fewest user rights possible while running SQL Server services. When feasible, utilise an MSA or virtual account. When MSA and virtual accounts are not available, utilise a specific low-privilege user account or domain account rather than a shared account for SQL Server services. Use different accounts for the various SQL Server services. Don’t grant the service groups or the SQL Server service account any new rights. When a service SID is supported, permissions will be given either directly to a service SID or through group membership.
The following page has further details on SPN Registration:
https://help.deepsecurity.trendmicro.com/11_2/on-premise/kerberos-mssql.html
-
SSL certificate for SQL AlwaysOn configured servers
The steps listed below can be used to generate an SSL certificate for SQL AlwaysOn enabled servers:
- Obtain a server certificate: The first step is to either construct a self-signed certificate or obtain a server certificate from a reputable certificate authority (CA), such as DigiCert or Comodo. To create a self-signed certificate, you can use an application like OpenSSL. Self-signed certificates are not validated by a third party and are only accepted if the client adds them as an exception.
- Install the certificate: After obtaining the certificate, you must instal it on the AlwaysOn availability group’s primary and secondary replicas. Installing the certificate on a local machine is possible by using the Microsoft Management Console (MMC) Snap-in.
- After installing the certificate, you must setup the primary replica in order to use it for SSL encryption. You can accomplish this by configuring the “ForceEncryption” option for the SQL Server network protocol to “Yes” using the SQL Server Configuration Manager.
- Add the certificate to the listener for the availability group: The certificate must now be added to the availability group listener. The SQL Server Management Studio (SSMS) can be used for this.
- Check the configuration: Next, make sure the certificate is set up correctly and that SSL encryption is being used. The “sys.dm exec connections” DMV can be used to determine whether SSL is enabled.
- Setup SQL Server: Launch SQL Server Management Studio, connect to the server, and access the server properties to set up SQL Server to use the SSL certificate. Right-click the “TCP/IP” protocol under “Protocols for SQLEXPRESS” on the “Security” tab and choose “Properties.”
- Choose the certification: Choose the certificate you want to use for SSL encryption from the “Certificate” page.
- Restart SQL Server: In order for the modifications to take effect, SQL Server must finally be restarted.
It’s also crucial to remember that before deploying the certificate to your production environment, you should test it to ensure that it is functioning properly. This will guarantee that the certificate is installed properly and that the configuration is done correctly.
How to add the certificate to the availability group listener?
- Open the SQL Server Management Studio (SSMS) and establish a connection to the availability group’s primary replica.
- Expand the “Availability Groups” folder after navigating to the “Always On High Availability” folder.
- Select “Properties” by performing a right-click on the availability group listener.
- Go to the “Certificate” tab in the properties window.
- From the drop-down list, choose the certificate you wish to use for the listener.
- To save the changes, click “OK.”
- Use the T-SQL command below to check the listener’s status:
SELECT * FROM sys.availability_group_listeners
- For the modifications to take effect, restart the availability group listener.
Note: It’s crucial to remember that this phase needs to be completed on the primary replica, and the listener needs to be available and online. All of the replicas in the availability group should also have the certificate installed.
The certificate for the availability group listener can be be created and configured using the New-AzSqlAvailabilityGroupListenerCert PowerShell command.
-
SSL Certificate not visible in SQL Configuration Manager Issue
How can you tell if SQL Server successfully loaded the SSL certificate?
- The SQL Server Error Log will indicate that the certificate “ABCXYZ” has been loaded for encryption once you restart your SQL Server service, indicating that the certificate has been loaded.
- You can use the following query to see if the requests are being encrypted:
select * from sys.dm_exec_connections where encrypt_option = ‘TRUE’
- In order to discover which computer is making encrypted requests, you can identify the client information with the help of the columns “client net address” and “session id.”
Note: Since all requests are encrypted by default due to server-side authentication, you should see that for all requests! To view each connection request, execute the command “select * from sys.dm exec connections.”
What if the SSL certificate is still not shown in SQL Server Configuration Manager?
- In SQL Server Configuration Manager, there is a chance that you won’t initially see the SSL certificate in the drop-down list for Certificate.
- When we viewed the certificate’s details in one of the cases, we discovered that the Subject was “abcxyz-corpau.abc.com,” while the Subject Alternative Name had different names.
- In our instance, the customer chose to use this certificate on numerous servers. SQL Server’s domain name is ABCXYZ-SQLServer.abc.local
- Now, the following conditions were not satisfied, hence we were unable to see the certificate in SQL Server Configuration Manager:
- The common name (CN) and host name or fully qualified domain name (FQDN) of the server computer must match, according to the Subject attribute of the certificate.
- Therefore, in our case, we proposed that the Certificate Authority update the Subject name from ABCXYZ-Corp.ABC.com to ABCXYZ-SQLServer.ABC.Local (FQDN of SQL Server).
- After making this modification, we loaded the certificate in MMC once more, and SQL Server Configuration Manager now displayed the loaded certificate.
-
Automate SQL Database Migration Using PowerShell from Source to Destination
Make a quick fix for automating database migration across SQL Server versions using PowerShell and T-SQL scripting. Before moving on to a set of customised actions, I need to find several databases that can be moved using a standard procedure or at the very least a collection of standard activities. You could want to write general migration scripts and set up a system to automate their execution in these circumstances. A general method like this can be quite helpful if you’re going to iterate through several test sessions and want to migrate a fresh copy before each one.
The following actions can be made to simplify the migrating process:1) The backup database from the Source Instance will be transferred.
2) Restore the database that will be moved to the new instance.
3) Refresh the allocation usage.
4) Data purity can be used to check database consistency.
5) Set the level of page verification.
6) Set the database compatibility level to match the version of the destination instance (e.g. 120).
7) Statistics should be updated.
8) Remove any orphaned schemas (schemas automatically created for each database user to maintain compatibility but with no objects belonging to them).
9) Fix Orphaned Users if any exists.
10) Run Rebuild or re-organize all databases based on Migration Change window and the databases size.I’m automating the first two Point’s, databases today for backup and restore. I still have to finish automating the other points, but I’m working on it. I’m hoping that by sharing this backup and restore PowerShell script today, some of the DBAs may find it useful.
You will use handlers that SMO will immediately call back when you execute backup and restore commands in T-SQL, and you will provide a straightforward progress notification similar to the ones offered by SSMS or sqlcmd. Please remember that the completion handler does nothing more than display the error.A handful of the PowerShell script’s functions include Get-SQLInstance, DatabaseFullBackup, and RestoreDatabaseFromFullBackup.
Click the link below top open the PowerShell script that will implement backup/restore procedures for each database which we want to migrate to complete the migration :
https://sqlserveradmintutorial.blogspot.com/2023/01/powershell-script-to-backuprestore.htmlA few variables in the script above, including SourceInstanceName, Destination Instance Name, Backup Path, DATA File Path, LOG File Path, and Databases Names, will need to be updated.
The script will start taking database backups from the source instance and restoring them to the destination instance once all the parameters have been customised to your environment.Additionally, my destination instance doesn’t have any databases. Using the aforementioned PowerShell script, I’m currently backing up and restoring my “DB1,” “DB2,” and “DB3” databases. When I run the script, it first takes a backup of the first database, restores it to the destination instance once the backup is finished, and then takes a backup of the second database. It starts restoring the second database once the backup is finished.
As soon as the DB1 Database backup is complete, the destination DB1 Database is instantly restored. After the restoration is complete, the backup for the following database (the DB2 database) is initiated, etc.,
Migration of the databases I mentioned in the Databases Field was completed by the script.The migration of all three databases from the source instance to the destination instance has been completed successfully. Now all you have to do to migrate your given databases is to run the script in PowerShell ISE.
-
DISCONNECTED State Issue on Alwayson Secondary replica
We had a strange issue with the Always on 3-Node replica today.
I got onto the server to Implement a change, which was to apply a SQL Patch to SQL Server 2016 SP2.
I have discovered that my AG’s health isn’t looking well though, and the databases in that secondary replica’s state are in RESTORING mode.
If a secondary replica in an Always On availability group is in a DISCONNECTED state, it means that it is no longer able to communicate with the primary replica or other secondary replicas in the availability group. This can be caused by a number of issues, such as network connectivity problems, configuration issues, or resource contention on the secondary replica.
Here are a few steps you can take to troubleshoot the issue:
- Check the network connectivity: Make sure that the secondary replica has a stable and reliable network connection to the primary replica and other secondary replicas. Check for any network configuration issues or problems with the network equipment.
- Check the availability group configuration: Make sure that the availability group configuration is correct and that the secondary replica is correctly configured to participate in the availability group.
- Check the resource utilization: Make sure that the secondary replica has sufficient resources, such as CPU and memory, to keep up with the replication workload.
- Check the error log: Check the SQL Server error log on the secondary replica to see if there are any error messages related to the disconnected state.
- Check the Windows Application Event log: Check the Windows Event log for any events that may indicate the reason for the disconnected state.
- Check the status of the synchronization: Check the status of the data synchronization between the primary and secondary replica. You can use the sys.dm_hadr_database_replica_states dynamic management view to check the synchronization status.
- Check for any known bugs and troubleshoot accordingly: Check if there are any known bugs in the version of SQL Server you are running that could cause this issue.
- Try to re-connect the replica: If all the above steps fail, you can try to manually force a reconnection between the primary and secondary replica using the ALTER AVAILABILITY GROUP command or using the SSMS GUI.
- It’s important to note that some of these steps may require a downtime, so it’s recommended to have a plan in place before attempting any troubleshooting steps.
So, googled for below error:
Error Message 35201:
When attempting to connect to availability replica ‘My replica’ with id [availability group id], a connection timeout occurred. Either a network or firewall problem, or the replica’s endpoint address is not the DB mirroring endpoint of the host server instance.
There is no connection between this secondary replica and the primary replica. DISCONNECTED is the connected state.
I found the following KB article on support.miscrosoft.com.The SQL instances version, SQL 2016 SP2, made it useless for my current organization, though.
As a result, we alerted the Windows Team, who are looking into it. We also discovered that some cluster issues had been recorded in Cluster Events on Windows Server Failover Cluster Manager.
Due to the fact that the data is not being moved to one of the secondary replicas, we are receiving disk space alerts in the interim.
In order to test if the issue was rectified, we requested permission from the Client to reboot the problematic secondary replica, as stated on the Microsoft page above.
The customer provided his okay, we rebooted the replica because the secondary replica isn’t being used in any way, but the issue still exists.
The SQL instances version, SQL 2016 SP2, made it useless for my current organization, though.
As a result, we alerted the Windows Team, who are looking into it. We also discovered that some cluster issues had been recorded in Cluster Events on Windows Server Failover Cluster Manager.
Due to the fact that the data is not being moved to one of the secondary replicas, we are receiving disk space alerts in the interim.
In order to test if the issue was rectified, we requested permission from the Client to reboot the problematic secondary replica, as stated on the Microsoft page above.
The customer provided his okay, we rebooted the replica because the secondary replica isn’t being used in any way, but the issue still exists. -
How to fix Slow running queries on SQL Server
As a DBA, we carry out all the fundamental checks, including examining CPU and memory consumption, any blockings, deadlocks, locks, fragmentation, statistics updating, and so forth, when a user approaches you and complains that his query is taking too long to finish. After completing all of these checks, if the user is still experiencing slowness, we must carry out the procedures listed below to diagnose or resolve SQL Server queries that are running slowly.
Identify the slow-running queries: Find the slow-running queries by using SQL Server Management Studio (SSMS), which allows you to see the queries that are presently running and their current execution times. The SQL Server Profiler can be used to track down queries that are running more slowly than intended. You may use the profiler to filter and sort the traces depending on several factors including duration, CPU usage, reads, and writes.
Analyze the query execution plan: After you’ve found the slow-running query, you can utilize the execution plan to learn more about how it’s being processed. The execution plan displays the steps that SQL Server takes to carry out the query, together with the operations’ projected prices and their respective orders. You can find out if there are any possible indexing, join, or other query-related performance problems by looking at the execution plan.
Check for missing or outdated statistics: The query optimizer needs up-to-date information about the data distribution in the tables utilized in the query in order to produce effective execution plans. The optimizer might make less-than-optimal decisions and the query might execute more slowly than anticipated if the statistics are incomplete or out of date. To update the statistics, use either the UPDATE STATISTICS statement or the sp updatestats stored procedure.
Check for blocking: When one query has a lock on a resource that another query needs to access, blocking occurs. Performance may be slowed as a result of the second query having to wait until the lock is released. To check for any potential blocking, utilize the sys.dm exec requests dynamic management view.
Check for resource contention: If the query is running slowly, use Performance Monitor to look for resource contention, such as high CPU or memory utilization. The resources that are leading to the highest waits can also be identified using the dynamic management view sys.dm os wait stats.
Check for hardware issues: Look for any hardware problems that might be slowing down the query, including disc constraints or high network latency. To evaluate disc performance and locate any bottlenecks, utilize tools like Perfmon, SQLIOSim, and SQLIO.
Optimize the query: Once the source of the poor performance has been found, you can improve the query by altering the query, the indexing, or other database infrastructure components. Adding or changing indexes, rewriting the query to employ more effective join methods, and using stored procedures rather than ad-hoc queries are all examples of common optimization strategies.
Monitor Performance: You may find and fix slow-running queries before they become serious issues by routinely monitoring the performance of your SQL Server instances, including query performance. You may track the performance of your SQL Server instances and fix any issues using the built-in performance monitoring tools in SQL Server, including SQL Server Management Studio (SSMS), SQL Server Profiler, and dynamic management views (DMVs).
-
Troubleshooting steps for SQL Connectivity Issues
You might have observed DBAs attempting to resolve SQL connectivity problems. So, I decided to post some tips here on how to fix SQL Server connectivity issues. Based on my experience, I’ve provided the essential actions to do below. The following are the things I look for while trying to connect to SQL Server but failing:
- SQL Server configuration Issue
- Firewall Issue
- Network Issue
- Logon Issue
- Configuration Issue in Application
SQL Server configuration Issue:
- Verify that the machine hosting SQL Server is up and listening on protocols.
- In SQL Server Configuration Manager, protocols can be verified and enabled.
- Named Pipes and TCP protocols must be enabled for SQL Server to connect remotely.
- We may check whether the server is listening on protocols in the SQL Server Errorlog.
- If the SQL instance is a named instance, make sure SQL Server Browser is operating on the machine.
Firewall Issue:
- Firewalls can prevent client and server SQL connection requests.
- This can be fixed by disabling the firewall, however this is only a temporary solution because it can be permanently turned off at any time.
- Adding SQL Server and SQL Browser to the Firewall Exception list will permanently cure issue.
- Provide the port number in which SQL is listening while adding the exception for the SQL Server DB Engine.
- When adding a SQL Browser exception, make sure to include UDP port 1434.
Network Issue:
- The Ping response from your source server to the target server is the first indicator of SQL connectivity.
- From the client machine to the SQL Server box, you may also use Telnet to test the UDL.
- Check nslookup; a misconfigured DNS entry may be the root of the SQL connectivity problem.
logon Issue to SQL Server:
- This is distinct from the examples mentioned above. Based on the error message we receive from the application team, this has to be troubleshot.
- Due to a problem with login connectivity to SQL, each individual case must be examined.
- If a login attempt is made to a SQL server and is unsuccessful, errors can be seen in the server’s event viewer and SQL error logs.
- We can examine the logs and resolve the problem.
Configuration Issue in Application:
- An incorrect setting in the application can also cause a problem with SQL connectivity.
- In the application configuration file, look for the connection string parameters.
- Additionally, verify the login account application being used to connect to SQL.
- If it is a SQL account, find out which authentication mode is enabled in the SQL instance properties.
- Check the account’s permissions level as well.
Incorrect connection string:
An improper connection string is one of the most frequent reasons of problems with SQL connectivity. An improper server name, database name, or authentication technique could be at blame for this.
Incorrect SQL Server instance name:
In the connection string, the client must include the right SQL Server instance name. Make sure you are using the right instance name.
These are a few typical reasons why SQL connectivity problems occur, but the precise reason can change based on the circumstance. You can pinpoint the issue’s root cause and choose the best course of action by using the troubleshooting procedures I discussed previously.