Troubleshoot SQL Connectivity Issues

DBA/Application teams regularly encounter SQL Connectivity difficulties while connecting from apps during daily operations with SQL Instances. I’ll go over a few checklist procedures for troubleshooting SQL Connectivity difficulties in this article.

When a client connects to SQL Server, it must first authenticate and then authorize itself. If the problem is due to authorization, the client will be able to establish a physical connection with SQL Server, but the problem will occur while authorizing with SQL Server, such as Login errors due to permission issues, and so on. If the problem is caused by a lack of authorization, the SQL Server Error log should be the first place to go for answers.

Isolating which component is causing the problem is a critical step in resolving SQL Connectivity issues: We can categorize connectivity concerns into the following categories:

A problem with the network
A problem with SQL Server’s settings.
A problem with the firewall
An issue with the client’s driver
A problem with the application’s settings.
Issues with authentication and logon.

The following are some of the most common error messages generated by the programme when connecting to SQL Server:

Error 1:

Error 2:

Error 3:

Error 4: Error is Similar to this not exactly the same error

[Microsoft] [SQL Native Client 10.0] TCP Provider: The remote server forcibly closes the existing connection.

CHECKLIST STEPS:

1.Gather information about the servers participating in the connection.

What is the total number of servers involved?
Is this a web-based or thick-client application?
Is the client establishing a connection to the application server, which then establishes a connection to SQL Server?
Is the connectivity problem limited to one client box, or are several clients unable to connect?
Make a list of all the servers’ names and IP addresses.

2. Check to see if SQL Server is accessible. A simple test is to use network ping to see if the server can be reached.

Ping <SQLServername> or Ping <IP address of SQL Server>If Ping fails to discover the SQL Server box, it means the client is unable to locate the SQL Server machine, and we will need the help of the Network team to debug this further.

3.Check to see if SQL Server is accessible. A simple test is to use network ping to see if the server can be reached.

4.Shared memory protocol is used for local networking on SQL Server boxes. The next step is to verify SQL Server’s protocols and make sure that TCP/IP and Named Pipes are enabled for remote connections:

5. To see if the client can connect to the SQL Server port, type: To see if the SQL Server port is accepting inbound connections, we can use the telnet programme.

For example: For instance, if the SQL Server’s host name is SQLprod and the port on which SQL is listening is 1433 telnet sqlprod 1433, and the client box is unable to contact the port on which SQL is listening, we will receive the following error:

6. If telnet also fails, this is a clear indicator that the port is being blocked by a device (firewall). The utility portqueryUI can be used to see if a port is blocked or not.
Download the PortqueryUI utility here: https://www.microsoft.com/enin/download/details.aspx?id=24009

If you see Filtered in the output, that means the port is blocked.
For example, if the SQL hostname is aoindia and SQL Port is 1433:

The next step is to check with the Windows/networking team to see if the port can be opened. At the operating system level, ensure that inbound and outbound rules are created to allow the connection for SQL port.

7. Run the PortqueryUI software again after the port has been allowed, and make sure the status is LISTENING.

The Telnet tool can only be used to see if a TCP port is blocked. However, the PortQueryUI application can be used to check if a port is banned for both TCP and UDP.

8. Gather the application’s connection string information. If the application is web-based, the connection string information is usually found in the web.config file. Collect the following information: a. Data Source b. Initial Catalog c. Integrated Security d. Provider a. Data Source b. Initial Catalog c. Integrated Security d. Provider
For example:
Provider=SQLNCLI10; Server=myServerName\theInstanceName;Database=myDataBase; Trusted_Connection=yes;
Few important details to be studied from connection string are: Type of authentication used (Windows/SQL), provider used (OLE DB or ODBC), Data Source field.

9. 

A simple test to see if the problem is caused by the application is to connect to the SQL instance from the client box using SQL management studio. However, the SSMS tool may not be installed on the client PCs in the majority of cases. We can construct a universal data link tool (UDL) to assess connectivity in this situation.
Create a text document on the client system with the extension.udl and rename it.
For example: test.udl
Select the appropriate provider list in the udl file based on the application’s provider (Microsoft OLE DB for SQL Server) or SQL native Client provider and click next.

Click on test connection after entering the SQL Server name and default database:

If the connection is established via udl, it means that the application’s connection string configuration is incorrect.

Simulate the connectivity issue with the same settings as the connection string and see if you can duplicate the problem.
Check with the application team to see if the connection string can be replaced with the one provided by UDL if the connectivity works from UDL. [To acquire the dynamically produced connection string, open the udl file in notepad.]

10.  If the problem is caused by the client provider, see if a different provider can be used to connect to SQL. Select several SQL providers from the udl file and verify whether or not connectivity works.
For example, If the problem is only with Microsoft OLE DB for SQL Server, see if the SQL native client can help.

11. If the data source name (commonly known as DSN) is used in the application connection string, create a user DSN with the ODBC Data Source Administrator tool (odbcad32) and test connectivity.

To launch ODBC administrator tool in 32-bit mode: C:\Windows\SysWOW64\odbcad32.exe

12. On the client PC, go to the registry location below and look for Last SQL Server connect HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSQLServer\Client

Clear the entries for the SQL Server in question specified for lastConnect and delete them. [Before making any changes, make sure you have a registry backup.]

13. Use the SQL Server Client network utility to see if there are any client-side aliases:

To launch the tool:

Examine the alias tab for any orphaned alias entries. If there are any aliases established for the SQL Server instance we’re troubleshooting, make a backup of them and delete the entries.

To run the SQL Client Configuration Tool in 32-bit mode, follow these steps.: C:\Windows\SysWOW64\cliconfg.exe

14. Check to see if you can test connectivity using protocols like TCP and Named Pipes.

If the connectivity works after forcing the protocol, ask the application team if the connection string can be changed to force the protocol in the data source field.

To see if the SQL connection is using TCP/named pipes, do the following: run the following query:

select session_id, net_transport from sys.dm_exec_connections

Check if a client side alias can be formed if no changes to the connection string can be made and the connectivity works by imposing a protocol.
Example: Create a named pipe alias using SQL Server client network utility if the connectivity only works with Named pipes protocol.

15.  By forcing the SQL port, you can see if you can connect to the instance. For example:

If you can connect while forcing the port number, there’s a problem with SQL Browser. SQL Browser is in charge of matching the instance name to the SQL Server port. Then, to troubleshoot browser difficulties, verify if SQL UDP browser port 1434 is banned or not.

16.  Check for connectivity by giving the SQL Server’s IP address rather than the hostname. For example:

If the instance hosting SQL has an IP address of 10.16.17.18 and SQL listens on port 5223, then:


Create a host file entry with the Server name and the IP address mapping to prevent DNS name resolution and involve the DNS team for name resolution issues if the connectivity works by hardcoding the IP address in the Server name.

17. If the connectivity problem persists after doing the aforementioned methods, the next step is to collect network traces using the Network Monitor programme.

I hope the techniques outlined above may assist you in resolving SQL Connectivity difficulties.

TroubleShoot SQL High CPU Utilization

An way to isolating SQL high CPU concerns is summarised in the screenshot below:

Microsoft recommends the following CHECKLIST POINTS:

1. Check the overall CPU usage via the Windows task manager. Gather information on the number of logical processors on the box.

2. Check the SQL Server process CPU consumption in Task Manager. Is the SQL CPU always at or above 70%?

3. Collect the following information:

  • How do you usually realise your CPU is a bottleneck?
  • What is the problem’s impact? Are there any specific errors that your user application encounters?
  • When did the issue initially appear? Is there anything that changed around this period that you’re aware of? (Has your workload increased? Changes to the table’s dimensions? Is it time to update your app? (Is there a SQL upgrade?)
  • Is it possible to establish new connections to the server during the outage?
  • How long did the issue persist? Have you been able to come up with anything that appears to be helpful in resolving the issue?
  • During the issue periods, what system-level symptoms did you notice? Is the server console, for example, slow or unresponsive during the problematic periods? Is there an increase in overall CPU usage? If that’s the case, what percentage of CPU is used during the problem? What percentage of the CPU is expected to be used?

4. f a process other than SQL Server (sqlservr.exe) is producing the high CPU, contact the team in charge of that process.

5. Open Perfmon and add the counters listed below:

Process (sqlservr):

% Privileged Time

% Processor Time

% User Time

Processor

% Privileged Time

% Processor Time

% User Time

6. Engage the Windows team if Processor Privileged time exceeds 25%.

Processor Time = Privileged Time + User Time.

7.  Verify that SQL is using a lot of CPU on the box by looking at the counters below:

Process (sqlservr): % Privileged Time

% Processor Time

% User Time

The CPU consumption by SQL Process is calculated by dividing the observed value by the number of logical processors.

If (Process (sqlservr) % Privileged time/No of Procs) is greater than 30%, make sure that KB 976700 for Windows 2008 R2 is installed.

This phase determines whether SQL Server is the source of the server’s high privilege time. Engage the Windows team if SQL privilege time is high, as determined by the calculations above.

8. Check the following sp_configure configurations for compliance with best practice recommendations:

For Max DOP recommended settings, see KB 2806535.

9. If you can’t connect to the SQL instance locally using SSMS, try connecting via Dedicated Admin Connection (DAC) with the following commands:

ADMIN: Servername

10. Using the following query, get the top 10 queries that use a lot of CPU:

SELECT s.session_id,

r.status,

r.blocking_session_id 'Blk by',

r.wait_type,

wait_resource,

r.wait_time / (1000 * 60) 'Wait M',

r.cpu_time,

r.logical_reads,

r.reads,

r.writes,

r.total_elapsed_time / (1000 * 60) 'Elaps M',

Substring(st.TEXT,(r.statement_start_offset / 2) + 1,

((CASE r.statement_end_offset

WHEN -1

THEN Datalength(st.TEXT)

ELSE r.statement_end_offset

END - r.statement_start_offset) / 2) + 1) AS statement_text,

Coalesce(Quotename(Db_name(st.dbid)) + N'.' + Quotename(Object_schema_name(st.objectid, st.dbid)) + N'.' +
Quotename(Object_name(st.objectid, st.dbid)), '') AS command_text,
r.command,

s.login_name,

s.host_name,

s.program_name,

s.last_request_end_time,

s.login_time,

r.open_transaction_count

FROM sys.dm_exec_sessions AS s

JOIN sys.dm_exec_requests AS r

ON r.session_id = s.session_id

CROSS APPLY sys.Dm_exec_sql_text(r.sql_handle) AS st

WHERE r.session_id != @@SPID

ORDER BY r.cpu_time desc

11.  Examine the wait type of the queries in the report above. If the CPU is the bottleneck, most sessions will experience the following delays:

SOS_SCHEDULER_YIELD

CXPACKET

THREADPOOL

If most queries are waiting on CXPACKET, examine the “Max degree of parallelism” and “Cost degree of parallelism” settings in sp configure to see if they are set according to best practice recommendations.

12.  To retrieve a list of the top CPU queries, run the SQL Standard report:

Right Click on the instance, go to reports> Standard reports

In the report, look at the top CPU demands. Compare the report to the list of the top CPU-intensive requests acquired in the previous phase.

13.  Once the top CPU queries have been discovered, use the statement_text and command_text column output from step 10 to generate a list of all SQL tables involved.

Examine the following:

Examine the top CPU-driven tables for Index Fragmentation.

when was the last time the statistics were updated?

Rebuild the index if the fragmentation is greater than 30%. Update the statistics on the table if they haven’t been updated.

If there are only a few tables responsible for high CPU in the Top CPU queries, share the tables list with the application team, as well as the statistics and fragmentation reports.

Check if there are any select queries that are consuming a lot of CPU, and ask the application team if these may be temporarily stopped on high-performance OLTP servers.

14.  If SQL is still using a lot of CPU after the database maintenance activity (such Index rebuild and Stats update), run the query specified in Step 10.

Examine the Top CPU query to see whether it has changed. If the query has changed, go to Step 13 and take the appropriate action. Continue to the following step if the query remains the same.

15. Using the following methods, gather the expected execution plans for the top CPU-intensive queries:

Query 1:   From the output of the query mentioned in step 10, get the top CPU using session IDs.

Take note of the Plan handle and SQL handle from the following query:

select sql_handle,plan_handle from sys.dm_exec_requests where session_id=<session_id>

Obtain the query’s text:

–Replace the SQL Handle with the result of the query above.

select * from sys.dm_exec_sql_text (sql_handle)

Get the query’s estimated execution plan:

–Replace the Plan handle with the result of the query above.

select * from sys.dm_exec_query_plan (plan_handle)

Query 2: The query below records the inquiry’s total CPU time as well as the plan handle. The query’s plan handle is required to obtain the query’s expected execution time.

select

highest_cpu_queries.plan_handle,

highest_cpu_queries.total_worker_time,

q.dbid,

q.objectid,

q.number,

q.encrypted,

q.[text]

from

(select top 50

qs.plan_handle,

qs.total_worker_time

from

sys.dm_exec_query_stats qs

order by qs.total_worker_time desc) as highest_cpu_queries

cross apply sys.dm_exec_sql_text(plan_handle) as q

order by highest_cpu_queries.total_worker_time desc

16.  With the application team, provide the estimated execution plan.

Look for a Operator that has a high cost.

Check the operator’s indexes and the predicted amount of rows.

Check the statistics and indexes on the table for the operator with the highest cost to ensure there are no stale statistics.

Check to see if the expected execution plan suggests creating a new index. Share the missing index details with the Application team if the plan provides an index recommendation.

17.  The “Convert Implicit” function in the execution plan can also cause SQL Server to use a lot of CPU.

Examine the execution plans for queries that use a lot of CPU, as well as the Operator with a High Cost, to see if the Convert Implicit function is used..

The CONVERT IMPLICIT function implicitly converts the column “NationalIDNumber” to integer in the preceding snapshot, despite the fact that it is defined as nvarchar in the table definition (15). So, share the report with the application team and double-check that the data type provided and the data type kept in the database are the same.

18.  Run the missing index query on the database that has a high CPU use report to see whether any missing indexes are recommended. Inform the Application team about the Index suggestion report.

19.  Use the Database Engine Tuning Adviser to tune the top CPU-intensive queries to check if the database engine advises index creation or statistics creation.

20.  Compilations/Re-Compilations in SQL Server :

Take the following counters from perfmon:

SQL Server: SQL Statistics: Batch Requests/sec

SQL Server: SQL Statistics: SQL Compilations/sec

SQL Server: SQL Statistics: SQL Recompilations/sec

Batch Requests/sec: Number of SQL batch requests received by server.

SQL Compilations/sec: Number of SQL compilations per second.

SQL Recompilations/sec: Number of SQL re-compiles per second.

If the number of recompilations is high, look for the following:

Any Schema changes

Statistics changes

SET option changes in the batch

Temporary table changes

The RECOMPILE query hint or the OPTION (RECOMPILE) query hint are used to create stored procedures.

Add the following events to SQL profiler and look for stored procedures that are regularly recompiled.

21. Check to see if SQL System threads are using a lot of CPU:

select * from sys.sysprocesses where cmd like 'LAZY WRITER' or cmd like '%Ghost%' or cmd like 'RESOURCE MONITOR'

Ghost cleanup thread >>>> Examine whether the user deleted a huge number of rows.

Lazy Writer thread >>>>>>> Check if the server is experiencing any memory constraints.

Resource Monitor thread >> Examine whether the server is experiencing any memory constraints.

22.  Check if any traces are active on the server if the Top CPU consuming queries have the wait type: SQLTRACE LOCK.

select * from sys.traces

23. During the Top CPU issue, collect the PSSDIAG. See KB 830232 for more information. In the SQL Nexus tool, load and analyse the data..

24. If the SQL CPU utilisation is still high after completing the preceding action plans, increase the server’s CPU.