Patching Issue- Fixing of Missing Windows Installer Cache Files(.MSI & .MSP) which occurs during SQL Server Update

When you are installing a Microsoft SQL Server service pack or a cumulative update, sometimes we will encounter the error messages with Windows Installer Cache problems. The Windows Installer Cache are located at c:\windows\installer folder which stores important files for applications installed using the Windows Installer technology and should not be deleted. The problem occurs when we are performing these actions such as uninstall, repair, or update SQL Server

Note : If the Cache installer is compromised then you will not see any immediate problems

.The issue occurs when the cache files are deleted from the location c:\windows\installer.

The problem occurs when the Windows Installer database file (.msi) and the Windows Installer patch file (.msp) is missing from the Windows Installer cache folder. The Windows Installer cache is located in the following folder:

        C:\Windows\installer

After doing SQL Patching changes continuously for so many servers without any Issue yesterday when Implementing SQL Patching on one of the Critical Server finally the day has arrived and faced Missing of MSI & MSP error issue

When I run SQL Server 2008R2 ServicePack 3 Initial Setup wizard itself not getting opened and throwing below error :

As per the Error I have fixed the Issue of Missing File by Copying the file .MSI which was mentioned in above to INSTALLER Folder  from SQL Media and started running Service Pack 3 again , still having issue and got other error as below :

I have fixed this error as well in the same way which I did as above and Initiated Service Pack 3 again now this time the Initial setup wizard was opened and the setup was started

After clicking the Update button in few minutes an error was popped up saying another MSI file was missing as it is not the best way to fix these MSI and MSP errors  also not sure how many files are Missing.

So, I have Cancelled the Setup and checked for Missing Files using FindSQLInstalls.vbs script
which helped me to find missing of MSI & MSP files before we start the ServicePack 3 Setup.

Script was found in MICROSOFT site  https://support.microsoft.com/en-gb/help/969052/how-to-restore-the-missing-windows-installer-cache-files-and-resolve-p#%2Fen-gb%2Fkb%2F969052

Copied the script to Server ‘ XXX ‘and saved at Location E:\MSI\FindSQLInstalls.vbs ..Ran the script from Command Prompt by redirecting the directory from Drive C to E:\MSI

                                          FindSQLInstalls.vbs E:\MSI\output.txt

Note : This script needs Elevated Administrative Privileges , Open the Command Prompt with Run As Administrator

The output file contains all the Missing Files of MSI & MSP which needs to fix before staring the SP setup

Open the output.txt file and search with key word  ” DOES NOT   “

Below Screenshot for finding out of MSP file

Below Screenshot for finding out MSI File

Copied the Location( COPY “Path”) from the txt file and kept in a side which is Highlighted in above screen shots

Now opened the Media file and copied the MSI file to C:\Windows\Installer Folder and Renamed the MSI File to Cached name which is shown in Error

Again searched with key word  ” DOES NOT ” till the end of the txt file and found total 25 Missing Files

Fixed all Missing files and ran the Script again in Command Prompt  FindSQLInstalls.vbs E:\MSI\output.txt to check that no files are missing before starting the SP installation.

Opened the Output file and searched for key word ” DOES NOT   ”  this time I didn’t find any Missing File which Needs to take Action

So started running SP3 but after Fixing the MSI and MSP files it asked for Server RESTART

I requested windows team to reboot,once server is came UP Initiated SP3 , this time patch was completed successfully.

After SP completion it was asked for RESTART of Server again and once the Server is Up I  started Installing Security Update and this time again some new Issue I faced which is shown below :

The location which is showing in above error was not exist on Drive H  …So, I have clicked browse button and given the correct path of the File sqlncli.msi and clicked OK

But popped up with error not a valid one

So, Clicked ok and Cancel but the setup was Running and Finally the Security Update was Successfully Applied

The Security Update KB article is Reflecting KB4057113 and all looks Good .

NOTE :

1. The SQL Media\Service Pack \ Security Update \ Cumulative Update any patch or ISO file we need to EXTRACT Manually first to a Specific location or folder and then run the installation by clicking on “SETUP.exe” file which is Best Practice for Installation

2 .If we run the Package directly without Manual extraction the Software will extract  Automatically to Some UN-known Location and will start the Setup Once that the server is REBOOTED sometimes that UN-known location will not be exist on the server and next time when we run any New SP\CU\SU it will check for Previously Extracted location but unfortunately that location is not available on Server it will throw an error and the installation will Failed.

Node Quarantine state Issue on WSFC 2016 with Availability groups( Win 2016 + SQL 2016 )

We have encountered an odd issue in our environment, where we had AlwaysON Availability Groups configured on Windows Server 2016 and SQL Server 2016.
When operating system patching is in progress and we’re working on AG failover and failback, I found an intriguing scenario. I was unable to view the AG dashboard healthy for one of the nodes after running some network loss situations. I noticed that SQL is up and running, but the Availability Groups are in the Resolving State. I quickly checked WSFC and noticed that one of the nodes is reporting an error and is not operational. So, I tried to get my cluster node back online as soon as possible using the typical method, but it didn’t work. A cursory examination of the cluster, A brief glance at the cluster event log revealed the following error message:

The status of the node in WSFC is also quarantined… This is a fascinating topic!!

What would happen to my availability group if the node does not automatically join the cluster till 02:03:26 in my case? The quarantined cluster node, as shown in the screenshot below, indicates an availability replica disconnected and a synchronization issue.

We notice a corresponding error number 100060 with the message An error occurred while receiving data: ‘10060(A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.)’. There is no explicit notification concerning quarantine state in the SQL Server error log. I received the following sample message in the SQL Server error log from the secondary replica:

SQL Server is waiting for the cluster node to join the WSFC and start. In summary, the quarantined node is operational, hence the availability health condition will remain unchanged. As a result of the problem, the node will join the cluster automatically, which may be a good thing until the linked issue on the affected cluster node is resolved. Fortunately, according to the Microsoft whitepaper, we will not have to wait for the quarantined period to end.

To release the quarantined State of Node, run the following PowerShell command:

Start-ClusterNode -Clearquarantine

Let’s get started! Check the cluster’s health status now, and the Node has been restored to its previous state.

Upgrade/migrate SQL Server 2014 Availability group(s) running on Windows Server 2012 R2 to SQL Server 2019 running on Windows Server 2016

What are we intending to do today?

With the least amount of downtime, upgrade/migrate SQL Server 2014 Availability group(s) running on Windows Server 2012 R2 to SQL Server 2019 running on Windows 2016.

My current working environment is as follows:

I’m running SQL Server 2014 Always On Availability Group with synchronous commit mode on a two-node Windows Server 2012 R2 Failover cluster. For my applications to connect, I’ve set up a listener. These two replicas are using the most recent releases of SQL 2014.

The following are the replica(node) details:

W12SQL2014A
W12SQL2014B

My two replicas (Nodes) running WindowsServer2012R2+SQL 2014 are shown above.

Our main Agend is
SQL 2014 AG, which is operating on Windows Server 2012R2, it has to be upgraded/migrated to SQL Server 2019, which is running on Windows Server 2016, with little downtime and no configuration changes for the application teams.

Note : I’m not doing an in-place upgrade right now. This necessitates extra downtime. Let’s assume that in-place upgrades are not permitted.

Plan of Action at a High Level:

. Prepare with two new Windows Server 2016 releases (W16SQL2019A & W16SQL2019B)

. SQL Server 2019 should be installed on both Windows 2016 servers.

. W16SQL2019A and W16SQL2019B nodes should be added to the same windows cluster

. In SQL 2019, enable the Alwayson HADR feature

. In SQL Server Alwayson Availability Group, add the two nodes as replicas

. Join the Databases .

. Examine AG’s dashboard to ensure that everything appears to be healthy and green

. Failover SQL Server 2014 to SQL Server 2019 during the final cutover date/time change window, then delete the old replicas (SQL Server 2014) from AG

. EVICT both nodes from the Cluster in Windows Server 2012R2.

After a successful configuration, there are a few things to consider.

Adding Win2016 nodes to a Win2012R2 environment MIXED MODE is being used by the same cluster. Please don’t leave this mixed mode WSFC running for an extended period of time. If you continue in mixed mode for longer than 4 weeks, Microsoft may refuse to help you. This blog is solely dedicated to performing rolling updates in order to ensure that your systems are truly highly available. Make an effort to complete the process in a day or two and ensure that it is completed.

Database synchronization does not always start automatically after adding new replicas. To get it SYNCHRONIZED, you must manually join each database.

Change the failover mode for all replicas to manual only to make sure the cluster doesn’t have any control over my AG failing over automatically.

Before you proceed with your AG failover from W12SQL2014A (which is a Primary Replica) to W16SQL2019A, make sure to alter the Availability Mode to SYNCHRONOUS COMMIT mode at the final cutover.

At this time, W16SQL2019A has assumed main responsibility for all databases in your AG that have been upgraded to SQL 2019, and the other SQL 2019 (W16SQL2019B ) instances will be in sync as well.

However, the two SQL 2014 instances will be in a bad way; in fact, both databases will become unreachable at this time since logs cannot be transferred from a higher(2019) to a lower(2014) version.

This means that databases on SQL 2014 instances will not be SYNCHRONIZED.

Finally, I EVICTED the Nodes W12SQL2014 from WSFC and removed both SQL 2014 Instances from Availability Group as Replicas.