An existing connection was forcibly closed by the remote host

June 4, 2009 11:17 by Marc

Two different projects complaining about the same issue: nice troubleshooting challenge! One is SharePoint-based and the second is a, let’s say “entertaining” .Net-based application. They both make use of SQL Server as back-end data store and both complain about having “existing connection forcibly closed” reported in they stack trace when then attempt to connect to SQL.

This happens when a client application is trying to re-sued an existing TCP connection to a remote host while it closes it, making connection reuse impossible. There are actually multiple possible root causes which do no seem to be mutually exclusive:

Limit set on the number of connection allowed by SQL Server on a given instance

For a given SQL instance, you can set the maximum number of connections that can be used by applications. Depending on the way your application is written, multiple connection might be used for a single transaction… Raise the limit or set it to unlimited as necessary.

The (infamous) Scalable Networking Pack

The Scalable Networking Pack is a set of improvements brought to the Windows Networking stack. It is available as an add-on for Windows Server 2003 but is included from Service Pack 2 as well as from Windows Vista/2003.

This update greatly modifies the way Windows handles network connectivity at TCP-level and might therefore provoke the error. In short, the following settings should be modified on the SQL Server (or on any server acting as the server component):

In the registry, under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\

  • EnableTCPChimney (REG_DWORD) set to 0 (disabled)
  • EnableRSS (REG_DWORD) 0 (disabled)
  • EnableTCPA (REG_DWORD) 0 (disabled))

Applying the change requires a reboot.[UPDATE] Some MS sources report that a reboot is not necessary for some settings so I switch my statement to *might* require a reboot.

You’ll find a lot of trustworthy online resources recommending to disable the SMP…

On the other hand, recent NIC drivers may allow your system to work properly with these options set to enabled… Look at this page to get a list of SMP “partners”: http://technet.microsoft.com/en-us/network/cc984184.aspx.

Faulty NIC, NIC driver or driver settings 

Some NIC include a TCP Offload Engine (TOE). Incorrectly configured or running an out-dated, they will generate error at TCP-level.

In some cases, the TOE simply does not work, so you also might want to test with this function completely disabled. When editing you driver’s parameters, look for “Large Send Offload”, “Checksum offload”…

Import to note, you might also want to check the link speed and duplex at NIC level AND at switch port level, they might also cause the problem. Remember,: they must be identical on BOTH sides

Applying the change *might* requires a reboot.

Windows TCP/IP Stack Custom Configuration or Hardening

There are plenty of resources describing how to “harden” the Windows TCP/IP stack. Unfortunately most of them simply show the “how to”, not its consequences. Of of them being the performance decrease implied by hardening. You’ll also find those parameters under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\

In our case, the parameter “SynAttackProtect” set to 1 instead of 0 (disabled) will force Windows to be more restrictive regarding the incoming or TCP connection requests and well as more aggressive with the re)use of existing one. If the parameter is  enabled, the following additional parameters will also be taken into account:

  • TcpMaxPortsExhausted: Determines the maximum number of connections that can be opened before enabling protection against SYN attacks
  • TCPMaxHalfOpen: Determines the maximum number of connections that can be left “half-open” (waiting for re-use)
  • TCPMaxHalfOpenRetried: Same as above BUT applicable to connections that were effectively re-used by the original client

The parameters above are thresholds used by Windows to determine if a TCP-based (SYN) attack is in progress or not. They should only be used if the server is put in a high risk situation (DMZ or internet-facing) while there is not other security device put in place (Firewall…).

Note that, before Windows 2003 SP2, this SynAttackProtect is set to 0 while with SP2, it is set to 1 then with the latest versions of Windows, it returns to 0…

Automatic adjustment for the TCP window size (From Vista or 2008 only)

On the client side, Windows, starting from Vista, comes with a feature that allows to dynamically set the TCP windows size depending upon the network (remote host) conditions. See http://support.microsoft.com/kb/929868. But I frankly doubt it can be the root cause, I just documented it for comprehensiveness.

If your application is affected by those problem, I hope you’ll find the culprit amongst one of those.

Any network device catching the traffic at TCP-level

If there is any firewall in place, look at their logs, they might reveal that some connections are refused when the client attempts to re-use them.

More Information

Thanks to Tim B (MSFT) and Pascal B (MSFT) for the hints and guidance.

And cut!

The French Connection Poster


He ain’t heavy he’s my (computer) browser

August 1, 2008 11:26 by Marc
Rambo III Poster

Back in 1988, Rambo III was set to be a blockbuster and Frank Stallone sung the end title theme “He ain’t heavy…”.

Nowadays, Afghanistan turned to be another new kind of battled field and I’ve recently been surprise by a side-effect of upgrading Active Directory to 2008. A customer of mine, like John Rambo, shot first and (optionally if thing got messed up) thought afterwards then asked for rescue (not by helicopter hopefully for me).

The problem was simple: the browse-list (also incorrectly called network neighborhood by the end-users) was almost empty while before upgrading, it was still full of dozens of computers. The reason was simple, although a little unexpected (who said undocumented?): The “computer browser” service on the DC holding the PDC Emulator FSMO role was set to disabled and therefore stopped. Restarting it solved the issue (with a little patience due to propagation) but one question remained: why was it stopped and configured so? I deeply inspected group policies applied to DC but remained clueless. By reproducing the operation in a test environment, it appeared that the upgrade process intentionally stops and sets to disable the browser service, as simple as that but… Not documented as far as I knew until I found this blog post telling the whole story: http://blogs.technet.com/networking/archive/2008/07/25/netbios-browsing-across-subnets-may-fail-after-upgrading-to-windows-server-2008.aspx



For more information about the browser services, John Rambo and that song, go there:

http://technet2.microsoft.com/windowsserver/en/library/43dce7f8-0741-4672-a7e3-762671110e9f1033.mspx?mfr=true

http://support.microsoft.com/kb/188001

http://technet2.microsoft.com/windowsserver/en/library/b4bf5ea0-f68d-403a-9194-2612f676d6c91033.mspx?mfr=true

http://www.imdb.com/title/tt0095956/

http://fr.youtube.com/watch?v=QOTLpQWloXE

And cut!

(And remember: God would have mercy, John Rambo won't!)