On IIS6: A process serving application pool ‘MyAppPMool’ terminated unexpectedly. The process id was '1234'. The process exit code was '0xffffffff'

January 13, 2010 00:18 by Marc

Suddenly some apparently well performing IIS servers recently started reporting this error regularly. some of them were also running SharePoint or OWA. all of them configured to use Integrated Windows Authentication (IWA) as authentication mechanism.

The problem with IIS worker process is that it can have so many explanation depending on the code executed that you can easily waste a week until you find a reasonable explanation. In this case, all servers were affected, regardless of the application they run. So my first idea was “they might be under attack”. But that was not the case: performance counters related to the worker process did not give any sign of that, this was confirmed by the IIS logs. Next”usual suspect”, a patch recently installed: bingo, that was it. Here are the details:

  • The Security Update implementing “Extended Protection” for authentication in IIS (KB973917) was just deployed on all servers
  • All impacted servers are running Windows Server 2003 Service Pack 2
  • One or multiple application served by that application pool/worker process have IWA enabled
  • After intensive file version analysis, it appeared that numerous IIS-related files (EXE, DLL’s…) were with a version prior SP2

Due to the inconsistency of IIS files in combination with that extra hot fix, the worker process keeps crashing –> root cause found!

Now how to fix it:

  1. Perform an inventory of currently installed post-SP2 fixes. I personally do it in a very straightforward way using psinfo but I am sure you’ll find plenty of methods to do it the way you like
  2. Reinstall the Service Pack 2
  3. Redeploy post-SP2 hot fixes, see step 1
  4. Check installed IIS File versions
  5. If file versions are OK, Install KB973917

Additional information’s:

Note: Make sure you pay attention to the process exit code which is always 0xffffffff. If you see another code, it might of course have another cause.

Marc


Disabling PAC Validation II: Won't Get Fooled Again

November 4, 2009 15:13 by Marc

I did not expect to receive so much feedback by mail regarding this (not so fascinating) topic. Not to mention referring sites and so on… This brought the motivation to loop the loop by testing on Windows Server 2008 (SP2) as well as on 2008 R2 in-depth in order to cover the whole stuff.

So in summary, when will PAC signature verification will finally occur?

The table hereunder summarizes possible scenario’s:

Server OS/
Target Application or Service
Server 2003 pre SP2 Server 2003 SP2 and above
with extra registry configuration
Server 2008,
Server 2008 R2
File & Print Sharing NO Validation NO Validation NO Validation
Exchange Server Validation NO Validation NO Validation
SQL Server Validation NO Validation NO Validation
IIS with application pool identity set to Local System or Network Service Validation NO Validation NO Validation
IIS with application pool identity set to a domain account Validation Validation Validation

So in short, the only difference between Server 2003 and 2008/2008 R2 is that with from 2008, you do not need to modify registry anymore since the default value is inverted.

Once again, the important point here is: if you configure Kerberos on a IIS farm (SharePoint or “simple” ASP.Net), PAC Validation will ALWAYS occur, regardless what you will do to prevent it UNLESS the application is granted and makes use of the right “Act as part of the operating system”.

If the target application is granted seTCB making use of it:

Granting the seTCB privilege is not sufficient because it will be disabled by default until the application effectively requests it. But why would it need it? For various reasons this privilege might be needed by the server application. 2 common usages are described in the sections below.

Protocol transition

Protocol transition is the ability for a server application to delegate user credentials to a back-end service using Kerberos while they were not initially provided under that form by the client.

In clear, this means that a user may be authenticated by a service using non-Kerberos protocols such as Basic, NTLM, Digest and this service, making use of that feature, will transform the credentials in order to propagate them to another server. Example: a user authenticates against SharePoint using NTLM, want to use reporting service while it runs on a 2nd server, the SharePoint server will perform the necessary transition to push (aka “delegate”) the user’s credentials to the SRS server using Kerberos.

IIS MVP Ken Schaefer gives an excellent overview on his blog: IIS and Kerberos Part 5 - Protocol Transition, Constrained Delegation, S4U2S and S4U2P.

Services For User

SU4 extensions are tightly linked to Protocol Transitions. In very very short, they allow, under certain conditions, an application to perform a logon on behalf of a user without knowing his/her password.

This feature is, for example, used in IAM/SSO products such as IBM TAM/WebSEAL or CA SiteMinder

For both technologies, since the user does not initially authenticate using Kerberos, there is no PAC to validate.

OK but finally, why is disabling PAC validation so important?

Well I won’t say it is “so important”. I might help improving performances under some circumstances.

Since, in short, the PAC is verified by the server application before granting a Ticket-Granting-Service (TGS) to the client, it does not occur at every request as long as the TGS remains valid (note: there are some exceptions to this rule). BUT in some case, this initial verification can take some times because 1) the client’s AD is far (in term of network, hops, latency, bandwidth…) from the server’s AD or 2) the client’s AS is too busy. This could therefore give the wrong impression that client to server authentication seem slow while you expect a big boost by switching to Kerberos.

Additional Resources

Marc


Ways to Lock-down SharePoint Designer Server-Side

October 1, 2009 17:24 by Marc

Beyond the discussion over the pro’s and con’s of SharePoint Designer, you may sometimes have to prevent its usage in order to comply with the policies or governance in place in your organization. Although you can block it from the client-side (using software restriction policies or Office ADM files), the most efficient measures (talking about enforcement) remain on the server-side.

Disable “Remote Interfaces” at Web Application Level

From the SharePoint Central Administration Web Site, you can modify the “Permissions for Web Application” and uncheck “Remote Interfaces”.

  • Pro’s: Simple, no code, config only
  • Con’s: will not only block SharePoint Designer but ALL “rich” protocols: WebDAV, Soap, FPRPC… No more Explorer View, RSS Feed, working from Office or “Edit in Office menu options”…

Lock Site Definition in ONET.XML

Modifying The configuration or each site definition in ONET.XML (adding the attribute DisableWebDesignFeatures=”wdfopensite” to each Project tag) will prevent customization using SP Designer.

  • Pro’s: Offers granularity per site definition
  • Con’s: Manual edition of ONET.XML by default… Up to you to package it properly otherwise.

Manage Permission at Site-Level

Add and Customize Pages: Removing this permission will prevent edition of pages that are not stored as list items (default.aspx and so on)

Browse Directories: disabling this option will simply prevent SPD to open a site… But it will also break other WebDAV clients!

Manage Lists: Disabling this permission will prevent some kind of modification from SPD but will not prevent its global usage. Moreover, it will also break simple browser-based advanced list edition…

  • Pro’s: Offers granularity per site.
  • Con’s: does not scale for large farms with lots of sites. Light break other functionalities or might not totally lock down SPD Usage… Better be prepared with a bullet-proof functional test matrix ;)

Using ISAPI Filter and User Agent String: UrlScan3.1

Since version 3.0, UrlScan tools comes with rich pattern-based request blocking options. To reject request originating from FrontPage Designer, add the following elements to the configuration file (UrlScan.ini):

RuleList=DenyUserAgent

[DenyUserAgent]
DenyDataSection=Agent Strings
ScanHeaders=User-Agent

[Agent Strings]
MS FrontPage 12.0

  • Pro’s: Easy to deploy and to configure. Excellent performances. Participates to the overall effort to secure the platform. Includes logging (own and IIS W3C files). Relatively granular deployment possible (per web application)
  • Con’s: Remains at ISAPI-level. No rich authorization options (based on user identity, membership…). Not 100% reliable: techies can find easy ways to tweak their User-Agent string

Using ISAPI Filter and User Agent String: Make your own

“Simplistic” example below. Ready to be compiled. Do not forget to chose the appropriate platform (32 or 64-bit) depending on your destination server.

#include <windows.h>
#include <httpfilt.h>

BOOL GetFilterVersion(HTTP_FILTER_VERSION *pVer)
{
  pVer->dwFlags = (SF_NOTIFY_PREPROC_HEADERS | SF_NOTIFY_AUTHENTICATION |
             SF_NOTIFY_URL_MAP  | SF_NOTIFY_SEND_RAW_DATA | SF_NOTIFY_LOG  | SF_NOTIFY_END_OF_NET_SESSION );
  pVer->dwFilterVersion = HTTP_FILTER_REVISION;
  strcpy(pVer->lpszFilterDesc, "SharePoint Designer Blocking ISAPI, Version 1.0");
  return TRUE;
}

DWORD WINAPI __stdcall HttpFilterProc(HTTP_FILTER_CONTEXT *pfc, DWORD NotificationType, VOID *pvData)
{
    char buffer[256];
    DWORD buffSize = sizeof(buffer);
    HTTP_FILTER_PREPROC_HEADERS *p;
    switch (NotificationType)  {

      case SF_NOTIFY_PREPROC_HEADERS :
      p = (HTTP_FILTER_PREPROC_HEADERS *)pvData;
      BOOL bHeader = p->GetHeader(pfc,"User-Agent:",buffer,&buffSize);
      CString UserAgent(buffer);
      if(UserAgent.Find("MS FrontPage 12.0") != -1) {
          p->ServerSupportFunction( pfc, SF_REQ_SEND_RESPONSE_HEADER, (PVOID) "403 Forbidden" , (ULONG_PTR)"Content-Length: 0\r\nContent-Type: text/html\r\n\r\n", NULL );
      }
      return SF_STATUS_REQ_HANDLED_NOTIFICATION;
    }
    return SF_STATUS_REQ_NEXT_NOTIFICATION;
}

  • Pro’s: Easy to deploy. Excellent performances
  • Con’s: Requires code and appropriate compilation. Limited functionalities, hard to code compared to HTTP modules/handlers. No logging, totally blind operations. Not 100% reliable: techies can find easy ways to tweak their User-Agent string

Using HTTP Module and User Agent String: Make your own too

HTTP modules are "roughly the .Net translation of ISAPI with tons of extra capabilities. “Simple” example below.

using System;
using System.Web;

namespace BlockSPD
{
    public class BlockSPDHttpModule : IHttpModule
    {
        public void Init(HttpApplication context)
        {
            context.BeginRequest += new EventHandler(BeginRequest);
        }
        private void BeginRequest(object sender, EventArgs e)
        {
            HttpContext currentContext = HttpContext.Current;

            if (currentContext.Request.UserAgent != null && currentContext.Request.UserAgent.Contains("MS FrontPage 12.0"))
            {
                currentContext.Response.StatusCode = 403;
                currentContext.Response.StatusDescription = "Forbidden";
                currentContext.Response.End();
            }
        }
        public void Dispose()
        {    
        }
    }
}

  • Pro’s: Easily to develop and relatively easy to deploy. High flexibility for extra conditions: user’s identity, group membership, URL accessed… with minimal effort compared to ISAPI
  • Con’s: Requires code. Implies more processing overhead than ISAPI. No logging, totally blind operations. Not 100% reliable: techies can find easy ways to tweak their User-Agent string

Using HTTP Module and User Agent String: Cool CodePlex projects

Don’t want to waste your time playing around with own modules/handlers, take a look at these ones from CodePlex:

More options using ISAPI, Modules, handlers or URLScan…

The technologies described above can also be used to reject requests that can be considered as “typical” fro SPD:

- Rejecting request whose URL ends with “/_vti_bin/_vti_aut/author.dll"

- Rejecting verbs or request that are typical from the FrontPageRPC procotol

In both cases, you might also block valid request coming from OFfice applications such as Word or Excel though…

Additional Resources

Marc


Disabling PAC Validation: More than meets the eyes

August 7, 2009 15:45 by Marc

A lot of bloggers have published how to configure Kerberos authentication “how-to” on the Internet, many of them focusing on IIS, Asp.Net and subsequently SharePoint. Some report performance boost over NTLM, others described how to disable “PAC Validation” and increase that performance gain even more.

Recently, I was asked to evaluated that configuration, ensure it works as expected and measure the effective gain from an end-user perspective. This post is a quick summary of the findings not subject to NDA.

What is a PAC and what is PAC Validation (actually PAC Signature Verification)

PAC stands for “Privilege Account Certificate”. It is a data structure contained in some Kerberos requests. It actually contains authorization-oriented information’s such as the group membership of an authenticated user, his/her SID (and SID history) as well as other information’s such as user flags or the time when a forced logoff should potentially happen (most are stuffs you usually configure on the “Account” and “Group Membership” tabs of the AD Users and Computers console.

Since the original Kerberos protocol did not cover the “authorization” part while windows had to maintains the compatibility when switching from NTLM to Kerberos, MS had no other choice than implementing a proprietary extension named PAC. Since the PAC travels over the network when users authenticate to servers (and service actually), it may be altered by malicious user able to decrypt it and modify it on the fly. The purpose of PAC Signature Verification is to make sure the PAC is unaltered when it is used by the server/Service for granting access

The verification/validation process takes place between a server and a domain controller of the domain the server belongs to. The server sends over RPC (therefore NOT using the standard Kerberos protocol) a request to the domain controller of the domain it belongs to specify it wants to verify the PAC signature and of-course, includes that signature (not the PAC itself). if the PAC whose signature is to be validated does not belong to the same domain as the server’s domain, the domain controller contacted for that purpose will follow the trust relationship in order to finally located a domain controller for the user’s domain and send the verification request to it. PAC verification is therefore a security feature implemented to prevent malicious alteration of user’s privileges.

Some people indicate that the verification process brings such an overhead to authentication that Kerberos finally becomes less performing, making end-user’s perceive poor performances and therefore behaves like NTLM or Basic (in term of performance, not security of course).

How to disable it (when you can)

There are 2 main scenario’s in which validation will not occur as long as specific conditions are met:

  • The server must run as least Windows Server 2003 Service Pack 2, the registry must include the setting “ValidateKdcPacSignature” set to “1” (default on Server 2008) and the server application is a Windows system service (its primary security token actually include the “SERVICE” SID)
  • The server application’s identity is granted seTcb privilege (aka Act as part of the operating system) and makes an effective use of it. 

Witnessing the PAC Signature Verification Process

From a technical (read “measurable”) perspective, there are multiple ways to witness the PAC verification process:

1. Use a network traffic analyzing tool such as MS Netmon or Wireshark and look for RPC communications between the server and the DC of its own domain right after the user’s attempted to authenticate against that server. Captured using Netmon 3.3, it would look like this:

From the server to the DC: NetLogonr:NetrLogonSamLogonEx Request, NLRNetrLogonSamLogonEx, Encrypted Data

Reply from the DC: NetLogonr:NetrLogonSamLogonEx Response

2. Enable netlogon logging and look for entries such as hereunder:

07/31 15:56:23 [LOGON] SamLogon: Generic logon of MYDOMAIN.LOCAL\(null) from (null) Package:Kerberos Returns 0x0

3. Enable Kerberos tracing/lsass logging and look for entries such as

396.500> Kerb-Warn: KerbVerifyPacSignature contacting domain MYDOMAIN.LOCAL for user MYUSER

Some noticeable behaviors…

IIS application pools do not run as “service”, they usually logon as “BATCH” BUT their process token will include the “SERVICE” SID if the identity used to run them is “LocalSystem” or “NetworkService”. Since you cannot used these identities when multiple IIS servers are used in a cluster or a “farm”, Kerberos requiring a domain account, there is very little chance (read no chance) for the PAC validation to be disabled.

Accidentally, this gave me the opportunity to witness that when you schedule a batch job under the context of the “NetworkService”, it will actually get the '”SERVICE” SID too!

Although online documentation reports that verification does not occur if the server’s process token holds the seTcb right, this did not seem to be sufficient. The application must explicitly make use of it.Simply granting the right will not lead to preventing PAC validation. This makes sense if you look at the reasons' why seTcb is often granted: it allows to make use of the S4U Kerberos extension which enable, for example, the ability to perform user logon without having to provide any password (!). These extensions are often used in single sign-on solutions or packages and allow easy transitioning of identities. Since in that case, the a new token is “regenerated”, there is no use in verifying a PAC signature

Generally speaking, granting the SeTCB right is something to be cautiously evaluated before implementation because it would allow its holder to perform OS-level operations such as, for example, logging on as a given domain user without knowing his/her password.

(Challengeable) Conclusions

  • Disabling effectively PAC Signature Validation is a no-brainer when the server application is running as a Windows system service (MS Exchange Store, File & Print  Sharing, MS SQL Server…). IIS Application Pools are NO Windows system services, they require special care and some scenario will not allow preventing PAC Signature Verification
  • Granting the seTcb privilege (aka Act as part of the operating system) is not sufficient per-se to prevent PAC Validation. The server application must effectively make use of the privilege by, for example, making use of MS proprietary Kerberos extensions like Service 4 User (S4U)
  • Disabling PAC validation on a SharePoint farm (IIS application pool running with a domain identity instead of Network Service) in order to prevent roundtrip to domain controllers when a user is authenticating against a WFE will not work, unlike stated by many bloggers!
  • Finally, Unless the authentication rate is extremely high and/or the server must authenticate users from “remote” domain with high-latency communications, disabling PAC Verification does not significantly improves performances.

More Information (to be decoded and tested properly!)

I would like to thank the Windows security specialist Emmanuel Dreux (www.Ilinfo.fr) for his help and input while deciphering Windows’s internal security behavior as well as my colleague Olivier B for his nearly encyclopedic knowledge of Kerberos on Windows.

I am aware my conclusions may seem to conflict with 1) Some official MS documentation 2) the technical recommendations of highly influential bloggers. Test it careful and feel free to report your findings here!

Marc


An existing connection was forcibly closed by the remote host

June 4, 2009 11:17 by Marc

Two different projects complaining about the same issue: nice troubleshooting challenge! One is SharePoint-based and the second is a, let’s say “entertaining” .Net-based application. They both make use of SQL Server as back-end data store and both complain about having “existing connection forcibly closed” reported in they stack trace when then attempt to connect to SQL.

This happens when a client application is trying to re-sued an existing TCP connection to a remote host while it closes it, making connection reuse impossible. There are actually multiple possible root causes which do no seem to be mutually exclusive:

Limit set on the number of connection allowed by SQL Server on a given instance

For a given SQL instance, you can set the maximum number of connections that can be used by applications. Depending on the way your application is written, multiple connection might be used for a single transaction… Raise the limit or set it to unlimited as necessary.

The (infamous) Scalable Networking Pack

The Scalable Networking Pack is a set of improvements brought to the Windows Networking stack. It is available as an add-on for Windows Server 2003 but is included from Service Pack 2 as well as from Windows Vista/2003.

This update greatly modifies the way Windows handles network connectivity at TCP-level and might therefore provoke the error. In short, the following settings should be modified on the SQL Server (or on any server acting as the server component):

In the registry, under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\

  • EnableTCPChimney (REG_DWORD) set to 0 (disabled)
  • EnableRSS (REG_DWORD) 0 (disabled)
  • EnableTCPA (REG_DWORD) 0 (disabled))

Applying the change requires a reboot.[UPDATE] Some MS sources report that a reboot is not necessary for some settings so I switch my statement to *might* require a reboot.

You’ll find a lot of trustworthy online resources recommending to disable the SMP…

On the other hand, recent NIC drivers may allow your system to work properly with these options set to enabled… Look at this page to get a list of SMP “partners”: http://technet.microsoft.com/en-us/network/cc984184.aspx.

Faulty NIC, NIC driver or driver settings 

Some NIC include a TCP Offload Engine (TOE). Incorrectly configured or running an out-dated, they will generate error at TCP-level.

In some cases, the TOE simply does not work, so you also might want to test with this function completely disabled. When editing you driver’s parameters, look for “Large Send Offload”, “Checksum offload”…

Import to note, you might also want to check the link speed and duplex at NIC level AND at switch port level, they might also cause the problem. Remember,: they must be identical on BOTH sides

Applying the change *might* requires a reboot.

Windows TCP/IP Stack Custom Configuration or Hardening

There are plenty of resources describing how to “harden” the Windows TCP/IP stack. Unfortunately most of them simply show the “how to”, not its consequences. Of of them being the performance decrease implied by hardening. You’ll also find those parameters under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\

In our case, the parameter “SynAttackProtect” set to 1 instead of 0 (disabled) will force Windows to be more restrictive regarding the incoming or TCP connection requests and well as more aggressive with the re)use of existing one. If the parameter is  enabled, the following additional parameters will also be taken into account:

  • TcpMaxPortsExhausted: Determines the maximum number of connections that can be opened before enabling protection against SYN attacks
  • TCPMaxHalfOpen: Determines the maximum number of connections that can be left “half-open” (waiting for re-use)
  • TCPMaxHalfOpenRetried: Same as above BUT applicable to connections that were effectively re-used by the original client

The parameters above are thresholds used by Windows to determine if a TCP-based (SYN) attack is in progress or not. They should only be used if the server is put in a high risk situation (DMZ or internet-facing) while there is not other security device put in place (Firewall…).

Note that, before Windows 2003 SP2, this SynAttackProtect is set to 0 while with SP2, it is set to 1 then with the latest versions of Windows, it returns to 0…

Automatic adjustment for the TCP window size (From Vista or 2008 only)

On the client side, Windows, starting from Vista, comes with a feature that allows to dynamically set the TCP windows size depending upon the network (remote host) conditions. See http://support.microsoft.com/kb/929868. But I frankly doubt it can be the root cause, I just documented it for comprehensiveness.

If your application is affected by those problem, I hope you’ll find the culprit amongst one of those.

Any network device catching the traffic at TCP-level

If there is any firewall in place, look at their logs, they might reveal that some connections are refused when the client attempts to re-use them.

More Information

Thanks to Tim B (MSFT) and Pascal B (MSFT) for the hints and guidance.

And cut!

The French Connection Poster