Troubleshoot problems with Kerberos in SharePoint - Part 1

In this section we will create a test environment to show which error message comes from which configuration causes.
Jesper M. Christensen

Network Administration - In this section we will create a test environment to show which error message comes from which configuration problems.

Introduce

If you haven't read the article with the Kerberos title in a Sharepoint environment, here's an article about Kerberos configuration and login process, please read through this article to get a better understanding of what happens when access to the website and basic configuration issues.

It is very difficult to accurately point out the meaning of the error messages that appear and you can spend a lot of time searching for help on the internet. So in this section we will create a test environment to help improve those tasks for you.

This is not a guide that can show all Kerberos related errors, but we create a test environment and create different problems to show which problems come from. Besides, the error messages in the server event log seem to be quite obvious, but sometimes larger investments are needed.

Setting

Demo-lab has the following computers:
DC1 Domain Controller (KDC)
SQL1 SQL Server 2008
WSS1 Windows Sharepoint Services 3.0 SP1 (+ infrastructure upgrade)
PC1 Windows Vista

Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 1Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 1
Figure 1

Service Principal Names (SPN) and delegation are configured as below.

Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 2Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 2
Figure 2

Where is the toolbox?

When troubleshooting errors, we must have a set of tools. In this series we will only use some of those tools, but for your convenience, here we recommend some troubleshooting tools.

  1. Windows server and client login events
  2. IIS log files are on frontend servers, SQL servers, and Domain Controllers
  3. SharePoint log files
  4. Command line tools
    - setspn (of the toolkit for Windows Server, Windows Server 2008 also has this default toolkit)
    - ldifde
    - KList (of the toolkit for Windows Server, Windows Server 2008 also has default)
  5. GUI tools
    - KerbTray (of Windows 200 Server, works with all versions of Windows)
    - ADSIEdit
    - Network Monitor
    - WireShark network data analyzer

Some useful commands to use when testing the clarity of commands:

  1. DNS cache: Ipconfig / flushdns
  2. NetBIOS cache you type in: Nbtstat -R
  3. Kerberos tickets: Klist purge

When analyzing the login procedure in Kerberos you need to follow the actions in the following table.

Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 3Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 3
Figure 3

The issues need to be studied carefully

There are a number of common problems on servers and here is a list of issues that will be introduced in the article:

  1. Time and date
  2. Application account
  3. Configure SPN

In this section we will show you what you can see in Windows event log files and the network data analyzer for each of the problems we create.

Time and date

The date and time are a very important part of the Kerberos authentication mechanism because the cards used by the Key Distribution Center (KDC) are only valid for a limited period of time. If the client and server are not synchronized, the validation of the cards will fail because this is part of the security structure. Therefore, it is important to check all servers and clients with the right time zone and regional settings. In this example, we will introduce the date and time issues.

Time differences on SharePoint servers

We configure the SharePoint server WSS1 to be unique after 24 hours and errors appear in the Windows System event log of

Warning, W32Time, Event ID: 52, Category: None
Dịch vụ thời gian có đặt thời gian với offset -86391 seconds

Usually servers will synchronize time automatically and these errors will not be encountered. That's just the case in our experiment so there's no need for administrators to intervene.

However, sometimes domain controllers may have synchronization issues. We tested it by changing the time on the domain controller and Kerberos announced the LSASRV event id 40960 in the system event log.

Warning, LSASRV, Event ID: 40960, Category: SPNEGO (Negotiator)
Hệ thống bảo vệ tìm thấy lỗi xác thực cho máy phục vụ MSSQLSvc / sql1.domain.local: 1433. Mã lỗi khi thực hiện Kerber giao thức đã xác định là thời gian ở các Phần mềm chính học không phải là thời gian thời gian tại các Domain Backup hoặc máy phục vụ của quá lớn. (0xc0000133) '.

Time errors are quite easy and bug fixes - just adjust the time or open the required ports in the firewall if the sync packets on time fail. In virtual environments, time synchronization problems can cause more serious problems, as the virtual hardware clock of the virtual machine may be different from other virtual servers.

Application accounts

IIS websites for web applications are automatically configured by SharePoint and when creating them you need to select or add Application Pools. The web application will run in this block and with its configured identity (user).

Change application account yourself

Websites run in IIS application blocks and are not meant to be configured themselves. If an administrator changes the identity of the application block to a wrong account, this can cause the website to become unavailable. Then you need to adjust what users change.

We try to change the application block account to domainspwrongacct for http:///intranet.domain.local.

Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 4Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 4
Figure 4

This will cause errors in the Windows System event log on the SharePoint server.

Warning, W3SVC, Event ID: 1012, Category: None
The identity of application pool, 'SharePoint - intranet.domain.local - 80' is invalid. Nếu nó không hợp lệ khi yêu cầu đầu tiên cho ứng dụng được xử lý, ứng dụng pool sẽ được disabled. Trường dữ liệu chứa số lỗi.

Warning, W3SVC, Event ID: 1057, Category: None
The identity of application pool 'SharePoint - intranet.domain.local - 80' is invalid, so the World Wide Web Publishing Service cannot create a worker process to serve the application pool. Therefore, the application pool has been disabled.

Error, W3SVC, Event ID: 1059
A lỗi đã bị khi khi khi khởi động ứng dụng dịch vụ đường dẫn "SharePoint - intranet.hendriksen.dk80 '. Application pool has được disabled.

and the error on the client computer accessing the website will be: Service Unavailable

To fix the above error, change the account to an account that is configured in SharePoint configuration and start the application block again from the IIS management interface. If you need to change the user / password in the SharePoint configuration, follow the steps described in the Microsoft article.

Service Principle Name configuration (SPN)

The configuration of SPN is also important for Kerberos authentication to work. First, we summarize how they are used between the server and the client.

  1. User types a URL in Internet Explorer (eg: http:///intranet.domain.local)
  2. The client browser will create an SPN that includes the host name and service type (SPN: http / intranet.domain.local - Service type: HTTP Name: intranet.domain.local)
  3. The client will send a request to KDC to get a card for this SPN
  4. The KDC server will encrypt the card with the public key of the registered accounts (domainspcontentpoolacct) and send this card to the client.
  5. The client will authenticate with the SharePoint server (frontend) by sending a card
  6. The SharePoint server decrypts the card with the application account (its identity) and checks the contents.
  7. An authenticated user or an error message will be sent to the event log or client browser record.
  8. If the user fails to authenticate Kerberos, then NTLM authentication will be performed.

Error of SPN for web application

We will try to see what happens if the client cannot get the card from the KDC by removing the SPN mapping to the account.

Delete the wrong account : SETSPN –D HTTP / intranet.domain.local domainspwrongpoolacct
Then we access the website from PC1, http:///intranet.domain.local, and go to the default page of the website. - but how to do the assessment?

If we check the event log on the client, we will not be able to see any entries. In the Windows security event log on a SharePoint server, we see the following components:

Audit Success, Event ID: 4624, Category: Logon
.
Logon process: NtLmSsp
Authentication Package: NTLM
.

Therefore, Kerberos has failed to log in and authenticate with NTLM because it overcomes that phenomenon. We need to study billions of why it happens and we can add more Kerberos and client and server logs or use network analytics packages. Most of the time we use network analytics packages called Wireshark and start by installing and running on the client. We will get the output when capturing the process above:

Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 5Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 5
Figure 5

When the SPN is lost, Active Directory will send a KDC_ERR_S_PRINCIPAL_UNKNOWN. This is a message saying that the Active Directory cannot find a matching SPN for this website.

Configure the error account in Active Directory for SPN

If the decryption key does not match step 6, this means that the encryption key comes from another account and the configuration has an error somewhere. Let's configure the SPN to use the error account and see their results.

Delete the account correct : SETSPN –D HTTP / intranet.domain.local domainspcontentpoolacct

Add the wrong account : SETSPN -A HTTP / intranet.domain.local domainspwrongpoolacct

If we analyze data packets from a SharePoint server, we will see communication when we perform iisreset / noforce and access the web application.

Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 6Troubleshoot problems with Kerberos in SharePoint - Part 1 Picture 6
Figure 6

SharePoint server will receive Kerberos information from KDC and use it to decrypt the card. If it does not match, it will generate an error message to be sent to the client.

In the Windows System event log of the client, we will get the following error:

Error, Event ID: 4, Category: None
Người dùng Kerberos đã nhận một KRB_AP_ERR_MODIFIED error from the server wss1 $. The target name used was HTTP / intranet.domain.local. Không rõ một máy phục vụ đích vào decrypt thư mục được cung cấp của máy phục vụ. Không thể thực hiện này khi tên cơ sở dữ liệu chính sách (SPN) được đăng nhập vào một tài khoản khác có người dùng của thiết bị tài khoản được dùng. Hãy xác định các SPN đích được đăng nhập, và chỉ đăng nhập, tài khoản được dùng bởi máy phục vụ. Lỗi này không thể thay đổi khi một dịch vụ đích được sử dụng một mật khẩu khác cho tài khoản dịch vụ này, không biết sự định vị lại Kerberos Key Center (KDC) có cho thiết bị dịch vụ. Hãy kiểm tra đặt dịch vụ trên máy chủ và KDC được cập nhật để sử dụng mật khẩu hiện thời. If the name server is not fully qualified, and the target domain (DOMAIN.LOCAL) is different from the domain client (DOMAIN.LOCAL), check if there is identically được đăng nhập máy phục vụ trong hai các trường nào, hoặc sử dụng fully-qualified name to identify the server.

When the front-end server tries to decrypt the service card, the lock fails because it is encrypted using the SPN account key (domainspcontentpoolacct) but is decrypted with the private key of the application block accounts (domainspwrongacct). The KRB_AP_ERR_MODIFIED error will be sent to the client and appear in the Windows System event log.

The environment is properly reconfigured into a domainspcontentpoolacct account:

Delete the wrong account : SETSPN –D HTTP / intranet.domain.local domainspwrongpoolacct
Add the correct account : SETSPN -A HTTP / intranet.domain.local domainspcontentpoolacct

Note: The KRB_AP_ERR_MODIFIED error is also caused by a configuration error.

Conclude

We have set up a test environment, found several tools to use and caused error messages to help us find some date / time answers, accounts in applications. and configure SPN.

In the next parts of this series, we will introduce some typical problems like:

  1. Duplicate Service Principal Names
  2. Error of DNS configuration type
  3. Delegation when used and how to check it
  4. Shared Service Provider (SSP)
  5. Research more with the data analyzer in the network
4.2 ★ | 17 Vote