70-240 in 15 minutes a week: Kerberos and Active Directory Replication

70-240 in 15 minutes a week: Kerberos and Active Directory Replication


June 29, 2001

by Dan DiNicolo
http://www.2000trainers.com

Welcome to article number 18 in my 70-240 in 15 minutes a week series. This week's article covers Kerberos, as well as Active Directory replication. This includes a look at how Kerberos functions in a single and multiple domain environments, and how replication occurs both within and between sites. Next week's article will be the final article in the Active Directory portion of the series, with a deeper look at maintaining the AD database, managing operations masters, and Remote Installation Services. The following week will begin the first of approximately eight articles to cover the Networking Services material. 

The material to be covered in this article includes:

Kerberos v.5
Active Directory Replication

A quick note for those who don't yet know - the Win2000Trainer.com website no longer exists. It has been replaced by my new partnership, 2000Trainers.com. Be sure to check out the expanded content, which now includes SQL and Exchange 2000 content as well. Thanks to everyone who supported Win2000Trainer.com.


Kerberos v.5

Windows 2000 Active Directory relies on a different authentication protocol than Windows NT 4. Where NT 4 used the NT Lan Manager (NTLM) protocol for authentication, Windows 2000 utilizes Kerberos. The Kerberos protocol was developed at MIT, and is named after Cerebus, the three-headed fire-breathing dog that guards the gate to Hades. Why do I bother telling you this? Because it makes it easier to remember that Kerberos is a 3-pronged authentication scheme. The three parts of Kerberos are:

1.Client - the system/user making the request
2.Server - the system that offers a service to systems whose identity can be confirmed
3.Key Distribution Center (KDC) - the third-party intermediary between the client and the server, who vouches for the identity of a client. In a Windows 2000 environment, the KDC in a domain controller running Active Directory (It could be a UNIX-based KDC also)

The way that Kerberos works can seem a little intimidating if you get into all of the tiny little details, but I'll spare them for an overview of how things work. It is more important that you understand the process to begin with. If you want every behind-the-scenes detail, I've provided a link at the end of the section.

In a Kerberos environment a user provides a username, password, and domain name (often referred to as a Realm in Kerberos lingo) that they wish to log on to. This information is sent to a KDC, who authenticate the user. If the user is valid, they are presented with something called a ticket-granting ticket, or TGT. I like to consider the TGT to be like a hand-stamp admission to a country fair - it proves that you have paid admission and have proof. The TGT is helpful in that it does not require you to constantly re-authenticate every time you need to access a server.

However, if you do want to access a server, you still require a ticket for that server or you will not be able to create a session with that machine. Think of a ticket as being like the ticket you need to purchase to get on rides at the country fair - even though you've paid admission (proved by the hand-stamp or TGT), you still need a ticket to get into the haunted house. When you wish to access a server, you first need to go to the KDC, present your TGT as proof of identity, and then request a session ticket for the server you wish to contact. This ticket simply acts as authentication between the client and the server you wish to contact. If you are authenticated, whether or not you will be able to actually access anything on the server will depend on your permissions. The TGT and session tickets that you are presented with actually expire after a period of time that is configurable via group policy. The default value for a TGT (also referred to as a user ticket) is 7 days, while the default value for a session ticket (sometimes called a service ticket) is 10 hours. 

In a single-domain environment, Kerberos authentication is pretty straightforward. However in a multiple domain environment Kerberos has more steps involved. The reason for this is that when you are attempting to obtain a session ticket for a server, it must be obtained from a KDC in the domain where the server exists. Also, you must obtain session tickets in order to traverse the trust-path to the KDC you need to contact. The example below outlines the steps necessary for a client in west.win2000trainer.com to access a server in east.win2000trainer.com.

1.The client logs on to the network as a user in east.win2000trainer.com, and is presented with a TGT.
2.The client wants to communicate with a server in west.win2000trainer.com. It contacts the KDC in east.win2000trainer.com, asking for a session ticket for a KDC in the win2000trainer.com domain. 
3.After it receives this ticket, it contacts the KDC in win2000trainer.com, requesting a session ticket for the KDC in west.win2000trainer.com. 
4.After it receives this ticket, it contacts the KDC in west.win2000trainer.com, and requests a session ticket for the server in west.win2000trainer.com whom it originally wanted to contact. 
5.Once granted the session ticket for the server, the client contacts that server directly and can access resources according to the permissions in place.

If this seems like a great deal of steps, that is indeed true. This is one of the reasons that you might consider implementing shortcut trusts, as outlined in my last article. If shortcut trusts exist, the shorter available path would be used. Kerberos is a wonderful protocol in that is makes the network much more secure, due to the necessity of authentication between clients and servers before a session can be established. It is actually much faster than you might think. For a good hands-on experiment, you might consider setting up multiple domains and then running network monitor while accessing resources between domains. Though the packets contents are encrypted, it will still give you a great idea of what is happening behind the scenes. Three utilities that you should be aware of for troubleshooting Kerberos problems are Netdom (discussed in a previous article), as well as the resource kit utilities KerbTray.exe and Llist.exe.

For a great piece of reading on Kerberos, click here 
For the Kerberos V5 RFC, click here 

Active Directory Replication

Windows 2000 implements replication much differently than Windows NT 4. In Windows NT domain environments, replication was single-master, meaning that only one domain controller actually accepted updates - the PDC. In Windows 2000, the model is multi-master, meaning that any domain controller can update Active Directory. This presents some challenges in terms of tracking changes on the network and resolving conflicts that might occur, as I'll discuss in a moment. However, along with the challenges that Windows 2000 Active Directory replication presents, it also presents an opportunity in that replication can finally be easily controlled, through the use of sites, site links, and schedules. 

I could easily have devoted this entire article to only replication, since there is so much that takes place behind the scenes. Thankfully, what is most important to understand is a handful of concepts (albeit a rather large handful), most of which are rather straightforward. Lets begin by taking a look at how replication works.

In an Active Directory environment, all domain controllers do not contact one central domain controller for changes. Instead, they create relationships with one another which track which domain controllers are sources of replication changes for them. These relationships are called connection objects, and I'll discuss how they work and how they are created shortly. Since every domain controller can accept updates, there is actually a distinction as to which domain controller made the original update. This update is called the originating update. Any update that is received on a DC as a result of replication is referred to as a replicated update. 

The actual process of getting updates from one domain controller to another is different depending on whether we are talking about replication within a site or replication between sites. As discussed earlier in the series, a site is a collection of high-speed IP subnets, and the intermediary element between sites is most often a WAN link. You need to define subnets in Active Directory, or AD will assume that all domain controllers are part of the single default site, literally named Default-First-Site-Link. Replication within a site happens on a 5-minute change notification interval. In this type of setup, after an originating update occurs on a domain controller, it waits 5 minutes before initiating a change notification message to replication partners (separated by 30-second intervals). This gives the domain controller time to batch many changes, instead of initiating replication for every change. After being notified, replication partners pull the changes. Note that replication is always pulled, not pushed. Replication between sites is a bigger discussion, and will be discussed in a bit.

A process that runs on all domain controllers called the Knowledge Consistency Checker (KCC) creates the connection objects between domain controllers automatically. The KCC runs every 15 minutes, and makes changes to the topology of connection objects if necessary (for example if a domain controller cannot be contacted). It is also possible to manually create connection objects between domain controllers, though this is not necessary. Connection objects are listed (and can be created) in Active Directory Sites and Services under the NTDS settings icon for a server. Note that the connection objects listed are those from whom a given domain controller will pull replicated changes. By default, the KCC creates a topology that ensures that a domain controller is never more than 3 hops away from another domain controller. In this case, hops refer to the number of domain controllers that need to be traversed to get a change to another domain controller.

Because of the nature of Active Directory replication, it is possible that conflicts could occur if one domain controller accepted an originating update, while another received an originating update on the same object (for example a user). First of all, Active Directory helps to reduce the possibility of this by replicating at the attribute level. As such, one domain controller could update a user's password and another his postal code and there would be no conflict, even thought changes were made to the same object. In the event that there is a conflict, these are solved using three possible methods (also referred to as globally unique stamps), in the order listed below:

1. Version numbers - all attributes start with their version number set to 1. Every time an update occurs, this number increases by one, and a higher version number always wins. However, this also means it is possible that 2 domain controllers could update the numbers to the same value, which means a conflict still exists. 
2. Timestamps - In the event that version numbers are the same, timestamps are used, with the time of the update on the domain controllers being compared. The most recent update (say 2:10pm over 2:07pm) always wins.
3. Server GUID - In the highly unlikely event that a conflict still exists, the globally unique identifiers of the servers making the originating update are compared, the one with the higher value wins.

There are a couple of cases that may still cause problems. One example is with orphaned objects. Lets say that you were to create a user in an OU called Sales, and then a minute later the Sales OU was deleted on another domain controller before replication had occurred. In this case, the parent object of the user would no longer exist, and the user would be moved to the LostandFound container, as shown below:

In situations where two objects are created on different domain controllers that have the same distinguished name, you'll always be able to tell, because one of the objects (the one with the higher stamp from the above list) retains the name while the lower of the two will exist with a name that appears as the object's relative name + the characters "CNF:" + the GUID of the object.

Another potential problem in a multi master replication model would be the possibility of replication loops occurring. Quite simply, A might let B and C know that changes exist. After receiving the changes, B might also try to send notification to C, who has already received them. To accomplish this, Active Directory uses a technique called propagation dampening. This technique has every domain controller hold a table in memory called the up-to-dateness table, which stores update sequence numbers (USNs) for every domain controller. When a domain controller handles an originating update, it updates its USN number, and this information is held on all other domain controllers. For example, if an originating change was made on DC1, and the USN on DC1 increased to 2667, all domain controllers would have this information in their up-to-dateness table, and would not require it to be replicated again from other partners if they offered the same updates.

Replication between sites works differently than replication within sites in an Active Directory environment. AS mentioned early, a site is a collection of high-speed subnets. In order to define sites and subnets, the proper objects must be created and associated. Usually you would start by creating site objects, and then associating subnet objects with sites. As the screen below shows, I have created 3 new sites, and associated 3 subnets with the site called Toronto, as shown.

In order to control replication between sites, we need to link the sites together using site links. A site link connects the 2 or more sites for the purpose of creating a pathway for replication. Once a site link has been created, properties can be set on that link, including the cost, schedule, and interval. The cost is a number between 1 and 32767 that helps determine the links that will be crossed in the event that multiple paths exist. The lower the cost number, the higher the priority of the path. Usually you map costs to speed of links - maybe 50 for a T1 link, 500 for a 56K link, and so forth. The schedule defines when replication is allowed to happen. By default this is always, but it could be configured to only allow replication at night, for example. The interval controls how often replication can occur between sites. By default this is set to every 180 minutes, but it can be set to lower or higher values if you choose. Note that inter-site replication does not use change notification. Instead it uses the schedule and interval values to figure out when replication occurs. This is very different than it NT 4, when change notification was used throughout the environment. 

If you are thinking that site links might be problematic, you might be right. For example, imagine if you were to create a new user account and the originating update were to take place on a domain controller in Toronto at 9am. If the schedule on the site link between Toronto and Vancouver only allowed replication between 6pm and 8am, the user account would not appear on domain controllers in Vancouver until after 6pm that evening. Note that this problem is easily circumvented - when creating the account in AD Users and Computers, simply connect to a different domain controller (say one in Vancouver) and create the account. This will make the originating update take place in Vancouver, and then user (presumably in Vancouver) would be able to log on immediately. 

Connection objects between domain controllers differ within and between sites. Within a site, domain controllers will have many connection objects with other domain controllers. However, replication between sites happens via connection objects between domain controllers in each site that are designated as bridgeheads. Bridgehead servers are chosen automatically, but you can set a list of preferred bridgehead servers, as shown below. The process that chooses bridgehead servers is the Intersite Topology Generator (ISTG), which runs automatically and will designate a new bridgehead should the current one not be available.

Another important consideration when setting up site links is the protocol that the site link will use. Active Directory supports site links via RPC (referred to as IP in the interface) as well as SMTP. Within a site, domain controllers use RPC. You should note that you would most often use RPC, since SMTP does not support replicating the domain partition between domain controllers in the same domain (this is mainly because the Sysvol folder is replicated using FRS, which uses RPC only). SMTP does however support replication of the Schema, Configuration, and Global Catalog partitions. SMTP is useful for distributed environments with unreliable WAN links. 

By default, all site links that you create are bridged (transitive). What that means is that in calculating the best path for replication, all site links are considered. 

For example, in the diagram above, replication between sites A and D would occur over the least cost path, which would be over the bridge automatically created - ABD, which has a cost of 20. Note that the alternative AD has a cost of 200, and bridge ACD has a cost of 110. In looking at all available site links, AB and BD were bridged to form the lowest cost path available. Site links are created in AD Sites and Services. As a best practice, you might consider naming site links after the sites that they connect.

In some situations, such as when your network is not fully routed, you may need to manually create site link bridges in order for replication to have a path to follow. If this were the case, you could turn off the automatic bridging of all site links and define the bridges that you wish to exist in AD Sites and Services. Note that site link bridges do not need to be created in a fully routed network, since all site links are bridged by default, allowing the lowest cost replication path to be calculated automatically. You should also note that Active Directory does not control the Layer 3 routing of data - site links allow replication between sites, and controls which sites are connected for the purpose of replication. The actual path that the data will follow over the physical network depends on how routing has been configured in your environment. 

You should also be aware of the replication troubleshooting tools that exist. The two main tools are Replication Monitor (Replmon.exe) and Repadmin.exe. Replication Monitor is installed along with other advanced tools from the Support\Tools directory on the Advanced Server CD, and provides a great deal of information about the replication environment including the ability to view USNs, view replication partners, view replication status on a server, trigger replication between partners, and so forth. Repadmin is a useful command-line tool, but provides information about only a single domain controller at a time. 

That brings us to the end of this article. Next week we'll finish off the Active Directory portion of the series with a look at managing the Active Directory database and operations masters, as well as a look at Remote Installation Services. As always, feel free to contact me with your questions and comments, but please be sure to post all technical messages to my message board. Until next week, best of luck with your studies.

Dan