by Dana Daugherty
One corrupt discovery data record (DDR) can
completely lock up a primary site server. In this article I’ll
attempt to help you troubleshoot typical primary site server
DDR processing problems. I’m also going to document a not
so typical problem that I had in my SMS
implementation.
Discovery Data Overview
One corrupt
discovery data record (DDR) can completely lock up a primary site
server. In this article I’ll attempt to help you troubleshoot
typical primary site server DDR processing problems.
Discovery data records are small files
(about 2K) containing basic client system information that is
subject to change. They have .ddr extensions. A DDR should
contain network address, username, GUID, machine name and a
few others elements. The file is processed on the client,
based on a site’s discovery configuration and is located in
mssmscoredatasmsdisc.ddr. Methods of client discovery
include NT Logon Discovery, Heart Beat Discovery, and Network
Discovery. DDRs are used by SMS to “keep tabs” on the clients.
They allow a few data elements to be refreshed on a regular
basis in order for remote tools and advertised programs to
function in a “reliable” manner. To view discovery data,
information open any collection and double-click on any client
from that collection. What you’ll see next is discovery data
for the client that has been entered into the SMS
database.
The DDR is most likely sent by the client to
the logon pointsmslogonddr.box, then forwarded on to
CAPddr.box, and then forwarded on to site
serversmsinboxesddm.box. Then it is finally placed into
the SMS database. As you have probably already experienced,
many problems can happen along the way. This article is going
to focus on the issue of DDRs backing up in site server’s
ddm.box. For more details about the information I’ve already
covered please check out the SMS Admin Guide, SMS Resource Kit
(in BORK 4.5) andor SMS Admin’s Companion.
Don’t confuse the discovery data process
with hardware inventory. The two are completely different
mechanisms used for different purposes.
Troubleshooting Typical DDR Processing
Issues
Data Discovery Manager is the server side
thread of smsexec.exe responsible for processing DDRs. If
discovery data is not being updated according to the
configuration that you have chosen within the site’s hierarchy
properties, hopefully the following information will
help you.
-
From a collection, open a client’s
properties and check the date on one of the Agent Time
entries. It should match your discovery method
frequency. Just a tip: You should have some discovery
method configured to occur at least once a day. -
You can browse to the primary site
server’s ddm.box inbox and view the backed up files for your
self. Turn details on in Windows Explorer and compare the
dates. If there are no backed up DDRs look at flow charts of
the discovery data process to find other possible
sources. -
Look for processing errors appearing in
the Site Status tree under the Discovery Data Manager
tread. -
View the SMS site server logs. Look in
ddm.log for lines that indicate Discovery Data Manager is
attempting to process the same DDR over and over.
The solution is really pretty simple. Stop
the Discovery Data Manager thread within the SMS Service
Manager (or just stop the SMS component in NT Service Manager)
and delete the file that is not being processed. For
troubleshooting purposes, it might be a good idea to copy the
file first for further research. Keep an eye on ddm.log and
ddm.box. Within an hour or so it should look like you opened
the dam and let all those backed up DDRs out.
Symptoms of My Atypical Problem
When a corrupt DDR entered ddm.box the
following symptoms appeared:
-
I browsed to the site server’s
ddm.box inbox and viewed several days worth of backed up
files DDRs. -
Ddm.log indicated that Discovery Data
Manager was attempting to process the same DDR over and
over.
The first two were very common for corrupt
DDRs, the next two are not.
-
Processing errors appeared multiple times
in the Site Status tree under the Discovery Data Manager
thread as DDRs backed up. They began the first time
Discovery Data Manager attempted to process the corrupt
record.Message ID 669 Component raised an
exception but failed to handle it. -
Here is another atypical symptom that I
saw: I had about 250 SMS crash dumps. These were located in
site serverSMSLogsCrashDumps. Each crash.log file
had the following message:
Time = 06/15/2001
13:42:24.925
Service name = SMS_EXECUTIVE
Thread name
= SMS_DISCOVERY_DATA_MANAGER
Executable =
D:SMSbini386smsexec.exe
Process ID = 394
(0x18a)
Thread ID = 655 (0x28f)
Instruction address =
5f4040fd
Exception = c0000005
(EXCEPTION_ACCESS_VIOLATION)
Description = “The thread
tried to read from the virtual address C35EC67F for which it
does not have the appropriate access.”
Raised inside
CService mutex = No
CService mutex description =
“”The only Tech Net article that remotely
relates to this issue is Q223755
SMS Executive Crashes when enumerating a non-Microsoft
Server.
Related Problems
Because of the problems covered above, I had
these additional issues that served to complicate the
troubleshooting process:
-
The drive containing the SMS install was
completely full. Normally it has 3.5 GB free. This was due
to the tremendous space that SMS crash dumps take up. Each
crash dump makes a copy of all SMS logs, wether logging is
turned on or not. It’s a pretty nice feature, as long as you
don’t have 250 of them. -
After the drive was full, the following appeared in Site Status
under Discovery Data Manager:
Message ID 2636 SMS Discovery Data
Manager Failed to update the following discovery data record
serversmsinboxesddm.boxHQCL4GPB.ddr because it cannot
update the data source.I believe this occurred due to the lack of
space on the SMS server install drive.
Despite the non-standard symptoms this was a
standard DDR corruption problem. The solution, again, was
quite simple. I found and deleted the corrupt DDR. I also
deleted the 3.5 GB worth of crash dumps. After that no more
problems.
Normally this type of problem wouldn’t go
unnoticed in my SMS implementation but…The corrupt DDR was
received on a Friday. I set my site status messages to clear
on Sunday evenings. I didn’t pay attention to the logs on
Monday. On Tuesday, as I prepared for an important software
distribution to 1500 machines, I noticed this
problem.
This article was originally published on Jun 25, 2001
1 of
1