GuidesDiscovery Data Manager Problems: Corrupt DDRs

Discovery Data Manager Problems: Corrupt DDRs





by Dana Daugherty

One corrupt discovery data record (DDR) can
completely lock up a primary site server. In this article I’ll
attempt to help you troubleshoot typical primary site server
DDR processing problems. I’m also going to document a not
so typical problem that I had in my SMS
implementation. 

Discovery Data Overview

One corrupt
discovery data record (DDR) can completely lock up a primary site
server. In this article I’ll attempt to help you troubleshoot
typical primary site server DDR processing problems.

Discovery data records are small files
(about 2K) containing basic client system information that is
subject to change. They have .ddr extensions. A DDR should
contain network address, username, GUID, machine name and a
few others elements. The file is processed on the client,
based on a site’s discovery configuration and is located in
mssmscoredatasmsdisc.ddr. Methods of client discovery
include NT Logon Discovery, Heart Beat Discovery, and Network
Discovery. DDRs are used by SMS to “keep tabs” on the clients.
They allow a few data elements to be refreshed on a regular
basis in order for remote tools and advertised programs to
function in  a “reliable” manner. To view discovery data,
information open any collection and double-click on any client
from that collection. What you’ll see next is discovery data
for the client that has been entered into the SMS
database. 

The DDR is most likely sent by the client to
the logon pointsmslogonddr.box, then forwarded on to
CAPddr.box, and then forwarded on to site
serversmsinboxesddm.box
. Then it is finally placed into
the SMS database. As you have probably already experienced,
many problems can happen along the way. This article is going
to focus on the issue of DDRs backing up in site server’s
ddm.box. For more details about the information I’ve already
covered please check out the SMS Admin Guide, SMS Resource Kit
(in BORK 4.5) andor SMS Admin’s Companion.

Don’t confuse the discovery data process
with hardware inventory. The two are completely different
mechanisms used for different purposes.

Troubleshooting Typical DDR Processing
Issues

Data Discovery Manager is the server side
thread of smsexec.exe responsible for processing DDRs. If
discovery data is not being updated according to the
configuration that you have chosen within the site’s hierarchy
properties, hopefully  the following information will
help you.

  • From a collection, open a client’s
    properties and check the date on one of the Agent Time
    entries. It should match your discovery method
    frequency.  Just a tip: You should have some discovery
    method configured to occur at least once a day.

  • You can browse to the primary site
    server’s ddm.box inbox and view the backed up files for your
    self. Turn details on in Windows Explorer and compare the
    dates. If there are no backed up DDRs look at flow charts of
    the discovery data process to find other possible
    sources.

  • Look for processing errors appearing in
    the Site Status tree under the Discovery Data Manager
    tread.

  • View the SMS site server logs. Look in
    ddm.log for lines that indicate Discovery Data Manager is
    attempting to process the same DDR over and over.

The solution is really pretty simple. Stop
the Discovery Data Manager thread within the SMS Service
Manager (or just stop the SMS component in NT Service Manager)
and delete the file that is not being processed. For
troubleshooting purposes, it might be a good idea to copy the
file first for further research. Keep an eye on ddm.log and
ddm.box. Within an hour or so it should look like you opened
the dam and let all those backed up DDRs out.

Symptoms of My Atypical Problem

 When a corrupt DDR entered ddm.box the
following symptoms appeared:

  • I  browsed to the site server’s
    ddm.box inbox and viewed several days worth of backed up
    files DDRs. 

  • Ddm.log indicated that Discovery Data
    Manager was attempting to process the same DDR over and
    over.

The first two were very common for corrupt
DDRs, the next two are not.

  • Processing errors appeared multiple times
    in the Site Status tree under the Discovery Data Manager
    thread as DDRs backed up. They began the first time
    Discovery Data Manager attempted to process the corrupt
    record.

    Message ID 669 Component raised an
    exception but failed to handle it.

  • Here is another atypical symptom that I
    saw: I had about 250 SMS crash dumps. These were located in
    site serverSMSLogsCrashDumps.  Each crash.log file
    had the following message:

Time = 06/15/2001
13:42:24.925
Service name = SMS_EXECUTIVE
Thread name
= SMS_DISCOVERY_DATA_MANAGER
Executable =
D:SMSbini386smsexec.exe
Process ID = 394
(0x18a)
Thread ID = 655 (0x28f)
Instruction address =
5f4040fd
Exception = c0000005
(EXCEPTION_ACCESS_VIOLATION)
Description = “The thread
tried to read from the virtual address C35EC67F for which it
does not have the appropriate access.”
Raised inside
CService mutex = No
CService mutex description =
“”

The only Tech Net article that remotely
relates to this issue is
Q223755
SMS Executive Crashes when enumerating a non-Microsoft
Server
.

Related Problems

Because of the problems covered above, I had
these additional issues that served to complicate the
troubleshooting process:

  • The drive containing the SMS install was
    completely full. Normally it has 3.5 GB free. This was due
    to the tremendous space that SMS crash dumps take up. Each
    crash dump makes a copy of all SMS logs, wether logging is
    turned on or not. It’s a pretty nice feature, as long as you
    don’t have 250 of them.

  • After the drive was full, the following appeared in Site Status
    under Discovery Data Manager:

Message ID 2636 SMS Discovery Data
Manager Failed to update the following discovery data record
serversmsinboxesddm.boxHQCL4GPB.ddr because it cannot
update the data source.

I believe this occurred due to the lack of
space on the SMS server install drive.

Despite the non-standard symptoms this was a
standard DDR corruption problem. The solution, again, was
quite simple. I found and deleted the corrupt DDR. I also
deleted the 3.5 GB worth of crash dumps. After that no more
problems.

Normally this type of problem wouldn’t go
unnoticed in my SMS implementation but…The corrupt DDR was
received on a Friday. I set my site status messages to clear
on Sunday evenings. I didn’t pay attention to the logs on
Monday. On Tuesday, as I prepared for an important software
distribution to 1500 machines, I noticed this
problem.


This article was originally published on Jun 25, 2001

Page
1 of
1

Thanks for your registration, follow us on our social networks to keep up-to-date

Latest Posts

Related Stories