Create an MMC Snap-In for Searching PDF Files
I recently called Microsoft Customer Service and Support (CSS) to help resolve
what I thought was an undocumented error. As it turns out, the error was documented—I
just couldn't find a reference to it in the 80 PDF manuals that came with this
particular Microsoft product. Luckily, the support engineer I talked with was
familiar with the error and knew the exact manual that I had to reference.
After that support incident, I recalled that I had used the Adobe PDF IFilter
plug-in for the Microsoft Indexing Service several years ago to search through
PDF files. Back then, I had only a dozen Adobe PDF files in a directory of hundreds
of .doc, .txt, .html and .mht files. However, I had to search every file for
specific text strings, and IFilter served this purpose well.
With the propeller hat spinning full tilt, I decided to again use IFilter
with the Indexing Service for the purpose of searching Adobe PDF files. But
this time, I created a customized Microsoft Management Console (MMC) snap-in
for the UI. Although you can use Adobe Acrobat Reader to search through PDF
files in a specified directory, it takes an extremely long time if that directory
is large (e.g., 65MB). With the MMC snap-in, the search is almost instantaneous.
Here's how you can create this snap-in on your local computer:
Go to http://www.adobe.com/support/downloads/detail.jsp?ftp
ID=2611 and download IFilter 6.0. This version supports most 32-bit Windows
desktop and server versions from Windows Server 2003 through Windows 2000.
(See the IFilter 6.0 download page for details.) If you already have the IFilter
5.0 installed, uninstall it first. I found that version 6.0 automatically
corrects a registry entry and a DLL registration that had to be manually corrected
in version 5.0.
Following the instructions provided on the Adobe Web site, install IFilter
6.0. I chose to install it to C:\ Program Files\Adobe\PDF IFilter. After you
install IFilter, restart your machine.
Select Run under the Start menu. Type mmc and click OK.
From the File menu, select Add/ Remove Snap-in and click Add.
In the Add Standalone Snap-in dialog box, highlight the Indexing Service
snap-in and click Add.
In the Connect to Computer dialog box, select Local computer and
click Finish.
Click Close in the Add Standalone Snap-in dialog box, then click OK in
the Add/Remove Snap-in dialog box.
In the Console Root window, right-click Indexing Service on Local Machine,
select the New option, and click Catalog. In the Add Catalog dialog box, provide
a name and location for the catalog you're creating. If you want to put the
catalog in a new directory, be sure to create this directory beforehand in
Windows Explorer. For this example, let's create the My Documents\Index Catalog
Files\My PDFs directory for the catalog, which we'll name My PDFs. Click OK
in the Add Catalog dialog box. When the message Catalog will remain off-line
until Indexing Service is restarted appears, click OK again to create
the catalog. In this case, the Indexing Service creates the My Documents\Index
Catalog Files\My PDFs\catalog.wci directory.
You need to stop the Indexing Service before you can restart it, so in
the Console Root window, right-click Indexing Service on Local Machine and
select Stop. Then, right-click Indexing Service on Local Machine and select
Start. The unpopulated statistics for your new catalog will appear in the
right pane. Don't worry if only zeros appear. This step simply builds the
indexing framework for the catalog. In step 11, you'll provide a path to the
directory containing the PDF files that will populate the catalog.
When you use IFilter with the Indexing Service, the Indexing Service indexes
not only PDF files but also all the files it natively supports, such as .doc,
.txt, and .html files. Thus, I recommend that you use Windows Explorer to
remove any nonessential subdirectories and files from the directory that contains
the PDF files you want to be able to search. In my first test of the catalog,
the directory of PDF files I wanted to search had a subdirectory that contained
50MB of streaming video files. Those streaming video files were indexed, which
added an unnecessary 65MB to the index catalog.
In the Console Root window, expand the directory that contains the My PDFs
catalog. Right-click Directories, select New, then choose Directory. To fill
in the Path field, browse to the directory that contains the PDF files you
want to be able to search. For this example, let's say these files are in
a directory named PDF Manuals. You can also enter the directory's Universal
Naming Convention (UNC) name in the Alias (UNC) field if you want. Click OK.
You can add as many directories as you want in the catalog by simply repeating
this step.
Right-click the path under the Directory header, then select All Tasks
followed by Rescan (Full). At this point, if you click Indexing Service on
Local Machine, you'll see the My PDFs entry starting to populate. This task
will take about five minutes. Note that the more you move your mouse around,
the longer it'll take to populate the catalog. Mouse movement causes the Indexing
Service to pause because it perceives that movement as user activity on the
PC.
If you want to add a desktop icon for your new catalog, go to the Console
Root window and expand the My PDFs catalog. Right-click Query the Catalog,
then select the New Window option. The Query the Catalog dialog box should
appear. Close the Console Root window behind the Query the Catalog dialog
box because you don't need that window in your finished product. On the toolbar,
click the Show/ Hide console tree button so that all you see is the
Indexing Service Query Form. Maximize the Query the Catalog dialog box. On
the File menu, select Save As. Name the file My PDFs.msc and save it in the
folder that contains the index framework directory (My Documents\Index Catalog
Files). I don't recommend that you save it directly in the index framework
directory (My Documents\Index Catalog Files\My PDFs) because if you perform
an Empty Catalog operation, that operation deletes everything in that directory,
including the Management Saved Console (.msc) file you just created. Close
all the MMC dialog boxes. When you're asked whether you want to save the console
settings, click No. You just saved the .msc file, and you don't want to overwrite
that file.
Use Windows Explorer to create a shortcut to the My PDFs.msc file.
Test your new MMC by clicking the shortcut. A window that's titled "My PDFs
- Query the Catalog" should appear that contains the Indexing Service Query
Form.
Great article. No problems indexing anything on local drive. For some reason, unable to index any network drives, whether a drive letter or unc path. Cannot add any directorys other than C. Get a "Invalid directory name" for anything that doesn't begin with C. Tried adding the directory through the registry and it does then show up, but doesn't index it. I have full rights to the network folders so it's not permissions issue.
jsramrod March 14, 2007 (Article Rating: )
Hi jsramrod,
I don't know about the network paths, but as far as local drives, make sure that you have not disabled "Allow Indexing Service to index this disk for fast file searching check box".
What's next for virtualization and business IT? Windows IT Pro senior editor Jeff James speaks with VMware President and CEO Diane Greene on the future of virtualization technology. ...
Critical Challenges of ESI & Email Retention Are you storing too much electronic information? Get expert legal advice and better understanding of what you are required to do as an IT professional.
Rev Up Your IT Know-How with Our Recharged Magazine! The improved Windows IT Pro provides trusted IT content with an enhanced new look and functionality! Get comprehensive coverage of industry topics, expert advice, and real-world solutions—PLUS access to over 10,000 articles online. Order today!
Get It All with Windows IT Pro VIP Stock your IT toolbox with every solution ever printed in Windows IT Pro and SQL Server Magazine plus bonus Web-exclusive content on hot topics. Subscribe to receive the VIP CD and a subscription to your choice of Windows IT Pro or SQL Server Magazine!
Order Your Fundamentals CD Today! Gain an introduction to Exchange, learn server security requirements, and understand how unified communications can play a role in your messaging strategies with this free Exchange CD.
jsramrod March 14, 2007 (Article Rating: