Thursday, 20 January 2005
Yesterday I mentioned a commercial announcement by a company that had released an IFilter for CHM files. A reader, Sean, pointed out that there was another IFilter available for that file format - for free. It had been a month or more since I last Googled for the term, so I updated the post (thanks, Sean) and and started looking at the most recent information on the 'net in the IFilter area. If you know of more/other resources, please comment below and let me know.

Full-text indexing of files on a computer system allows the user to search for information on a computer, and for that search to extend into the files themselves. Because many computer files that contain text are actually proprietary in format, it can be difficult to "read" the content of those files. File formats need to be optimized for applications, but we need a way to get the text content out, so we can search across multiple file types to find information without having to root through files one by one.

Enter the IFilter. On Windows, IFilters are special little programs that contain the information needed to pull the text content from these proprietary files. Once you get the data out, you can work with it in a number of ways.

Now, remember that I'm not a developer, I'm a business-process guy, so keep that in mind when reading this explanation of IFilters and how they are used.

Q: What's an IFilter?

IFilters are special DLLs used by Windows applications to index the content of specific types of files. From the Microsoft Platform SDK Indexing Service document:

"The IFilter interface scans documents for text and properties (also called attributes). It extracts chunks of text from these documents, filtering out embedded formatting and retaining information about the position of the text ... IFilter provides the foundation for building higher-level applications such as document indexers and application-independent viewers."

Q: Where and when are IFilters used?

There's been some new activity in this are, likely as the result of the release of the MSN Desktop Search tool (which uses IFilters to index files) and the fact that SQL 2005 will be coming soon. IFilters can be leveraged by any application that calls them, and they are typically used to generate an index of information that users can then search through to find information. That's how the Indexing Service on your Windows desktop machine works, for example, and these other Microsoft applications use IFilter to generate their search indexes:

  • MSN Desktop Search
  • SharePoint Portal Server
  • Windows SharePoint Services
  • SQL Server 2000 and 2005
  • Indexing Services

Those are not the only apps that use IFilters - but they are good representative examples.

Applications typically call the IFilter DLLs and then use them to examine the content of files stored on a computer. The information that comes from the IFilter is used to build a searchable text index that correlates to the discovered content back to its source. From there an application can allow the user to query the index.

Q: What IFilters are available?

Nothing beats a good Google search for finding the latest and greatest, but the Channel 9 wiki has a useful page listing a variety of IFilters and how to find them.

Q: How can I tell what IFilters are installed on my system?

A newer (and free) application from Citeknet called IFilter Explorer will let you see what all is installed on your computer, with more information than the average person will likely need. Developers who need to work with IFilters will find the information very useful in its detail.

If you know of other IFilter resources or facts, please comment here or post them on the Channel 9 wiki to share with others.



Add/Read: Comments [2] | Digg This
Office 2003 | OneNote | Tech
Thursday, 20 January 2005 18:21:55 (Pacific Standard Time, UTC-08:00)
That's a nice way to describe IFilters! Thanks for the links.
Tuesday, 21 June 2005 03:56:42 (Pacific Daylight Time, UTC-07:00)
You should also check out http://addins.msn.com - there are free ifilters for download and install from ifiltershop.com, Adobe (.pdf) & citeknet available from there.
Name
E-mail
(will show your gravatar icon)
Home page

Comment (Some html is allowed: a@href@title, b, blockquote@cite, em, i, strike, strong, sub, super, u)