You can also check out ifilters – there are a number of resources if you do a search for asp.net ifilters:
- http://www.codeproject.com/KB/cs/IFilter.aspx
- http://en.wikipedia.org/wiki/IFilters
- http://www.ifilter.org/
- https://stackoverflow.com/questions/1535992/ifilter-or-sdk-for-many-file-types
Of course, there is added hassle if you are distributing this to client systems, because you will either need to include the ifilters with your distribution and install those with your app on their machine, or they will lack the ability to extract text from any files they don’t have ifilters for.