Interview with dtSearch Engine – Instant search for large data volumes
by admin
PC Magazine – a number of years ago – reportedly described dtSearch as “industrial-strength text retrieval,” and that is still the angle they take.
dtSearch has been selling the dtSearch product line since 1991, offering nearly unparalleled experience in both the text search and retrieval space and the file parsing and conversion space. dtSearch has a very large installed base, ensuring active use of the dtSearch product line around the world.
Indexed search time is typically less than a second, even across terabytes of data.
At the core of the dtSearch product line are the dtSearch Engine for Win & .NET and dtSearch Engine for Linux products. Other products include dtSearch Web with Spider, dtSearch Publish, dtSearch Desktop with Spider and dtSearch Network with Spider.
Today, I’ll be interviewing the PR representative for dtSearch.
What is the difference between regular and forensic search?
A lot of forensic industry searching is regular searching of files like MS Office files, PDF files, HTML files, XML files, etc. For this, the general dtSearch search types and functionality still apply.
Sometimes, however, the data that requires searching is not normal file data. Rather, it is recovered data appearing outside the context of normal file types. Examples of this type of data would be data recovered through an “undelete” process; data pulled from unallocated computer space; or partially recovered file fragment data.
For this type of data, we offer an additional filtering option that attempts to sift through the binary codes to retrieve the text. Intuitively, most people think that “filtering” will retrieve less information, since we must be “filtering out” something. However, the opposite is true for filtering of forensically-recovered data.
In addition to searching of forensically-recovered data, the forensic community tends to have other advanced search needs as well, such as the need to search email files without continually having to go through MAPI, international language search issues, etc.
What features should companies look for when evaluating network search solutions?
The three major questions to ask are:
- What type of data is it
- How much data is it?
- Whether you want to make the data available in a classic network search environment or in a web-browser-access (typically Intranet) environment.
Companies need a search engine that will handle all of the various types of data that they need searched preferably in a unified manner. This is discussed in depth in our online descriptions of distributed searching (see answer below).
Companies also need a search engine that will handle the quantity of data that they have, and not falter on a very large data set. (See above for a discussion of the dtSearch terabyte indexer.)
Finally, companies need to decide if they want to make searching available in a classic Windows network environment or on an Intranet server, such as Win / IIS.
For the classic network environment, we offer dtSearch Network with Spider. For shared browser-based search access, we offer dtSearch Web with Spider, both as solutions that are easy to deploy with no programming required.
Why would someone implement your web search solution instead of using a PHP script to search their online database?
A lot of our customers are using us as an “upgrade” to databases that have their own built-in search engines, such as Oracle SQL and MS SQL. Note that the dtSearch Engine can index and search not only SQL meta data, but also the full text of BLOB data as well.
What are the advantages of distributed searching?
At least in dtSearch, our distributed searching lets you take numerous different types of data stores from multiple different locations and provide integrated “one stop” searching of all of these. For example, a single dtSearch query can cover local MS Office files, email archives, public web sites, secure Intranet sites like SharePoint databases even if they are remote, etc. And you can integrate all of these data sources into one search results sorted by relevancy, by date, etc.
Anything else you’d like to add?
In addition to its search technology, as noted above, dtSearch Corp. has its own file parsers / converters. These have become increasingly valuable in their own right, particularly since Oracle has acquired the main file parsers / converters that the world was previously using (i.e. the Stellant / Inso file parsers / converters). So, in addition to a text search and retrieval company, dtSearch Corp. also offers file format support licenses through our developer API.
For more information on dtSearch Engine, please visit http://dtsearch.com
Related posts:














