December 23, 2014

In eDiscovery, Forensics Is NOT a 4 letter Word

It is not clear when it happened but the marketing cycle surrounding the word forensics has clearly come full circle. Ten years ago, forensics was really a bad word in the eDiscovery market. It implied a level of black magic that few people were willing to play with and was generally relegated to the realm of cops and government officials. That all seemed to change with the amendments to the FRCP. With news of sanctions being levied for failure to properly preserve, people were looking to forensics technologies as the gold standard in the eDiscovery market, largely due to the extensive case law these types of products had backing them. Your standard forensics products (FTK and EnCase are by far the most widely used) could boast thousands of cases in which they were used and large numbers in which they were cited directly. However, as is always the case in any market, the success of these tools brought a massive attempt by other companies to co-opt the term forensics for their own benefit, while attempting to downplay the value of those forensics products being relied on by law enforcement and government agencies. Over a period of several years, the word forensics became so over played and used in so many odd ways that it largely lost its meaning. As a function of this type of marketing we are clearly seeing a backlash, which is unfortunate and a mistake, because forensics is both relevant and extremely important in the world of eDiscovery.

When used correctly, the term “forensic”, in the context of digital evidence, means that the process and methodologies employed by a forensic-class technology to collect and produce evidence have withstood challenges to their validity in court. So what makes one technology “courtroom battle-tested” and another not? This is actually a contentious issue. Most vendors in the market will argue that because they have customers who have used their products in civil cases, the product is forensically sound. That however is an extremely dangerous leap in logic. A Crayola sketch can get admitted into court so long as no one objects, but that doesn’t make it a court-accepted technique for producing documents. Until the technology and process has been challenged and held up to judicial scrutiny, no one can determine the defensibility. Few technologies have withstood an actual courtroom battle, but there is a substantial body of case law scrutinizing and ultimately backing several eDiscovery technologies. For example, see Gutman v. Klein Talk and United States v. Mann for court validation of document preservation and document searching technologies respectively.

Case law is not the only important distinguishing characteristic of forensically sound technologies. There is a technological reason to favor eDiscovery solutions based on forensic technologies – even if a company’s risk profile is low. That reason is diligence, and the well-hidden fact is that forensics technologies are more thorough in terms of their ability to locate, collect, process, and expose data than other solutions. This stems from the historical way in which these technologies approach data handling. Most eDiscovery technologies handle data much like a search engine would; they go after the most obvious and active data, index it, and then spit back the results. This approach is incredibly appealing to newcomer eDiscovery software vendors, because all it typically requires is building an interface around one or two popular text extraction technologies. However, reliance on the results from such tools does not make for a defensible and repeatable process.

The problem is those technologies were really not designed for the multifarious states in which data exists on custodian machines. The result is that these products apply incomplete search procedures on only common file types and don’t handle well (or at all) exceptions such as:

  • Non-Windows-based files
  • Corrupt files
  • Deleted files
  • Large files
  • Embedded files
  • Open files and emails
  • New, exotic & unknown formats
  • Files inside of files several levels deep

So what do products relying on these types of technologies/approaches typically do when they come across exceptions? The answer is all too frustrating for unsuspecting customers, but they typically drop the file and move on. The result is the user never really knows that they have failed to produce relevant data. The reality is that search and collection tools need to be augmented with purpose-built code that properly supports file systems and types to successfully handle even a typical custodian’s electronically stored information (ESI). This takes many years of experience and development. Many vendors simply lack the time, resources and experience necessary to fully address the breadth of contemporary ESI. This deficiency is indicative of why many intentionally fail to discuss – or worse – paint a negative picture of forensically sound tools.

Products built on a forensic engine don’t suffer this shortfall. Because these technologies were initially designed many years ago for criminal cases, where ALL evidence on the drive is in-scope – not just the active files – they have been developed and honed to not only look at every part of every file, but also every part of the drive in question. These products continue to evolve in this high-scrutiny environment and are held to much higher performance standards, due to the nature of their use. This results in a much more thorough set of files, indices, and much more accurate results. Companies looking to purchase eDiscovery solutions should run a test against their own sample data to see this dynamic in action. The typical experience will result in a forensics tool producing as much as 20% more hits than non-forensics technologies, even on active files. That 20% could be the difference between a case won and sanctions for underproduction.

In the end, there are a many compelling reasons to rely on forensics technologies in the context of eDiscovery, among them defensibility. A company does itself a great disservice if it falls victim to the over marketing or downplaying of the term forensics and chooses to rule out these products as inapplicable or overkill. Forensics is not a 4-letter word in the eDiscovery market. If your product has case law backing it then that is a good thing not a bad one, and it should give your organization added confidence. After all, you don’t want to be the person who brings the proverbial knife to the courtroom gunfight.

Tim Leehealey

Tim Leehealey is Chairman and CEO of AccessData. Prior to joining AccessData he was VP of Corporate Development at Guidance Software. Prior to that he was an investment banking analyst covering the security market at Wedbush Morgan.

More Posts

Comments

  1. Perry Segal says:

    I’m writing this to inform you that there already is an existing e-discovery blog called “e-Discovery Insights” at the url, http://www.ediscoveryinsights.com. I’ve owned this blog for over two years. I respectfully request that you rename this blog to something else.

  2. Josh Restivo says:

    In general, I agree with Tim’s message here. I’ve interacted with several larger law firms who utilize their all-in-one case-management system to load and index individual documents. Sometimes the resulting output is sufficient. Often, though, it could have been far more valuable had basic forensic (as Tim uses the word) processes been followed.

    It can be too easy to write this off as a cost-benefit discussion (proper forensic tools are quite pricey and provide little to no production capability) but so many attorneys and their firms remain completely unaware of the potential benefits to be realized by a more thorough forensic approach. If their case management software happens to be aware of document metadata, I can assure you that it is completely ignorant of filesystem metadata. This latter form of metadata has proven for more useful across a wide cross-section of cases. Plus it can be used to help validate document-level metadata or, more rarely, invalidate filesystem metadata.

    So again, good idea to go the forensics route wherever and whenever possible. However, both Access Data (FTK) and Guidance Software (Encase) have some demons to own up to. Pertinent to Tim’s post and his postulation that forensics software will catch all manner of stuff that would be missed under common Discovery environments…I fee, compelled to temper the argument.

    There are a few angles to take here but the one issue that has reared its ugly head time and again is that while the leading forensic tools pride themselves on identifying and extracting data from disparate sources, event correlation remains a joke. This is one of the primary reasons behind the perception that forensics is a black art. The software extracts volumes of raw data and then provides the most rudimentary searching and sorting features.

    Rather than, say, extracting a windows event log and vetting timestamps from items contained therein with corresponding filesystem entries, one is left to do that themself if they wish to validate partial filesystem integrity. Or, instead of identifying web browsing sessions based on clustered cache times and grafting that information over metadata from filesharing repositiories with contraband files, we’re left to juggle that on our own.

    The fact that neither of the computer forensic software leaders seem to have realized that there are different (legal) case types with their own common data extraction needs tends to indicate that their developers have spent too much time in their Aeron’s and not enough time working with non-government attorneys and private practice forensic firms.

Speak Your Mind

*