This week we are posting an excerpt from our recently published EDRM Buyer’s Guide; a phase-by phase walkthrough and checklist created by a former Litigation Project Manager using guidelines developed during implementation of a top oil and gas company’s e-discovery program. The Guide is designed to help stakeholders evaluate software and workflow solutions at each phase of the e-discovery process in a neutral way and contains advisory sections on everything from information security to considerations when TIFFing documents. We will be posting excerpts from both the long form and checklist portions of the guide every few weeks – moving from left to right along the Electronic Data Reference Model [EDRM.NET]. This week we are focusing on the EDRM phase “Collection”. The EDRM Collection phase includes acquisition of potentially relevant electronically stored information (ESI). Collection should include both the document/file, as well as any associated metadata.
Collection is a crucial part of the e-discovery process, which is reflected in the wide spectrum of offerings and definitions in this area. Many providers offer some level of collection, but few have years of experience and a solid track record of delivering defensible results. Collection and Processing capabilities should be heavily scrutinized to separate inflated marketing spin from the real thing. Organizations should take particular care to test and ensure data is not being dropped or missed (open files and email, system files, large files, etc.) during collection.
While certainly not a court requirement, forensic data collection inherently achieves a degree of defensibility not available in a non-forensic collection. Forensic collection has other advantages, as well as heightened defensibility, including the ability to audit the collection and the ability to collect deleted files. No longer solely the domain of law enforcement, forensic collection is rapidly becoming understood and sought by opposing counsel and the courts. The organization that chooses a tool with forensic collection capability not only chooses the strongest level of collection stability, but puts itself at the front of a developing trend.
Whether the chosen solution offers a forensic collection capability or not, the collection solution must have a certain set of functionality in order to be minimally acceptable.
Organizations should review and ensure:
- Solution needs the ability to collect open files or files currently in use. Tools that fail to meet this critical criteria fall short of being legally defensible and leave organizations open to charges of incomplete preservation.
- Ability to (but not require) capture a full disk image¾this becomes especially useful in criminal investigations when a collection window is short. Case or custodian specifics can dictate the necessity for full-disk imaging (employee malfeasance or termination).
- The collection tool must have a full spectrum of criteria for both inclusion and exclusion, including the following: keywords (with full set of operands), file extensions, file type (internal file identification), file dates (accessed, modify, create, etc.), file path, file size, MD5 hash, archive search (ability to search and collect from compressed file types).
- Incremental collection capability¾meaning the solution should offer the ability to collect all modified or newly created data since the date of last acquisition. This is particularly critical for lengthy matters or matters that have a large number of mobile custodians.
- Also important since most organizations have many potential custodians located offsite and outside the corporate network, is the ability to collect from employee laptops that are not logged into the corporate network.
- There should be a robust and valid throttling capability to limit network and resource impact when acquiring data during peak utilization periods or where available network bandwidth comes at a premium.
- The system should generate reports on non-responsive files (list of all files which did not meet search criteria and were therefore not collected), reports which describe the collection criteria w/associated “hit” counts, data sources, legal matter as well as custodians involved, and report a list of all collected files by name, type, source and size.
Finally a solid collection tool should not interrupt custodian’s day to day business activities. The process should be transparent to the end user and be automated to the extent that a collection is completely repeatable from one custodian to the next.