The Culling Fields: Some Software is Better than Others at Efficient Document Culling
Thursday, July 16, 2015

Document cullingA word of caution before I get into the weeds of how two-filter document culling works. Although this method is software agnostic, in order to fully emulate this method, your document review software must have certain basic capabilities. An important one that is found in most software is an effective, easy to use, bulk coding feature. They are primarily used in the first first filter as part of multimodal broad-based culling.

Some of the multiple methods in the first pass of filtering do not require special software features. Instead, they require good attorney judgment. For example, the legal judgment to know which custodians to exclude, or what date range to use. This is something I often advise on as my role as the firm’s National e-Discovery Counsel. The judgment comes from years of experience and a solid collaborative process with the trial lawyer in charge of the case. No matter how good your software, effective culling will not work without skilled attorneys to use the software.

Nevertheless, other culling methods rely heavily on software features. These search features require a trifecta of legal knowledge, technical knowledge, and good software. Good examples of this are email domain searches and similarity searches. If your software does not have the features, which will be discussed in this series for the first filter, then you probably should switch right away. For most, however, that will not be necessary. Most software today has the features needed for the methods described for the first filter. The multimodal culling methods used in the first filter are, for the most part, pretty basic.

The second filter, is, however, a different story. The software features needed to implement the second filter are more advanced, namely the predictive coding and probability ranking features. Here you review and cull the various strata of the ranked documents. The second filter can still work without predictive coding, but not as well. You can use keyword search ranking for instance, but it is not nearly as reliable in my experience.

document cullingKeyword and other search methods are still used when you do predictive coding. It is not an either or decision, but rather both and. You use predictive coding – active machine learning – and other search methods. The software you use should be designed for this.

Even when your culling has progressed to second filter, and advanced predictive coding culling, you will still want to use a smattering of other multimodal search methods. In the final stages of review you do so not just to cull out probable irrelevant, but to help find relevant and highly relevant documents to improve training. I do not rely on probability searches alone, although sometimes in the second filter I rely almost entirely on predictive coding based searches to continue the training. It all depends on the particular review project.

E-DIscoveryIf you are using software without AI-enhanced active learning features, iw – without bona fide predictive coding – then you are forced to only use other multimodal methods in the second filter, such as keywords. Warning, true active learning features are not present in most review software, or are weak. That is true even with software that claims to have predictive coding features, but really just has dressed-up passive learning, i.e. concept searches with latent semantic indexing.

You handicap yourself, and your litigation capacities, by continuing to use such less expensive programs. Good software, like everything else, does not come cheap, but should pay for itself many times over if used correctly. I like to think that the same comment goes for lawyers too!

This is part three of the continuing series on two-filer document culling. Please read part one and part two first.

 

NLR Logo

We collaborate with the world's leading lawyers to deliver news tailored for you. Sign Up to receive our free e-Newsbulletins

 

Sign Up for e-NewsBulletins