Page 11 - EDiscovery
P. 11
NYLJ.COM |
E-Discovery | MONDAY, MARCH 16, 2015 | S11
“contextually diverse” documents to make technical documents typically take longer to tion in identifying initial search terms but After a reviewer made his or her coding
sure there are no topics or concepts left review.) That would mean it would take 316 speciied nothing about what would happen determinations for all of the records in a
in the collection that go unexplored by hours to review 15,800 records. Let’s further after running the search terms against the 50-record batch, the TAR engine utilized
reviewers. As reviewers complete their assume that the average blended rate for the agreed-upon custodians. Opposing counsel those coding determinations to update
small batches of documents, the system reviewers in this case was $250 an hour. At was well informed about TAR and used TAR the algorithm and to continuously re-rank
continuously re-ranks the entire popu- $250 an hour, that is a cost of $79,000. Here, for review of their client’s documents.
the entire population in the background.
lation in the background, incorporating the fees to the vendor, Catalyst, were about (This is the process called Continuous
those new coding calls to “get smarter” $10,000. Thus, the entire net savings was Active Learning.) Predict would then
and improve its predictions.
about $69,000.
The Speciics
Each time reviewers click a button for Earlier, we outlined the general process use the most recent re-ranking to create
new batches on demand for the review-
more documents, the system creates a new The Law on Switching Horses
we used in our review. For those of you who ers. This review-and-updating process
batch based on the most recently completed might be considering TAR, allow us to provide continued until our reviewers repeatedly
re-ranking. This means that the ranking is Of course, switching review methods in the further detail on the worklow.
encountered batches containing few, if
constantly improving and never stops learn- middle of a case is not always something that As noted above, the parties had agreed any, responsive records.
ing. But from the reviewing attorneys’ point can be done unilaterally. A party’s ability to do at the outset on certain search terms and We then conducted targeted searches
of view, all they have to do is ask for more that may be limited by the case management custodians. Thus, we began by running the through the unreviewed records in an effort
documents and then review batches that order or some other circumstances.
agreed-upon search terms against the iles
have a much higher proportion of relevant A leading case on this point is Bridgestone for the agreed-upon custodians and non- to locate other potentially responsive records.
Americas v. International Business MachinesOur reviewers made coding determinations
documents than they otherwise would have , custodial data sources. After deNISTing and on the results of those targeted searches.
seen.
No. 3:13-1196 (M.D. Tenn. July 22, 2014). deduplication, that process yielded about After that, we had about 15,800 unreviewed
We proceeded along this track until we There, after initially screening a collection 40,800 records for review.
records.
started seeing batches with few, if any, rel- of over two million documents using key We next sampled (at a conidence level
evant documents. This is one of the indica- words, Bridgestone sought leave of court to of 98 percent and a conidence interval of
tions that there are few relevant documents use TAR for the remainder of its responsive-
that remain unreviewed, and that the point of ness review. IBM objected that this would be Switching review methods
± 3 percent) the entire population of about
diminishing returns has been passed. Cata- an unwarranted change in the original case 40,800 records to determine the percentage
lyst then helped us test our results by sam- management order and that it would be unfair in the middle of a case is not of responsive records. That sampling indi-
always something that can be cated an overall richness rate of 15.3 percent
pling the documents we had not reviewed. to use TAR after doing the initial screening (including privileged materials).
The statistical analysis of the sample review with key words.
done unilaterally. A party’s abil- Based on that sampling, the entire popu-
showed that we had achieved a very high After considering the parties’ arguments, lation should include approximately 6,240
“recall”—the review metric that describes U.S. Magistrate Judge Joe B. Brown issued ity to do that may be limited
how close we came to inding everything. an order permitting Bridgestone to use by the case management order responsive records (including privileged
Achieving such a high recall means that TAR. Acknowledging that he was “allowing materials). At that point, however, we had
we found the vast majority of the relevant Plaintiff to switch horses in midstream,” he or some other circumstances.
already coded about 6,400 responsive records
documents.
reasoned that “openness and transparency (including privileged materials). Thus, we had
By the end of the TAR process, we had in what Plaintiff is doing will be of critical an estimated recall of 102 percent and a con-
idence interval of ± 3 percent.
reviewed another 6,800 documents, beyond importance.” He noted that Bridgestone had Our reviewers then reviewed the records Based on that estimated recall, we decided
the 18,200 we had reviewed before beginning agreed to provide its seed documents and linearly, starting with the custodians and to discontinue the review because the burden
TAR. That meant that there remained anoth- that IBM is a “sophisticated user of advanced non-custodial data sources that we believed
er 15,800 documents that we never had to methods for integrating and reviewing large most likely to have a higher percentage of of or expense for continuing the review and
review. Put another way, once we ired up the amounts of data.”
responsive records. Our reviewers made inding any additional responsive records out-
TAR system, we only had to review 30 percent “In the inal analysis, the use of predic- coding determinations (“responsive,” “non- weighed the likely beneit of any such records.
of the remaining documents before we were tive coding is a judgment call, hopefully responsive” or “privileged”) for approximately
done. It saved 70 percent of the remaining keeping in mind the exhortation of Rule 26 18,200 records during the linear review.
Conclusion
expense and time the review would have that discovery be tailored by the court to be It was at that point that we decided to The law and practice surrounding the use
otherwise required.
as eficient and cost-effective as possible,” employ TAR and, in particular, the Insight
Again, by the standards of some large Brown wrote. “In this case, we are talking Predict TAR tool and document-relationship of TAR in e-discovery continue to evolve. This
cases, that raw number may not sound like about millions of documents to be reviewed engine from Catalyst.
case was somewhat unusual in that our legal
a huge savings. But a 70 percent savings on with costs likewise in the millions. There is To train its search engine, Catalyst used team did not start out with TAR. It was only
even a portion of a larger review quickly no single, simple, correct solution possible our coding determinations for the approxi- after manually reviewing nearly half the docu-
gets into some seriously large numbers. under these circumstances.”
mately 18,200 records we had considered dur- ments that we decided to switch to using TAR.
And even when you consider the savings to In our case, opposing counsel did not ing the linear review. Catalyst’s Predict was Even so, by using TAR, the team was able to
our client, we achieved a signiicant result.
object to our initial manual, linear review then set to automatically create batches of 50 eliminate the need to manually review nearly
Let’s assume that a reviewer can typical- followed by the switch to TAR. We had records each that the TAR engine predicted 40 percent of all the documents. That resulted
ly get through about 50 records per hour. entered into a case-management protocol were most likely responsive to the opposing in substantial cost savings to our client and
(Remember that this is a patent litigation, and
that required cooperation and collabora-
party’s production requests.
time savings to our litigation team.
P
P
P
n
o
o
oi
i
i
n
n
nt
t
t
Y
Y
Yo
o
o
u
u
u
r
r
r
C
C
C
a
a
a
r
r
r
e
e
ee
e
e
r
r
r
i
i
in
n
t
tD
t
t
h
h
he
e
eR
R
R
i
i
i
g
g
g
h
h
h
t
t
D
D
i
i
i
r
r
r
e
e
ec
c
c
t
t
ti
i
i
o
o
on
n
n.
.
.
Find the right position today.
Visit Lawjobs.com Your hiring partner