Â© 2012 Winston & Strawn LLP

© 2012 Winston & Strawn LLP

eDiscovery and Technology Assisted Review: What You Need to Know Now Brought to you by Winston & Strawn’s eDiscovery & Information Management Practice Group

© 2012 Winston & Strawn LLP

Today’s eLunch Presenters

John Rosenthal

Chris Costello

Chair, eDiscovery & Information Management Washington, DC

eDiscovery & Information Management New York

[email protected]

[email protected]

© 2012 Winston & Strawn LLP

3

Welcome!

© 2012 Winston & Strawn LLP

4

Overview 

Technology Assisted Review (“TAR”) 







Definitions Types of technology assisted review Predictive Coding

Proposed Changes to the Federal Rules 









Why the need for further changes in the rules Overview of the Rules Process Two-track approach of the Advisory Committee Rule 37 Duke Conference – Rules 16 and 26

© 2012 Winston & Strawn LLP

5

Let’s Look at the Numbers   





 



94% of respondents found the cost of e-discovery “frustrating” 87% of respondents used an early case assessment to try to resolve matters earlier 81% of respondents brought software in-house, which helps to cut costs on law firm or service provider fees 52% of respondents brought staff in-house to help reduce fees spent on law firms or service providers 32% of respondents used clustering or visualization tools to speed review along (down from 34% in 2010) 71% of respondents used contract attorneys for legal review (down from 77% in 2010) 61% of respondents were able to quantify how much money they spent on e-discovery. Many companies are still unaware of their spending habits. 42% of respondents have a tool to collect and preserve data from the cloud or from social media FTI Survey of 31 In-House General Counsels

6 © 2012 Winston & Strawn LLP

6

E-Discovery Spend

Fulbright Annual Litigation Trends Survey

7 © 2012 Winston & Strawn LLP

7

Electronic Document Review 

Excessive and unpredictable costs:  



Traditional document review is not accurate: 





58 % to 70 % of total litigation costs Document review costs are rising due to the increasing amount of electronic information Evidence suggests that there are high error rates in linear manual review Error rates lead to likelihood of inadvertent production of privileged or sensitive information

Inability to defend the review process: 

Judges are increasingly focusing on the need for validation of review processes

© 2012 Winston & Strawn LLP

8

8

Traditional Electronic Document Review = Linear Review     

Over collection Little or no culling Ad hoc use of Boolean searches Linear review of the data set Use of traditional associate work force to perform review

Traditional Approach

Manually Acquire Broad Amounts of Data

Process Data

First Level Review Second Level Review

Produced Documents

© 2012 Winston & Strawn LLP

9

Goals of ESI Review   

Recall - Identification and prioritization of relevant material Precision - Elimination of irrelevant/non-responsive material Identification of privileged material

Relevant Data Non-relevant and retrieved

Relevant and not retrieved

Retrieved Data © 2012 Winston & Strawn LLP

Relevant and10 retrieved

Accuracy of Human Review 

Recall Number of responsive documents retrieved Total number of responsive documents in the collection



Precision Number of responsive documents identified Total number of documents retrieved

© 2012 Winston & Strawn LLP

11

Accuracy of Human Review Perfection

100%

TREC

90%

Blair & Maron (1985) 80%

Voorhees

Precision

70%

Roitblat

60% 50% 40%

TREC TREC

30% 20%

TREC 10% 0% 0%

10%

20%

30%

40%

50%

60%

Recall © 2012 Winston & Strawn LLP

12

70%

80%

90%

100%

The Sedona Conference Commentary on the Use of Search and Information Retrieval “[T]here appears to be a myth that manual review by humans of large amounts of information is as accurate and complete as possible – perhaps even perfect – and constitutes the gold standard by which all searches should be measured. Even assuming that the profession had the time and resources to continue to conduct manual review of massive sets of electronic data sets (which it does not), the relative efficacy of that approach versus utilizing newly developed automated methods of review remains very much open to debate.” 13 © 2012 Winston & Strawn LLP

13

2011 RAND Study re E-Discovery “Taken together, this body of research shows that groups of human reviewers exhibit significant inconsistency when examining the same set of documents for responsiveness under conditions similar to those in large-scale reviews. . . . Human error in applying the criteria for inclusion appears to be the primary culprit [regarding the lack of accuracy], not a lack of clarity in the document’s meaning or ambiguity in how the scope of the production demand should be interpreted. In other words, people make mistakes, and based on the evidence, they make them regularly when it comes to judging relevancy or responsiveness.” 14 © 2012 Winston & Strawn LLP

14

Ralph Losey Revised EDRM Model

© 2012 Winston & Strawn LLP

15

Document Review Models Outsourced Manual Review

Technology Assisted Reviews

• Most prominent model used today • Limited culling and analysis • Heavy reliance on attorney review • Use of sampling to ensure quality control

• Process approach to review to increase efficiency, recall and precision, using legally accepted tool sets: • Threading • Near-Duping • Advance search • Clustering

© 2012 Winston & Strawn LLP

16

Predictive Coding • Great deal of confusion regarding what it means • Uses attorneys to develop a seed set of data that can be fed into a black box to find similar documents • Emphasizes sampling of inclusion set and exclusion set • Only a handful of courts have addressed its use

Technology Assisted Review 

Meta-Data Context



Boolean Queries 

Wildcard expansions



Proximity Specification



Misspellings/Fuzzy Search



Synonyms



Dupe and Near Dupe



Threading



Concept/clustering engines



LSI, LSA, PLSA



Predictive coding

© 2012 Winston & Strawn LLP

17

Technology Assisted Reviews Analytical

Collection

Processing, Filtering and Culling

Non-Linear Review

© 2012 Winston & Strawn LLP

• Working with client and data to develop a set of defensible “relevance criteria” to select data subject to review

• Use of search and retrieval at the front end can dramatically reduce the volume and cost • Risk consideration • Employ more sophisticated processing tools to further reduce the volume set • Unilaterally vs. negotiate

• Clustering/Concepting • Threading • Near Dupe • Predictive Coding

18

E-Mail Threads • 70% of production is eMail and of that nearly 65% or more are part of e-mail threads

Less Time

Less Errors The Problem:

eMail Threads – Step 1

eMail Threads – Step 2

 No clear method to identify eMail threads

 Group into eMail sets

 Build tree structure  Identify missing links

 eMails are reviewed multiple times and inconsistently

 Suppress duplicates  Focus on inclusives

 Extremely difficult to identify where missing eMails exist © 2012 Winston & Strawn LLP

Source: Equivio

19

Less Cost

Duplication and Near Duplication • 15% to 40% of document population are duplicates or near duplicates

Less Time Near-Duping – Step 2

The Problem:

Near-Duping – Step 1

 No clear method to organize and allocate documents across reviewers

 Group the near-duplicates  Assign near-dupe sets for coherent review to  Identify the differences reviewers among the near Reviewers prioritize and duplicates review only the differences

 Documents are reviewed multiple times by different reviewers

 Apply coding to entire near-dupe sets where appropriate

 High risk of different coding among similar documents © 2012 Winston & Strawn LLP Source: Equivio

20

Less Errors Less Cost

Clustering or Concepting 



Concept search places a document or part of a document in this space. Results are returned in order of relevance. 

higher score = closer document

• Document 1: 98 • Document 3: 92 • Document 4: 91

© 2012 Winston & Strawn LLP Source: K-Cura Corp.

21

“Predictive Coding”

© 2012 Winston & Strawn LLP

22

What is Predictive Coding Document Set for Review

Source: Servient Inc. http://www.servient.com/ © 2012 Winston & Strawn LLP

23

Use Cases for Predictive Coding        

Early case assessment Relevance inclusion Relevance exclusion Pre-review tagging Pre-review batching Privileged review Review of incoming productions Internal investigations

© 2012 Winston & Strawn LLP

24

Limitations on Predictive Coding 



   

As with any statistical model, caution should be exercised (“Torture numbers, and they’ll confess to anything”) Garbage in = garbage out Limitations: Not right for all types of cases Size matters Unable to address:     



Images Graphics Excel files Video Voice

Confidentiality

© 2012 Winston & Strawn LLP

25

Do you Need to Understand the Technology?







“Muddy water is best cleared by leaving it alone.” Alan Wilson Watts

© 2012 Winston & Strawn LLP

26

Very few individuals in the industry will ever understand the technology Even fewer people would know how to attack the technology Does the technology matter?  Not all TAR software is created equal  Same seed set put into different TAR software will yield vastly different results

Defending the Technology? 

What is the basic underlying technology? Support Vector Machines (SVM) (i.e., patterns are determined and categorized from positive examples (relevant documents) and negative examples (irrelevant documents), and new examples are classified in one category or the other based on whether these patterns appear in the new examples) Probabilistic Latent Semantic Analysis (PLSA) (i.e., documents are categorized by detecting concepts through a statistical analysis of word contexts; documents are grouped based on probabilities of the number of times words occur together) Other potential algorithms that generate correlations and categorizations What has the vendor done to explain the technology? What has the vendor done to defend the technology? How can the technology be abused or misused? What are its limitations? 2012 Technology Concepts & Design, Inc. 





   

© 2012 Winston & Strawn LLP

27

27

Stages to a Predictive Coding Process?

Team Selection

© 2012 Winston & Strawn LLP

Culling

Iterative Training the System

Selection of Control Set

Selection of Sensitivity

Quality Control of Corpus

28 28

The Process 



Who designed, implemented and supervised the process? What should your team look like? 





Sr. Partner? Contract attorney?

How many people should be on the team?

© 2012 Winston & Strawn LLP

29

29

The Process 

Selection of the control set?     



Size? Random or targeted? Entire corpus or issue driven? Entire documents or selected portions? Richness of the data?

Training the system?   

Iterations? How are conflicts resolved? Is it more important to focus on inclusive or exclusive documents?

© 2012 Winston & Strawn LLP

30

30

Stabilization criteria

Source: Equivo

© 2012 Winston & Strawn LLP

31

31

The Process 

Sensitivity

Source: Equivo

© 2012 Winston & Strawn LLP

32

32

The Process 

Quality control of remaining corpus 





Written sampling protocol? How much do you look at? When do you need to retrain?

© 2012 Winston & Strawn LLP

33

33

Predictive Coding Decisions 







Da Silva Moore v. Publicis Groupe – Case No. 11 Civ. 1279 (S.D.N.Y. Feb. 24, 2012) (Peck) “[t]he Court determined that the use of predictive coding was appropriate considering … the superiority of computer-assisted review to the available alternatives (i.e., linear manual review or keyword searches).” Global Aerospace v. Landow Aviation, Case No. CL 61040 (Va. Cir. Ct. Loudon Co. April 23, 2012), – Va. state judge approved use of predictive coding where defendant stated that it would achieve recall of 76.7%. Kleen Products v. Packaging Corp. (N.D. Ill.) (Nolan) - refused plaintiffs' request to force defendants to use predictive coding over search terms. In Re: Actos (Pioglitazone) Products Liability Litigation (W.D. La. 2012) (Doherty) – order setting forth a detailed protocol for the use of predictive coding. 34

© 2012 Winston & Strawn LLP

34

Moore v. Publicis & MSL 



Judge Peck – “This judicial opinion now recognizes that computer-assisted review is an acceptable way to search for relevant ESI in appropriate cases.” “The technology exists and should be used where appropriate, but it is not a case of machine replacing humans: it is the process used and the interaction of man and machine that the court needs to examine.”

© 2012 Winston & Strawn LLP

35

Judge Peck’s Key Takeaways     

Process Transparency Proportionality Cooperation Competence

© 2012 Winston & Strawn LLP

36

Kleen Products, LLC v. Packaging Corp. of America, et al., (J. Nolan) N.D. ILL, Case No. 1:10-cv-05711 (Sept. 9, 2010)  



Class action antitrust action filed in 2010 Plaintiffs requested that Defendants use predictive coding (Feb. 2012) Defendants (7 paper companies) 







Had already produced over 1 million docs using traditional keyword-based search terms on key custodians Thousands of hours (99% complete) No glaring omissions

3 days of hearings

© 2012 Winston & Strawn LLP

37

Global Aerospace v. Landow Aviation (J. Chamblin) Virg. Cir. Ct. (Loudoun), No. CL-61040 (Apr. 23, 2012) 







Protective order — defendants can use predictive coding to process and produce documents. Explains that predictive coding meets duty under Virginia law to use reasonable inquiry and care in discovery. Contrasts predictive coding with linear human review and keyword searches. Takeaway — such opinions will dominate.

© 2012 Winston & Strawn LLP

38

In re Actos (Pioglitazone) Products Liability Litigation, MDL No. 229 (W.D. La. 2012) (Doherty)

 



 



Court issued ESI protocol utilizing predictive coding (Equivio Relevance) Select 4 custodians for creation of sample collection population; parties to select three “experts” to work together to work collaboratively to train the system. Parties to meet and confer on the relevance scores generated using sample collection and to decide on a “cutoff” score. Iterative training phase until system reaches stability Post Predictive Coding “meet and confer” to finalize method for searching for documents. Results still a long way off.

© 2012 Winston & Strawn LLP

39

Defense of Process   

Legal Standard – does not exist Documentation vs. Transparency Transparency 



Is it required? How much is too much?

© 2012 Winston & Strawn LLP

40

Lessons Learned 









Sedona Conference Cooperation Proclamation is gaining traction among judiciary, especially as it applies to TAR/predictive coding. Discussions concerning use of predictive coding should occur early and often (e.g., disclosure of seed sets and process involved, discuss acceptable rates of recall and precision, number of iterations, etc.) Counsel needs to be cognizant of the strengths and weaknesses of the various TAR/Predictive Coding software and prepared to discuss how best to implement it. Clients should inquire as to use of predictive coding and appropriateness in case at hand and cost-saving potential. Although Predictive Coding is not appropriate in all circumstances, courts are beginning to accept its use as a means to handle high volume, complex litigation where it can serve to reduce overall costs and increase likelihood of recall and precision.

© 2012 Winston & Strawn LLP

41

Moving Forward 













Expect to see more instances where Predictive Coding gets judicial stamp of approval. Use of Predictive Coding continues moving to investigations and review of documents produced by opposing party to speed reviews. Expect to see more instances where clients push for cost savings and benefits from using predictive coding. More in-depth discussions of predictive coding methodologies, proportionality, and sharing of data between counsel prior to Rule 16(f) conferences, and longer, multi-day Rule 16(f) conferences as parties try to agree on protocols implementing TAR/Predictive Coding strategies. Focus on the process and transparency of the software/Predictive Coding protocol. Increased importance of developing highly trained and experienced “experts” to develop sample/seed sets. Loss in revenue from linear review and shifting law firm approach to embrace new technologies/roles for lawyers and paralegals.

© 2012 Winston & Strawn LLP

42

Update on Federal Rule Process 

Overview of Rules Process 





Discovery Subcommittee (preservation – triggers, scope, sanctions) Duke Subcommittee (proportionality, cooperation, early case management)

Scope of potential amendments 









Rule 1 Rule 26 Rule 16 Rule 37 Rule 45 (already in progress)

© 2012 Winston & Strawn LLP

43

Federal Rules Process 

2010:   



2011   



American College of Trial Lawyers study Sedona Conference on Future of Civil Litigation Duke Conference on Future of Civil Litigation Call for a “mini-conference” in September Mini-conference occurs in Dallas – Sept. 9 Submissions by law firms, corporations and academia

2012   

FJC Early Stages of Litigation Report Rand Report Sedona Conference Proposed Draft Rules

© 2012 Winston & Strawn LLP

44

Microsoft’s Comments

© 2012 Winston & Strawn LLP

45

Microsoft’s Comments

Preserved 48,431,250 pages Collected and Processed 12,915,000 pages Reviewed 645,750 pages Produced 141,450 pages

© 2012 Winston & Strawn LLP

46

Used 142 pages

Rule 37(e) Proposal (e) FAILURE TO PRESERVE DISCOVERABLE INFORMATION. If a party fails to preserve discoverable information that reasonably should be preserved in the anticipation or conduct of litigation, (1) The court may permit additional discovery, order the party to undertake curative measures, or require the party to pay the reasonable expenses, including attorney’s fees, caused by the failure. (2) The court may impose any of the sanctions listed in Rule 37(b)(2)(A) or give an adverse-inference jury instruction only if the court finds: (A) that the failure was willful or in bad faith and caused substantial prejudice in the litigation; or (B) that the failure irreparably deprived a party of any meaningful opportunity to present a claim or defense.

© 2012 Winston & Strawn LLP

47

Rule 37(e) Proposal (3) In determining whether a party failed to preserve discoverable information that reasonably should have been preserved, and whether the failure was willful or in bad faith, the court should consider all relevant factors, including: (A) the extent to which the party was on notice that litigation was likely and that the information would be discoverable; (B) the reasonableness of the party’s efforts to preserve the information, including the use of a litigation hold and the scope of the preservation efforts; (C) whether the party received a request that information be preserved, the clarity and reasonableness of the request, and whether the person who made the request and the party engaged in good-faith consultation regarding the scope of preservation; (D) the party’s resources and sophistication in litigation; (E) the proportionality of the preservation efforts to any anticipated or ongoing litigation; and (F) whether the party sought timely guidance from the court regarding any unresolved disputes concerning the preservation of discoverable information. © 2012 Winston & Strawn LLP

48

What Will Happen?

© 2012 Winston & Strawn LLP

49

Questions?

© 2012 Winston & Strawn LLP

Thank You.

© 2012 Winston & Strawn LLP

Â© 2012 Winston & Strawn LLP

Recommend Documents