Litigation Support Tip of the Night
top of page

DISCO has integrated a proprietary AI system called Cecilia into its eDiscovery platform. I received a demonstration of it this past week, and Cecilia AI shows how artificial intelligence is transforming electronic discovery and how attorneys work with document productions.


Cecilia AI can answer questions based on an analysis of the data set loaded into DISCO, and it can further narrow down the data it considers in its answers based on a smaller subset of data. In this example you can see that it identifies the position of an executive whose name appears in the Enron email data set.



Similarly, Cecilia AI can provide definitions for unfamiliar terms with a simple right-click function:



Cecilia AI does not however allow a user to provide feedback, instructing it to correct a mistake or hallucination so it will not get repeated in the future for other users who may not recognize the mistake. So if an attorney knew that in fact Kenneth Lay had become CEO of Enron in 1984 rather than in 1985, it cannot tell Cecilia AI to give that answer going forward.


There is an option to reset Cecilia so that it will not base its results on previous inputs made by users of the database.



It does list the documents that it uses as the basis for its answers.


You can ask Cecilia questions about individual documents, such as whether or not a contract has a clause addressing potential damages. DISCO provides Cecilia as a free feature on all databases, and up to 50 questions based on a single document or document summaries can be generated each day without opting for Cecilia to be fully enabled.


Cecilia will not answer questions about documents which are less than 300 characters, or more than 250,000 characters. The upper limit is surprising - a short novel like The Adventures of Huckleberry Finn is about 455,000 characters.


DISCO's Auto Review uses Cecilia to identify relevant documents for production based on tags in which a subject matter expert simply explains in ordinary language the kind of documents he or she wants to be identified:


Auto Review provides percentages for precision, recall, and prevalence, and breaks down what fraction of document results are associated with a given tag.




Cecilia can tell a user the custodian for a specific document, and it can run searches for document-based queries made in ordinary English such as, "Show all documents between July 4, 2021 and December 25, 2023", or, "How many email messages are in this database?"


 
 

Excel includes an optional add-in (which can be accessed from File . . . Options . . . Add-ins) named 'Solver' which you can use to automatically alter the data listed in an array that the result in a formula is based on, so that this data matches the value you want the formula to result in.


When activated it will appear on the Data ribbon in a new group called, 'Analyze'.


We begin by selecting a cell which has the result of a formula in it, then opening Solver. In this example, the result of the sum of home runs on rows 2 to 15 in column I is given in cell I16, 223.


We indicate that want the result to be 300 in the box next to the 'Value Of' radio button.



We want to designate that the result should be given by only changing the totals for players who hit 15 or more home runs, and none of these players should hit more than 50 home runs.



After clicking 'Solve', the following update to the number of home runs is generated.



However we obviously can't use this result because it's not possible to hit 0.2 home runs in baseball.


We can fix the results by specifying in Solver that each entry in column I should be an integer by selecting the option for 'int' in the drop-down menu between the cell references and the constraint box.







Now when we click 'Solve' only whole numbers are listed in the array that gives us a sum of 300 home runs:



 
 

Excel's FILTER function can be used to generate (by entering the formula once in a single cell) results in multiple cells, by searching for a value in one column, and returning the data from the complete range where there's a hit for that value.


So in this example, the FILTER function entered in cell G2, returns from the range A2:D8, those entries where 'Tom' is listed as the sales rep in column B.





If you need to search for where a word or phrase appears in multiple cells in a row, you can use SEARCH nested in an ISNUMBER formula, and return the full contents of any cell which contains the searched for string. [See the explanation for this formula in the Tip of the Night for September 20, 2021] So in this example, we want to see which cells in the range from columns B to D contain references to the painter named in column J.



The formula is composed by searching for the value in cell J1, on the second row between columns B and D, and then filtering down the results from that range.


=IFERROR(FILTER($B2:$D2,ISNUMBER(SEARCH(J$1,$B2:$D2))),"")



The IFERROR function has the effect of excluding the '#CALC!' which would result if the FILTER function did not find a result. Entered this way, the FILTER function will return multiple hits from the cited range in multiple cells to the right of the column in which it is entered. When the complete formula is copied to search for strings entered at cells K1 and L1 (we use absolute references for columns B and D by entering dollar signs, so the complete formula points to the correct array) however a #SPILL! error will result as the data is overwritten.


To avoid this problem, we nest the formula in a TEXTJOIN function like this, so each hit in the range from column B to column gets entered in a single cell.


=IFERROR(TEXTJOIN("; ",TRUE,FILTER($B2:$D2,ISNUMBER(SEARCH(J$1,$B2:$D2))),""),"")


We can collect the results returned for all 10 rows by then using a simple TEXTJOIN function, as described in the post for June 29, 2024.



 
 

Sean O'Shea has more than 20 years of experience in the litigation support field with major law firms in New York and San Francisco.   He is an ACEDS Certified eDiscovery Specialist and a Relativity Certified Administrator.

The views expressed in this blog are those of the owner and do not reflect the views or opinions of the owner’s employer.

If you have a question or comment about this blog, please make a submission using the form to the right. 

Your details were sent successfully!

© 2015 by Sean O'Shea . Proudly created with Wix.com

bottom of page