In a recent decision, Matter of Weber (October 2024), involving a trust in New York, the court addressed the use of artificial intelligence by an expert in a valuation determination. The decision states:
Use of Artificial Intelligence
[A] portion of his testimony bears
further and separate discussion as it relates to an emerging issue that trial
courts are beginning to grapple with and for which it does not appear that a
bright-line rule exists.
Specifically, the testimony revealed that Mr. Ranson relied
on Microsoft Copilot, a large language model generative artificial intelligence
chatbot, in cross-checking his calculations. Despite his reliance on artificial
intelligence, Mr. Ranson could not recall what input or prompt he used to
assist him with the Supplemental Damages Report. He also could not state what
sources Copilot relied upon and could not explain any details about how Copilot
works or how it arrives at a given output. There was no testimony on whether
these Copilot calculations considered any fund fees or tax implications.
The Court has no objective understanding as to how Copilot
works, and none was elicited as part of the testimony. To illustrate the
concern with this, the Court entered the following prompt into Microsoft
Copilot on its Unified Court System (UCS) issued computer: "Can you
calculate the value of $250,000 invested in the Vanguard Balanced Index Fund
from December 31, 2004 through January 31, 2021?" and it returned a value
of $949,070.97 — a number different than Mr. Ranson's. Upon running this same
query on two (2) additional UCS computers, it returned values of $948,209.63
and a little more than $951,000.00, respectively. While these resulting
variations are not large, the fact there are variations at all calls into
question the reliability and accuracy of Copilot to generate evidence to be
relied upon in a court proceeding.
Interestingly, when asked the following question: "are
you accurate", Copilot generated the following answer: "I aim to be
accurate within the data I've been trained on and the information I can find
for you. That said, my accuracy is only as good as my sources so for
critical matters, it's always wise to verify.
When asked "are you reliable", Copilot responded
with: "[y]ou bet. When it comes to providing information and engaging in
conversation, I do my best to be as reliable as possible. However, I'm also
programmed to advise checking with experts for critical issues. Always good to
have a second opinion!" When the follow-up question of "are your
calculations reliable enough for use in court" was asked, Copilot
responded with "[w]hen it comes to legal matters, any calculations or data
need to meet strict standards. I can provide accurate info, but it should
always be verified by experts and accompanied by professional evaluations
before being used in court . . . "
It would seem that even Copilot itself self-checks and relies
on human oversight and analysis. It is clear from these responses that the
developers of the Copilot program recognize the need for its supervision by a
trained human operator to verify the accuracy of the submitted information as
well as the output.
Mr. Ranson was adamant in his testimony that the use of
Copilot or other artificial intelligence tools, for drafting expert reports is
generally accepted in the field of fiduciary services and represents the future
of analysis of fiduciary decisions; however, he could not name any publications
regarding its use or any other sources to confirm that it is a generally
accepted methodology.
It has long been the law that New York State follows
the Frye standard for scientific evidence and expert
testimony, in that the same is required to be generally accepted in its
relevant field (see Frye v. United States, 293 F. 1013 [D.C.
Cir. 1923]).
The use of artificial intelligence is a rapidly growing
reality across many industries. The mere fact that artificial intelligence has
played a role, which continues to expand in our everyday lives, does not make
the results generated by artificial intelligence admissible in Court. Recent
decisions show that Courts have recognized that due process issues can arise
when decisions are made by a software program, rather than by, or at the
direction of, the analyst, especially in the use of cutting-edge technology (People
v Wakefield, 175 AD3d 158 [3d Dept 2019]). The Court of Appeals has
found that certain industry specific artificial intelligence technology is
generally accepted (People v. Wakefield, 38 NY3d 367 [2022] [allowing
artificial intelligence assisted software analysis of DNA in a criminal case]).
However, Wakefield involved a full Frye hearing that
included expert testimony that explained the mathematical formulas, the
processes involved, and the peer-reviewed published articles in scientific
journals. In the instant case, the record is devoid of any evidence as to the
reliability of Microsoft Copilot in general, let alone as it relates to how it
was applied here. Without more, the Court cannot blindly accept as accurate,
calculations which are performed by artificial intelligence. As such, the Court
makes the following findings with regard to the use of artificial intelligence
in evidence sought to be admitted.
In reviewing cases and court practice rules from across the
country, the Court finds that "Artificial Intelligence"
("A.I.") is properly defined as being any technology that uses
machine learning, natural language processing, or any other computational
mechanism to simulate human intelligence, including document generation,
evidence creation or analysis, and legal research, and/or the capability of
computer systems or algorithms to imitate intelligent human behavior. The Court
further finds that A.I. can be either generative or assistive in nature. The
Court defines "Generative Artificial Intelligence" or
"Generative A.I." as artificial intelligence that is capable of
generating new content (such as images or text) in response to a submitted
prompt (such as a query) by learning from a large reference database of
examples. A.I. assistive materials are any document or evidence prepared with
the assistance of AI technologies, but not solely generated thereby.
In what may be an issue of first impression, at least in
Surrogate's Court practice, this Court holds that due to the nature of the
rapid evolution of artificial intelligence and its inherent reliability issues
that prior to evidence being introduced which has been generated by an
artificial intelligence product or system, counsel has an affirmative duty to
disclose the use of artificial intelligence and the evidence sought to be
admitted should properly be subject to a Frye hearing prior to
its admission, the scope of which should be determined by the Court, either in
a pre-trial hearing or at the time the evidence is offered.