Wednesday, January 22, 2025

Use of AI, Including Large Language Models (LLMs), in Tax Court Brief Writing (And Really Other Legal Analysis) (1/22/25)

 AI (artificial intelligence) is ubiquitous now; or at least the discussion of AI is ubiquitous. See generally Artificial intelligence. (2025, January 22), Wikipedia, here.  I asked ChatGPT about use of AI by lawyers and received the response linked here. I write today on some instances recently called to my attention of misuse of AI in briefing in Tax Court cases, but I understand that similar misuse has been identified in briefing in other courts.

Use of AI in legal briefing has received considerable attention, from general discussion of the strengths and weaknesses to specific instances where lawyers have been called out when they used AI that failed. E.g., Is AI a Good Tool for Legal Brief Writing? (Spellbook 10/22/24), here (general discussion, but noting in part for today’s blog that “AI tools can sometimes "hallucinate" information and generate fake citations that human lawyers must carefully check.”); What Are the Best AI Tools for Writing Legal Briefs? (Bloomberg Law 6/10/24), here (nothing that AI in large language models (“LLM”) can produce “false information” via what are called “hallucinations;” and that, as a result, “21 federal trial judges have issued standing orders regarding AI, and attorneys are often required to disclose all uses of AI.”) Suffice it to say that my understanding is that AI generated content must be carefully checked and appropriate revisions made before submitting that content in a brief submitted to the court. (This is confirmed by my limited use of AI as discussed at the end of this blog.)

The Tax Court has no formal rule addressing the use of AI. However, a reader recently advised me of two Tax Court Orders by Judge Buch addressing the issue. Thomas v. Commissioner (T.C. Dkt 10795-22 at #36 Order dtd 10/23/24), here; and Westlake Housing, L.P. v. Commissioner (T.C. Dkt. No. 478-24L at # 32 Order dated 1/13/25), here. (I have posted both orders to my Google Docs to permit a permalink that readers can directly access without having to go through the DAWSON docket sheet which does not offer a permalink for direct access to the orders.)

Thomas is a short order (5 pages); Westlake is even shorter (2 pages). I discuss Thomas in some detail. The Court (Judge Buch) sets the issue up in its opening paragraph:

          This case was tried on September 17, 2024, in Atlanta, Georgia. In preparing for trial, the Court noticed that some of the authorities cited in petitioner’s Pretrial Memorandum did not exist, evidencing possible AI hallucinations. To inquire into these authorities, the Court held a hearing to provide petitioner’s counsel an opportunity to clarify the Pretrial Memorandum. During that hearing, petitioner’s counsel explained that someone else had prepared the Pretrial Memorandum, and she did not review the work that was provided to her. Rule 33 instructs that, in signing a pleading, counsel is certifying that he or she has read the pleading, that it is well grounded in fact; and that it is warranted by existing law. Because the Pretrial Memorandum violates this standard, we will deem it to be stricken. We will also take this occasion to address the use of AI as a tool to assist petitioners and practitioners. As discussed below, however, striking the Pretrial Memorandum will not affect the ultimate outcome in this case.

After then summarizing nicely the role of the Pretrial Memorandum (pp. 1 & 2), the Court noted:

After citing the applicable Internal Revenue Code sections and Treasury regulations, the petitioner’s Pretrial Memorandum went on to discuss our caselaw, stating in part:

In the case of Schluter v. Commissioner, T.C. Memo 1998-269, the Tax Court held that an employee who was required to submit business expenses for reimbursement, but who was not reimbursed by the employer, was entitled to deduct those expenses. Similarly, in Meneguzzo v. Commissioner, T.C. Memo 1969-15, the Court allowed deductions where the employer had an obligation to reimburse the employee, but reimbursement was not made.

          And after accurately stating the rules regarding burden of proof, the Pretrial Memorandum continued, stating:

However, as demonstrated in Gagliardi v. Commissioner, T.C. Memo 2011-194, if the taxpayer provides credible evidence that the expenses were incurred and not reimbursed, the burden may shift back to the IRS to prove that the disallowance of the deduction was correct.

In preparation for trial of this case, the Court observed that none of the cases referenced in the passages quoted above exist as cited. Moreover, neither the named cases nor their accompanying citations stand for the propositions for which they were cited. Petitioner’s counsel cited Schluter v. Commissioner, T.C. Memo. 1998-269. But Schluter v. Commissioner is actually T.C. Memo. 1970-67, a dependency exemption case. And T.C. Memo. 1998-269 is actually Schmitt v. Commissioner, a method of accounting case. Petitioner’s counsel cited Meneguzzo v. Commissioner, T.C. Memo 1969-15. But Meneguzzo v. Commissioner is actually 43 T.C. 824 (1965), a tip reporting case. And T.C. Memo. 1969-15 is actually B-E-C-K McLaughlin & Assoc. v. Renegotiation Board, an excess profits case. Petitioner’s counsel cited Gagliardi v. Commissioner, T.C. Memo. 2011-194. But Gagliardi v. Commissioner is actually T.C. Memo. 2008-10, a gambling loss case. And T.C. Memo. 2011-194 is actually Layton v. Commissioner, a collection case.

          After trial of this case, the Court informed petitioner’s counsel that it had been unable to locate three of the four cases cited in the Pretrial Memorandum, provided those citations to petitioner’s counsel, and set a hearing for the following day to address those citations. During that hearing, petitioner’s counsel stated that she had recently joined a new law firm and had relied on a new paralegal to draft the Pretrial Memorandum. Petitioner’s counsel stated that she did not review what the paralegal had prepared. The Court specifically inquired whether petitioner’s counsel had used a so-called artificial intelligence platform or large language model to prepare a portion of the Pretrial Memorandum. Petitioner’s counsel stated that she had not; but it was also apparent to the Court that petitioner’s counsel was unaware whether her paralegal might have used such a tool to assist in drafting the Pretrial Memorandum. 

Then in the Discussion (beginning on p. 3), the Court covers the requirements of Tax Court Rule 33, here, requiring the signing counsel to certify that (i) he or she has read the pleading and, (ii) on “information, and belief formed after reasonable inquiry, it is well grounded in fact and is warranted by existing law or by a nonfrivolous argument for extending, modifying, or reversing existing law or for establishing new law.”

The Court then, under the heading “The Use of AI” (pp. 3-4), offers the following: 

The Use of AI

          We take this occasion to address the use of AI. On this record, it is unclear whether, or to what extent, some form of AI may have been used to assist in preparing petitioner’s Pretrial Memorandum. But the Pretrial Memorandum has the hallmarks of a document prepared with the assistance of a large language model.

          Large language models have the ability to give the appearance of understanding text and generating what appear to be thoughtful responses. In reality, a language model “generates its response by selecting the most probable sequence of tokens that follow the prompt’s tokens; therefore, it essentially functions as a probability distribution over these tokens.” Matthew Dahl et al., Large Legal [*4] Fictions: Profiling Legal Hallucinations in Large Language Models, 16 Journal of Legal Analysis 64, 66 (2024). This description of a language model applies equally to a large language model, except that it has a larger set of reference parameters and a larger set of training data. At its core, however, large language models function much the same as text prediction when typing a message on a smartphone; they make their best guess as to what the next word or string of words should be.

          Large language models are prone to what have been termed “hallucinations.” An AI hallucination occurs when a large language model perceives a word pattern and generates output that is inaccurate or even nonsensical. The problem arises because “LLMs are liable to generate language that is inconsistent with current legal doctrine and case law, and, in the legal field, where adherence to authorities is paramount, unfaithful or imprecise interpretations of the law can lead to nonsensical—or worse, harmful and inaccurate—legal advice or decisions.” Id. at 64.

          The Court is not in the business of dictating to attorneys the extent to which they can or should rely on advancing technology to assist them in representing their clients. For example, in Dynamo Holdings L.P. v. Commissioner, 143 T.C. 183 (2014), we were asked to opine on a party’s use of predictive coding to assist it in responding to discovery. We noted that “it is a proper role of the Court to supervise the discovery process and intervene when it is abused by the parties, [but that] the Court is not normally in the business of dictating to parties the process that they should use when responding to discovery.” Id., 143 T.C. at 188. In that case, the parties asked the Court to consider whether document review could be done with the assistance of computers. Although we approved the use of predictive coding, in doing so, we directed the parties to the normative rules for discovery when evaluating the end product of the discovery as distinguished from the means to reach that end. Id., 143 T.C. at 194.

          We take a similar view with respect to the use of AI and large language models. As with any tool that assists lawyers or litigants in preparing their cases, “AI [has the] potential to increase access to justice, particularly for litigants with limited resources. … For those who cannot afford a lawyer, AI can help. … These tools have the welcome potential to smooth out any mismatch between available resources and urgent needs in our court system.” John G. Roberts, Jr., 2023 Year-End Report on the Federal Judiciary at 5 (2023). But the Chief Justice went on to provide a warning.

          But any use of AI requires caution and humility. One of AI’s prominent applications made headlines this year for a shortcoming known as ‘hallucination,’ which caused the lawyers using the application to submit briefs with citations to non-existent cases.

This is what appears to have happened here, as well.

The Court imposed the following remedy” (pp. 4-5):

           The circumstances of this case warrant a minimal sanction. We begin by noting that, substantively, the Pretrial Memorandum accurately stated the law, even if its case citations were erroneous. Substantively, it provided the Court with information that was useful in preparing for trial, although Court resources were diverted in attempts to track down the erroneous citations.

          We are also mindful that, in representing Mr. Thomas, petitioner’s counsel is serving the type of petitioner who is often left unserved by the legal community. This case has less than $10,000 at issue (not including interest), and Mr. Thomas reported a little over $50,000 of income per year. This amount was likely sufficient to make him ineligible for assistance by a low income taxpayer clinic. See, https://www.taxpayeradvocate.irs.gov/about-us/low-income-taxpayer-clinics-litc/ (last visited Sept. 25, 2024) (stating that the income ceiling for qualification for clinic assistance for a family of two is $51,100). And the amount at issue makes it economically difficult for a petitioner such as Mr. Thomas to justify paying for an attorney to represent him. Small firms bill at rates in excess of $300 per hour. Themis Solutions Inc., 2024 Legal Trends for Solo and Small Law Firms (Clio, 2024). And large firm rates average just under $1000 per hour with partner rates at the largest law firms approaching $1500 per hour. Michael Dineen & Sarah Scales, Hourly Rates in Am Law 100 Firms: Increases and Key Drivers (Brightflag Inc., 2023).

          Given this unique situation, we will take the symbolic action of deeming the Pretrial Memorandum to be stricken. This action is “symbolic” because it will not impose an economic burden on either petitioner or his counsel. The applicable law in this case is clear and the facts presented at trial were likewise clear. And the outcome of this case is unaffected by striking the Pretrial Memorandum. The Court is reluctant, in this situation, to impose a pecuniary cost on petitioner’s counsel, who is serving a petitioner who is in an economic “no man’s land” and who might otherwise have gone unserved. Thus, consistent with the foregoing, it is

          ORDERED that petitioner’s Pretrial Memorandum (doc. no. 28) is deemed stricken.

This was a minimal sanction because petitioner's counsel is not sanctioned and the petitioner is not prejudiced. But it is a warning that all practitioners should keep in mind in using AI for brief writing and their obligations to both the court and to the parties they represent.

I forego a similar analysis for Westlake which, in only two pages, reports similar miscitations and requires that petitioner, really petitioner’s counsel, supplement Petitioner's its Motion to Remand.

JAT Comment:

My limited use of certain AI tools solely for summarizing certain documents (including cases and legal articles, but not for brief writing) suggests to me that, at least the tools I used (including ChatGPT, Co-Pilot, and NotebookLM), require significant review and revision to capture important nuance in the documents they summarize. Perhaps the lawyer’s most important tool in trade is the ability to spot and present nuance. The tools I used have not achieved that ability yet and, in any event, for important documents such as court briefs, attorneys are still required to read and understand the output of AI tools and make revisions as appropriate to the particular document.

Added 1/22/25 2pm:
 
1. I just received a link to Charity A. Anastasio, 10 Tips to Ease into Generative AI Use for Lawyers in 2025 (ABA Law Practice 1/13/25), here (accessing the content at this link may require ABA membership). The suggestions are basic and directed to those just getting into AI.
 
2. I have two examples of my use of AI to summarize an article that I had written. SSRN Paper: Loper Bright Is the Law But Poor Statutory Interpretation (Federal Tax Procedure Blog 10/28/24; 1/20/25), here. I present in the body of the content at the end, the MS Co-Pilot summary that required significant revision for nuance. I then link to the NotebookLM summary which was much more detailed and nuanced and close enough that I made no revisions.

No comments:

Post a Comment

Comments are moderated. Jack Townsend will review and approve comments only to make sure the comments are appropriate. Although comments can be made anonymously, please identify yourself (either by real name or pseudonymn) so that, over a few comments, readers will be able to better judge whether to read the comments and respond to the comments.