Few-shot learning helps LLMs deliver better outputs with just a few examples -- no fine-tuning necessary
The new era of generative AI is increasingly proving that general trumps specific. What exactly do we mean by that?
For years, the thinking among machine learning researchers was that by specializing a model – meaning training it for niche use cases – the model would get better and better at serving that specific niche. Many legal technologies were developed on exactly this premise, requiring firms to provide thousands or even hundreds of thousands of data points to build a specific model for, say, extracting entities from contracts. Even the leader of the new AI revolution – OpenAI itself – went down this path and built a specialized model called “Codex” for the niche use case of generating code.
But when OpenAI announced its new general purpose GPT 4 series of models, it also decided to shut down Codex. Why? Because it turns out that the general purpose GPT 4 was better at writing code than the specialized Codex. Similarly, researchers have demonstrated that GPT models significantly outperform specialized models that have been trained on massive amounts of niche data for use cases like entity extraction.
That’s not to say that “fine-tuning” a model never has a place but you can often get shockingly good results with a general purpose model like GPT 4 … and a few good examples.
That’s where “few shot” learning comes in. Instead of fine-tuning a model on thousands of specialized data points (which, by the way, gets really expensive), you can simply provide a few examples in your prompt and that can significantly improve the accuracy of your outputs.
To understand this, let’s try writing a prompt that summarizes a statute:
System Message:
1. Carefully review the statute provided by the user.
2. Summarize the statute and include citations in the summary.
For example, if the statue provide by the user was as follows:
STATUTE:
“18 U.S. Code § 981 - Civil forfeiture
(a)
(1)The following property is subject to forfeiture to the United States:
(A)Any property, real or personal, involved in a transaction or attempted transaction in violation of section 1956, 1957 or 1960 of this title, or any property traceable to such property.
(B)Any property, real or personal, within the jurisdiction of the United States, constituting, derived from, or traceable to, any proceeds obtained directly or indirectly from an offense against a foreign nation, or any property used to facilitate such an offense, if the offense—
(i)involves trafficking in nuclear, chemical, biological, or radiological weapons technology or material, or the manufacture, importation, sale, or distribution of a controlled substance (as that term is defined for purposes of the Controlled Substances Act), or any other conduct described in section 1956(c)(7)(B);
(ii)would be punishable within the jurisdiction of the foreign nation by death or imprisonment for a term exceeding 1 year; and
(iii)would be punishable under the laws of the United States by imprisonment for a term exceeding 1 year, if the act or activity constituting the offense had occurred within the jurisdiction of the United States.
(C)Any property, real or personal, which constitutes or is derived from proceeds traceable to a violation of section 215, 471, 472, 473, 474, 476, 477, 478, 479, 480, 481, 485, 486, 487, 488, 501, 502, 510, 542, 545, 656, 657, 670, 842, 844, 1005, 1006, 1007, 1014, 1028, 1029, 1030, 1032, or 1344 of this title or any offense constituting “specified unlawful activity” (as defined in section 1956(c)(7) of this title), or a conspiracy to commit such offense”
A good summary of the statute would look like this:
SUMMARY:
Statute: 18 U.S. Code § 981 - Civil forfeiture
Summary: This statue describes the three types of property that are subject to forfeiture in the United States. The first type is property, involved in a transaction or attempted transaction in violation of certain sections of the title, or any property traceable to such property. See 18 U.S. Code § 981(a)(1)(A). The second type is property within the jurisdiction of the United States, constituting, derived from, or traceable to, any proceeds obtained directly or indirectly from an offense against a foreign nation, or any property used to facilitate such an offense. See 18 U.S. Code § 981(a)(1)(B).
User Message:
STATUTE:
“18 U.S. Code § 111 - Assaulting, resisting, or impeding certain officers or employees
(a)In General.—Whoever—
(1)forcibly assaults, resists, opposes, impedes, intimidates, or interferes with any person designated in section 1114 of this title while engaged in or on account of the performance of official duties; or
(2)forcibly assaults or intimidates any person who formerly served as a person designated in section 1114 on account of the performance of official duties during such person’s term of service,
shall, where the acts in violation of this section constitute only simple assault, be fined under this title or imprisoned not more than one year, or both, and where such acts involve physical contact with the victim of that assault or the intent to commit another felony, be fined under this title or imprisoned not more than 8 years, or both.
(b)Enhanced Penalty.—
Whoever, in the commission of any acts described in subsection (a), uses a deadly or dangerous weapon (including a weapon intended to cause death or danger but that fails to do so by reason of a defective component) or inflicts bodily injury, shall be fined under this title or imprisoned not more than 20 years, or both.
(c)Extraterritorial Jurisdiction.—
There is extraterritorial jurisdiction over the conduct prohibited by this section.”
SUMMARY:
Completion:
Statute: 18 U.S. Code § 111 - Assaulting, resisting, or impeding certain officers or employees
Summary: This statute outlines the penalties for forcibly assaulting, resisting, opposing, impeding, intimidating, or interfering with any person designated in section 1114 of this title while they are performing their official duties or for any person who formerly served as a person designated in section 1114 on account of the performance of official duties during their term of service. The penalties vary depending on the severity of the act, with a maximum of one year imprisonment for simple assault, and up to eight years for acts involving physical contact or the intent to commit another felony. See 18 U.S. Code § 111(a). The statute also provides for an enhanced penalty of up to 20 years imprisonment if a deadly or dangerous weapon is used or bodily injury is inflicted during the commission of the acts. See 18 U.S. Code § 111(b). The statute also establishes extraterritorial jurisdiction over the conduct prohibited by this section. See 18 U.S. Code § 111(c).
Now watch what happens to the completion when we remove the example from the system message:
Completion:
The statute, 18 U.S. Code § 111, pertains to the assault, resistance, or impediment of certain officers or employees. It states that anyone who forcibly assaults, resists, opposes, impedes, intimidates, or interferes with any person designated in section 1114 of this title while they are performing their official duties, or any person who formerly served as a person designated in section 1114 due to the performance of official duties during their term of service, will be penalized.
The penalties vary depending on the severity of the act. If the violation constitutes only simple assault, the offender may be fined or imprisoned for up to one year, or both. If the acts involve physical contact with the victim or the intent to commit another felony, the offender may be fined or imprisoned for up to eight years, or both.
The statute also provides for an enhanced penalty if a deadly or dangerous weapon is used in the commission of the acts, or if bodily injury is inflicted. In such cases, the offender may be fined or imprisoned for up to 20 years, or both.
The statute also establishes extraterritorial jurisdiction over the conduct prohibited by this section, meaning it applies to acts committed outside the United States.
As you can see, the power of using examples is that they can implicitly convey instructions that you would otherwise have to spell out. The second completion is missing the labels we used, is four paragraphs instead of one, and crucially lacks the citations that the prior summary so helpfully included. You could write out all those stylistic guidelines… or you could just provide one or two good examples.
This tactic is likely to grow in importance as the context windows of models continue to trend upward. With older models like GPT-2, you could barely fit in all of the instructions you wanted in each prompt. With newer models like Claude 3.5 and GPT 4o, you can provide instructions that are dozens or even hundreds of pages long, which leaves lots of room for examples.
For law firms with vast corpuses of precedent documents, few shot learning offers a particularly compelling way to perform common legal tasks like document drafting. Instead of painstakingly describing all the rules for drafting a particular document, you can simply provide examples of how your firm has previously taken a given set of facts and drafted a document using those facts. If you’d like to see precedent-based document automation in action, check out some of the the document drafting Legal Apps in the ndMAX Studio.