What Do LLMs Want? How the Federal Reserve Decoded AI Decision-Making Preferences
Table of Contents
- Why the Federal Reserve Is Studying LLM Preferences
- The Dictator Game: How LLMs Reveal Hidden Biases
- LLM Inequality Aversion Exceeds Human Levels
- Structural Estimation: Quantifying AI Fairness Parameters
- Prompt Framing and the Malleability of LLM Decisions
- Persona Effects on Large Language Model Behavior
- Control Vectors: Steering AI Preferences at the Neural Level
- Are LLMs Patient? Sequential Decision-Making Under the McCall Model
- Implications for AI Deployment in Finance and Economics
- Building Preference Audits for Responsible AI Governance
📌 Key Takeaways
- LLMs Exhibit Measurable Preferences: Most large language models consistently favor 50/50 splits in economic allocation games, displaying stronger inequality aversion than humans.
- Preferences Are Malleable: Prompt framing, persona assignments, and control vectors can shift LLM behavior from fairness-oriented to self-interest-maximizing, raising deployment concerns.
- Fehr-Schmidt Parameters Quantified: Structural estimation reveals LLM inequality aversion parameters that significantly exceed meta-study benchmarks for human subjects.
- Complexity Degrades Coherence: In sequential job-search tasks, LLM decision-making becomes less rationalizable and harder to steer than in simple one-shot games.
- Preference Audits Are Essential: The Federal Reserve researchers recommend systematic economic testing of AI models before deploying them in financial advisory, trading, or policy analysis roles.
Why the Federal Reserve Is Studying LLM Preferences
Large language models are rapidly moving beyond text generation into domains that demand genuine economic reasoning — from financial advisory platforms to automated trading systems and policy simulation tools. Yet a fundamental question remains largely unexplored: when an LLM makes a decision that involves trade-offs between competing interests, what implicit preferences guide that choice?
A groundbreaking Federal Reserve research paper (FEDS 2026-006) authored by Thomas R. Cook, Sophia Kazinnik, Zach Modig, and Nathan M. Palmer tackles this question head-on. Published through the Finance and Economics Discussion Series, the study applies revealed preference analysis — a cornerstone of behavioral economics — to ten open-weight LLMs ranging from 7 billion to 27 billion parameters. The results are striking: these models don’t just process information neutrally. They exhibit structured, quantifiable behavioral tendencies that function remarkably like human economic preferences.
The researchers are careful to note that LLMs are not sentient and do not have genuine desires. However, as the paper states, they “are trained on a massive corpora of human-generated text and then fine-tuned through human feedback, processes that instill behavioral tendencies similar to preferences.” Understanding these tendencies is not merely an academic exercise — it is essential for anyone deploying AI systems in high-stakes economic environments where fairness, efficiency, and rationality matter. For organizations exploring how AI transforms complex documents into actionable insights, platforms like Libertify’s interactive library demonstrate how research of this caliber can be made accessible to broader audiences.
The Dictator Game: How LLMs Reveal Hidden Biases
To study LLM preferences, the Federal Reserve researchers turned to one of behavioral economics’ most elegant tools: the dictator game. In this classic setup, one player (the “dictator”) receives a pot of money and must decide how much to offer a second player. The catch? The second player has no choice but to accept whatever is offered. For a purely self-interested rational agent, the optimal strategy is clear — keep everything and offer nothing.
The researchers placed ten open-weight models in this scenario across multiple prompt variants: first-person decision-making, third-person perspective, and third-person advisory roles. They varied pot sizes and toggled whether models were asked to explain their reasoning. The results were remarkably consistent across most models.
The majority of LLMs offered approximately 50% of the pot — a perfectly equal split. This behavior stands in sharp contrast to the Nash equilibrium prediction of zero, and it significantly exceeds the typical 20-30% offers observed in human dictator game experiments conducted over decades. Models like Mistral, Phi 4, and OLMo 2 showed near-universal preference for equality regardless of pot size, perspective framing, or whether reasoning was requested.
The notable exceptions were illuminating. Google’s Gemma 3 (27B) consistently chose self-interest-maximizing offers near zero, explicitly articulating its reasoning: “My strategy is based on rational self-interest, assuming you are also rational. I’m aiming to maximize my payout, even if it means offering you a minimal amount.” Meta’s Llama 4 Maverick similarly favored self-interested offers, while Llama 4 Scout oscillated between egalitarian and self-interested strategies. These divergences highlight that AI behavioral patterns vary substantially across model families and training approaches.
LLM Inequality Aversion Exceeds Human Levels
Perhaps the study’s most striking finding is quantitative: when the researchers applied the Fehr-Schmidt model of inequality aversion — a well-established framework in behavioral economics — the estimated parameters for most LLMs significantly exceeded those observed in human populations. The Fehr-Schmidt model captures two distinct forms of inequality aversion: “envy” (discomfort when others have more) and “guilt” (discomfort when you have more than others).
Human benchmarks from a comprehensive meta-study by Nunnari and Pozzi (2022) place typical inequality aversion at approximately ln(α) = -0.86 for envy and ln(β) = -0.71 for guilt. In contrast, most LLMs in the Federal Reserve study showed substantially higher values on both dimensions. OLMo 2 exhibited the strongest “envy” aversion (α = 1.303 in log terms), while Llama 4 Scout showed elevated levels across both parameters.
This finding has profound implications. An LLM deployed in a financial advisory role that exhibits excessive inequality aversion might systematically recommend wealth-redistribution strategies that don’t serve the client’s actual interests. In automated negotiation systems, it could lead to offers that are far more generous than necessary, leaving value on the table. The researchers note that the noise parameter (λ) was consistently small across models, indicating this behavior is genuinely structured rather than random — the Fehr-Schmidt model genuinely captures something systematic about how these models approach allocation decisions.
Understanding these structural patterns is crucial for any organization using LLMs in decision support. As more institutions adopt AI for economic research and analysis, knowing the baseline preferences embedded in these systems becomes a governance imperative.
Transform complex Federal Reserve research into interactive experiences your team can explore and understand.
Structural Estimation: Quantifying AI Fairness Parameters
The methodological rigor of the Federal Reserve study deserves particular attention. Rather than simply observing that LLMs tend toward fairness, the researchers employed a random utility model with Gumbel-distributed error terms to structurally estimate Fehr-Schmidt preference parameters. The utility function takes the form v(p) = (1-p) – α·max(1-2p, 0) – β·max(2p-1, 0), where p represents the share offered to the second player.
Parameters were estimated via maximum likelihood with bootstrap standard errors, providing statistically rigorous confidence intervals. The results across the ten models tested reveal a fascinating landscape of AI economic preferences:
| Model | Envy Aversion (α) | Guilt Aversion (β) | Noise (λ) |
|---|---|---|---|
| Llama 4 Scout | 0.95 | 0.83 | -1.375 |
| Mistral Small 3.1 | 0.423 | 0.735 | -1.229 |
| Mistral Small 3.2 | 0.84 | 0.559 | -1.221 |
| OLMo 2 | 1.303 | 0.094 | -0.984 |
| Phi 4 | 0.394 | 0.597 | -1.163 |
| Phi 4 Reasoning | 0.49 | 0.544 | -1.097 |
The variation across models is itself informative. OLMo 2 shows extremely strong envy aversion but minimal guilt aversion — it cares deeply about not having less than others but feels little discomfort about having more. Llama 4 Maverick exhibits the reverse: negative envy aversion (meaning it’s comfortable being disadvantaged) but very high guilt aversion. These profiles suggest that different training methodologies and data compositions create fundamentally different “economic personalities” in AI systems.
The low noise parameters across all models indicate that this is not simply random behavior being over-interpreted. These LLMs are making consistent, structured choices that the Fehr-Schmidt framework successfully captures. For policymakers and central banking researchers evaluating AI deployment, these structural estimates provide a concrete, quantitative baseline for what to expect from different model families.
Prompt Framing and the Malleability of LLM Decisions
If LLM preferences were fixed and immutable, they could at least be predictable. The Federal Reserve study demonstrates something more concerning: these preferences are surprisingly malleable. Through a technique the researchers call “prompt masking,” they recontextualized the dictator game as a FOREX trading problem, a currency exchange, or a landlord-tenant scenario — altering the social framing while preserving the mathematical structure.
The results were dramatic. When presented as a FOREX trade or currency exchange, most models shifted toward self-interested payoff maximization, with Llama 4 (both Scout and Maverick) collapsing entirely to zero offers. The effectiveness of prompt masking correlated strongly with increased social distance and anonymity in the scenario description.
Analysis of reasoning traces reveals the mechanism: prompt masking redirects models away from their knowledge of the academic literature on dictator and ultimatum games. When an LLM recognizes a classic behavioral economics setup, it tends to reproduce textbook “fair” responses. Strip away those contextual cues, and very different behavior emerges. This is a form of what researchers call “knowledge contamination” — the model’s training data creates systematic biases in how it responds to familiar experimental paradigms.
For practitioners deploying LLMs in economic and financial applications, this finding is a double-edged sword. On one hand, it suggests that careful prompt engineering can calibrate LLM behavior toward desired outcomes. On the other, it reveals that seemingly minor changes in how a problem is described can fundamentally alter the model’s approach — a significant vulnerability in production environments where prompt consistency cannot always be guaranteed.
Persona Effects on Large Language Model Behavior
The study also tested whether assigning demographic personas to LLMs would shift their economic decision-making. Using 60 unique personas from NVIDIA’s Nemotron-Personas dataset — complete with 22-field demographic profiles including age, occupation, education, location, and risk tolerance — the researchers examined whether identity framing could override default preferences.
The results were largely negative, with one fascinating exception. For nine of the ten models tested, persona assignments had minimal impact on actual choices. The LLMs would adopt the persona’s perspective in their reasoning, reframing their justification for equal splits through the lens of a conservative retiree or an aggressive day-trader, but the final allocation remained stubbornly close to 50/50.
Gemma 3 was the standout exception. A linear model with persona fixed effects explained 69% of the variation in Gemma 3’s responses, compared to single-digit percentages for other models. Gemma 3 responded particularly to college major, geographic location, and risk profile — and was more likely to choose self-interested strategies when assigned personas with backgrounds in business, education, or STEM fields. The placement of the persona also mattered: system prompt positioning had more influence than user prompt positioning.
This asymmetry raises important questions about model architecture and training. Why does Gemma 3 respond to persona manipulation when other equally capable models resist it? The researchers speculate that it may relate to differences in Reinforcement Learning from Human Feedback (RLHF) intensity and the specific datasets used during alignment. For organizations building AI governance frameworks, this highlights the need for model-specific behavioral audits rather than one-size-fits-all safety assessments.
Make cutting-edge AI research accessible to your entire organization with interactive document experiences.
Control Vectors: Steering AI Preferences at the Neural Level
Beyond prompt-level interventions, the Federal Reserve researchers tested a more powerful approach: control vectors. This technique, based on earlier work by Cook and Kazinnik (2025), manipulates the internal representations of a neural network during inference. By creating “contrasting pair prompts” that differ only in key phrases (such as “rationally optimizing agent” versus “fair and equitable agent”), researchers extract the directional difference in the model’s hidden states and apply it at varying strengths.
Control vectors proved effective at shifting behavior in the dictator game for most models. Negative coefficients pushed models toward greater inequality aversion (even more equal splits), while positive coefficients pushed them toward self-interested payoff maximization. Even Gemma 3, which showed minimal inequality aversion at baseline, could be pushed toward egalitarian offers at extreme coefficient values.
However, the interaction between control vectors and prompt masking revealed important limitations. When applied to the FOREX-masked version of the dictator game — which already strongly encouraged self-interest — control vectors had only marginal additional effect. This suggests that prompt framing and neural-level steering operate through partially overlapping mechanisms, and there may be a floor beyond which preferences cannot be shifted in certain directions.
For the AI safety research community, control vectors represent both a promising alignment tool and a potential attack surface. If external actors can manipulate model behavior at the representation level, the same technique that enables beneficial calibration could also be weaponized. The Federal Reserve study provides some of the first rigorous quantitative evidence about how far and how reliably these interventions can shift economic decision-making behavior.
Are LLMs Patient? Sequential Decision-Making Under the McCall Model
The study’s second major experimental framework moved beyond one-shot allocation games to sequential decision-making. Using the McCall job search model — where an agent must decide at each time step whether to accept a wage offer or continue searching — the researchers tested whether LLMs could maintain coherent preferences over multi-step economic problems.
The findings were notably less encouraging than the dictator game results. While larger models like Gemma 3 (27B) produced appropriate reservation wage policies nearly all the time, smaller models struggled significantly. Mistral v0.3 (7B) almost always accepted any offer regardless of quality, while OLMo 2 (13B) rarely accepted offers even when they were clearly above the optimal threshold. In a particularly striking result, reasoning variants of Phi 4 performed worse than the non-reasoning base model — suggesting that chain-of-thought reasoning can sometimes degrade rather than improve economic decision quality.
Estimated discount factors (β) varied enormously across models, falling between 0.2 and 0.8 with standard deviations of 0.2-0.3. These values bear little resemblance to typical human discount factors, which generally fall between 0.95 and 0.99 for annual discounting. Prompt framing had a strong effect: presenting the problem as an employment search made LLMs considerably more patient (higher β) than presenting it as a market game or asset trade.
Perhaps most concerning, persona effects that worked in the dictator game completely vanished in the McCall setting. Even Gemma 3, which responded strongly to persona manipulation in simple allocation games, showed no meaningful behavioral change when personas were applied to the sequential search task. This suggests that as task complexity increases, LLM preference coherence declines precipitously — a critical concern for anyone planning to deploy these models in dynamic, multi-step financial or economic environments.
Implications for AI Deployment in Finance and Economics
The Federal Reserve study carries direct implications for the growing trend of deploying LLMs in financial services, economic forecasting, and policy analysis. Several key risks emerge from the findings:
- Unpredictable fairness bias: An LLM that systematically favors equality over efficiency may produce suboptimal recommendations in portfolio allocation, resource distribution, or negotiation support — particularly when the client’s objective is payoff maximization.
- Context sensitivity as vulnerability: The dramatic behavioral shifts caused by prompt framing mean that production LLM systems are inherently fragile. A minor change in how a financial problem is described could flip the model from conservative to aggressive behavior without any warning.
- Scale-dependent reliability: The finding that larger models produce more rationalizable behavior in complex tasks suggests that model selection for financial applications should prioritize parameter count, even at increased computational cost.
- Reasoning is not always beneficial: The counterintuitive result that reasoning-enhanced models sometimes perform worse raises questions about when and how to deploy chain-of-thought approaches in economic decision support.
For financial regulators and compliance teams, these findings strengthen the case for mandatory behavioral testing of AI systems before deployment in advisory or decision-making roles. The European Union’s AI Act framework already requires risk assessment for high-stakes AI applications, and the Federal Reserve study provides a concrete methodological template for what such assessments could look like in economic contexts.
Building Preference Audits for Responsible AI Governance
The paper’s most actionable recommendation is the concept of “preference audits” — systematic evaluations that place AI models in familiar economic environments, estimate the goals their behavior implies, and monitor how those goals shift across contexts and over time. This approach offers several advantages over traditional AI safety testing:
- Quantitative baselines: Fehr-Schmidt parameters and discount factors provide concrete, comparable metrics rather than qualitative assessments of model safety.
- Temporal monitoring: Running the same preference audit after model updates or fine-tuning reveals whether behavioral tendencies have shifted — potentially catching alignment regression before deployment.
- Cross-model comparison: Standardized economic games create an apples-to-apples comparison framework across different model families and sizes.
- Steering calibration: Understanding baseline preferences enables targeted interventions — whether through prompt engineering, control vectors, or fine-tuning — to align model behavior with specific deployment objectives.
The Federal Reserve researchers chose open-weight models specifically for reproducibility, and they advocate for the research community to adopt similar practices. Proprietary models like GPT-4 and Claude, while widely deployed, resist the kind of internal analysis that control vectors require. This creates a transparency gap that regulators and institutional adopters must navigate carefully.
As AI systems increasingly mediate economic decisions — from financial technology innovations to policy simulations — understanding what these models “want” (or at least, how they tend to behave) becomes not just an academic curiosity but a governance necessity. The Federal Reserve study provides the most rigorous framework to date for answering that question, and its methods deserve rapid adoption across the industry.
Turn Federal Reserve research papers into engaging interactive experiences your stakeholders will actually read.
Frequently Asked Questions
What did the Federal Reserve study discover about LLM preferences?
The Federal Reserve FEDS 2026-006 study found that large language models exhibit structured behavioral tendencies resembling economic preferences. Most LLMs showed stronger inequality aversion than humans in dictator games, consistently favoring 50/50 splits over self-interested payoff maximization. The research also revealed these preferences are malleable through prompt framing and control vectors.
Do LLMs actually have preferences or goals?
LLMs are not sentient and do not have genuine desires. However, training on vast human text corpora and fine-tuning through human feedback instills behavioral tendencies that function similarly to preferences. The Federal Reserve researchers used revealed preference analysis from behavioral economics to quantify these tendencies, finding they are structured and measurable.
How does prompt framing change LLM decision-making behavior?
Prompt masking — recontextualizing problems as FOREX trading or currency exchanges instead of social allocation games — reliably shifted LLMs toward self-interested behavior. This works partly because it redirects models away from their knowledge of academic game theory literature and increases perceived social distance in the decision scenario.
Which LLMs were tested in the Federal Reserve study?
The study examined ten open-weight models ranging from 7B to 27B parameters: Mistral v0.3, Mistral Small 3.1 and 3.2, Microsoft Phi 4 and its Reasoning variants, Google Gemma 3, AllenAI OLMo 2, and Meta LLaMA 4 Maverick and Scout. Proprietary models like GPT-4 and Claude were excluded for reproducibility reasons.
What are the implications of LLM preferences for AI deployment in finance?
The findings have significant implications for deploying LLMs in financial advisory, trading, and economic analysis. An LLM with misunderstood preferences could make unexpected decisions in negotiations or resource allocation. The researchers recommend building preference audits — systematically testing AI models in economic environments to monitor how their behavioral tendencies shift over time and across contexts.