Seed Bank, Co‑op, Seed Bank, Co‑op, Stoop Swap

Metaphors for Governing Language Model Data for Creative Writing

Alicia Guo, University of Washington Carly Schnitzler, Johns Hopkins University Katy Gero, University of Sydney

Many popular language models today are created by large, for-profit technology companies and serve the goals of these corporations. Can we envision a co-operative language model that challenges existing notions of how technology is created and who it can serve?

Toward Community Models

Many popular language models today are built by large technology companies, trained on massive datasets of scraped web text and books, often without the authors’ consent or knowledge, then marketed back to the public for generating the kinds of writing authors produce professionally. But not all creators are against the technology itself, but instead are troubled by the ways in which the technology is currently being created, governed and used. We make this distinction because critiques of nonconsensual data collection, the colonial or extractive nature of data use and the negative impact on artist industries at large are critiques of a particular way of developing generative AI models, but not (necessarily) of generative AI models as a technology.

Yet, it is immensely difficult to separate the abstract idea of a technology from the way it is created and used. This is especially difficult for generative AI models where there are few examples that are created in line with creator values and respect for data contributors. Writers are often the sources of data and then the potential users of language models, but rarely are they the decision-makers, leading some writers to refuse or boycott these tools entirely. Others advocate for critical engagement or policy changes or stronger norms around consent and provenance.

Amongst the larger effort of writers responding to the current AI landscape, we intentionally take a positive futuring stance and imagine how creative writers can claim this technology as their own and explore it on their own terms.

What would a language model look like if it was created by writers and built to serve their creative community?

Instead of advocating for better practices from existing model creators, we propose that communities can and should build this technology for themselves. In particular, we are interested in community governance, or how to build language models by and for creative writers. We start with a loose definition of community models as models built not only for a community, but with meaningful participation from the community they are intended to serve—in their design and creation, not just as stakeholders or data contributors.

This raises many questions about how community models may function, such as feasibility of model and community scale, how to collect and manage training data, whether such models can truly operate independently of commercial ones, and if they can ever feasibly be truly “large.” In particular, the first part of this project aims to explore the questions:

How do writers envision positive or idealized community language model governance?
What values do writers prioritize in community language model governance?

How We Got Here

Algorithmic processes have long been a part of various writing practices, from tarot decks as storytelling devices to dice rolls deciding narrative elements to computational rule-based systems to generate text. What changes with large language models is the scale of training data used as opposed to randomness and the concentration of ownership. Despite these concerns, some writers have chosen to adopt these models into their practice. Many writers are not fully against the technology, but are concerned about the distribution of power and control, and the devaluing of creative labor. Lawsuits and licensing debates are important, but do not answer the whole question. Even if a model is legally trained, writers may still want more say over the design and purpose of these models. Community models are one of many possible responses to this problem.

We are not arguing that every writer should use AI, or that community models solve the harms of current AI systems. We share many of the concerns that have led writers to refuse, resist, or organize against generative AI. This project starts from a complementary question: alongside refusal and regulation, what would it take for writers to have real agency over the models built for creative writing?

The Workshops

To this end, we invited creative writers to join us in imagining what governing a community language might look like. Across six workshops, including two pilot sessions, more than one hundred writers brainstormed metaphors to think through how a language model might be governed if it were created by and for a writing community.

Writers were asked to:

Articulate how they would want interacting with a language model to make them feel.
Brainstorm metaphors for alternative language model governance. What if a language model were like a seed bank? or a community garden? or a church?
Analyze two metaphors in depth in small groups, asking questions like who are the contributors? Why are they there? Who are the users, who are the stewards, and what would the relationships be like?

Leave the first metaphor on the board.

A few examples from the workshops, and from you as the reader. Feel free to contribute your own metaphors for governing a language model by adding a sticky note.

Why Metaphors?

Metaphors are powerful tools: we use metaphors to strategically reframe the problem of how to govern language models by considering other groups and processes that make collective decisions, and how writers might adopt those practices in this new context. As Kenneth Burke writes in the essay “Four Master Tropes,” good metaphors allow us to see “something in terms of something else.” Braunstein and Warren expand on this Burkean impulse in their work on the stack metaphor in computing, writing: “As a rhetorical figure, metaphor shapes what can be thought. When it functions properly, we do not even notice the epistemic shifts that occur when one domain or scale substitutes for another.” Metaphors are both a rhetorical device and an epistemic instrument—they are doing simultaneous social and cognitive work. In this project, we use metaphor generation to ask: what is epistemically prioritized when we imagine community data governance through frames like community gardens or anthills or garage bands?

For this project, we were especially interested in Schön’s theorization of the “generative metaphor” which moves from explanatory to imaginative—arguing that in addition to framing a way of looking at an idea or concept, metaphors can also be “a process by which new perspectives on the world come into existence.” No metaphor was perfect, and often broke when stretched enough. A single metaphor can also be expanded in multiple ways, depending on what participants found important to highlight. In this sense, metaphors were a way for participants to reason through governance concepts that could otherwise become abstract or technical.

Metaphors for Governance

The workshops generated over 259 metaphors for model governance. Looking at the metaphors across all workshops, some patterns emerged, specifically in the domains of life that participants drew from. These domains begin to reveal the kinds of systems participants found most resonant or legible as models for governance: many metaphors drew from domains characterized by care, knowledge sharing, or creative production. Most metaphors eschewed traditional corporate or state structures, but some metaphors entailed commercial activities (coffee shop), commercial software (AutoCAD, Photoshop), or government-funded projects (public library), suggesting corporate and state structures can have elements of communal control. Notably, many metaphors were small in scale (band jam, film club, potluck) and prosocial, with many of them being playful and imaginative (beehive, seance, drunk friend).

Six illustrated metaphor domains: living systems, performance, early internet, instruments, knowledge systems, and civic life. — Metaphor domains that participants returned to when imagining how community language models could be governed.

When analyzing metaphors in depth, we found that participants came back to four main themes on how to manage community models:

The importance of consent as an ongoing, dynamic, and temporal process.
How to define implicit boundaries that could maintain control without explicit gatekeeping.
Ways to give recognition to contributions to community models.
Trade-offs in scale that pushed participants towards desiring smaller models that could be run locally and preserve privacy.

Consent as ongoing, dynamic, and temporal

Participants consistently emphasized the need for ethically sourced contributions and stated that consent was a pre-condition for community governance. How participants saw and defined consent differed, with many approaches proposed such as opt-ins, renew options, and revocation—what was consistent was the non-negotiability of intentional consent. Many metaphors used donations as a way to gather data, such as in municipal composting where contributors brought in their collected compost material. Other metaphors such as library located consent further upstream, through the act of publishing. For some, intentional consent entailed participants knowing intimately what their data was being used for and the actions their contributions were enabling, pointing at the purpose of the models, asking whether downstream decisions and qualities affect consent: “Do we need a clear sense of what is a good or bad purpose in order to make an ethical model?”

Consent was rarely framed as a one-time decision, but as something ongoing that data contributors would actively participate in, asked through questions of what happens when a community becomes less active, or when contributors no longer agree with their past decisions. Participants grappled with whether models should persist or fade alongside community activity. As one participant put it:

“When a Discord server or group chat stops being as participatory, do you want the LLM to live on as a record of that, as an archive? Or do you want it to only exist when people are actively participating? So we talked about the need for a kill switch...or other ways of naturally allowing the context to decline.”

Multiple groups landed on consent structures that changed over time and allowed for a natural built-in expiration to be part of the consent mechanism, where contributors would have to actively renew their consent either through explicit renewal or by the contribution of new data. Metaphors from the “living system” domain, such as a forest, fungi network and community gardens, prompted participants to imagine models that could change both with the intake of new information and the clearing of old information. The forest metaphor group envisioned: “A forest ecosystem can cycle...the size and the limitation of the data should be set so that the amount of information that’s cycled through is constant...it learns to forget.”

Boundaries and other mechanisms for maintaining control

Many groups were against formal contracts and explicit gatekeeping of a community model, exploring the tension between accessibility and the shared values and trust that make governance work. A group focused on shared physical places surfaced how physical locations can be gated, and rejected the idea of having a language model be “gated” with a key. Other groups also resisted restricted access, with one expressing the desire for models to be “absolutely free for as many people as possible,” pointing to resources like Reddit and Wikipedia and questioning the “extent to which you can really do community work” with commercial models.

However, participants worried about who gets to contribute, recognizing that contributors shape model values and character, suggesting some curation might be necessary. This would require access controls, at least on the model creation side. One group likened discarding data that wasn’t serving the community space to keeping invasive species out of a pollinator garden, which would be carried out by community gardeners; similarly, a bar bathroom graffiti wall could be regulated by the owners or staff. At the same time, some groups ran into the limits of their metaphors since it is more difficult to know exactly how training data contributions map to language model characteristics or behavior. Such gatekeeping also didn’t align with their values, because attempting to formalize requirements was “just a harder thing to enforce...once you’re putting limits or giving form to what kinds of contributions count, you start putting walls up; instead of fencing in your garden to help curate it, you’re then leaving people out.”

Other metaphors such as churches, dance halls, and campfires described spaces with softer membership boundaries that relied on shared values, norms, and guidance from existing community members to guide access, rather than explicit “gates” or rules. One group expanded on churches and viewed them as welcoming but with strong norms: “It’s open to people...but they also share the value system...At the same time, there is a structure, there are priests.” In such metaphors, anyone can join but only those with similar values are likely to stay, and there are barriers to take on formal roles.

Recognition of presence in models

Most participants explicitly minimized the importance of monetary compensation as a motivator, but rather focused on credit, recognition, or access to the model as what they might be gaining from contributing to the model or to its governance. Some metaphors led participants to consider a notion of payment, particularly for use, similar to paying to see an orchestra performance or having an engineering middleman following the business model of microgrids serving a community. Many groups framed compensation through social connection and other experiential benefits rather than monetary payment, describing what contributors would get being “creative spark, shared experiences, and socializing,” “something missing in their life...whimsy” or “joy of playing.” Other groups, while bringing up a concern that volunteer-led projects often fail to sustain themselves, prioritized thinking through how to assign credit or recognize contributors.

Several groups noted that writers influence each other in entangled ways that resist clear attribution, similar to sampling in music. One participant compared this to writer workshops: “You’re inevitably influencing each other in the same way that an LLM output is made up of these latent influences...it’s difficult to ascribe credit.” Rather than discrete contributions, influences blend together, raising questions about whether attribution is even desirable. As one participant asked, “being able to be traced back to a creator—is that a good thing or a bad thing? Would we want to be anonymous?”

Instead of fine-grained credit, participants anchored on wanting recognition of “time and effort put into community creation,” similar to communities like Wikipedia and Reddit. Participants discussed credit not only outside the model (attribution for contributions) but also recognition within it—having a distinctive, recognizable presence in the model’s outputs. Reflecting on the graffiti-in-a-bar-bathroom metaphor, one participant noted: “The mark on the wall signifies something that you don’t normally see in a language model. You don’t have that sense of presence, but I really love that idea of leaving a mark.” Rather than explicit attribution, participants imagined recognition through felt influence.

Scale and the desire for smaller language models

Scale considerations through size of community, size of the model, and longevity of the model became key factors in how the metaphors played out and also a point of tension where they broke. Although participants were primed to imagine what governing a large language model might look like, the metaphors they chose and the ways in which they expanded them repeatedly pointed at the desire to have smaller, personalized, and intentionally limited systems. The archive group explicitly asked in relation to an archive’s ability to preserve provenance and uncertainty: “thinking about smallness or in relationship to data...can we imagine a large language model that doesn’t seem godlike?”

The desires that groups mentioned such as ongoing consent, membership that stems from shared values and trust, and meaningful recognition were often discussed through metaphors that assumed limited scale of community, i.e., local, tight knit communities. Questions such as “what happens when an insular dataset meant for two people grows into a massive LLM?” surfaced, complicating how far the expansions of the chosen metaphors could be applied before behaviors changed. For example, the bar bathroom graffiti wall example brought up what would happen if the wall became a famous piece of art, raising the stakes and discouraging further free use for it.

Participants repeatedly voiced wants for models that were local and personalized, which often went hand in hand. They recognized there would be a utility tradeoff with a smaller model, but many expressed that they didn’t need a big model, and many groups mentioned wanting these models to exist locally on their computer, not in the cloud. Writing can be incredibly personal, and participants wanted a way to be able to contribute and use models without feeling like they were giving up their privacy or control over their writing: “one of the big desires is to have models that are local first...especially if folks are going to put a corpus of their work or an individual work of theirs into a model.” Some wanted to be able to use these models for their personal writing and journals, thus many existing tools have not been an option to them. In talking about how these models would be used (that is, why the participants might want to be contributors to a model,) participants consistently emphasized creativity over efficiency and even welcomed the idea of friction if it supported their creative experimentation, citing how tools can be molded by the people using them: “the more you sit in a chair, the more it becomes shaped to you.”

This automatically puts some limits on the size of a model, but in return can serve to address more specific needs, where one group stated that what was “important about that metaphor [microgrids] is that it’s hyperlocal and can really address the needs of smaller communities.”

What Comes Next

“This group of people’s shared code of ethics or rules—that struck us as interesting, I think, in part because there’s often a sense that language models are supposed to be these neutral artifacts, especially the big ones that are being produced by the big companies right now. And the idea of having them be reflective of a relatively tight-knit group living according to a certain code that might be deliberately separating itself from the world in some way, that was interesting to us.”

The metaphors are a starting point for asking the question of what a community language model could be. We are continuing this work towards building small language models to test in practice what small groups of writers building together would look like. In the next phase we are looking for writers, writing groups, or organizations who are interested in piloting early versions of these ideas with us.

We hope for this work to invite many parallel experiments towards community models. The full paper details future work possibilities for writers, researchers, and community organizations. If you are interested in this work, we are hosting two town halls for writers, researchers, organizers, and builders who want to think with us about what community language models could become.

Cite This Work

Guo, Alicia, Carly Schnitzler, and Katy Gero. 2026. Seed Bank, Co-op, Stoop Swap: Metaphors for Governing Language Model Data for Creative Writing. To appear in Creativity and Cognition (C&C '26), July 13–16, 2026, London, United Kingdom. DOI: 10.1145/3803784.3807550.

This work received an honorable mention at C&C '26.

@inproceedings{guo2026seedbank,
  title = {Seed Bank, Co-op, Stoop Swap: Metaphors for Governing Language Model Data for Creative Writing},
  author = {Guo, Alicia and Schnitzler, Carly and Gero, Katy},
  year = {2026},
  booktitle = {Creativity and Cognition (C\&C '26), July 13--16, 2026, London, United Kingdom},
  publisher = {Association for Computing Machinery},
  address = {New York, NY, USA},
  doi = {10.1145/3803784.3807550},
  isbn = {979-8-4007-2583-8/2026/07},
  url = {https://doi.org/10.1145/3803784.3807550}
}