We’ve filed a law­suit chal­leng­ing Sta­ble Dif­fu­sion, a 21st-cen­tury col­lage tool that vio­lates the rights of artists.
Because AI needs to be fair & eth­i­cal for every­one.

Jan­u­ary 13, 2023

Hello. This is Matthew Butterick. I’m a writer, designer, pro­gram­mer, and law­yer. In Novem­ber 2022, I teamed up with the amaz­ingly excel­lent class-action lit­i­ga­tors Joseph Saveri, Cadio Zir­poli, and Travis Man­fredi at the Joseph Saveri Law Firm to file a law­suit against GitHub Copi­lot for its “unprece­dented open-source soft­ware piracy”. (That law­suit is still in progress.)

Since then, we’ve heard from peo­ple all over the world—espe­cially writ­ers, artists, pro­gram­mers, and other cre­ators—who are con­cerned about AI sys­tems being trained on vast amounts of copy­righted work with no con­sent, no credit, and no com­pen­sa­tion.

Today, we’re tak­ing another step toward mak­ing AI fair & eth­i­cal for every­one. On behalf of three won­der­ful artist plain­tiffsSarah Ander­sen, Kelly McK­er­nan, and Karla Ortiz—we’ve filed a class-action law­suit against Sta­bil­ity AI, DeviantArt, and Mid­jour­ney for their use of Sta­ble Dif­fu­sion, a 21st-cen­tury col­lage tool that remixes the copy­righted works of mil­lions of artists whose work was used as train­ing data.

Join­ing as co-coun­sel are the ter­rific lit­i­ga­tors Brian Clark and Laura Mat­son of Lock­ridge Grindal Nauen P.L.L.P.

Today’s fil­ings:

As a law­yer who is also a long­time mem­ber of the visual-arts com­mu­nity, it’s an honor to stand up on behalf of fel­low artists and con­tinue this vital con­ver­sa­tion about how AI will coex­ist with human cul­ture and cre­ativ­ity.

The image-gen­er­a­tor com­pa­nies have made their views clear. 
Now they can hear from artists.

A 21st-cen­tury col­lage tool

Sta­ble Dif­fu­sion is an arti­fi­cial intel­li­gence (AI) soft­ware prod­uct, released in August 2022 by a com­pany called Sta­bil­ity AI.

Sta­ble Dif­fu­sion con­tains unau­tho­rized copies of mil­lions—and pos­si­bly bil­lions—of copy­righted images. These copies were made with­out the knowl­edge or con­sent of the artists.

Even assum­ing nom­i­nal dam­ages of $1 per image, the value of this mis­ap­pro­pri­a­tion would be roughly $5 bil­lion. (For com­par­i­son, the largest art heist ever was the 1990 theft of 13 art­works from the Isabella Stew­art Gard­ner Museum, with a cur­rent esti­mated value of $500 mil­lion.)

Sta­ble Dif­fu­sion belongs to a cat­e­gory of AI sys­tems called gen­er­a­tive AI. These sys­tems are trained on a cer­tain kind of cre­ative work—for instance text, soft­ware code, or images—and then remix these works to derive (or “gen­er­ate”) more works of the same kind.

Hav­ing copied the five bil­lion images—with­out the con­sent of the orig­i­nal artists—Sta­ble Dif­fu­sion relies on a math­e­mat­i­cal process called dif­fu­sion to store com­pressed copies of these train­ing images, which in turn are recom­bined to derive other images. It is, in short, a 21st-cen­tury col­lage tool.

These result­ing images may or may not out­wardly resem­ble the train­ing images. Nev­er­the­less, they are derived from copies of the train­ing images, and com­pete with them in the mar­ket­place. At min­i­mum, Sta­ble Dif­fu­sion’s abil­ity to flood the mar­ket with an essen­tially unlim­ited num­ber of infring­ing images will inflict per­ma­nent dam­age on the mar­ket for art and artists.

Even Sta­bil­ity AI CEO Emad Mostaque has fore­cast that “[f]uture [AI] mod­els will be fully licensed”. But Sta­ble Dif­fu­sion is not. It is a par­a­site that, if allowed to pro­lif­er­ate, will cause irrepara­ble harm to artists, now and in the future.

The prob­lem with dif­fu­sion

The dif­fu­sion tech­nique was invented in 2015 by AI researchers at Stan­ford Uni­ver­sity. The dia­gram below, taken from the Stan­ford team’s research, illus­trates the two phases of the dif­fu­sion process using train­ing data in the shape of a spi­ral.

The first phase in dif­fu­sion is to take an image (or other data) and pro­gres­sively add more visual noise to it in a series of steps. (This process is depicted in the top row of the dia­gram.) At each step, the AI records how the addi­tion of noise changes the image. By the last step, the image has been “dif­fused” into essen­tially ran­dom noise.

The sec­ond phase is like the first, but in reverse. (This process is depicted in the bot­tom row of the dia­gram, which reads right to left.) Hav­ing recorded the steps that turn a cer­tain image into noise, the AI can run those steps back­wards. Start­ing with some ran­dom noise, the AI applies the steps in reverse. By remov­ing noise (or “denois­ing”) the data, the AI will pro­duce a copy of the orig­i­nal image.

In the dia­gram, the recon­structed spi­ral (in red) has some fuzzy parts in the lower half that the orig­i­nal spi­ral (in blue) does not. Though the red spi­ral is plainly a copy of the blue spi­ral, in com­puter terms it would be called a lossy copy, mean­ing some details are lost in trans­la­tion. This is true of numer­ous dig­i­tal data for­mats, includ­ing MP3 and JPEG, that also make highly com­pressed copies of dig­i­tal data by omit­ting small details.

In short, dif­fu­sion is a way for an AI pro­gram to fig­ure out how to recon­struct a copy of the train­ing data through denois­ing. Because this is so, in copy­right terms it’s no dif­fer­ent than an MP3 or JPEG—a way of stor­ing a com­pressed copy of cer­tain dig­i­tal data.

Fur­ther read­ing

Inter­po­lat­ing with latent images

In 2020, the dif­fu­sion tech­nique was improved by researchers at UC Berke­ley in two ways:

  1. They showed how a dif­fu­sion model could store its train­ing images in a more com­pressed for­mat with­out impact­ing its abil­ity to recon­struct high-fidelity copies. These com­pressed copies of train­ing images are known as latent images.

  2. They found that these latent images could be inter­po­lated—mean­ing, blended math­e­mat­i­cally—to pro­duce new deriv­a­tive images.

The dia­gram below, taken from the Berke­ley team’s research, shows how this process works.

The image in the red frame has been inter­po­lated from the two “Source” images pixel by pixel. It looks like two translu­cent face images stacked on top of each other, not a sin­gle con­vinc­ing face.

The image in the green frame has been gen­er­ated dif­fer­ently. In that case, the two source images have been com­pressed into latent images. Once these latent images have been inter­po­lated, this newly inter­po­lated latent image has been recon­structed into pix­els using the denois­ing process. Com­pared to the pixel-by-pixel inter­po­la­tion, the advan­tage is appar­ent: the inter­po­la­tion based on latent images looks like a sin­gle con­vinc­ing human face, not an over­lay of two faces.

Despite the dif­fer­ence in results, in copy­right terms, these two modes of inter­po­la­tion are equiv­a­lent: they both gen­er­ate deriv­a­tive works by inter­po­lat­ing two source images.

Fur­ther read­ing

Con­di­tion­ing with text prompts

In 2022, the dif­fu­sion tech­nique was fur­ther improved by researchers in Munich. These researchers fig­ured out how to shape the denois­ing process with extra infor­ma­tion. This process is called con­di­tion­ing. (One of these researchers, Robin Rom­bach, is now employed by Sta­bil­ity AI as a devel­oper of Sta­ble Dif­fu­sion.)

The most com­mon tool for con­di­tion­ing is short text descrip­tions, also known as text prompts, that describe ele­ments of the image, e.g.—“a dog wear­ing a base­ball cap while eat­ing ice cream”. (Result shown at right.) This gave rise to the dom­i­nant inter­face of Sta­ble Dif­fu­sion and other AI image gen­er­a­tors: con­vert­ing a text prompt into an image.

The text-prompt inter­face serves another pur­pose, how­ever. It cre­ates a layer of mag­i­cal mis­di­rec­tion that makes it harder for users to coax out obvi­ous copies of the train­ing images (though not impos­si­ble). Nev­er­the­less, because all the visual infor­ma­tion in the sys­tem is derived from the copy­righted train­ing images, the images pro­duced—regard­less of out­ward appear­ance—are nec­es­sar­ily works derived from those train­ing images.

How Sta­ble Dif­fu­sion com­bines these pieces

Within Sta­ble Dif­fu­sion, the pieces described above are imple­mented as three sep­a­rate AI mod­els that coop­er­ate. For details of how these three mod­els work together, see Sta­ble Dif­fu­sion using Hug­ging Face—Look­ing under the hood and The Illus­trated Sta­ble Dif­fu­sion.

The defen­dants

Sta­bil­ity AI

Sta­bil­ity AI, founded by Emad Mostaque, is based in Lon­don.

Sta­bil­ity AI funded LAION, a Ger­man orga­ni­za­tion that is cre­at­ing ever-larger image datasets—with­out con­sent, credit, or com­pen­sa­tion to the orig­i­nal artists—for use by AI com­pa­nies.

Sta­bil­ity AI is the devel­oper of Sta­ble Dif­fu­sion. Sta­bil­ity AI trained Sta­ble Dif­fu­sion using the LAION dataset.

Sta­bil­ity AI also released Dream­Stu­dio, a paid app that pack­ages Sta­ble Dif­fu­sion in a web inter­face.


DeviantArt was founded in 2000 and has long been one of the largest artist com­mu­ni­ties on the web.

As shown by Simon Willi­son and Andy Baio, thou­sands—and prob­a­bly closer to mil­lions—of images in LAION were copied from DeviantArt and used to train Sta­ble Dif­fu­sion.

Rather than stand up for its com­mu­nity of artists by pro­tect­ing them against AI train­ing, DeviantArt instead chose to release DreamUp, a paid app built around Sta­ble Dif­fu­sion. In turn, a flood of AI-gen­er­ated art has inun­dated DeviantArt, crowd­ing out human artists.

When con­fronted about the ethics and legal­ity of these maneu­vers dur­ing a live Q&A ses­sion in Novem­ber 2022, mem­bers of the DeviantArt man­age­ment team, includ­ing CEO Moti Levy, could not explain why they betrayed their artist com­mu­nity by embrac­ing Sta­ble Dif­fu­sion, while inten­tion­ally vio­lat­ing their own terms of ser­vice and pri­vacy pol­icy.


Mid­jour­ney was founded in 2021 by David Holz in San Fran­cisco. Mid­jour­ney offers a text-to-image gen­er­a­tor through Dis­cord and a web app.

Though hold­ing itself out as a “research lab”, Mid­jour­ney has cul­ti­vated a large audi­ence of pay­ing cus­tomers who use Mid­jour­ney’s image gen­er­a­tor pro­fes­sion­ally. Holz has said he wants Mid­jour­ney to be “focused toward mak­ing every­thing beau­ti­ful and artis­tic look­ing.”

To that end, Holz has admit­ted that Mid­jour­ney is trained on “a big scrape of the inter­net”. Though when asked about the ethics of mas­sive copy­ing of train­ing images, he said

There are no laws specif­i­cally about that.

And when Holz was fur­ther asked about allow­ing artists to opt out of train­ing, he said

We’re look­ing at that. The chal­lenge now is find­ing out what the rules are.

We look for­ward to help­ing Mr. Holz find out about the many state and fed­eral laws that pro­tect artists and their work.

The plain­tiffs

Our plain­tiffs are won­der­ful, accom­plished artists who have stepped for­ward to rep­re­sent a class of thou­sands—pos­si­bly mil­lions—of fel­low artists affected by gen­er­a­tive AI.

Sarah Ander­sen

Sarah Ander­sen is a car­toon­ist and illus­tra­tor. She grad­u­ated from the Mary­land Insti­tute Col­lege of Art in 2014. She cur­rently lives in Port­land, Ore­gon. Her semi-auto­bi­o­graph­i­cal comic strip, Sarah’s Scrib­bles, finds the humor in liv­ing as an intro­vert. Her graphic novel FANGS was nom­i­nated for an Eis­ner Award.

Sarah also wrote The Alt-Right Manip­u­lated My Comic. Then A.I. Claimed It for the New York Times.

Kelly McK­er­nan

Kelly McK­er­nan is an inde­pen­dent artist based in Nashville. They grad­u­ated from Ken­ne­saw State Uni­ver­sity in 2009 and have been a full-time artist since 2012. Kelly cre­ates orig­i­nal water­color and acryla gouache paint­ings for gal­leries, pri­vate com­mis­sions, and their online store. In addi­tion to main­tain­ing a large social-media fol­low­ing, Kelly shares tuto­ri­als and teaches work­shops, trav­els across the US for events and comic-cons, and also cre­ates illus­tra­tions for books, comics, games, and more.

Karla Ortiz

Karla Ortiz is a Puerto Rican, inter­na­tion­ally rec­og­nized, award-win­ning artist. With her excep­tional design sense, real­is­tic ren­ders, and char­ac­ter-dri­ven nar­ra­tives, Karla has con­tributed to many big-bud­get projects in the film, tele­vi­sion and video-game indus­tries. Karla is also a reg­u­lar illus­tra­tor for major pub­lish­ing and role-play­ing game com­pa­nies.

Karla’s fig­u­ra­tive and mys­te­ri­ous art has been show­cased in notable gal­leries such as Spoke Art and Hashimoto Con­tem­po­rary in San Fran­cisco; Nucleus Gallery, Think­space, and Maxwell Alexan­der Gallery in Los Ange­les; and Galerie Arludik in Paris. She cur­rently lives in San Fran­cisco with her cat Bady.

Con­tact­ing us

If you’re a mem­ber of the press or the pub­lic with other ques­tions about this case or related top­ics, con­tact stablediffusion_inquiries@saverilawfirm.com. (Though please don’t send con­fi­den­tial or priv­i­leged infor­ma­tion.)

If you’d like to receive occa­sional email updates on the progress of the case, click here to sign up.

This web page is infor­ma­tional. Gen­eral prin­ci­ples of law are dis­cussed. But nei­ther Matthew Butterick nor any­one at the Joseph Saveri Law Firm or Lock­ridge Grindal Nauen is your law­yer, and noth­ing here is offered as legal advice. Ref­er­ences to copy­right per­tain to US law. This page will be updated as new infor­ma­tion becomes avail­able.