tanraak wrote:
I'm guessing you probably clicked here because you're aware of some of the discussion in art communities around generative ML models and art. In some parts of the internet, it's been a fierce discussion.
Are diffusion models a threat to artists? Stable diffusion is all the rage. You might be wondering if I'm scared of it as an artist. By the end of this post, you'll see why I have the (perhaps surprising) opinion that Twitter is much scarier than diffusion models or any AI, and why diffusion models are the new psychidelics for artists. You might be in for a bit of a read, so flex your attention muscles for a bit with me. I'm currently sick, and don't want to get off the couch, so I have a lot of time to write this.
I have a bias here. Ever since the
pix2pix model came out 5 years ago, I've had an interest in using ML models with art. Seeing this silly example of a neural network transforming sketches into jankly, scrimblo impressions of a cat may have changed my life. Pix2pix was one of the things that stoked my interest in studying ML, and I've thought about how it could potentially help me. Now, as image generation technology moves from GANs to diffusion models, it appears like the promise of being able to collaborate with ML models is starting to be fulfilled.
Let me start by saying that I think diffusion models have the potential to help artists. I think this will be a matter of taste, where some artists get a lot of value from them and others find them unhelpful. In my mind, the most promising use for ML models is exploration and discovery. I see ML models as able to spitball ideas and point in potential directions that the human artist or writer or creator can then decide to explore further or not.
Take a break from reading this journal and go read this blog post right now. I think this is the most on-the-money take about ML that's out there. ML models don't reason like humans - in fact, they don't reason in general, but rather, they associate things. A trained generative model is almost like a subconscious, with a vast internal space of patterns it recognizes that form constellations, and those constellations point to other constellations, until eventually the stars align into an image or sentence that, to the neural network, constitutes a plausible combination of patterns. This is what gives generative models a free-form, psuedo-psychidelic creativity. The model doesn't reason about objects; it doesn't know what a hand is, or that a hand has five fingers, and it doesn't even have the concept of "five" as an abstract concept. When it tries to generate a hand, it's because some constellations of patterns in its inputs tip it in the direction of constellations of patterns related to fingers in images it's been trained on, and it will fearlessly generate images of characters with as many fingers as satisfies the model's urges, because "hand-ness" is just a collection of associations.
Let me tell you, the thing that causes people take have wrong takes about neural networks the most is imagining them in an anthropomorphized way. I don't think science fiction has prepared us for the future we're living in, or headed toward. On the whole, it feels to me like sci-fi writers are relentlessly obsessed with the question of when and how AI will become like us. We imagine AI as Data from Star Trek, R2-D2 from Star Wars, etc. It doesn't help that the machine learning field loves to borrow buzzwords like "AI" and "neural networks," which lead people to imagine that ML models are somehow simulating a brain in a box, and breathlessly ask questions about sentience and consciousness that, in my opinion, are not the right questions to be asking at the moment. Our human brains are so wired to project feeling and emotion and experience onto things we have sympathy and protective feelings for things like little four-wheeled food delivery robots, if they make beeping sounds that are cute enough and maybe have little animated eyes on them, even though ultimately they are just inanimate machines rolling around. A lot of sci-fi calls on us to be sympathetic to "AI" and imagine a future where we have to give robots rights because they're conscious and somehow, fundamentally like us. Maybe that's a question for the far future, but the more that current-day ML progresses, the more I feel like we need to be asking questions that go in a different direction. It looks to me like what we have is AI that is far more intelligent that what we imagined, but in ways that are completely different from how we, as humans, are intelligent. With ML, we don't have a box that contains an artifically thinking, experiencing, and reasoning brain, but rather, a box that contains some really clever matrix multiplications. And the way in which the matrix multiplications are clever produces an intelligence that is new, yet undiscovered, and largely unrelated to ours. This is the intelligence of being able to distill patterns and associations from billions of images and trillions of sentences, more than any human could see or read in a lifetime. Time does not work for us the way time works for ML models, which is under appreciated. We humans are limited by how much time and mental bandwidth we have to take in and process stimuli, but ML models can consume billions of lifetime's worth of video and images and distill essential things about the data into the model's weights. We are intelligent because we can think; ML models are highly tuned equations that are intelligent because the volume of data they have consumed is so enormous.
The real power of ML models is the out-of-the-box way of combining ideas that ML models have. We, as humans, learn rules about things. We learn what concepts belong together, what's sensible and what's not sensible; for example, why it would be inappropriate for Jerome K. Jerome to be writing an essay about Twitter in 1897. But a lot of great art and storytelling is about breaking the pattern, about putting one thing out of place and asking "What if?" What if
women spontaneously turned into dragons in the 1950's? What if magic was real? Take another example:
the impermanence of memory. At some point in our childhood, we learn the rule that clocks are solid things. One of the reasons this painting is evocative is how it breaks those rules of reality.
Artists have been using drugs like weed, LSD, etc to be more creative since, like, forever. Part of creativity is allowing your mind to open up to the point where you can connect ideas in a new way by unlearning rules that the world has drilled into you. We might use the terms of psycholoy for this, like
divergent thinking vs
convergent thinking. Being a creative person calls for a lot more divergent thinking than one normally might have, because to make new ideas, you often have to break rules.
Here's an interesting dive into this as it relates to weed.
It just so happens that generative ML models, especially diffusion models, can be really good at divergent thinking. Why? Because they associate, rather than reason. Because, unlike the depictions in sci-fi of coldly logical, hyper rational AI that is programmed to think by rules (think I, Robot), generative ML models don't think by rules. A diffusion models' world is made up of constellations of patterns in data, and it has no mind to reason about the fact that a hand should have five fingers or a clock should be a solid object. In ML models, artists have a potential collaborator who can take in a work-in-progress of some kind of creative project and spit out completely unexpected ideas. The ideas it produces may or may not make any sense in the slightest. That's fine; the human collaborator can do all of the convergent thinking, and the AI model can help with the divergent part of the process, being guided by the human through threads in its latent space of possibility.
So, for my pic of Trisk, I generated almost
400 tries from the stable diffusion model. Early in the process, I allowing the model much more freedom to riff on the pic, and over time, my collaboration with the model began to narrow toward a certain kind of aesthetic. You can see from the AI generated images that things don't make sense. Her head melts into weird abstract patterns; the structure of her body melts into a psychidelic mess. But that's okay; I'm the artist, I can keep what I need and ignore what I don't, and steadily guide the model in a direction with lots of compositing, masking, paintovers, etc. Even if I kept nothing from what the model generated in the final image, playing around with the model opened my mind to new possibilities for the background that I didn't think of.
Twitter, Reddit, and the internet in general, tends toward dunk culture. One of the trends on art twitter is dunking on AI generated images. For example: Look how janky this is! Look at this silly output from the diffusion model, lol. Put an artist's work and a generated image side by side and look at how much better the artist's work is.
I think this misses the point. It's another case of overly anthropomorphizing AI. We want to imagine AI as like us, and therefore we think of the diffusion model as another artist. If we see it making bad art, we go, "Look, it's a bad artist." But for ML models, it's important to break out of the way of thinking that sets AI alongside humans to get it right. The AI is like a subconscious on a drug trip having hyperreal dreams; it's lived millions of lifetimes yet it doesn't know what reality is. Things will come out weird and incomplete, just like when the diffusion model riffed on my work-in-progress image of Trisk by dreaming her head melting into a bunch of goo. That doesn't make it a bad artist, because it's not an artist, and its intelligence is not like human intelligence (nor do I think that a couple more years of ML research is going to fix those inconsistencies; they're a result of pretty fundamental things about the technology).
So do we need to worry about AI replacing artists? I don't think so. I think ML research would have to come up with AI that has true semantic understanding and cognitive processes. Maybe that will happen in our lifetimes, maybe it won't. Either way, I don't think humans are going to be replaced in the arts, because humans like humans. We have culture and memes. We understand the social context of the things we make and the way a human might feel when someone sees our art or hears our music. We can feel things with each other. You can't feel things with an AI; while AI can make you feel things, it doesn't feel them along with you. Because of that, the AI has a lot of blind spots. Consider the artists who draw raunchy feral TF furry porn. Why would an AI have reason to combine those ideas? An AI has no feelings or goals, so it wouldn't be motivated to discover that part of the latent space of possible images - the latent space of people-turning-into-dogs-fucking, or foxtaur-with-four-tits-and-dicknipples. But human artists go there, because human artists participate in the human experience - of sexual urges, passions, fears, hopes, dreams, and so on. If an AI were to generate those things, it would be because they exist in its dataset - because humans were interested in them and made it examples to look at. At the very least, they would have to be easily extrapolated from its dataset. I think that, for example, the "AI is going to replace concept artists" strain of thought is ridiculous. It's a line of thinking that imagines artists as pretty-image-generators. But the reason you want a concept artist is not just because they can render pretty colors and realistic lighting or make you a neat looking image. Artists are important because they understand the meaning of what they're making, because they can collaborate with you and have conversations and understand you, and because they have some kind of taste. I think a lot of non-artists, including some people who call themselves "AI artists," confuse the nature of what artists learn. Growing as an artist is about learning to see, not learning to be a factory for prettily rendered images. Learning to be observant, to use your eye, and to reason about what you see is an incredibly valuable skill that concept artists have, and I think anyone who thinks that replacing people with AI that can make lots of images but has no semantic reasoning, is shortsighted.
So am I worried that AI models are going to create too much art for anyone to pay attention to me? That there's going to be a flood of art that real artists get buried in? Oh my god, no. If I'm worried about competition, it's not from AI. It's from other humans. I literally do not have the time or mental bandwidth to even take in all the amazing stuff I see every day when I log in. I have over 50k furaffinity submissions in my inbox, because at some point I fell behind on checking and deleting them. I follow some new amazing artists pretty much every day on Twitter who'll have tens of thousands of people following them, and brilliant art, that i wasn't aware of. And the furry fandom is so full of brilliant 18-21 year olds in the new generation of artists that have boundless energy. And as a rule, the humans are still way better at it than the AI. That's what I'd worry about, if I were worrying about competition. In fact, as an American artist, one of the challenging things in the furry community is how many other artists set very low expectations for commissioners in terms of their pricing because they live in very low cost-of-living areas. For example, some of the most brilliant and amazing artists I know are from eastern European areas where they can get by on, like, $300 a month, and don't mind doing a $80 commission. Now, my bare living expenses to get by are astronomical compared to theirs. Besides that, there are a lot of furry artists who are hobbyists, or in cheap areas, who could charge 10x what they do and literally just don't because they aren't aware, or don't care, or have a ton of social anxiety about breaking off from the pack. I have been at it long enough to have a thick skin, so I charge what the market will give me. It's not my job to make my art affordable for all levels of the income rung; I only have the capacity to work for 30-40 people in a year, max, anyway. But, I do get people who react negatively with scoffing and envy because I charge more than they expect and think it's fair for an artist to make, and they're comparing my art prices with those of a bunch of crazy, young, eastern European artists with infinite energy who'll turn out masterpieces and be content with income that would be poverty for me. Do I care about AI as my competition? No. Talented humans are way better, and there are a lot of them.
Speaking of dunk culture, there's a kind of toxicity from certain people on the "AI side." The geek redditors and tech bros who have the mentality of "Get out of the way, the train's coming through," and "You can't uninvent the technology," in response to artists being concerned about the ethics of AI, hostile to diffusion models, etc. This is the essence of the kind of tech-bro culture that's out there: being interested in what's possible and completely uninterested in whether their use of the tech is respectful or tasteful or how it might affect people. The attitude is: The train's coming through, and if it happens to change things around for you, it's your job to adapt to it, and not our job to worry about that.
Let's be clear that you can absolutely steal artist's work with the ML models. In fact, that is literally what people are doing. For example,
this kind of thing can happen. There are "AI artists" out there on twitter whose generations are heavily based on not only the styles of various popular artists, but also happen to be created from their work; part of the process is conditioning the model on the artist's work, compositing in parts of it, and so on. The end result is AI art that literally looks stolen from the styles of various popular artists, and therefore benefits from their work, but is put out into the world with no credit to them. To understand why it might be upsetting to artists to have their names used in prompts, or their art fed into an image2image sampling process, we have to understand the culture of attribution in art communities.
Art communities tend to function on a culture of attribution and credit. I think it's the lifeblood of FA, actually. Organically, a culture developed on art communities like Deviantart and furaffinity of doing art trades, commissioning art, and posting things with credit, either to the commissioner or the artist. This is the glue that holds the art communities together. Artists are recognized for their work; commissioners are recognized for their OCs; and as an artist, if you know your work is going to be credited to you, you don't mind doing commissions for other people of their OCs that they're going to post an share. If this weren't the case, not only would it be a lot harder for artists to make a living, I think that whole swaths of artists would basically lose their motivation to make art, because art is pretty damn hard to make and requires a lot of toil and tears, and it doesn't feel as worth it if you're going to post stuff that'll be reposted without credit. The culture of crediting and attribution and having something that looks like a community where commissioners can refer to the artist's profiles and artists can refer to the commissioner's profiles is a large part of what makes it worthwhile on both sides.
What about styles? One of the arguments about AI being used to copy an artist's style is "you can't copyright a style!" This parallels other things that you hear from the AI art community in that it's entirely concerned with what's possible, what's legal, what's allowed. "We can do this, and you can't stop us," is also an argument that art tracers could use. Art communities nowhere would accept that argument. Sure, you can take an artist's art and trace the pose. Maybe you can't copyright a pose or an underlying pattern from being traced. But people aren't appealing to copyright law when they ask you not to trace their art. People are upset about that because it's disrespectful, in that you're using someone else's work and benefiting from it without credit.
Because of this, the tribes that are currently hostile toward each other - the pro-AI-art geeks who scoff at whiny art communities trying to resist progress, and the hand-wringing, frustrated artists who don't want their styles stolen and dunk on comically bad AI generations - are having conversations on completely different wavelengths. From the perspective of artists, it's about what's respectful and acceptable to do in communities they are a part of, and whether or not you can do something is considered secondary to whether you should be doing it. You can trace my art, and I can't stop you. You can use my art on anonymous furry RP accounts and steal people's characters, and unless I can invoke DMCA power, I generally can't stop you. But our community is not going to accept that as okay. If I am upset because you took my character and are using it as your RP body on some anonymous account on the grounds that my character has a lot of meaning to me, is deeply personal to me, and belongs to me and not you, I am making a human argument from what's fair and right. Similarly, the wavelength of the conversation that is going on in art communities is about figuring out the boundaries and norms and feelings about AI art. When is it okay to use? When is it taking advantage of someone's work? If you're using my name in your prompt, or even worse, my work in conditioning your image2image generation, and intentionally generating stuff that's very similar to mine, how different is that to copying my work in other ways, like tracing it? Why even develop a style as an artist if it's going to be used without your consent in an AI model? On the other hand, in AI art communities, the conversation generally has a techno-fatalism and a techno-hype to it. The discussions are often not people-oriented.
Take privacy advocacy. There is a group of people out there that is very concerned about how much everyone is tracked and how much is collected and known about individuals, and both the power imbalances and the chilling effects this can create on people's behavior. Privacy advocates make an argument that we don't want to live in a surveillance state, or have corporate surveillance that feeds our data into black box models that make decisions about us that we can't appeal. This is not an insignificant debate. I can imagine a lot of the same lines of thinking that come from techno-fatalism about AI being applied here: Well, we have technology that tracks us everywhere, the cat's already out of the bag, everything's built around this technology, your cell phone company knows where you are 24/7, so just accept the new reality and let us make a little money off of your data. But privacy advocates are making their case on the grounds of the kind of society we want to have. The point isn't to naively pretend technology is going to be uninvented, but to have a conversation about what is okay to do with technology and what we're willing to live with. Please go read
this cautionary tale about the dual use of generative networks.
Let's tie this back to how important the culture of attribution is to art communities. People generally don't want to live in a world where what they create is being used by someone else, without consent, and without attribution. People don't want to live in a world where their OCs are used without their consent, either. What's at the bottom of this? Basic respect for an individual. The argument is: I made something. Please respect me by asking if you want to use it. How much AI transforms that thing before it stops being considered stealing is something that's going to be up for debate and is going to be ironed out community by community. Is it a style, is it conditioning the model on my work? Compositing pieces of my work and transforming them with a certain level of denoise strength? How much different is different enough? We'll just figure that out over time.
There are people who, for sure, just don't care about if I am telling them that what they're doing is disrespectful. This is when someone takes my OC and uses it on their account without asking, pretends to be me, etc. If I tell them it upsets me, they might even have schadenfreude about that, if they're a troll. It would be naive to think that we can make that aspect of the internet go away entirely, but we don't have to accept that it's ok. Will trolls exist? Yes, but that's what we moderate our communities and curate our feeds for. There is certainly a strain of internet culture out there that basically is formed from the idea that you can do what you want, that the internet can be a nihilistic hellscape where nothing matters and, if you expect some respect as an individual, you are only revealing your naivete. It's a crabs-in-a-barrel mentality and I think it winds up in people making each other miserable, nonstop. Well, maybe there are a lot of miserable people out there, and there's a big can of worms about how trolling, hate, anon culture and ironic distance all work, but it's too big to open up here. So the arguments for using technology in an ethical and respectful way are for people who are, ultimately, interested in things like respect and community.
Let me make a surprising final point. I'm way more scared of Twitter than stable diffusion. Actually, I'm more scared of the platforms, in general. I don't think we grapple enough with how much power the platforms have. Remember, above, I make the claim that the culture of attribution is important to art communities. Furry artists make a living on platforms, by being able to use social media to build a business. But platforms can decide they're not that interested in giving creators space to build businesses. Artists on furry twitter can no longer @ their commissioners when they post because they've discovered the algorithm punishes them for doing this on a media post. The significance of this is hugely underappreciated. Maybe I'm doing a commission for @tanraak, but when I post it, I have to un-link their name like this: @/tanraak. But wait, what if Twitter starts to think that having extra symbols like @/ together is also a ding on the algorithm's score? Or if Twitter just decides we're not the kind of people they want around and slowly grind down communities by punishing their algorithm rank? Artists are afraid to use the word Patreon anymore now that they've discovered it punishes their posts.
The relationship artists have with these platforms is like this: You post content to the platform, but it's not free. It has to earn attention, or the platform will not show it to many people. Why? Because all the platform cares about is that its users are spending their attention on the platform. Now, if the platform has enough options for interesting things to show its users, you start to lose bargaining power with the platform. Maybe I want to post about my Patr*on or my C*mmission opening on Twitter. Posts about Patreon and commissions don't get as much engagement, so the algorithm learns to show them to fewer people. If my content is not earning its keep - meaning, it's not earning enough attention - the platform will replace me with someone else's content. I am a tiny bubble in a sea, and my community of people is just froth around the top of the algorithm. On old school social media, I have more freedom to do things that cost the platform attention. If I post lots of stuff on Furaffinity about my patreon and small business, that might cost Furaffinity a little bit of attention. I am costing the platform a little bit in exchange for the growth of my small business. But on many of the more modern platforms, the creators seem to have less and less power to do this; every post has to earn the platform the maximum amount of attention.
Frankly, I think the centralization and power of the platforms is way scarier than AI. Can I adapt, as an artist, to a world where I might consider using diffusion models in my workflow as an artist? Hell yeah. Could I adapt to a world where the platforms I rely on decided, suddenly, that they didn't need me anymore and were going to become unfriendly to my community or my content? That would be devastating. Most of all, I don't want Furaffinity or furry sites to go away, because I they have the things that made art communities work in the first place: the culture of attribution. Linking to profiles, having galleries of art, having a platform that allows you to promote yourself and your small business to your followers, is a lifeblood of that stuff.
I remember when I was in middle school and had a group of friends who needed a way to communicate. We didn't have phones, and the iPhone didn't exist yet anyway. One of us decided to use his computer to host and spawn a php forum for our little group of buddies, so we could all talk and keep up. Since it was on his home computer, it was up when his computer was on, and if he was away on a family vacation, our little forum would be down. There was no domain name, so we'd find it by entering his house's IP address; of course, he had to set the network up for that. But we were all a bunch of nerds, so that was no problem. Since the IP address wasn't fixed, if it changed, we'd just have to tell each other about that. Our own little corner of the "dark web" for kiddies. We had a lot of conversations on that php forum. Then, at some point, Facebook came along and started to look appealing. I remember the moment we had the conversation about shutting our message board down. Why bother hosting our own message board? Facebook does everything for us. We don't need to waste power having this home server up when we can catch up on Facebook now.
If I could go back in time, it would be the a Charles Dicken's style Ghost from Christmas Past here to give a grave warning about the choice we were going to make. Don't do it! God, that moment is, to me, something that symbolizes the entire, enormous, unstoppable process that has unfolded over the past couple decades of the internet centralizing onto the big, few platforms. Smaller communities having to move onto the platforms to stay relevant. It becoming increasingly tough for communities to host their own stuff as the technical requirements and attention-addictiveness requirements to compete with the big platforms rose. Some people complain that furaffinity looks like it's from 2005, but that's one of the things that makes me comfortable with it. It's still one of the old guard of social media sites, things don't seem hidden in black boxes, you don't have to guess about what the rules are for what you have to do to get your content seen, and it probably will never have the sophistication to turn into a pure ML-based feed like twitter. (I wouldn't mind if it did add some kind of recommender system, but i don't know that they have the technical capability for that). I just hope this process of consolidation doesn't come for platforms I care about like Furaffinity and VRChat. I know some people who make VRChat content professionally, which is an even scarier place to be in to me, because who knows what the platform is going to decide in the future? Let's all say a little prayer for VRChat's future, and that it stays as weird and furry and anime and user-content-driven as it currently is.
Anyway, that's my hot take: If the furry fandom ever moved entirely to twitter, that would be scarier for me than diffusion models. Living and dying by the random things platforms decide, and playing the algorithm game, is way worse than having a cool little ML model that can generate neat images and fake art.
Long post! Very long post. Hopefully, for the 3 people that will actually read the thing, it was edifying