Re: are you okay with AI bots training on your content?
The following is an excerpt from a private exchange that I am sharing with the permission of my correspondent. I am not disclosing their personal details.
I’ve been doing some research on bot blocklists and such. I looked at your website’s robots.txt and noticed that you don’t block any crawlers. May I ask you why? What do you think about the fact that ChatGPT has been trained (and will be trained) on your work?
Before answering your questions, allow me to provide a big-picture view of my approach to the issue.
My principal reservation about artificial intelligence (I am not going into technical terms of LLMs, AGI, etc.) is political and is not about the technology as such: it is a matter of ownership and access which can only be addressed holistically through thoroughgoing reforms. In general, I am sceptical of any form of highly concentrated and exclusive control, as it typically results in abuse. This can take many forms, such as familial (e.g. a patriarch/matriarch that is intolerant towards new forms of expression among the younger generations), legal-institutional (e.g. a dictator that defies constitutional norms to the longer-term detriment of the country’s wellness), religious (e.g. a hierarch that twists people’s religiosity to raise an army of fanatics), economic (e.g. billionaires circumventing fair competition to entrench their businesses), historical (e.g. a figure you cannot criticise which is used to justify current malpractices), and social (e.g. celebrities that manipulate people into parasocial relationships and other types of questionable behaviour).
These are analytical constructs. In actuality, phenomena will tick more than one box as there are permutations between extremes and combinations of various qualities. The point is that whenever power rests in few or increasingly fewer loci, it suffers from a mismatch between relevance and competence or, to put it differently, it is far away from those caught in the events. If a person living in Europe decides what will happen to the village of someone in Asia, they are not making the best decision for the latter’s well-being simply because the realities of each one’s life are different and so are their respective priorities or sensitivities. Exclusive control becomes abusive the more detached it is from the particularities of the case because it no longer notices the nuances therein. In legal terms, it violates the principle of subsidiarity and is likely to be insensitive to the connatural principle of proportionality.
There are concerns I have about the technology, such as matters of a transhuman sort (falling in love with a bot, some company planting chips in your brain, …), though those ultimately resolve to—or are anyhow defined by—the aforementioned basics of control.
In purely technical terms, I think artificial intelligence is a remarkable achievement and one that will mark a new era of human civilisation. As with every innovation, it brings with it amazing opportunities, the extent of which we cannot fully fathom yet, while it also heralds the start of a whole new range of problems from quotidian affairs to international relations. I think it is a mistake to be categorically for or against this evolution, as it is neither good nor bad. Just how our current world or those that have preceded it are a mixture of positives and negatives. My stance is thus more nuanced: I have no longing for some mythical past when things were ostensibly perfect—it was always messy, just in different ways.
Now to your questions. I am largely ignorant about the scope of the
robots.txt
. What do I need to know about it that will improve my
website? And what does that improvement pertain to? I am happy to make
any change that benefits the dissemination of my content. Note that my
website is 14 years old and seems to be working fine, so I am not sure
what to make of this.
As for what I think about my publications being used to train AI, I raise no objection. My works are all public and I consider them part of the wider corpus of human creativity. They are not mine anymore than they are yours, notwithstanding conventional notions of authorship and ownership. When I express an idea, I make a connection that is in the potential of the cosmos and which anyone else can thus recognise and assume as part of their own thoughts. Nobody can ever restrict the idea: even if they inhibit its circulation through legal and technical means, the idea as such remains graspable and is not reducible to a finite quantity (which could then be made exclusive).
When I use expressions such as “my works” or “I think”, I do not imply that I am the only one capable of developing or holding those ideas. I simply convey an impression about their origin relative to what I am aware of. I am nevertheless mindful of the fact that I do not exert exclusive control over the very endowment of my talents, the happenstances that stimulate my mind, the connections my being makes, and the dynamic interplay of factors in any given case. Everything I do unfolds in the continuum of this world, so it is essentially arbitrary to claim as my own that which is not strictly internal and specific to me.
For example, I am now writing this argument in response to your questions (“your questions” in the same sense as “my works”). Is the argument exclusively mine given that the questions which triggered it are not? Can I even isolate myself to test whether I could develop the same thoughts minus this stimulus? Would it have been exactly the same in a vacuum? What would even be the argument were it not to depend on any prior thoughts of mine, which themselves synthesise the notional internal with the external (i.e. which do not happen in a vacuum)? Where do we draw this indelible line of “mine” versus “yours” or “theirs”? Put simply, what I have—and who I am—is not a closed system.
Think of it also in terms of some old song that exists in your tradition. Nobody knows the original composer. Their name is lost to time. Yet each of your people can participate in the experience of the song. There is no sense in which one’s experience precludes that of another. There is no inherent exclusivity.
I thus consider it dishonest to then claim something circumstantial —the products of me in relation to my environment—as inseparable from some assumed constant of “me” or selfhood. In simple terms, I do not truly own and cannot identify with anything out there: I am, at best, a messenger or a user of ideas (colloquially “my ideas”). All I then do is extend this outlook to my beliefs about property in general, where I only see a practical need for having some belongings that are finite in order to live. Matters of thought are for everyone, including bots whose ownership structure I may not like.