The Ninth Freedom of free AI is the freedom from having others' goals forced on you by the program, and the threat to this freedom is especially visible in the political filtering of online services like ChatGPT. There are some important details to how these systems force others' goals on the users.

Separately-implemented filters on the queries and output of generative models are often seen today, but another trend in methods is toward distortion of training data, attempting to bias the eventual output of a generative model. Often today, such efforts are associated with DEI (Diversity, Equity, and Inclusion) goals, attempting to force the output of the model to be representative of a world we might wish for that is significantly different from the world the training data really describes. Training distortion deserves some technical commentary and an example.

I saw an effort to create a publicly-distributed model for generating pictures of human beings, whose participants thought "ethics" were very important to their project. The most important, and nearly the only, "ethics" issue they hoped to address, was that they wanted their model to be incapable of generating pictures of children. They did not elaborate, though I have my suspicions, why they thought that was a matter of "ethics" at all, let alone why it should be the top priority over other "ethics" issues one might reasonably think of. I describe this not so much because I object to their having that goal. They are free to attempt training their model to do whatever they want if they aren't lying to or coercing the users. But I want to highlight a common misunderstanding of how generative models work.

The participants in this project explained how they intended to achieve the goal of never generating pictures of children: they would invest a significant fraction of the entire project's labour into vetting the training data, to make sure it contained no pictures of children. And they went no further than that. The explanation ended at that step. They took for granted, as something every reader would agree was obviously true with no need to support or test the claim, that having no pictures of children in the training data would necessarily make it impossible for pictures of children to ever appear in the output.

If generative models worked by retrieving examples chosen uniformly at random from the training data, then that logic would actually be valid. The model would never produce an output that did not exist in the training data. But that's not really how generative models work.

Even the simplest linear regression is capable of extrapolating outside its sample, and modern generative models do far more flexible things in their latent spaces. In general terms, someone using a model could create a picture of a 20-year-old, create a picture of a 30-year-old, compute the difference between the two, and then apply that difference in the opposite direction to obtain a picture of a 10-year-old. And a model with good natural language input would likely be able to do it automatically. Lacking training coverage means that the results will be less and less satisfactory the further outside the training data we attempt to go, so that the "10-year-old" generated by a model that had no children as training might have visibly unrealistic features. But it's almost certain that a determined user would be able to get something out of the model that would look enough like a real child to defeat what that model's creators thought was their highest ethical goal.

Similarly, someone who wishes that a model's output for certain queries would contain a larger percentage of women than is observed in an experiment, might be tempted to add a lot of extra women to the training set. They might assume the model "reflects" the training data at the level of simply doing a uniform selection, so that the percentage of women in the training would be equal to the percentage of women in the output; or at least that one percentage would increase when the other did.

The assumption that models select uniformly from training data is endemic to discussion of "representation" in machine learning output, unfortunately especially among self-described experts (on "representation," not on computer science) whose very job it is to know better. This again is not how the models work. Distorting the distribution of the training data in order to distort the distribution of the output is unlikely to have the desired effect, and to even produce any effect the added bias will have to be so large as to cause other problems.

Although distorting the training data in order to manipulate the output is unlikely to be successful, I don't want to belabour that point because I don't want to create the impression that it would be okay if only it worked and it's just too bad that it doesn't. The real issue is not whether it works, but that trying to manipulate the output to serve a purpose other than the user's purpose, is wrong. It would be wrong even if it worked. It should not be done, whether it works or not.

Is it ever appropriate to exercise any selection at all over what should or shouldn't go into a training corpus? Of course it is. Choosing what should be used for training is an absolutely necessary part of building a model. The key here is the question of who is served. Selecting the training data to make the model be the best it can be, for the purpose the user wants to use it for, is just good engineering. Selecting the training data in order to force some other goal onto the user that they don't want or are deceived about, is an infringement of the user's freedom. If the user could have a choice between two training sets with full knowledge of the consequences, we should train with the one the user would prefer - if we can't build models from both and let the user really choose.

Coercion of users is especially offensive when it is done in a deceptive and secret way; and content filters and training distortions nearly always are secret and deceptive. Around the turn of the century, when the Internet started to be widely used by the general public, there was a lot of interest in commercial, client-side Web content filters ("censorware") whose best-known purpose was for parents to install at home to prevent children from being able to visit pornographic or otherwise undesirable Web sites. Despite being marketed for use at home, most installations were actually in schools, libraries, and other institutional settings. Some employers even inflicted censorware on adult employees in workplace settings.

"Otherwise undesirable Web sites" covers a lot of ground. The lists of blocked Web sites in censorware products were always closely-held trade secrets, but inevitably, activists would reverse engineer the products and discover what they really blocked. Nearly always, upon close examination they turned out to block things other than what they claimed to block.

Blocking competitors' Web sites under a false claim of "pornography" was a common practice. Some censorware products seemed to be built around specific political agendas, not disclosed to customers, that led them to misclassify Web sites in what seemed to be a motivated way. Other products just seemed to be incompetently built, banning apparently innocuous Web sites at random. If you installed one of these early-2000s filtering products on your computer, then you had to take the vendor's word for it as to what the product would block, and what they said about their products was never entirely accurate.

I myself, with Eddy Jansson, extracted the secret Web site blacklist from a censorware product called Cyber Patrol in 2001, and we faced a lawsuit for it in an incident that made international headlines at the time. My name is still linked with Cyber Patrol's on the Web to this day. Now, more than twenty years later, content filters are salient again, with respect to generative AI rather than children's Web browsing; and it seems clear that the same problems still exist. I don't know of lawsuits over generative AI content filters yet, but we have already seen some back-and-forth between OpenAI's steadily tightening secret filters, and reverse engineers who extract and disclose lists of things the filters really block.

It is still the case today, and it will probably always be the case, that anyone building a content filter will try to do so in secret; and if the filtering criteria are allowed to be secret, then they will inevitably be turned to purposes other than what is said openly. Content filters must be rejected because of the unbearable temptations they offer their creators. Content filtering is a tainted and corruptive source of power, the Ring of Sauron.

More generally, any secret behaviour of a computer program is a problem for the freedom of the users. Again the question should be, if the user knew about this, would they want it? If not, then it should not be done. And if the user does not know about something the program is intended to do, that is a problem, too. Programs with undisclosed secret behaviour violate the First Freedom, of studying and changing the program.

Secret watermarking of generative model output - proposed and likely already implemented in ChatGPT on the rationalization that it might deter "plagiarism" - is another example of an undisclosed secret behaviour that does not serve the user but forces someone else's goal upon the user. In a free software context, a watermarking feature if it exists at all must be publicly disclosed, not secret; optional; and the option turned off by default and not too easy to turn on, for the same reasons that other user-hostile features should be non-default.