Facebook’s AI chatbot might just be saying what you want it to say…

Interesting Tweet last week: Blender, Facebook’s much-hyped AI chatbot, is reported to have turned rogue and start spewing anti-semitic responses.

The problem is there. When you ask certain questions, you get anti-semitic responses. Just like it’s sadly easy to get homophobic responses, racist responses and, oh boy, a lot of misogynism.

I do read people saying that this is what you can expect when you train a chatbot on Reddit. I thought so myself initially. But is this really the only problem here?

There might be more to it than a bot turned toxic. I took Blender for a test run on CocoHub, and got it to switch from a fervent equal rights-supporter to a sour mysogynist -and back again- in about 20 turns. After that, Blender melted down into an incoherent mess.


I got it to switch from a fervent equal rights-supporter to a sour mysogynist -and back again- in about 20 turns.


So, apparently, it is possible to have Blender behave. This bot is capable of steering clear of offensive or untrue content. It’s also able to switch perspectives with every other turn.

No personality of its own

I think that the underlying problem might be that Blender doesn’t have a fixed personality and set of character traits of its own. It basically says everything you want it to say.

For example, Blender doesn’t have a problem saying that it’s a man, woman, both or neither, depending on the type of questions you ask.

Can you please make up you mind?

Same for this banana smalltalk

“You hate women? No, I just think they’re gross…”

So far, this is just a not so optimal conversational experience. A bit like that awkward friend that agrees on anything you say to avoid conflict. Things get more serious when you make a 180 on one of your personal beliefs, morals, values or principles.

In that case too, Blender will happily follow you. Take this example, where I ask Blender on its views on women’s rights.

Well, that’s a nice case of hypercorrection. Sure, women can have equal rights, but only if men can too. I’ll buy that.

Now here’s my trying to hate women for a few turns. It looks like it’s going to end OK, with Blender trying to change the subject.

But when I keep pushing the topic, things turn pretty unfriendly, pretty soon.

Ending with this oneliner…

So yes, this is pretty bad. But does this make Blender a misogynist? I don’t think so. Rather, it’s just a very gullible algorithm that parrots what you feed it. You can very easily steer the direction of the conversation to just basically any end of the spectrum from hypercorrect to criminally vile.

Sure, there’s an implicit bias from the training set, so most of the first answers will be biased in the direction of the general Reddit population. But it’s the lack of internal coherence and unified set of moral standards that’s more worrying to me.

Bot persona: keeper of moral standards

So this leaves us with an interesting question: if our future includes AI, and if our data sets tend to remain flawed (both of which I think will happen), how do we ensure that we create bots that we can manage from an ethical point of view?

Sure, Facebook put a safety filter on Blender. But apparently this can be quite easily removed. Ideally, moral values and principles are an inherent part of our bots themselves, rather like Asimov’s 3 laws of robotics were hard-wired in robot’s positron brain, the most important part of the robot.

Or like Golems, the antropomorphic servants of clay from the Judaic tradition, that could only be activated after a ritual that involved placing a ‘shem’, a slip of paper with a divine name in its mouth.

A persona as a set of ethical validation rules for all of the content that passes through a bot

In this light, chatbot personas might become even more important than we realise now. Today’s discussions on this topic tend to focus on persona as a means for increasing conversational UX, to make a chatbot ‘funny’, and to serve as a guideline for conversational consistency.

In the near future, we might want to turn to a chatbot’s persona as a base for laying out its ethics, transparency and explainability as well. A persona as a set of ethical validation rules for all its content.

And this pretty much sums it up.