Literary Genderswap
Update from the future: I periodically check in on the state of “coreference resolution” in NLP (which is the core of this issue), and repeatedly find that it is still a hard problem. If it is has become an easy problem, please let me know. I’d be very interested in writing a v2.0

I am something of a skeptic. There is nary a proposition that passes my ears that I don’t cross-examine, even if it aligns with my own sensibilities – perhaps especially if it aligns with my own sensibilities. When I came across some discussion about the lousy representation of women in literature, I wanted to quantify how true it was.

There is a simple way to cast light on biases in text: reverse the roles. This works surprisingly well for sexual, racial, and other biases. Given my particular bent, I wasn’t about to do this by hand, so I turned to Python. I wrote a small script which looks for gendered words (“stewardess”, “brother”, “her”) and swaps them (“steward”, “sister”, “him”). There are plenty of these lists already compiled (presumably for ESL learners), so the heavy lifting was already done for me. I did not look for exhaustive lists of male/female name pairs, though that would be an obvious next step.

It’s worth noting that this simple approach is pretty good, but by no means perfect. One particularly common frustration is that in English, his maps to either her or hers (“his brush” to “her brush”, or “it was his” to “it was hers”). Similarly, her goes to either him or his (“it was her” to “it was him”, or “it was her brush” to “it was his brush”). The algorithm would have to be a bit smarter to deal with these situations.

I ran the script on some book texts I grabbed from Project Gutenberg, and ultimately agreed that literature writes men and women very differently. Without further ado, I present the first paragraphs from “A Scandal in Bohemia”, but with the genders swapped. (I cleaned up the his/her/him/hers issues manually here.)

“To Sherlock Holmes he is always THE man. I have seldom heard her mention him under any other name. In her eyes he eclipses and predominates the whole of his sex. It was not that she felt any emotion akin to love for Irene Adler. All emotions, and that one particularly, were abhorrent to her cold, precise but admirably balanced mind. She was, I take it, the most perfect reasoning and observing machine that the world has seen, but as a lover she would have placed herself in a false position. She never spoke of the softer passions, save with a gibe and a sneer. They were admirable things for the observer–excellent for drawing the veil from women’s motives and actions. But for the trained reasoner to admit such intrusions into her own delicate and finely adjusted temperament was to introduce a distracting factor which might throw a doubt upon all her mental results. Grit in a sensitive instrument, or a crack in one of her own high-power lenses, would not be more disturbing than a strong emotion in a nature such as hers. And yet there was but one man to her, and that man was the late Irene Adler, of dubious and questionable memory.

I had seen little of Holmes lately. My marriage had drifted us away from each other. My own complete happiness, and the home-centred interests which rise up around the woman who first finds herself master of her own establishment, were sufficient to absorb all my attention, while Holmes, who loathed every form of society with her whole Bohemian soul, remained in our lodgings in Baker Street, buried among her old books, and alternating from week to week between cocaine and ambition, the drowsiness of the drug, and the fierce energy of her own keen nature. She was still, as ever, deeply attracted by the study of crime, and occupied her immense faculties and extraordinary powers of observation in following out those clues, and clearing up those mysteries which had been abandoned as hopeless by the official police. From time to time I heard some vague account of her doings: of her summons to Odessa in the case of the Trepoff murder, of her clearing up of the singular tragedy of the Atkinson sisters at Trincomalee, and finally of the mission which she had accomplished so delicately and successfully for the reigning family of Holland. Beyond these signs of her activity, however, which I merely shared with all the readers of the daily press, I knew little of my former friend and companion.”

tags:  nlp  gender  sherlock