The words I am typing and you are reading are the product of centuries of human migration, subjugation, and study. English is the language of global capitalism — the “language of airports,” as it’s often described — but it is also a maddening collection of rules. How do you explain the difference between “having had,” “had had,” “have had,” and just plain “had” to someone unfamiliar with English grammar, let alone the vocabulary? Why is the preposition through pronounced like the verb threw and not like the adjective rough? As a native speaker, I tripped over these questions while teaching English as a second language in Moscow, Russia, where I worked as a journalist for several years. At the same time, I was struggling daily with how to communicate in Russian, which has a different alphabet, no fixed word order (“Sentence this odd it reads, no?”), and, like any language, various dialects. The experience left me fascinated with how we communicate and how the rules of our respective languages evolved. At times it seems miraculous that language even exists.

Justin E.H. Smith, a professor of philosophy at the University of Paris and a prolific author of fiction and nonfiction, can go on at length about this subject. Born in Nevada, he attended Columbia University and the University of California, Davis. He is fluent in French, German, and Russian; has studied and translated Latin, Sanskrit, Turkish, and Sakha (the language of the indigenous Yakut population in Siberia); and is familiar with such invented languages as Volapük, Esperanto, and even Vulcan, from the Star Trek universe. “At least in the circles I run in in continental Europe,” he says, “you’re pretty much expected to get by in four or five languages. But I have the kind of interest that, if I learn about a text written in, say, Phrygian, and I really want to read it, I’ll learn to do so.”

Smith’s most recent book, The Internet Is Not What You Think It Is: A History, a Philosophy, a Warning, is not about linguistics, but it does poke at these questions of communication. His wide-ranging Substack newsletter contains detailed essays about the history of vampires next to original short stories in the vein of Edgar Allan Poe. He has written for Harper’s, The New Statesman, and The Point about surveillance in the pandemic age, the metaverse, and the resurgence of vindictive nationalism. He’s even taken a stab at a fictional translation of the infamous Voynich Manuscript, a fifteenth- or sixteenth-century document from Italy that is written in a code no one has been able to break.

Seeking a conversation about linguistics, I contacted Smith simply because his appetite for the topic seems boundless. Over several lengthy Zoom calls from his apartment in Paris, he made it clear that he is not a linguist or an authority on international syntax, but we did discuss aspects of human communication — tactile, musical, and physical — that seem even more mysterious than our ability to speak to each other across oceans of grammar.

 

A photograph of Justin E.H. Smith.

JUSTIN E.H. SMITH

Cohen: Out of all the languages you’re familiar with, which ones can you actually think in?

Smith: The language I teach in, and the language I use in daily life, is French, which certainly infects my dreams and my dialogue with myself when I’m alone in the shower. When I try to think in other languages, I get a lot of crossed signals. For example, I often find that, when I try to formulate full sentences in my head in Russian, they get infected by French words. I was looking in my brain for the Russian word priobresti — “to acquire” — which uses an unusual or relatively rare suffix in Russian. But a glitch in my brain made up the verb obtenut’. The French word for “to acquire” is obtenir, and the Russian suffix -ut is roughly comparable to -ir in French, as far as rarity goes. Because ten’ is “shadow” in Russian, when I tried to say, “to acquire,” what went through my mind was an image of something getting covered in shade. I constructed this word on the fly from its coincidental similarity to a French verb, but then the apparent content of the Russian word clouded my mind.

Cohen: When I was learning Russian and realized that there’s no present-tense form of the verb to be, it was unsettling for me as a native English speaker.

Smith: Many professional linguists are wary of the old Sapir-Whorf hypothesis [the idea that the structure of our language influences how we see the world — Ed.]. For example, supposedly the Hopi language has nothing comparable to our past tense. Proponents of the hypothesis have suggested that the peculiarities of Hopi verbs might mean they don’t have a sense of the past. Others have rejected that as an exaggeration. Maybe the Hopi verbal tense system is complicated, but they must find some way to talk about the past.

I’ve noticed that, whenever I get exasperated because I can’t say x or y in some language I’m trying to master, it generally turns out that the problem is a result of my weakness. Ten years ago I was convinced there were things French people just couldn’t express. And what I’ve gradually learned is that there is always an equivalent between French and English. I’ve been studying Sakha, or Yakut [an indigenous language of the Yakutia region of the Russian Federation — Ed.], for about four years. Sometimes I feel my comprehension is quite good, but the other day I just happened to meet a French woman who lived in Yakutsk for seven years and is fluent in Sakha, and she expected me to start speaking it with her. It’s not every day you run into a fluent Sakha speaker in Paris! I realized at that moment how much further I have to go in that language. What we would take care of with one verb in English, they’ll express with three or four verbs in a row. Instead of saying, “I sat down,” in Sakha you would say something like “I going falling sat.” Right now I feel there’s just no way I could express the same ideas in Sakha that I’m used to expressing in English or French, but that’s just a reflection of my own intermediate status as a speaker.

Cohen: Estonian, Japanese, and Turkish share the trait of being “agglutinative,” meaning you build upon a word to give it several different grammatical functions. That seems like it could put a different filter on how you see the world, and therefore yourself in it.

Smith: Seventeenth-century colonial bilingual phrase books, like English/Narragansett or French/Wyandot, are really interesting. Here I’m talking way beyond my competence, but from what I’ve read, it does seem true that with many Native American languages, it’s difficult for us to identify parts of speech: a subject noun, say, followed by a verb and maybe some other stuff in the predicate. This is why non–Native Americans have this vague idea that Indigenous languages give people proper names that are a description of something the person does, like Sitting Bull. That’s just one example of the way that different languages express what we would ordinarily think of as the relationship between nouns and adjectives or even between nouns and verbs.

Sakha is quite different, of course, but there are some similarities. They have verbs with meanings like “to appear with a large lower lip.” It’s one word. An English speaker might ask: How is that a verb? That’s not an activity! [Laughs.] But there are these very specific verbs that mean “to have ruddy cheeks,” or “to have thick eyebrows.” They are describing physical traits, but it’s hard for an outsider to really understand what is being talked about, let alone whether it’s aesthetically desirable or not.

Cohen: I’m often amazed that cultures have learned enough about each other’s languages that we can teach and study them. What do we know about how these exchanges started?

Smith: I am constantly in awe of nineteenth-century-style linguistics. The founder of Yakutology in Europe is Otto von Böhtlingk, who was based in Saint Petersburg, Russia. He was an Indologist who worked mostly on Sanskrit grammar. Someone happened to come back from Yakutia with a sample of written text. So von Böhtlingk took a three-month break from Sanskrit to write a grammar of Sakha based on maybe twenty pages of text that had been produced by an ethnic Russian speaker of Sakha — a clerk in Yakutsk. Typically a classicist in the nineteenth century was required to learn Sanskrit, but if you look at the manual they used to teach it, it just shows some tables of Sanskrit verb structure. From that, the German linguists specializing in Greek were supposed to understand how the rest of Sanskrit worked. This is an ability that I doubt anyone in the twenty-first century has.

Another thing that impresses me is the early work of intercultural contact. My longtime hero [German polymath] Gottfried Wilhelm Leibniz developed methods in the early eighteenth century that were then applied by travelers for a hundred years: He devised a list of words in a foreign language that needed to be written down. Once they had these basic vocabulary words, philologists could compare them across languages to determine what family the language was in. So they placed Sakha next to Chuvash, but not next to the Tungusic languages, and so on. It was very systematic, but mostly they just made lists of nouns. I’ve got hundreds of Sakha nouns cluttering my brain, but that doesn’t enable me to form articulate sentences from them.

One way to learn a language in the eighteenth century was to be taken prisoner. For example, some educated military officers from Poland and Sweden were taken prisoner by the Russians and sent to Siberia, where some decided they might as well pass the time doing something interesting, like learning the local languages. I would have loved to study Ket, the last remaining of the so-called Yeniseian languages, which possibly connect some North American languages to Siberian languages; it has only about two dozen speakers. But here in France there are no resources, whereas with Sakha I can talk to people; I can listen to news reports on YouTube; I can watch children’s cartoons.

Cohen: You said these linguists had an ability that you don’t think anyone in the twenty-first century has. Why is that?

Smith: Linguists learn to think about language very rigorously. They have to be able to look at a sentence and parse it. When I was a kid, we still did sentence diagrams in elementary school, where you visualize the different parts of the sentences. Today it’s pretty much only professional linguists who do that. But up until the early twentieth century, when Latin was still part of a formal education, people had pretty standard knowledge of the architecture of language. In classical studies they had to look at paradigms of Sanskrit verbs and understand them. These days if you study Sanskrit, you’ll typically be led through Duolingo-style exercises — learning helpful phrases, learning a song — without looking at the core structure.

The classical method made us impose some arbitrary rules on the use of proper English: the “no split infinitives” rule is one that makes sense only if you are thinking about Latin as a paradigm for good English composition. We liberated ourselves from that over the course of the twentieth century. It’s the same with Sakha in the post-Soviet period: it had to be understood apart from Russian. It’s bad practice to tether one language to another: English to Latin, or Sakha to Russian. But then, when you cease to do that, the language just goes into free-float, and speakers are no longer fully able to explain what they’re doing in their own language.

Cohen: Do all languages ultimately borrow from others?

Smith: English, in particular, is a mongrel language. There are some dissident theorists who say that it’s wrong to classify it as Germanic at this point because the indigenous Celtic substratum significantly weakened the Germanic signal at the time of the arrival of the Angles and Jutes from Scandinavia, and then the borrowing of Norman French vocabulary further messed with the DNA of the language. That’s a minority theory, though. English is a Germanic language, but it’s a weird Germanic language, because of the Celtic and the Norman influences. I sometimes teach American students here in Paris. One thing I like to point out to them is the way in English we often have two words for something: say, freedom and liberty. It feels to me that liberty is something positive, a high ideal, whereas freedom is something jingoistic — like “freedom fries.” [Laughs.] If you’re speaking French, you only have liberté, and if you’re speaking German, you only have Freiheit. It’s obviously a very Germanic word, freedom. That duality in English is a testament to the peculiar historical development of our language, which I’ve just described a bit. It’s something a French speaker doesn’t have today. When I’ve tried over the past few years to speak German, people have told me I’m speaking like Friedrich the Great, the enlightened Prussian emperor, because I’m pressing French or Latin roots into service in German, and it sounds very precious and high-falutin in most cases.

It’s silly, I think, when governmental bodies try to maintain the purity of a language, like in Iceland, where there’s no single word for computer. The Icelandic phrase for computer is “number prophetess.” That’s cute and clever, but I don’t think it’s really doing what these official bodies and academies want it to do.

Some theories suggest that the Indo-Europeanization of all of Europe — with some exceptions in small pockets like the Basque Country and Hungary and Finland — was not so much a replacement of population or conquest, but rather the imitation of an elite class. The early inhabitants of Europe aspired to speak in the way of elite foreigners to the point that this actually transformed the languages. There are efforts to reconstruct what the underlying languages might have been. Some theorists think that there are hints in German of a Finno-Ugric substrate. [Living Finno-Ugric languages include Finnish, Estonian, and Hungarian. — Ed.] So how did the Germans get Indo-Europeanized? By imitation, by aspiration, rather than by displacement of populations. This complicates our picture of what it is for one ethnolinguistic group to replace another.

Cohen: What time period are you talking about?

Smith: This would be prehistoric. That’s why it’s so hard to know — because it was happening long before written texts. All efforts to reconstruct the history of European languages prior to the arrival of the classical civilizations of antiquity are pure speculation. We don’t even have a single tombstone with a proper name on it to help us. All we have is perhaps the names of rivers. We can speculate about earlier linguistic strata based on the names of such geographical features, the theory being that no conquering people would bother to give their own name to a river that already had one.

Cohen: What do we know about how a collision between languages creates a word and how that word is accepted across an entire population?

Smith: This speculative historical example of Indo-Europeanization can pretty easily be confirmed by the way we see language actually working — in particular what I take to be the atrocious use of English in continental European advertising. It’s worse in Germany than anywhere else I’ve been. You see these horrid constructions: das fitness-workout or something like that, where they add the German definite article. It’s an attempt to borrow cachet from English as the current language of global capitalism. It won’t be forever, but for now it is.

The way this is happening today is a pretty good indication of how it may have happened five thousand years ago. I haven’t been in Moscow in a long time, but I remember having the option of saying either obsluzhivaniye or servis, where the first one suggested an arm of the Soviet-style government and the other suggested a private company that would actually provide you with a service in exchange for money. There are plenty of examples of this, where you have the native term, which sounds uncool, and the flashy term from the language of the conqueror, so to speak.

Cohen: But before global capitalism you didn’t have McDonald’s everywhere. Travel was much more difficult; it was a big deal to move from place to place. How were these types of exchanges done and accepted and spread?

Smith: I’m not a prehistorian, so I hesitate to speculate too much. Yes, capitalism does introduce some new dynamics into the equation, but the difference between, let’s say, a patrician Roman officer and a Germanic villager somewhere near Cologne circa 200 CE was crystal clear: one had shiny equipment and the latest stuff from the metropole, and the other was a dirty yokel. [Laughs.] It’s not surprising that the direction of linguistic influence was more from Latin into the Germanic languages than vice versa.

Cohen: Because of the power of subjugation?

Smith: Because language flows in the same direction as other elements of culture. We know that there were all sorts of languages spoken in the Iberian Peninsula before the Romans came. Some of them were Celtic. Some were probably related to Basque. Aquitanian is an example of a broader Basque-like language. We know about these because occasionally some curious Roman authors would write down lists of place names, or just proper names. Sometimes they would note things like “Before the soldiers take a drink there, they like to chant . . . ,” and then they would quote three words. But that’s rare. For the most part, language flows in the other direction, from the conquerors to the conquered.

Cohen: The Inuit of North America famously have many different words for qualities of snow. Their language was obviously shaped by geography. How has geography shaped the way that other languages have evolved?

Smith: It certainly shapes the spread of languages. It’s fascinating to think about the spread of Spanish in Latin America. Why is the area not completely Hispanicized? Why are there significant regions where you still have Quechua or some other language that just refuses to disappear? The reason seems to be geographical: mountainous regions and rain forests are difficult to penetrate. There’s a similar effect with biodiversity — difficult geographical features tend to preserve both species and languages. In the case of Siberia the linguistic communities that are the most threatened are the ones in regions that Russians found most geographically attractive and where they settled in greater numbers. Yakut is probably healthier today because the Russification was limited by the harsh climate.

In this era of modern nation-states, each nation-state, in order to be respectable, has to pretend that its language is entirely its own, rather than something that bleeds across borders.

Cohen: What about artificial languages, like Volapük or Esperanto? How does something like that get created and why?

Smith: The idea for creating artificial languages has its origins in the seventeenth century. Again Leibniz is one of the key players. The incentive for this comes when an older theory of language is collapsing. In the seventeenth century this older theory was called the Adamic language — the language supposedly spoken before the fall of Adam and Eve. It was presumed that when Adam named the animals of the field, he wasn’t just making a sound and saying, “From now on, we’re going to use that sound to refer to this creature.” The traditional interpretation of this part of Genesis is rather that he was naming what that animal actually is: the name was the essence of the creature itself.

For a long time there was this impossible dream of achieving such a condition, where we would overcome the arbitrary sounds that we make and start speaking the names of things as they are, and that would be the perfect language. Leibniz and others started to doubt that that was even possible. What would it be like not to use an arbitrary sound to refer to a monkey, but to actually say what the monkey is? What would that even be? It didn’t make sense. Language is a collection of arbitrary sounds that we use to index the things of the world. And so, once the prevailing theory of language started to break down, people like Leibniz, having realized that we were never going to be able to retrieve some perfect language that had never existed anyway, decided we might as well start from scratch. That’s when the dream of a perfect artificial language begins. It’s connected with the birth of computer science as well, when people started to think about creating a natural language that would eliminate ambiguity. They started to think about encoding language in a way that was so unambiguous and precise, even machines could process the information.

By the end of the nineteenth century we have actual stabs at complete artificial languages. You mentioned Esperanto and Volapük. The British philosopher Bertrand Russell was a fan and promoter of Ido, an offshoot of Esperanto. I’ve always hated Esperanto, because it’s pretty clearly the creation of some Polish guy who had studied Romance languages. I like Volapük because it at least looks weird. If we’re going to start from scratch, let’s not anchor this to any familiar phonology or vocabulary.

Here we see, especially with Esperanto, two goals that I think are ultimately in tension with one another: one being the goal of creating a perfect language, which means you really have to start from scratch, and the other being the goal of simplifying language to create a means of universal communication. A universal tongue would have to be easy to learn and familiar to at least a certain number of people. And because it was mostly Europeans who were doing this, it was natural to start out from a background in Romance philology. But I seriously doubt either of these goals will ever be more than a pastime for amateurs. The dream of artificial languages was at its peak a hundred years ago, and it didn’t get very far. We seem largely content with the idea that the language most of us use will be the language of the current global superpower, whoever that happens to be. [Laughs.]

Cohen: Science fiction and fantasy franchises like Star Trek, The Lord of the Rings, and Game of Thrones have produced real languages that people can study. What type of work goes into creating them?

Smith: There are Internet guides to creating your own language. You can plug in a partial vocabulary — say, all the known Vulcan words from Star Trek — and use AI tools to help you flesh it out. It’s very impressive.

I’m inclined to think that these fandoms today are pretty close to what the Esperanto clubs were a hundred years ago. We have to consider, however, that Esperanto and Volapük have something in common with other linguistic projects such as, say, modern Hebrew. There was a preexisting source material to draw from, but the Hebrew spoken in Israel today is largely a product of the same kind of ambitious modern project that gave us Esperanto. Of course, if we place modern Hebrew in that category, then we should probably place modern Norwegian and Lithuanian there, too, because in each of these cases the language had to be actively constructed from rather incomplete sources. In the case of Hebrew, those sources were scriptural and liturgical; in other projects of national language construction, the new language took its cue from the way people spoke in a given region, in farming communities and so on. It wasn’t until the seventeenth century that people began coming up with dictionaries, rules of grammar, and eventually some kind of regulation by academies. This process is artificial: it’s freezing into place and naming and institutionalizing forms of communication that were previously just taken for granted.

Cohen: What advantages or disadvantages does an artificial language have over one that evolved organically over time?

Smith: Success stories like modern Hebrew show that, in many cases, a heap of political will is required to make it work. I have a friend in Israel who often comments on how strange it is that his Polish father was born in a world where it would have been unthinkable to go about your life speaking Hebrew. But then with the creation of Israel — my friend’s father got out of Poland just in time, in the late 1930s, and landed in what would become Israel — it became a common project that everyone worked on together, and then their children just took it for granted. If there were a similar circumstance of necessity and political will, I could imagine there being a second generation of Esperanto speakers who would take it for granted that it’s just how one speaks.

Cohen: In Turkey, Mustafa Kemal Atatürk instituted the modern Turkish language and alphabet. He personally traveled around the country trying to teach people a new alphabet. And this wasn’t that long ago — the late 1920s.

Smith: Modern Turkish is a language-construction project comparable to Hebrew in its audacity, its transformative nature, and its connection to a political project. You have this peculiar situation in Turkey, where within a generation knowledge of the past became extremely difficult to access because, by the middle of the twentieth century, few people could read earlier documents written in Ottoman Turkish. Atatürk not only brought in a new alphabet, but also purged Persian and Arabic vocabulary. Turkey made a huge effort — somewhat like the model of the Icelandic “number prophetess” — to construct words for things for which they had no available Turkic word, because for centuries they had borrowed from Persian and Arabic. In some cases they went deep into Central Asia to find other Turkic-speaking people to inspire their vocabulary. They even went to Yakutia! There are words in modern Turkish that are borrowed from Siberia, because they needed to find another Turkic language that could guide them. That’s audacious and impressive. But there was a trade-off: On the one hand, you have a relatively pure and also much-easier-to-learn modern written language. On the other hand, it cuts you off from the past and from the incredible hybrid, transregional, cosmopolitan character of Ottoman Turkish.

At the university level in the United States, a language is taught as a separate, hermetically sealed thing. I think this impoverishes our idea of what a language is. To my mind every language should be taught as a package deal of languages within the same family. When you learn a Portuguese word, why not learn the Spanish word, too? The two are almost certainly similar. And that gives you an opportunity to think about the way languages transform from one region to another.

In this era of modern nation-states, each nation-state, in order to be respectable, has to pretend that its language is entirely its own, rather than something that bleeds across borders. Modern Turkish is the perfect example of that. The real history of Turkey is one that collides with the Persian and Arabic worlds.

Cohen: When I lived in Russia, I noticed the utter disdain some Russians had for Ukraine in general, but specifically for the language, which does share some similar features with Russian. The equivalent in America might be someone from New York commenting on how people with Southern accents don’t pronounce their g’s.

Smith: There’s a Yiddish saying that I think I first heard from [linguist and social critic] Noam Chomsky: “A dialect is a language without an army.” Again and again, in the history of modern nationalism, what gets made official and considered proper is a question of where the power lies. Sometimes there are quirky circumstances, such as the likely apocryphal story that some king of Spain had a lisp, and it became fashionable to lisp the way he did, and this is why Spanish in Spain has all these th sounds, whereas Latin American Spanish does not. Whether it’s true or not, it fits with the way certain language comes to be considered “correct.” Then there are less-amusing cases when a subjugated group speaks a different dialect of the same language or another language, or something that’s somewhere in between. Ukrainian is very clearly its own language. But arguing over what counts as a dialect and what counts as a proper language is a political, not a linguistic, question.

Cohen: You’ve spent some time looking at the Voynich Manuscript, a document ostensibly written in northern Italy in the fifteenth or sixteenth century using a code or cipher that no one really knows how to crack. How common is it for linguists or archaeologists to find something like this?

Smith: We’ve learned how old that manuscript is only recently, from carbon-dating techniques. Throughout most of the twentieth century it remained an open question whether this was a fraud passed off on the world by Wilfrid Voynich [the rare-book dealer who discovered the item in 1912 — Ed.]. We now know that it is the real deal and that the raw materials used in the illustrations can be traced with high probability to somewhere in northern Italy. But we don’t know if it’s an enciphered natural language or an artificial language. A lot of people who have gone down the rabbit hole on it think they have the answer. It’s an incredible case: Around the time of World War I William Friedman, the pioneer of American military cryptography, was working in a private institute where they used the Voynich Manuscript as the ultimate challenge for cryptographers-in-training. For decades the best minds in U.S. military intelligence worked on this text and could not get anywhere. So if they couldn’t do it, there’s no way some amateur hobbyist is going to crack the code.

There are other language mysteries, too. One is the stone carvings made by the ancient Picts of northern Scotland. For a long time scholars couldn’t decide whether they were merely decorative or contained information. Using digital analysis, researchers were able to establish that there are recurrent patterns that suggest encoded information. Another example — one I wish I could drop everything and devote my career to — is the Inca quipus, which are strings tied in different sequences of knots. They were kept and maintained and lengthened over generations in Andean cultures. When anthropologists started asking, “What is this?” people in the Andes would say, “Go ask the oldest person in our village,” and the oldest person in the village would say, “I think I remember my grandparents adding knots to these, but I don’t quite remember what they’re for.” Some people think they constitute a form of writing. Others think they are more like tabulations, record keeping. But then you have to ask, “What’s the difference?” I guess if it’s a proper written language, then potentially you could express novel ideas. You could tell a story about birds, or write a physics textbook, or do anything at all. Whereas if it’s just tabulation, then there’s only so much you can do.

On the other hand, we know that if you look at the earliest writing on Sumerian clay tablets, most of it is record keeping: tracking who gave how many cattle to whom, or how many casks of wine. A friend of mine, a writer I really admire, puts it this way: if you publish a poem in a magazine, and then you submit an invoice for the poem, the second activity gets a lot closer to the original nature of writing than the first one. [Laughs.] If you trace it back to early Mesopotamia, writing isn’t all about poetry. It’s all about the invoice.

Cohen: The strings make me think of braille: a physical, tactile form of writing.

Smith: There might be language capacities we are underexploiting. I have very sharp tactile memories of a remote control that I probably last held in my hand in 1983. I know exactly where the “volume up” button is, as opposed to the “channel up” button. This was just seared into my fingertips in a profound way — whereas my memory of most things I saw in 1983 remains vague. It seems we’re well disposed to process information haptically, and braille capitalizes on that. Arguably it emerges out of practices that we’ve always used, like getting a feeling for the health of a plant by running your fingers over its leaves, touching the environment around us.

Cohen: Technology has also provided us with interesting new lexicons. Online you can use emoji or even GIFs to explain something or react to people. The American linguist John McWhorter has refuted the idea that these and other new forms of communication, such as texting or “Internet speak,” are making us dumber or taking away from language.

Smith: He’s absolutely right. This is an extremely creative and spontaneous moment for language. There are whole sociolects [dialects belonging to a certain social class — Ed.] that you and I don’t even know about, because we’re too old or we don’t belong to the communities of people who have come up with them. Emoji are fascinating because they’re a return to the ideographic sources of a lot of writing, including the Phoenician alphabet, which included a representation of a bull’s head. The Greeks then turned that upside down, and it became an alpha. And then it became our letter A. We’re used to a form of writing that’s completely disconnected from ideographs, but now they’re coming back. And we’re having to question our presumption of the superiority of alphabetic writing.

I think of the satirical language in Gulliver’s Travels where you just pull out of a sack the thing you want to talk about, and you line it up with the other things you’re talking about in a way that forms a sentence. With emoji we’re pretty close to doing that. There have been experiments in translating entire novels into emoji; you need to have a key to translate it, but it’s pretty complete. There are probably a billion people in the world now who are illiterate in the classical sense but are communicating by text all the time using pictures and animations. It’s an unexpected twist in the history of literacy.

Cohen: We also have a lot of people who can speak the language of machines: Java and Python and various forms of code. How do you think that affects the species as a whole?

Smith: It will probably be a pretty short period of human history in which we have a good number of people who speak machine. I think we’re now moving into a period when we will leave it to the machines to speak to each other. A lot of the tedious work of coding came during an early phase of computing. We’re developing artificial intelligence to do that for us. When we have only machines speaking machine, however, it’s going to be a big problem, because their language is going to proliferate beyond our ability to fully grasp even how it’s proliferating.

Cohen: I’d also like to talk about slang and how cultures create and develop it. When I lived in Russia, I had a roommate who had a several-volume dictionary of vulgar Russian slang. I was surprised to find this second language within the language.

Smith: There is Russian prison slang that is basically classifiable as an argot or a cant — a language unique to a particular group within a society. There was some crime ring in, I think, Ohio a few years ago whose members spoke a Gaelic-English cant in order not to be understood by outsiders. Carnies — people who work the carnival circuit — and traveling tinkerers would traditionally have an insider language that is just slang ramped up to the point where it’s impenetrable to outsiders. Here in France, when I’m listening to adolescents, I don’t know a lot of the slang words they’re using, but I can still basically follow along. The full-on argots and cants have mostly disappeared in the United States. In England there’s the Cockney slang based on rhymes. In France there’s the verlan slang, where you change the order of the syllables: a cigarette becomes a garetteci, and a femme becomes a meufe. Even the name verlan comes from the inversion of the word l’envers — i.e., “the reverse.” Such slangs are quite common around the world.

Then there’s the slang that’s not an insider sociolect but consists of the vulgar words we all know. Any competent native English speaker knows the word fuck, right? Even the most pious, convent-dwelling nun, if she is a native English speaker, will know it — even if she’s never uttered the word in her life. These kinds of “bad” words are unlike sociolects, in that you have to know them. In any language, it seems, there have to be some words that have that force, whether you as a speaker of the language actually deploy them or not. I lived for a while in Québec, where, strangely enough, the worst French words you can say are câlice and tabarnak — “chalice” and “tabernacle,” things you find inside a church. Meanwhile merde — “shit” — is a rather mild thing to say in French.

Over the past few decades we’ve declared a new winner for the worst word in the English language. It’s the only word in the English language that I will not say, but it’s definitely one that I know: the N-word. And it comes from the peculiar racial history of the United States. Which words occupy the role of worst vulgarities reflect what’s most salient in a given society, whether it’s racial dynamics, or religion, or sexuality, or excrement.

Cohen: Why were those particular words in Québec so bad?

Smith: If you say tabarnak, the implication is something like “I’m going to desecrate the tabernacle.” [Laughs.] What exactly does it mean when we say fuck? Is there a suppressed proposition there? I think yob tvoyu mat [loosely, “Fucked your mother” in Russian] is really interesting because, as I recall, yob could be a claim about what you have already done to this person’s mother, or it could be a command to that person to do it to their mother — two very different things! [Laughs.]

Cohen: It’s the same thing with poshol ty na khui [literally, “You went to the dick,” but loosely, “Go fuck yourself” in Russian]. It’s past tense, but it’s also directed at someone. It doesn’t make much sense in translation, but someone once broke a bottle over a trash can and came at me for saying it to them in Kyiv.

Smith: Khui has the same root as khren, or “horseradish,” which is also a euphemism for male genitalia. And it comes from a more primitive root, khuyet’, which means “to conceal.” So it’s like “that which is concealed.” It’s a weird insult.

Then you have euphemisms like gosh darn, heck, and shoot. It’s as if we know intuitively how to shift a little to the side of words that we’re not supposed to say, so we can both say them and not say them.

Cohen: You said every culture needs to have these words. Is that because we need to understand the limits of what’s acceptable?

Smith: These are highly charged words used to express emotions that we’re all going to have. The idea is that the vocabulary is needed to match these emotions that are irrepressible. I’ve always found the available French vulgarities totally alien. I would never get angry and say putain — “whore” — which is the most common exclamation to use here when you stub your toe or whatever. The sound of it, the actual phonology of it, feels off to me. What it actually refers to is always present in my mind: I don’t want to invoke a whore if I’m really angry; that’s just too far from my moral universe and from my conceptual touchstones. On the other hand, when I learned German, all of the vulgarities felt totally natural to me, in part because they often share a common history with English vulgarity. So, excuse my German, but fick dich, Arschloch [“Fuck you, asshole”] is just the most natural thing in the world for me.

Cohen: You have experience translating a few different languages. What do you think gets lost in translation? Are there certain emotional experiences unique to a language?

Smith: I find that I really enjoy translating German poetry into English. The poets I like most are those who employ very basic vocabulary that enables me, in translating, to celebrate and draw out the Germanic character of the English language. That’s where I’ve had the most success translating poetry. I’ve done quite a bit of translation of Latin philosophical prose, which is a very different process, because if you don’t understand something, you just look for the conventional way of translating it. I’m still doing some translation from Russian, but the texts I translate are not particularly interesting as examples of the Russian language.

The main translation I’m doing now is from Sakha into English. And unlike with all these other languages, there’s just no precedent for how you’re supposed to render something. There are ongoing disputes about how many verb tenses there are in Sakha, how many verbal moods there are, what exactly counts as a noun case. It’s been extensively studied by linguists, but it hasn’t been standardized, so there’s no official determination of whether you’re getting a given sentence or a given line of poetry right.

That’s something I think we don’t appreciate enough: German, English, French, and Latin are different languages, but they are intercalibrated enough that there is always an outside authority who can tell you whether you’re getting it right for any given sentence. Most languages are not like that: they are free-floating. There have been some efforts to calibrate Russian and Sakha; Russian becomes the object language while Sakha is the target language, in much the way that, say, a nineteenth-century textbook on Sanskrit would be written in Latin. So when you’re trying to learn Sakha, you start to think you’re discerning some kind of order in it, but that order is just through the lens of Russian, even though the two languages have less in common than English and Russian. Sakha is a Turkic language, and Russian is an Indo-European language. Yet my understanding of the ordering of the Sakha language is via Russian. And that really affects my perception of it and the way I translate it.

There are probably a billion people in the world now who are illiterate in the classical sense but are communicating by text all the time using pictures and animations. It’s an unexpected twist in the history of literacy.

Cohen: We haven’t talked about any of the thousands of languages and dialects on the continent of Africa. Nigeria alone has around five hundred. Most African languages, to my understanding, are marginalized in Western education. You can probably make a strong argument that this is a remnant of colonialism, where the West has no incentive to learn those languages.

Smith: The Niger-Congo language family is supposed to have the largest number of languages in it, but having the largest number of languages is not so much a linguistic feat as it is a political situation: these individual languages exist side by side, village to village, because there is no central authority trying to standardize them, which would also make them disappear. There were more languages spoken in France five hundred years ago than there are now; they were somewhat like French but not the French we speak today. Most of them have largely disappeared or, like Gascon or Poitevin, have been reconceived as local dialects — i.e., variants of a standard and authoritative version of the language rather than locally authoritative ways of speaking in their own right. Some of them were never written down. And we can suppose that the process of centralization of political authority is what reduced their number. Is it good for speakers of Niger-Congo languages that they currently hold the record for having the largest language family? Or is that rather a reflection of the lack of centralized political power? I’ll say this: it’s a shame every time one way of speaking disappears from the world, especially when it doesn’t leave a trace.

I don’t know much about African languages. I have translated Anton Wilhelm Amo, an African philosopher of the early eighteenth century, who wrote in Latin and gives clear signs of speaking German as his primary language. He was brought to Europe from West Africa, from what’s now Ghana, when he was about four years old. People are always looking for traces of his thinking in an Akan language, probably Twi, which was likely his native language. But I don’t think he remembered his first language when he was writing as an adult. Ghanaian philosophers have tried to reflect on the way one would be likely to conceptualize things like God, substance, body, or soul if one were thinking philosophically in an Akan language. This kind of work is interesting, pushing the boundary of textuality. It takes figures like a West African philosopher trained in early-modern Germany to catalyze this kind of reflection.

I’m doing some work right now on a possible case of a seventeenth-century Ethiopian philosopher, but Ethiopian is an extremely textual tradition. And its roots are in the early-ancient Christian Levant, rather than in Africa. An Ethiopian cleric of a thousand years ago would have been basically contemptuous of the nontextual pagan peoples surrounding him — at least as contemptuous as any European has ever been. So there are multiple layers to Africa, and incredible work by comparative linguists on African languages, but it’s mostly understudied.

One of my professors in grad school, Boris Gasparov, was a specialist of Old Church Slavonic. He had a pet theory about the archaic gender system in the Slavic languages, and he drew, for some of his reflections, on languages from the Bantu family, where you have something that’s like gender but is actually more a system of broad semantic categories. Rather than just having masculine, feminine, and neuter, you’ve got horizontal, and celestial, and as big as a tree — things like that. What we call “gender” in language is actually just an impoverished trace of a much bigger system of semantic categorization that you can still find in some other languages.

Cohen: There are also examples in Ghana of percussion being a form of communication between villages. A lot of this system is not written down; it’s passed on verbally between generations. Are there examples in other cultures where music is used as a language?

Smith: There’s the Silbo Gomero whistling language in the Canary Islands, which has survived in some limited way. The Guanches were the Indigenous people of the Canary Islands, probably a Berber people who were massacred by the Spanish by the end of the fifteenth century. But some of the mountain dwellers continued to use a system of communicating across vast canyons by whistling, and it turns out that, rather than being a limited code, this has the full repertoire of a real language, such that it could replace spoken language when necessary. That’s a key difference linguists are always looking for. Sign language is a real language and not just something people do when they have no other choice. It has full generative recursive syntax — everything that we expect from a language, as opposed to a code.

In the late 1960s a poet named Jerome Rothenberg started working on what he calls ethnopoetics. It’s a very sixties idea — he would probably face resistance if he tried to do it today. But by the 1970s he was doing translations on paper of a lot of things that we don’t ordinarily consider part of a language, like bodily gestures. For example, he translated Navajo dances. Now, the dances involved chanting, which had words. But his point was that if you only translated the words of the chant and not the whole dance, it left so much out. I think about this whenever I read someone comparing a rapper like Eminem to a poet like John Milton. I always think, Come on, this isn’t working. Rap is an oral art form, not a textual one. It’s like an opera libretto: you can read it if you want to, but you’re missing a big part of what makes it great. Rothenberg’s point about translating from Indigenous art forms is that we are limited in our imagination when we see them either as poetry or as song or as dance. What the Navajo are doing is a kind of total artwork for which we don’t have a category. So he set about trying to note, in a graphic way, absolutely everything that happens in the Navajo ceremony, and he saw that as a form of poetry.

It’s hard to imagine that the drumming you mention would constitute a full language with recursive syntax, such that you could translate a book into drumming. It’s a form of communication, but maybe not something we would call a language. There are so many practices like this. Theorists like Roland Barthes are right about the ubiquity of semiotic systems: It’s not just that people get together and say, “Let’s drum and have this mean that.” It’s also your body language when you walk down the street, and absolutely everything in human life. And in some places the unspoken part gets more codified and made explicit. Barthes’s point was that you can make anything explicit: you can analyze fashion, for example, and show what it’s communicating, whether the people who are wearing it know it or not.

Here’s a last thought: Let’s say you’re watching Jeopardy!, with high-level champions answering the hardest questions possible. So it’s fair to ask them anything — about quantum mechanics, or about the isotopes of different elements on the periodic table, right? But if you were to ask them something about, say, noun declensions in Finnish, people would probably say, “Wait, that’s not fair. They don’t know Finnish.” It’s interesting to speculate about why we cordon language off in that way — why facts about Finnish are not considered to be facts about the world or things that one could be expected to know, even if one knows everything. Unless one is Finnish, of course.