This Blog Post is Heavy
The above title illustrates one of the problems with decoding language: ambiguity (or word sense). Or, for the more technically inclined: polysemy.
Here the intended meaning is not so challenging – something to do with density perhaps? You’re in for a difficult read? Or maybe something heavy on the soul? A quick glance at any definition of heavy gives myriad possibilities, perhaps more than you were expecting.
But What Does Heavy Look Like?
Creativity includes a variety of cognitive capacities. One of them is sketching, like the proverbial back-of-a-napkin inventor’s sketch below. This visualization of the innovator’s formula (idea + work + collaboration + luck) suggests a more realistic interpretation of the so-called innovator’s dilemma.
Sketching like this is hard. Most folks can’t do it and end up defaulting to bullet points or drawing everything as a box. A determined creative might try to learn by following a book, like Back of a Napkin. She might even try to use an AI-assisted drawing tool like Google Autodraw.
But none of these approaches solves the real problem, which is not how to draw (although most of us need help with that too) but what to draw.
The problem presents itself during an average whiteboard session where participants watch the scribe struggle to draw what’s she glimpses, perhaps in fragments, inside of her head. Much to Tufte’s lament, the poor scribe resorts to bullet-pointing and somehow visualizing every business problem as a set of boxes and arrows.
She typically draws what’s on the left of this dotted line, not the right:
That glyph at the top right gives us a clue as to what heavy might look like in this case, as in “heavy on the soul” or burdensome (“demotivated”). Of course, this is just one of many possible visual metaphors:
AI-Assisted Creativity & Curation
With Cogni-Draw, we set out to solve this “visual articulation” problem by building the world’s most expressive and comprehensive visual communications library.
Don’t be fooled. This is not just a bunch of clip-art, or even “Clip-art 2.0.” Our “library” is actually an AI-powered search engine exposed via an API that will take terms and phrases and convert them into appropriate visual glyphs. It also learns new phrases, including modern idiomns, like “Growth Hacking.”
Per our augmented creativity vision, we rely upon the so-called “Human in the Loop” method of “training” our AI.
Although, not quite.
We don’t use low-skilled “human mechanical turks” of the kind used to tag endless pictures (predictably of cars) for, say, Crowd Flower. The humans in our AI loop are highly creative illustrators and cartoonists who have a refined knack for turning words into visual meaning, even for metaphorical subjects. But we use AI to massively augment their creative process, more like “AI in the human loop”.
This is roughly the flow:
- Predict which terms/phrases users are going to search, at the moment related to business: AI does this!
- Prioritize which terms/phrases to illustrate first: AI does this!
- Suggest possible synonyms and related phrases as “cognitive cues” for the illustrator: AI does this!
- Suggest possible visual motifs that might already express the term: AI does this!
- Draw the glyph: humans do this 🙂
- Tag the glyphs using a sense-based tagging scheme: humans (editors) do this, but with AI-assisted prompts!
- Find – and rank- related words that might also be expressed by the glyphs: AI does this!
- Optimize the library based on how users deploy the glyphs: AI does this!
- Expand the search terms using synonyms and similar terms: Guess what? AI does this!
Most of how we solve the visual articulation problem is by a highly tuned combination of AI with human creativity. AI does a lot of the legwork that enables our curators and illustrators to work at maximally creative and productive effectiveness.
The core technical components are all related to natural language processing (NLP). We use a wide range of techniques which in combination give optimal results. Even the “in combination” piece has been partly solved using AI whereby we used AI to accelerate how our engineers and architects were able to think about and solve the problem. Most of the coding on this project has been the creation of tools and experiments to “think in code” about the problem space.
In essence, we are using NLP to understand business-related documents so that we can build a kind of language model that is highly amenable to producing visual motifs.
Per the start of this document, a key challenge is the polysemous nature of language and the use of synonyms. There are essentially a number of iterative steps in our process:
- Build a language corpus (non-visual dictionary)
- Express the corpus visually
- Re-interpret the visuals (glyphs) into new words/meanings
- Tune the corpus
- Extend the corpus using synonym and “similar meaning” words
- Repeat (and optimize)
We then put this entire process into the end-user loop whereby we get additional feedback about language usage interpretation via the glyphs.
A Picture is worth a Thousand (a Lot) of Words
The above process is confounded by the even greater “polysemy” of images. With the kinds of topics we are dealing with, many of the glyphs could easily be used to articulate a wide set of meanings. In fact, we have been fascinated by the how often a glyph “reveals” a new meaning just by the process of visual inspection (and introspection). This is a process that can be greatly amplified by the use of “word clouds” (although we seldom visualize them as such).
The polysemous, or rather elastic, nature of the glyphs makes it difficult to select appropriate synonyms. For example, the image below could have the meaning: lift, parcel, transport, delivery, logistics, warehouse, and even Amazon.
If we consider naively the synonyms for parcel, we could end up with allotment, as in “parcel of land”. Now if a user types in “allotment” (unlikely for our business lexicon, but not impossible) they might be frustrated to see a fork lift truck pop up in their whiteboard application.
To solve this problem we use a more sophisticated and multi-layered approach that takes into account word sense and contextual meanings. Of course, the meaning of words is fuzzy (and not the same for every person) and so we also have to tie back our synonym selection to a relevancy mechanism in the search in order to still offer more oblique interpretations, but further down the result set.
What is Natural Anyway?
We use the phrase natural (in NLP) as if its meaning is obvious. In one sense it is, in that we don’t want to have to modulate what we say when trying to be understood by a machine.
However, there is a more subtle interpretation, which might be something more like “colloquial”, “idiomatic”, “informal” or even “idiosyncratic” use of language.
A user is often more likely to think in (decor) contextual terms like “bright colors” or “non-fussy” or “muted tones”. Okay, that last one is closer to the lingo used by an interior designer or decorator, but you get the idea.
A large part of the underlying motivation at Telos.ai is an attempt to understand more “human-friendly” (natural?) vocabularies and how to map these vocabularies to the target items of interest, be they glyphs, artworks, colors or even foods.
Cogni-Draw is only step one in our journey to create more human-friendly datasets that incorporate these kinds of aesthetic and natural interpretations.
If you’d like to know more – please contact us.