The browser can be accessed
here.
The digital landscape is a result of several abstractions that attempt to map
semantics to space:
- We created a "word embedding model"
(Wikipedia) that represents the relationships between Brandes' words in
a much lower-dimensional space -- in this case, 200 dimensions -- than the
original text, which contains nearly 11,000 individual words.
- Although this in itself is a massive reduction in complexity, we need to further
map these 200 dimensions down to the two dimensions of a computer screen in
order to display it. We accomplish this using t-SNE (Wikipedia), a dimensionality reduction algorithm that
attempts preserve local relationships as well as the global shape.
- Finally, we create an artificial digital landscape of the resulting semantic
space using WebGL, a programming language similar to those used for advanced
computer games.
In the resulting visualization, you can search for particular terms to locate them on
the digital map. Try words such as:
- hegel
- frankrig
- jakobinerne
- reaktionær
... pressing the "Søg" [= Search] button to jump to the word's location in the
digital map.
Although the map appears three-dimensional, in all honesty we need to admit the
height is purely to aid in creating a sense of space. In the future, these mountains
and valleys could represent the word's transformations over time, its rarity, or
other aspects of its use.
Software:
- WordVectors R wrapper for word2vec by Ben Schmidt (Assistant professor of
history at Northeastern University and core faculty in the NuLab for Texts,
Maps, and Networks): https://github.com/bmschmidt/wordVectors
- Word-to-Viz visualization library by Doug Duhaime (Digital Humanities Developer,
Yale DHLab). Code forthcoming: https://douglasduhaime.com/
Technical parameters:
- Vectors: 200
- Window: 30
- Iterations: 30
Further reading on Word Vectors / Word Embedding: