Published on Oct 19, 2011

Data Book Covers

This joint adventure with graphic design studio FBA. has been sitting in the closet for more than a year, and recently started bearing fruits. It resulted in a tool that helps designers build visual artifacts to use in book covers design. The tool parses a text and exhibits its most frequents words, enabling the designer to choose which words he/she wants to visualize by also choosing from more than a dozen of visualization models. Despite the recurring use of the word “visualization”, the intent here is not to visualize anything, but to provide abstract visual artifacts that are innately related with the form of the text or content of the book (the semantic capabilities of the tool are meager, but this doesn’t diminish what I believe is a proper strategy: if I want to design something about a text, then I can start by analytically giving form to that text).

The tool can map occurrences of selected words across the whole text. For instance, taking Tolstoy’s War and Peace, we can map Pierre Bezukhov, Natasha Rostov, Andrei Bolkonsky (in shades of blue) and Napoleon Bonaparte, Tsar Alexander I (in shades of red). As a result we have several visual artifacts from different visualization models. Each model has its own varying parameters that can create drastically different results. Following, I excuse myself from the boredom of describing each visualization model, inputs and parameters, but they pass through dotted maps, temperature maps, circular and bar graphs, texture based density maps (a la 60’s statistical maps) and even particles trajectories that are attracted to the occurrences of certain words. Andrew Vande Moere suggested that describing the visualization models could be of use to the visualization community, and indeed he is very right since most of the visualization showcase articles do not describe properly the models in use. When I started reading about interesting visualization projects some years ago, the absence of cut to the chase descriptions was a harsh obstacle in the learning process. Therefore, some of the visualization models are briefly described in the end of this article.

Taking these artifacts to the purpose of the tool (book covers design), I made some experiences based on Ana Boavida/FBA.’s designs for Minotauro collection of latin-american writers. For instance, I tend to select the characters of the book, coloring only the protagonists.

Finally, realizing that this could work, we waited for projects in which this strategy could make perfect sense. The first real results are a growing collection about management, where the book covers were designed by Rita Marquito/FBA. using this tool. The whole management thematic is abstract enough to scream for this kind of approach.

What is interesting about this tool is that it is flexible enough to allow the designer to develop different forms of expression. And how Ana Boavida/FBA. did this… For the next collection of law books, she parameterized a visualization model with combinations that I hadn’t predicted, thus creating regular patterns that in spite of giving great book covers, expunge the data nature of the artifacts. But that, is another story.

Photographs by Daniel Santos/FBA.


Brief description of visualization models

The visualization models pass through dotted maps, temperature maps, circular and bar graphs, texture based density maps (a la 60′s statistical maps) and even particles trajectories that are attracted to the occurrences of certain words.

A text in a book is typically organized in semantic aggregations (e.g. chapters, sections). Since the tool does not try to extract any semantic information by itself, I start by considering the text as one big line of words. The trick here is to pass this unidimensional nature to a 2D canvas. Histograms do it, but I prefer starting with maps. Therefore most of the models start by lying out the text through several horizontal lines, as if one page could represent the whole book.
This way one can easily map in which portions of the text a certain character appears.

With this layout the position of each occurrence is determined. A seen on the left, I can represent each occurrence as a square. The color of the square is the attributed color to the corresponding word, while the size of is proportional to the total of occurrences in the text. Less occurring words are drawn above most frequent words in order to avoid occlusions.

Instead of drawing squares over the occurrences, the colors of the pixels can be directly set weighting the proximity to word occurrences. The strategy adopted on the visualization at left only accounts for occurrences that are in the past, generating abrupt changes in color when unexpected occurrences are encountered while getting a washed out effect as the strength of each occurrence decreases while reading advances.

The text can also be divided in N portions of equal size. In each of these portions, if there are occurrences of a certain word, a corresponding pattern is drawn. The patterns for each selected word are dynamically generated through the rotation of a basic line-based pattern, assuring that each word has an unique pattern. In this way each portion of the text is represented trough an artifact that can be used do decipher the density of important occurrences and the similarity with other portions.

Perhaps the most adventurous visualization in the 2D realm is depicted at left. There is one trajectory for each selected word through its occurrences. The initial position is determined by the strategy previously described. The trajectory is then incremented as if the text was being advanced, curving as the reading position approximates a new occurrence. The qualitative dimension of the fur balls traduces an agglomeration of occurrences, so if an occurrence is isolated, the trajectory quickly converges to a point without much spinning. When the next occurrence is far ahead on the text, the trajectory will be rectilinear. Because I haven’t found the mixture of rects and curves of aesthetic interest, the rectilinear trajectories can be removed. When a trajectory spans canvas’ bounds it emerges in the opposite edge as like in a toroidal space. The intensity of the curvature and the looking ahead distance that aggregates the next occurrences are customizable, being able to generate a great variety of artifacts from the same visualization model.

The next models aren’t based on the previously described positioning strategy. Instead they revolve in turn of the unidimensional nature of the text. At left, an atomic histogram of occurrences, where the height is the total number of occurrences of the corresponding word class.

The same histogram but with a circular base.

Finally, this last visualization model fills a shape with vertexes based in the circular histogram. Therefore, the disruption of each circular shape represents the absence of the corresponding word in portions of the text.