Style transfer is a machine learning technique to generate new images that use the aesthetic style from one image and the content from another. Style transfer creates enchanting images, but I feel that the most beautiful aspect of style transfer is the algorithm that accomplishes the transfer. Behind the stylized image lies a wonderful analogy for the design process and how designers craft signals out of noise.

Style transfer The style and content sources are the input to style transfer, and the stylized content is the output.

Style transfer uses three images, one that represents the content, one that represents the aesthetic style, and a third internal image that eventually becomes the stylized output. In the image above, Divan Japonais by Henri de Toulouse-Lautrec is used as the style source, and a profile image is used as the content. These images define constraints that guide the algorithm while the third image is iteratively transformed from white noise into the final product.

White noise White noise is the initial state of style transfer.

To transform the internal image, a mathematical function called the cost function is used. The cost function is defined to capture the global content of one image and the local spatial information that implies the style of another image1. If the cost function has a high value, the internal image is doing a poor job of capturing the content and style of the two images. Through an iterative process, the internal image is changed to lower the cost value2. The white noise initially has a very high cost because it does not contain either the content or style of the provided images. The final product has a much lower cost value because it does a better job of capturing the style and content of the two source images.

The early stages of style transfer show the final image emerging from noise as the cost value decreases.

The precursor to style transfer was texture synthesis3. This technique extracts the style from an image and uses that style to generate a texture. It is similar to style transfer but lacks the content image. Since there is no content image in texture synthesis, you can think of it as style transfer except the initial white noise is also the content source. The image below elaborates on this explanation.

Style extraction The style can be extracted by using the initial white noise as the content source.

The initial white noise gives structure to the texture. When texture synthesis is run using different white noise, it generates similar textures with different structures. Compare the following two textures generated from Divan Japonais and focus on the top right corner of both images:

Divan Japonais 1

Divan Japonais 2 Two similar but different textures generated from Divan Japonais.

Both images share similar qualities because they are constrained by the same cost function to capture the style of Divan Japonais, but the initial white noise changes the structure of the pattern. It is easiest to see the different structure by focusing on an area and noting the differences in the placement of the white and black shapes. The two textures above appear as two sections of an infinite tapestry. There are infinite number of ways to arrange the structure of the texture, and the two images above are only two possibilities. Consider the initial white noise: for each pixel, there are two values, white or black. For a very tiny image of 5x5 pixels, there are 33,554,432 possible combinations of white and black pixels4. As the size of the image increases (for reference the two images above are each 1920x1280 pixels), the number of possible starting images quickly approaches infinity. We can consider these infinite possibilities as the starting space; this space contains every starting image that could exist. We can consider all the possible final images as the corresponding solution space; this space contains every generated texture that minimizes the cost function. Style transfer is a function that moves from a point in the starting space (noise) to a point in the solution space (stylized image). The starting and solution spaces are integral components of design, and the design process provides a useful framework for managing infinite possibilities using constraints, iteration, and evolution.

The role of the designer is to isolate ideal approaches and solutions to a problem. One of the first steps for designers is to define constraints to guide the process. “What is the problem to solve?”, “What materials can we use?”: these are constraints that shape the starting and solution spaces. In a similar way, the cost functions discussed above are constraints to guide the algorithm. Texture synthesis uses the constraint that the output must maintain the style of the input image. Style transfer has an additional constraint on top of texture synthesis that the output must also maintain the content of another image. Constraints may sound restrictive but designers rejoice when the infinite set of possible solutions is reduced by additional constraints.

While constraints define the starting and solution spaces, iteration serves the purpose of moving from starting space to solution space. Designers pick multiple starting points and produce prototypes by iterating on ideas until a product forms. At each step of the iteration, designers ask themselves whether the prototype satisfies the constraints of the project. There is a cost function associated with the design process, but it is not always mathematically defined5. Aesthetic and functional values set by the designer are used to compare and judge prototypes. For style transfer, iteration is the algorithm working to lower the cost and turn the white noise into the stylized image. At each step, the machine learning algorithm compares the current state to the desired constraints defined in the cost function and attempts to reduce the disparity. The video above of the image emerging from noise shows iteration in progress. This moves from a single point in the starting space to a single point in the solution space.

The final component of the design process is evolution. As prototypes are created, judged, and compared, they inform the designer of new ideas to evolve the constraints. The process starts again, and new prototypes are created through iteration focused on the fresh constraints which lead to further evolution of the constraints. In this way, designers are able to navigate a solution space by evolving constraints to reduce the possibilities to a manageable size. Let’s turn once more to style transfer and see how we can add in evolution to navigate the solution space.

A small Game of Life simulation with cyclical behavior.

A generated texture is a single solution, so we want to be able to navigate through a subspace of all the possible textures. Rather than randomly generate new textures, we want the previous constraints and solutions to inform future constraints and solutions. White noise controls the structure of the output. When we change the white noise, we change the layout of the generated texture. By evolving the white noise in a structured way, we can begin to navigate the solution space. Cellular automaton are programs that simulate changes in a basic digital environment. By coding simple rules into the simulator, complex behavior emerges. The video above is a sample of the most well known cellular automaton: Conway’s Game of Life. Pixels represent cellular organisms, and each pixel lives and dies by the following rules:

  • Any live cell with fewer than two live neighbours dies, as if caused by underpopulation.
  • Any live cell with two or three live neighbours lives on to the next generation.
  • Any live cell with more than three live neighbours dies, as if by overpopulation.
  • Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

A large Game of Life simulation.

We want to navigate the solution space through an evolution of constraints. White noise acts as the content source during texture synthesis, and we can use a Game of Life environment to drive changes in the content constraint. Evolution in the simulator drives evolution in the constraints. The simulator’s state defines a content constraint, iteration produces a texture, and evolution presents a new content constraint for the next step. In this way, we are able to navigate through the solution space and explore the infinite tapestry.

The following videos are best viewed in HD. If the video has poor quality, please click on the HD button in the bottom right corner and select 1080p.

Living Tapestry made from Divan Japonais by Henri de Toulouse-Lautrec.

Living Tapestry made from The Scream by Edvard Munch.

Living Tapestry made from Bicentennial Print by Roy Lichtenstein.

Game of Life gives movement to these living tapestries in an organic yet alien manner. There are a number of ways to expand on this work. A different cellular automaton with different rules would result in a different type of movement. It is also possible to mix and interpolate styles. I used the pre-trained styles provided by Google Brain Magenta, but different styles have different effects when used as a living tapestry.

Finally, I’d like to close on the role of noise in design. In science and engineering, noise impedes understanding by polluting data collection. It is the role of scientists and engineers to extract signal from noise. In design, noise is a material to be sculpted into definition. Something constitutes noise when it does not cognitively register, it sounds or looks random. But given enough samples of a specific pattern of noise, we begin to recognize it; it moves from the space of random noises to the space of recognized signals. Science is about finding signals in nature while design is about forging signals for individuals and society, the realm of culture.

Moving from noise to signal is a learned behavior. There is a huge focus on learning in many disciplines at the moment. Neuroscience and cognitive science illuminate factors about how humans learn and use signals that arise from nature or culture. Machine learning research is focused on teaching machines to use the signals that we understand but cannot symbolically define. The forefront of both science and design is dominated by machine learning. Science will have a deeper understanding of nature through machines that can classify patterns that still appear to us as noise. Machines will also help us generate culture in new and exciting ways that will redefine society and the built environment6.

I believe that noise is the ultimate material. Every physical, digital, and biological material that designers use are simply proxies for noise. In ceramics or any other discipline, you start with a lump of clay or other material. This undefined blob is noise, it holds no signals except its bare material properties. Through constraints, iteration, and evolution, designers shape the material into an object with functional, aesthetic, and emotional properties. And through these objects, designers communicate with society, registering new signals and generating culture. Shaping the undefined into something understood is the most important role of design. But until now, designers have worked through proxies of noise to shape the undefined. Style transfer begins with noise, this is the initial material that is manipulated and given a shape. Machine learning holds tremendous potential as an emerging design tool that allows designers to work directly with noise as a material.

If you enjoyed this article, please share it on Twitter or by any other means!

Acknowledgements

I forked Google Brain Magenta’s image stylization and also used their style checkpoints. I used Thearn’s Game of Life implementation.

  1. If you are familiar with machine learning, you can find a more thorough definition of the cost function in the original paper A Neural Algorithm of Artistic Style

  2. This is how the original style transfer works. Fast style transfer outputs in a single forward pass and the iterative aspect occurs during training.

  3. Texture Synthesis Using Convolutional Neural Networks

  4. There are two possible values for each pixel and there are 25 pixels. 2^25 = 33,554,432.

  5. In the case where the designer defines a cost function mathematically and runs the function computationally, this is called Computational Design.

  6. For a more somber analysis of machine generated culture, I recommend an earlier article on Machine Learning and Misinformation.