Thursday, September 26, 2019

GANs and other deep learning models for cooking recipes

I retired this spring after working on artificial intelligence projects since the 1980s. Freedom from having to work on large projects for other people and companies is liberating and frees up time for thinking about new ideas. Currently I am most interested in deep learning models for generating and evaluating recipes - for now I am using a GAN model (which I am calling RecipeGAN).

When I managed a deep learning team at Capital One, I used GANs to synthesize data. During a Saturday morning quiet-time hacking sprint the first month at my new job, I had the idea to take an example program SimpleGAN that generated MINST digits and instead generate numeric spreadsheet data (using the Wisconsin Cancer Data Set that I had previously used in my books as example machine learning data). I was really surprised how well this worked: I could generate fake Wisconsin cancer data, train a classification model on the fake data, and get classification prediction accuracy on real data samples that was almost as good as a model trained on real data. This was by just making about 40 lines of code changes to the short SimpleGAN TensorFlow example/demo program. My team took this simple idea and built a robust production system around it that is well described in Austin Walter’s Medium article.

Several years ago, a fan of my CookingSpace.com web app gave me 100K public domain recipes in digital format so I should have ample training data for RecipeGAN. I will put the code and data on github when I am done with this experiment. If you are not familiar with Generative Adversarial Networks (GANs), in the cooking/recipe context the idea is simple enough: a generator model takes as input a random vector (referred to as Z vector, or latent input) and generates random recipes (for now represented as sparse vectors indicating the use of ingredients). A discriminator model learns to tell the difference between fake ingredient lists generated by the generator and real ingredient list samples. Both models are trained jointly so the generator learns to better fool the discriminator model while the discriminator model learns to not be fooled. When this process is done, the discriminator model is no longer needed. New random latent Z input vectors fed as input to the generator model hopefully generate realistic ingredient lists.

I am also interested in language generation and an end goal for my current research is to generate English directions for making the fake recipes (the ingredient lists created by the RecipeGAN generator model). This is a fun project and I also hope that the code and data will be useful to other people, even if I don’t get good results. Indeed, I am writing this blog now to encourage myself to share results no matter how well the system works. Ideas are meant to be shared.

BTW, please don’t take my proclamations of being retired too seriously. I am still helping people, as a consultant, get started on deep learning projects.