I’d like to tell you the story of the Ultimate Chocolate Chip Cookie Recipe.
This isn’t the Neiman Marcus $65,000 cookie recipe. Nor is it the classic Toll House Chocolate Chip Cookie recipe that we all grew up with (and, though the instructions are all the same, my Mom made the best). This is a recipe learned from thousands of bakers around the world, via love and math.
There are currently 2,480,000 Google search results for “chocolate chip cookie recipe.” Imagine a young baker, in the kitchen late at night, with a laptop at hand. Of course she would write a program to gobble them all up!
Recipes are highly structured text, following easy-to-read patterns. First, ingredients, each with an amount and a description. Next, instructions, referring back to the ingredients. There are even recipes already marked up for computers to understand, using a (micro)format called hRecipe. Thanks to these structured documents, it’s possible to write a bit of code to process some statistics and create a parser, or a program that can break any text recipe down in a way that a computer can understand.
Then, math. We can build a statistical model of chocolate chip cookies, encompassing many different kinds of information. We can learn that high altitude cookies tend to require an additional .5 cups of flour and up to 2 tsp of water or an additional egg. We can also learn that crispy cookies are baked at 375 degrees Fahrenheit for an average of three minutes longer than chewy cookies.
The median recipe represents the optimal combinations of ingredients and instructions; it is the combined deliciousness of thousands of bakers. The statistical model gives us the bounds of their wisdom, the ability to modulate different qualities of the cookies easily and without years of effort in the kitchen.
We’ve created this data infrastructure for things like targeting advertisements on the web, but we can use it to help us create and consume wholesome, delicious food. We need to use it to understand the failures and vulnerabilities of our current food system, and to build a more robust one alongside it. It’s all possible right now. Let’s do it!
And the Ultimate Chocolate Chip Cookie? It is a damn fine cookie.
Of course, my Mom’s are still better.
Hilary Mason is the Chief Scientist at bit.ly, where she finds sense in vast data sets. Her work involves both pure research and development of product-focused features. She’s also a co-founder of HackNY (hackny.org), a non-profit organization that connects talented student hackers from around the world with startups in NYC. Hilary recently started the data science blog Dataists (dataists.com) and is a member of hacker collective NYC Resistor.
She has discovered two new species, loves to bake cookies, and asks way too many questions.