Data sceptic is the official podcast of data skeptic comm bringing these stories interviews and? Mini episodes on topics, data, science, machine learning, statistics and artificial intelligence Dave ray is the CTO of NQ, where his team of data scientists and engineers are building the companys AI platform for analyzing and generating original content. Previously he was the chief data officer for video amp, where he continues to serve as an advisor prior to that he was the chief scientist at Pasadena labs.
A company he founded during his doctoral, studies that built machine learning based applications for clients, like Microsoft and IAC, Dave holds a PhD in machine learning from Caltech and completed his undergrad from the University of Toronto where he began doing research and deep, learning and computer vision. And Jeff Hintons group he has been involved in the field for over 10 years. He is currently a fund adviser and investor at the VC firm signal fire, an angel investor in AI and
blockchain startups Dave welcome to data skeptic thank you very much for having me Kyle so youll be speaking I guess to kick things off and promote this a little bit at the upcoming conference that were having can you give me a few details about what you’re gonna talk about yeah Im really looking forward to the SoCal data science conference Ill mainly be talking about some of the efforts @nq in using generative models in particular generative deep learning models to create original content that’s.
October 27th in Pasadena California Ill have some links in the show notes for any who wants to check that out a good line of other speakers as well I hope you can come here Dave and and many of the others too so there’s many areas you and I could get into today I thought itd be most interesting to open up talking a little bit about mq can you tell me some of what you guys work on.
there NQ is a reimagine a ssin a film production company you know the founding team is filmmakers as well as VCS have been in Silicon Valley for you know over 20 30 years I recently joined NQ because I wanted to lead an initiative on bringing technology especially AI machine learning some advances in VR into the film production process but the advantage of doing this.
with NQ is there is a lot of production expertise film production is an art and that is you know we have a head start in the production space by working within NQ and leveraging the expertise in-house as well as the industry contacts yeah Im starting to see a lot of interest in this space you know it seems historically Hollywood was thought of as an artistic thing and of course it is.
but many people have the opinion that if somethings artistic there isnt much a mathematician can do to help with it how has that been changing in recent years its an interesting point I got into its after my PhD in machine learning I got into the space of advertising technology because there’s a lot of behavioral data and its a great place for a machine learning person to be in but the core.
thing there was what really drives advertising is content the underlying content for the longest time the content was supposed to be something that it was generated artistically its a creative process and so on we wanted to take a closer look at that especially because Ive seen some of these methods in deep learning that where you could generate model hypotheses generate text generate images so I started thinking about how could these be applied to filmmaking and furthermore in the film space the whole production aspect that is something that you start with a script which is pretty.
much like the blueprint of a project and then it takes on a project management scope from there you’re trying to optimize peoples schedules minimize budget there’s tons of variables like where theyre going to be like at a particular day there’s financing involved so there’s a lot of variables in the production space as well and that’s another place where I saw the opportunity to apply AI to optimize film production as well to decrease budgets and timelines so really now at Ncube what were trying to do is bring tools to help creatives and.
producers create much better versions of the products that they would have created without us analogy Id draw is AutoCAD not too long ago field of architecture that was supposed to be just very creatively driven in endeavor but these tools for visualizing and you know machine learning and all these things came in to the field of architecture so that you can take ideas that architects had and supercharge it helped them create much more complex versions of what they had in their mind and also take away a lot of the tedium.
so that’s what were focused on and in terms of those tools what does that end up being as the end user do I log into some system how do I engage with NQ if Im a movie producer the external facing products that we have now one is the story writer where you know you could be an amateur story writer or a professional story writer you could use.
our story writer platform to upload a script at any stage even if its an unfinished one first we visualize it would extract all the elements of the story we construct character interactions story arcs help you visualize your story with the aid of that visualization you can easily identify places that need more development if there are any particular you know holes in the script and so on then the step beyond that is if you go in and change a certain variable lets.
say a character interaction or you introduce a new character how does that affect the rest of the story and then furthermore we plug in our generative models in there so we can actually generate pieces of a script or we can generate dialogues it gives you the ability to take a story that youve just taught off and turn that into a much more better fleshed out complete crypt.
so that’s the story writer part on the production side Studios have a lot of these tools but independent producers dont have access to a lot of these where you can take your script extract the core elements from that figure out what are the budget aspects figure out the scheduling generate call sheets after optimizing all the variables that go into your production that’s the production optimization tool and finally and this.
is something that we use internally we have thought about maybe exposing it or offering it as a SAS lea service is we can go from script and pretty accurately look at outcomes in terms of budgets and production schedules and so on and tie in data about performances or like you know predicted ball Brocks office are predicted distributions we can go from end to end from a script all the way to potential revenue and costs according to different distribution channels so now we get an end-to-end model where we can look at.
lots and lots of scripts out there and you know see the outcomes and if we like the outcomes we can exercise options to buy scripts that way we enter or get into the funding side of this that is a tool that we have used internally and that’s one of the reasons why a lot of the productions that weve done at NQ but turned out to be very profitable so a lot of stuff for us to get into I.
think maybe itd be interesting to unpack the last one first theyre the trying to take a look at a script and figure out which ones you might want to fund and even did you say you were potentially predicting box-office revenue from the script the idea of predicting box-office revenue from a script is not something entirely new even in the current state of the art people look at scripts they go through breakdowns they put in variables like the talent is going to be where is this.
going to be shot the location genre and so on and that goes into a MacPro model and with predictions on box-office performances or but there its likely to get distribution in Netflix or Amazon or various different channels and and we all know that obviously filmmaking is not an exact science and some of these macro indicators dont really work that well its its a constant effort then to introduce newer and newer variables into.
this where we can come in is actually looking at the scripts evaluating the quality of writing the character interactions but also looking at the production process itself how it would typically work at studios now is their greenlight a project based on the script and these macro variables then it would go into a production stage where each of them there’s a script breakdown budget estimates and so on it adds another two to three weeks to that process we automate that process bringing it much further upstream so add the development stage itself you have a much more.
accurate model of what the likely budget and outcome is going to be so you’re making much better decisions early on so in terms of making a prediction like that determining you know like what the fund or what it might end up generating the box-office its interesting I think if if it were day one and we were starting that project from scratch and I was to tackle it I would come up with.
some heavily engineered features like you know how big of a star are the various people whos the attached director some of these really high-level things that certainly have predictive power but I dont think Id get very far with my little logistic regression I probably get a very small r-squared there must be more sophisticated things you’re looking at like you’d mention quality of writing but I would also say that I dont know that every box office hit necessarily is known for its great literary accomplishment sometimes.
theyre just I dont know fun or like you said its unpredictable why something succeed how do you go about modeling that what more sophisticated things can you look at beyond what the actors and actresses are and those sorts of details its a very good question I mean we have a more we do consider those macro variables like the genre and the performer and so on because those add to.
the final outcome quite a lot for example the lead actor has a you know huge effect on the outcome of the movie but where we can add a lot more data from the script process itself is by using deep learning models that’s what we had to use on our side when we did our generative models when we want to create original content so we use those models as well in the regression sense to come up with scores for different for for a script itself so those add in two variables and there are.
a lot more variables that we can extract with those models and then we add in the production elements so now were dealing with a lot more variables than you know what you might fit into a regression model you guys are involved in producing this its not a biopic but a movie that stars a fictionalized version of Michael Jackson of course anyone who reads the script can understand that they I as a person Im culturally aware I know who Michael Jackson is and I know that.
that’s gonna be more successful to the same movie made with a generic character because you’re bringing his personality to life and he has a fan base and all that sort of thing I dont know that that’s something the deep learning network would figure out just looking at a script so Im curious how much of a movie is predictable in the data you have available I would say about 50% of the variance comes from the script and.
the resulting production the script itself ends up affecting the character interactions when movies are released you know that people love to point out holes in the script also the script effects the ultimate production and therefore the budget right so you can you know choose to take out certain elements of a script to reduce the budget but how does that end up.
affecting the overall score of the movie so those are I would say kind of contributing to roughly like 50% of the variance typically there’s also additional data that we use like distribution data what is the likelihood of getting distribution for a certain film as well as search data so search trends and Twitter data so were putting in a lot of data into this into these models I would imagine though that you have a.
relatively small data set I know a lot of movies come out more so than even Im probably aware of but you have fewer records than you have had on the ad tech projects you work on I imagine to even do cross validation you probably have to be choosy about how much of your set to use for hold out or not do you bump into issues with low sample size just because there are a finite number of movies made not only low sample size but you also have a sample selection bias if you’re just looking at movies that are you know in.
the public domain or that have been made oh good if were just training yeah if you’re just training you’re deep learning model using movies that have been made and then run that against box-office data that would be heavily skewed and have a huge sample selection bias like by far the majority of the scripts never get made right so we do control for that because we look at scripts that are and weve inserted.
ourselves into the various points of the ecosystem to be able to collect this data where we look at scripts that are under development by working with different partners we also look at scripts that have never been picked up so we have a much bigger sample of scripts that are out there and then a small fraction of them are the ones that are actually being considered and an even smaller fraction of them actually get made so that’s one way that we are.
looking at a much bigger data set to control for some of these biases gotcha but then whats your objective function on the movies that were written but never made that’s a very good question because ultimately you know there’s not only the box office successes but you also have the aesthetics the creative decision true for that we are limited by our network that we have created when where we have annotated data and we have collected data internally so people have.
gone through scripts annotated parts that they have liked so weve collected a lot of human train data I see yeah that’s the key right there I think correct and that’s right obviously there’s limitations in collecting that data but that’s that’s really our advantage we realized early on that in order to do this well and in order – you know solve this really hard problem we had to create datasets or tap into datasets that other people dont.
have access to and so that’s why we have to annotate a lot of this data ourselves internally and its an ongoing project where we collect data from multiple sources and annotate it with amateur and sometimes when we can access it professional writers oh very interesting yeah that was gonna be my next question that the annotations are very interesting to me because if you get someone to be an annotator who is.
exceptionally good at spotting hits well then the machine learning is pretty easy but I would be surprised if any one person existed who does that so maybe you’re tapping into wisdom of the crowds or maybe your system is even learning under what conditions do certain annotators provide useful feedback do you have any sort of black box sense of how the your algorithms take advantage of those annotations yes so a general.
Mechanical Turk approach doesnt really work here because you cant just distribute it on Amazon and come back with a bunch of annotations right there it has to be some expert data involved here but to a lot of to a large extent the wisdom of the crowds does work I mean that’s one of the reasons why when you look at films and film predictions.
critic ratings do not necessarily correspond to how successful the film is but when we looked at our annotators and how success or how useful their annotations have been we dont really need a large number of samples from expert annotators or top critics out there the wisdom of the crowd does help a lot because ultimately part of the measure we have is how well like the script would be in the crowd and how is.
Hollywood responded to the idea that you could predict or maybe rank order whatever your deliverable is but give them direction on what they should fund or work on it seems to me that there are a lot of people who may be in the interest of protecting our jobs or due to their you know high confidence and their abilities might say you know no machine can help me here has there been wide acceptance or any resistance to the.
services you guys offer in this way some of the early attempts of bringing in AI into the creative space a lot of the earlier inventors made a lot of big claims you know as a result of that naturally people tend to think bold is that either the claims are exaggerated or there’s a threat to the creative endeavors because of AI none of that is really true I mean ultimately art is an.
aesthetic endeavor the success of how good a piece of art or writing is depends on the observer so its a subjective measure its not a typical way where we can define an objective function that’s why if we are going to develop these methods we ultimately need definitely need a human in the loop that’s guiding this process and helping the Machine learn about whats good at statics and and whats not we have realized this and way we position ourselves its a creative endeavor our.
tools and our methods come in to take away a lot of the tedium and also take a creation and make it a lot more complex an example of this is interactive video games where based on the choices you make the stories can change so in that case or even take the example of TV shows a game of Thrones where there’s been a lot of character interactions over many many episodes so both reflect really complex storylines and that’s one place where people have to rely on these.
tools to make their writing a lot better so weve talked a lot about script analysis and some of the funding and things lets get into going the other direction generative models everybody want to take a quick break from the show and tell you about one of our sponsors periscope data periscope data is perfect for data teams anyone that knows sequel just basic sequel select from where.
stuff theyll be able to master this tool instantly are you doing visualizations in Excel geez you got to get past that even if you’re doing something a little fancier like some matplotlib stuff check out periscope data you can quickly and easily build these interactive charts they have really intelligent to false if you get asked a lot of analytical questions or you want to do a lot of quick analysis.
I cant imagine doing that without periscope data and then once you build stuff its in the dashboard it automatically resyncs to new data so you can follow all these things youve looked up one time as trends in the future organize them into different dashboards that you share with different collaborators if your collaborators can use a web browser they can definitely use periscope data very easy to use very slick see if its right for you and your team at periscope datacom slash skeptics so weve talked a lot about script.
analysis and some of the funding and things lets get into going the other direction generative models and in particular I guess we should start by talking about Sun Spring I think probably every listeners gonna already be familiar with it but just in case there’s someone out there who wasnt can you share the story of how Sun Spring came to be and what the final production was so Sun spring I was definitely an experimental effort by and cue it was taking a very what Id say at this point.
of our development a very rudimentary model based on a long short-term memory with a Mac maximum likelihood objective function that could generate text in the form of a script you know what captured a lot of peoples fascination was taking a script that was produced by a deep learning algorithm and converting that to a film now there’s a lot of effort and I think some excellent direction.
that went into taking that script and turning that into something that was watchable and people actually liked it so the process was at that point it was a text generator long short-term memory tags generator it was produced in a very time constrained manner so it was produced within a space of 48 hours because it was produced during a competition the film itself was there right the script was written in advance the script was also generated during the competition gotcha I think that that part can be forgotten.
so like taking all these together it was definitely quite an effort to produce the ultimate film within the time constraints but ultimately people really liked it now the storylines and the scripts wasnt completely coherent so when I talk to people on the film for them its quite fascinating because the script was generated by AI and is ultimately produced but you know computer scientists tend to be harsh critics of themselves and other methods myself included so you know when when I talk to other computer scientists we obviously dived into how can we make.
this better how can we make future scripts more coherent and it was ultimately really good because its sparked a dialogue it helped us push the technology forward and that was really the starting point and now we made a lot of improvement since then how interesting can you go into some of those advancements maybe I can give a little bit of a bad round especially on generative models so generative models model the joint distribution of the.
input and the output so what you can do is you can sample data because youve modeled Joint Distribution and so with with big you know some of the early works you might have seen with generated deep learning models include the images that were generated with tensorflow they have this crazy hallucinatory look to it conception that right right so that was in the case of images right and then you know beyond that then came.
generated adversarial networks that moved from reading these alien looking images to something that was much more sharper and closer to the training set so the images and also there’s been some work on generating musical pieces as well as well as sounds so a lot of the advancements that have happened have been in in the domains where you’re when you where you have a continuous input.
space language is different and language is a lot harder because language works in discrete spaces what I mean by that is you know if you if you take an image and a color like a red red color and you move in a slight gradient then you get a some version of a red or if you have a circle and you move in a slight gradient.
its still yes like a circle that makes the objective function different and you can do a lot of gradient based methods language doesnt quite work like that if I have a sentence like my cat is cute then if you take the word cat and just do one gradient move and turn the C into a B my bag is cute does not make a lot of sense so a lot of the optimization methods that worked for images and music and a lot of these continuous space models dont work when it comes to.
language in language you’re evaluating the entire output so that’s one of the things that we have to do were to move away from a lot of the objective functions towards using different search techniques like beam search and then also evaluating the entire sentences which require a whole different paradis that’s why I would say language is also harder and at the same time for language you do need a decent size training purpose to really understand whether what you’re generating is good or not so.
those are the two places we had to start furthermore generative adversarial networks work really well again with images because when Ganz started the purpose of Ganz was to produce sharper looking images so what it would do is a again method would go and explore the space and try to hone in closer and closer to the maximum that’s not the.
case when were dealing with language because I could say two different sentences and have similar meaning that’s why we started looking at more auto-encoder models or variational auto encoder models that have been more successful in for our purposes yeah can you tell me a little bit more about those I think that’s one of the keys to the work you.
guys have done right with a variational modeling coders have been used quite a lot in neural machine translation your own machine translations also you know another technique use there is to sequence models so an autoencoder takes an input an input could be in a very high dimension and converts it into a low dimensional space but its own representation to give you an example lets say lets take a tone the sound could be a bird chirping so in the waveform level which could be the input there’s a lot of different variation.
there but underneath the representation that’s learned is just a bird and chirping so if you look at you break it down into kilo her dress on tation its a much more compact representation that’s you know in a lot of senses that’s what our brain does as well it takes an auditory input and then our auditory cortex and words it into a much more compact representation and then there’s a decoder element to that so.
once you want to produce new examples we say some things its emerges as a point in that low dimensional representation then you take the off the inverse process so you go from what a bird song might sound like to the actual waveform so it goes through a process of first compressing and then decompress it and the variational part basically means.
that this objective function is really hard to write down because its multimodal and high dimensional and so on so one way of doing that is by using your sampling methods like Markov chain Monte Monte Carlo methods but that’s computationally very expensive so instead you approximated with much more tractable distributions like multinomial call sets so that is the variational got it makes sense and why do you think its that formalism that’s works so well on.
your applications the reason for that is with language as I said you know there’s a with language there’s a pot and then there’s an expression so there’s multiple ways I could express a particular concept so my core concept could be I want to talk about cats because cats so popular on the Internet in every example has to come from that I want to talk about cats so now there’s different sentences I can construct about that their cat walking.
down the road it cat meowing and so on so you go from concept to all these forms that are a representation of it similarly when you look at a whole bunch of sentences you’re trying to extract more low dimensional representation that’s the encoding part and when it comes decoding want to go from concept to expression so that works pretty well for languages so I was really glad with.
the way you framed the whole Sun spring project I thought it was really neat but also you take it in context as you point out coherency wasnt its strong suit I dont know that it would have gotten the same attention if a human author had written it although I may be David Lynch could have written it and that would have been typical or something like that.
but I I think its a great thing to celebrate in the sense that its a breakthrough it was a first it was a pioneering effort in a script that was generated entirely generatively and of course you know a movie is the contributions of many things the actors the directors all those people came together to do it but its the first of many I think scripts that well see either written in combination with a human or perhaps eventually will have an entirely automated script generated I.
think there’s probably some innovations from what we have today in terms of the tools we use to do deep learning and what it would take to do that because like when I think of a story could a neural network complete my sentences for me sure could it come up with a bunch of characters develop their relationships create a story arc this is a lot of.
sequential connective things that have to be put together do you think its a matter of just throwing more neurons at it or do we need to invent new ways of doing deep learning to achieve that ultimate goal so we do need new architectures to do this it has to take a hierarchical representation so when you’re talking about one concept could be embedded in a sentence but then if you want to be coherent in a whole paragraph you need one love hierarchical level if you want to be.
further more coherent you know about a character cross paragraphs that requires another level hierarchical structures are very important here also for deep neural architectures having an external memory makes the process much more efficient so now if I can remember or if the neural network can remember what was said in you know 52 pages ago and then be part of a search function then you can continue on that same vein so that’s.
an example of you know but with murder mesh mysteries right in the early part of the movie you could have a knife placed under a desk and that’s used later on at the end of the movie alright and that’s the key element so that’s what the external memory helps with being able to query the much earlier part of the script so that’s what were exploring now we have created we have.
new productions in Sun spring I cant go too much into detail there because going to be submitted to some film festivals but the part that we wanted to emphasize was not so much look at you know this this script that was generated by AI and there’s a novelty to it but where does it actually make sense to use this tools one of the productions that we have is in a future where humans have to interact with AI I mean every day were.
doing that were talking to our echo or Google home at home as these devices get better and smarter what might those conversations look like so that’s one take of it the other production that were working on is were going back into films that have been very successful but where they have used an AI character like Star Wars a lot of it or Mystery Science Theater where the AI dialogues have actually been written by.
humans so that’s not very authentic so what happened if we actually tried generating the I to you know are using our models to have actually those AI character dialogues being generated by AI what happens then yeah so these are the things that we are working on right now and should be released in the near future wheres the best place for people.
to follow NQ and some of these things as they start rolling out you can follow our twitter account at NQ Im also on social media day Bray on Twitter as we come up with new projects both AI generated as well as you know that that are kind of a output of our internal models that weve used we post about it regularly it seems that some of our productions seem to be picked up by media because its very topical so would likely be in the news as well like so.
well Dave I really look forward to watching that coverage and seeing what you guys are doing in the future also looking forward to hearing your talk the upcoming conference once again anybody wants to check that out its gonna be in Pasadena on October 22nd the Southern California data science conference a link to that many of the other topics will be in the show notes thanks again for joining me today this is a great thank you very much well thanks for having me.
data skeptic is a listener-supported program to support the show visit data skeptic comm and click on the membership tab.
Leave a Reply