good morning everybody uh thank you for attending uh this event is being recorded and were going to go through a couple of big topics here first of all youve got a lot of marketing systems your datas everywhere uh were gonna first ganesha is going to walk you through in a minute well get to the agenda but were going to dig into how companies in the real world are using artificial.
intelligence machine learning for data science in the marketing and linux domain so lets go to the next slide and well youll see our pictures im grover im the one on the left there im a mathematician that’s usually a warning but in crowds like this its not too bad im a math and topology expert i did grad school with math and quantum electro dynamics if you find that even slightly intimidating please dont.
im a plumber i deal with modern data science tools we have a lot of tools at our disposal i deal with a lot of business data my my specialty is business and retail and marketing and sales data i have done some work in biogenetics epidemiology pharma but i primarily come back and work on business data systems over to you ganesh thanks grover and thanks for having me on on your webinar today really excited to share some thinking around how you can get your data ready to do some of the.
elegant modeling and predictive analytics that zeppels really great at um so my background is is uh in the sas world um in software as a service startups working with small companies and big companies prior to that was in consulting and uh grover if math is a warning then consulting probably might be as well so uh yeah take no take no harm ive ive cinched converted to to being uh a person in industry um i lead product marketing for a number of our go to market activities at snowflake and im really excited today.
to share some of the learnings on the snowflake side around how how different customers are using snowflake and zeppel in marketing analytics use cases so flash forward here a bit on snowflake the background here you know where the cloud data platform essentially what our platform does is takes disparate data sources like all of your customer data we pipe it into a single source of truth a single copy of data that then makes it active for a number of different use cases in terms of marketing analytics the.
workloads that we support everything from data engineering to data warehousing data science these all support marketers in a number of different ways you can see them here on the screen ultimately what snowflake does is we make the data available for concurrent use we can scale up your compute resources on a per second basis so that all of your use cases are fully supported and when it comes to data science we know that power is key getting to personalization in real time can change the difference in.
a customer engagement from being very personalized to being actually quite poor so that’s one of the key use cases that we see on the marketing analytics side which brings me to zeppel and rover im super excited about the zeppel partnership overall um you mind just giving us a minute on how the partnership works in a bit of the integration and i think you’re on mute again uh.
zeppel is a data science notebook in the cloud um were scaling this all the way up from what used to be kind of a kick the tires approach with notebooks all the way to end-to-end production and one of the first things ive been doing for years not this year i mean for maybe five or six years is telling people we have a bad habit in the history of.
data science of using a bunch of flat files and losing our work leaving it unsecured on laptops not being able to share it and so what i tell people is and im sure this doesnt sound real modern to you guys because youve uh youve cloudified everything but i tell people use a proper database and one of the characteristics of a proper database of course is that its available uh that its fast that its secure and.
that you can reach anywhere in the world so what we do is we have data science notebooks you can put modeling in there you can train the models but then unlike most notebooks you can actually go into deployment and that’s where youll be really glad you’re hooked to the live data in snowflake because its available anywhere in the world and its fast and you’re not going to run out of resources so well dig into more than that but that’s uh that’s how we work together so the agenda today were really in two parts ganeshas going to.
deal with the something that snowflake has become a true expert and a domain expert which is marketing data uh how you deal with the data strategy and in my world we always start with the word dont tell me about the project lets start with the data architecture what kind of data do we have how are we going to access it where is it stored uh how does it relate and and so we can.
actually dig into some of the experience and technology that snowflake has developed then im going to take some of our joint customer stories and talk about how we deal with both descriptive and predictive kinds of analytics and modeling and deployments over to you ganesh perfect thanks grover and so yeah so setting the foundation with the data strategy to set the context here as grover said you know in order to really get the value from your data science investments.
you really need to bring data to bear in a seamless way where your high investment resources your talented data scientists arent spending time munging for data right cleaning that data bringing it together instead they have the data available um but oftentimes that that understanding is is isnt there in the organization um and this is a bit of a humorous view at the at the take right you know many organizations come down from from the tops uh.
from the ceo cmo perhaps saying you know this is the year everyone were going to be data-driven and uh you know they they look at maybe some reporting from google or e-consultancy or mckinsey the bain bcgs of the world and they see that being data-driven leads to business growth and that’s great and they feel great you know we nailed it were going to be data-driven now were going to really use our full wealth of data to drive our business um how do you think that makes data scientists and marketers feel.
i know in speaking to our customers when they hear this they feel like they’ve been just dumped on by a pile of bricks no ones sitting around at home trying not to be data-driven in fact most marketers have been trying to be data-driven for the past 20 years um certainly weve been investing in new and uh more effective ways to leverage.
the data but the challenge ultimately isnt the the need to be data-driven its the fact that we have fragmented data sources um just here on the right-hand side salesforce is reporting that over the past two years they saw a 50 jump in the number of data sources that the average team is using and fewer than half of of teams feel like they have a unified view of all their customer data so when it comes to actually leveraging the data having that unified view is almost the first step to making that.
possible a bit about the why of why this is such a challenge im sure many folks have seen this before when it comes to capturing customer data the martech explosion has created a vast macro world of all customer data every touch point and more complex nuanced journeys is being stored in different parts of the customer journey in these sas applications let alone your internal data and that data is only growing exponentially right so ultimately if this data is fragmented.
it creates a big challenge for organizations to get value from it right so when it comes to doing really targeted engagements its hard to do that if you cant enrich your customer profiles and create those deep segments whether you’re doing one-to-one on a data science basis or even something more preliminary when it comes to optimizing those engagements throughout the customer journey to maximize lifetime value that becomes a gap and when it comes to attribution measurement when when grover speaks in a little bit around some different models.
to really measure roi it really is important what data you’re bringing into that model so that you’re capturing all the different touch points across channels and so a lot of this just isnt possible unless that data is in one place and if you’re not breaking down these data silos a lot of investments dont realize their full potential so ill quickly cover off on three steps to bringing your data together right so its pretty basic integrate your data store it in one place and then make it available for sharing across all your.
use cases and ill fly through some of this because i know a lot of the content the richness is going to be bringing it to life in the customer stories in a minute so lets first talk about integrating data right um this is a view on some of the data that might exist in your ecosystem that you already have but maybe arent leveraging to the fullest potential.
the first is often first party data so data that exists in your internal systems your virtual four walls so to say the second is data that you’re bringing within and out of with your your second party partners so strategic partners folks that you work with that have data that’s unique to your relationships your joint relationships with customers a common example here might be brands working with their retail distribution partners and finally we talk about third party data or data that you can procure from the ecosystem to enrich your customer profiles and better do a do a better job of.
creating customer segments and driving greater engagement and so these at each level level there’s different data silos that need to be broken down and it can be very onerous and challenging to try to tackle it all at once but we recommend thinking about these three levels and thinking about what do you need to win um in a pr in an incremental way in order to ultimately drive the value through your modeling and briefly how do you go about doing that um one when it comes to integrating data we we.
work with a ton of etl partners integration partners and you can see some of the views here briefly things to consider before working with the partner think about the coverage and making sure that you have um all of the relevant data sources captured that you’re not going to have to build things on a one-off basis or go to engineering or do it roll up your sleeves yourself to make it happen two when it comes to.
simplicity being able to manage the the integrations on your own so that in the future when you bring in new data you can use the tools that you youve procured to make that possible and then three if you already have pre-approved vendors that’s going to dramatically reduce the time to value so you can get up and running pretty fast.
one area i wanted to quickly highlight which is the opposite of moving data to integrate data is actually sharing data and the snowflake data marketplace is a great option for third-party data essentially thanks to all of the data being in the cloud once data is on snowflake you can access that data without ever having to set up a data pipeline and this dramatically changes the game when it comes to sharing data without moving data.
which means reduced risk for all parties involved the data lives where it needs to live there’s no copies of data it also is much easier to maintain that data so if an api or an etl pipeline breaks in this world you dont need to manage that in fact once the data is refreshed on a partner site it becomes instantly available in your ecosystem so i think this is a great opportunity for third-party data so lets flash forward and talk about.
now storing it in a central place i wont get into all the detail but i think its important for folks to walk away understanding that not all data platforms are built the same its incredibly important to have one single platform where all of your workloads can be concurrently used and you can scale to make sure that a data science workload isnt going to conflict with a uh analytics use case and so that at any single moment in time.
all of your use of that data is possible and i think there’s a lot of buzzwords in the ecosystem right now and a lot of people are using the same marketing lingo ultimately my advice is take a screenshot of this and test it for yourself and make sure that you’re able to run all of the use cases you need in a cost effective way in a way that’s easy to manage and then the third piece just to quickly touch on is sharing data.
right and we talked a little bit about this earlier that in order to then bring the data into your analytics models whether its analytics whether its ad hoc analysis or more programmatic data science uh you’re going to need to make that data available and the again the antiquated way of thinking about this would be actually copying files and maybe flat file uploads into a system we know the pain of that right its its.
costly to maintain its error prone um working with pre-built partners with snowflake like zeppel dramatically changed the game here that data is just instantly available a few clicks through the partner connect ecosystem and that data is always fresh its live and youll be up and running in minutes the last piece on data sharing is uh if its you know for one-to-one use cases one-to-many use cases.
or many to many um i think its worth it for this audience to investigate some of the use cases that you need to activate your data especially if you’re thinking about marketing use cases and then again test and and uh make sure that your the data platform you’re investing in can support all of this an effective way i think and i think there’s a lot of confusion in the marketplace between what data platforms can do and what they.
market they can do so again if i leave you with one one thing its think about your overall needs and and please go ahead and get under the hood and test it so ill leave you with just these three thoughts right so what we what we talked about data in silos is never going to be uh useful for your organization you’re.
never going to be able to run some of these fancy models that grover is about to jump into and you really need to take these three steps right integrate the data store in a single place and enable access to all of your different workloads at the same time rover with that im going to hand it off to you here and im excited to to see some of the real life some of the real life stories here from our joint customers okay and ill share my screen here hopefully its gonna let me do that.
there we go and share and um right here look at me i can use powerpoint okay okay you guys can see my screen okay and let me make sure im in the right place here we can were good okay so lets dig into some of the real life stories uh so in the first day of a statistics class and im a mathematician not a.
statistician but statistics does some really useful stuff and in the first day of the class uh they told you two things they told you statistics is about counting and that counting is hard that’s one thing the other thing they told you is that statistics does two things it analyzes what happens in the past and then using completely different methods it has some predictive ability of the future and that’s what everybody wants in marketing so what we want.
is we want to deal with descriptive that is get my marketing data together uh tell me what happened tell me what worked in general and then well put a modern data science spin on it train some models lets train some models to know what to look for then lets move to can we be looking at live data and as ganesh said being doing something as real time as changing customer recommendations based on something the customer did five minutes ago or two minutes ago uh to making sure that its customer.
personalized uh can we can we direct our ad money to the right ad campaign so lets dig into this uh there’s two big environments and theyre completely totally different there’s the b2b sales and marketing world which is driven by uh customer relationship management systems you know sales force oracle uh crm uh sugar crm you got some kind of big crm out there uh marketing automation systems and there are a huge number uh and without trying to pick one or night one there’s only one out there that deals well with databases much greater than about a.
million records most of them dont why because theyre built for b2b scale um you got outbound calls emails you got trade shows you know we wish we could have trade shows we have a sales cycle and this starts to affect things greatly let me lets slow down for just a second and tell you one huge difference between the b2b world and the b2c world for marketing in the b2b world something i did two years ago could be driving a sale that’s about to happen right now now technically in the b2c world.
advertising from two years ago can cause influence but truly in the b2c world most of my actual sales cycles are measured in hours or minutes and so the scale of the time domain is very different than the sales marketing uh in b2b and sales marketing in b2c in b2b world we have a lot of difficulty attribution but we still try and we also have a complexity here although you have some of this with parent child products in b2c primarily in b2b were selling to.
organizations so we need to track whats happening with all the people in a company not just the one person were talking to right now in b2c as ganeshas picture showed very well we have distribution channels we have to deal with uh with things that are not in our control we have products displayed in multiple storefronts both physical and virtual uh we can have hundreds of millions of people were tracking we have a lot of complexities with search engines repeat customers paid search direct email the sales cycle is often a single visit.
which means that single visit is use it or lose it when someones visited your online store you do not want to depend on coming back you want it to sell now we do have difficulty attribution and were usually trying to solve a person not an organization this requires different data collection different models uh machine learning and data science are only vaguely related between b2b and b2c ive never seen the same exact.
model work working both descriptive this is romney its a rear view mirror its basically looking at what did work in the past predictive is what people want to get to they want to say i would like to use machine learning to change my advertising or at least tell me what to change my advertising i want to move where the puck is going to be not where it has.
been so we want to do both of these uh pretty quickly and we want to make sure we do descriptive you do have to do descriptive first you cant just say give me some of that machine learning stuff and hook it up to my ad campaign manager you have to take a look at what did work in your world and even two companies with almost identical products could have completely different descriptive histories depending on what they did were going to talk about romney this is.
return on marketing investment its about 18 years old its really a millennial thing um its first real use was around 2002 and guy pals book and its pretty simple formula whats not is an roi in the traditional sense roi return on investment is a capex model um and it returns it compares investment capex to yield you can always retain some capex assets unless the house burns down romney is an opex model operating expense and the investment is 100 at risk you could spend a billion dollars in marketing and get nothing its gone so romney is really.
measuring something different than capex roi but still pretty darn useful uh the the formula is really simple uh show me what my change in revenue was times my margin and then subtract out what i spent on a program divided by the program spin and itll tell me what my romney is itll tell me basically what my uh what my what my model is so for example lets suppose i went and got a million dollars in revenue and i spent a hundred thousand dollars in.
marketing well ill take a million subtract a hundred thousand that’s nine hundred thousand i divide it by a hundred thousand and that gives me a rami of nine and so companies are looking for romneys uh they would prefer to have ramis in the small double digits almost nobody has romneys in the large double digits there are two kinds of rami and youll.
hear a lot of people touting both kinds fuzzy it didnt mean no offense to this its not a negative term long-term rami is whats called brand capital and brand capital is when someone does a brand awareness or brand positivity uh theyre trying to measure the benefit quote-unquote of your entire marketing of the projects i am asked to work on are sharp romney tell me where we spent money unless its working tell me if its working track the inputs tell me if we got sales from it uh you know and and show me that in a.
business cycle sense dont tell me about five years tell me about the money i spent in 2019 and the revenue im getting in 2020 that’s considered sharp ramen you do have to separate out existing from renewal a lot of mature businesses are only really interested in renewal business because they dont get that many new customers so they dig down into show me how marketing helped to drive renewals and upsells most tech companies are interested in new business they want to know.
show me what activities bring me new customers and so depending on the maturity of your industry you may care more about upsells renewals and customer retention or you may care about new sales to new customers online stores by the way will tell you they make most of their money off of repeat return customers and that’s where most of the love goes and you probably notice that with online stores they dont work as hard to get your business they do love new customers what they really really want is to make customers satisfied.
high-tech companies want new customers lets dig into one of the things we use data science for and this is by the way you know typical typical of the kind of customers that snowflake and zeppel have in common they often want to go if theyre doing huge deals they want to go in and look at show me exactly what happened in a particular deal and so possibly for the first time in your life so lets slow down for just a minute you’re looking at what we call single stream and youll hear that word.
in data science a lot some of the pictures that ganesh showed are how you get to single stream and what you do is you collect all this amorphous data you get it into a database that can scale like snowflake exactly why were talking with you about this and then youll just isolate it down to show me all the events in all time for a customer in this.
particular case since this is a business to business sale it is an opportunity relatively large that moved from historical down to a closed one condition down there at the bottom so old time is at the top new time is at the bottom and here you can see something that you cant tell anywhere else you know how sales and marketing often disagree whether the the deal came from.
sales or marketing im sure nobody in this call has ever heard that argument but just pretend that there is such an argument this clearly shows this particular deal was first initiated by marketing and over on the left-hand side where weve isolated the marketing events you can see something you never see on the right-hand side and this is one of the purposes of data visualization anybody here who hasnt.
taken edward toothdays course please go take it data visualization but the punch line after two days of watching every tuesday do magic hell give you the truth the data have to tell you how to be presented you’re looking at something you may have never seen its called a champagne chart and its purpose is to show you marketing events often create pack behavior youll see five different people did the same thing.
on the same day to start this scan down to right where the deal was created down there where it shifts from red to purple again 30 people from the same company showed up at the same webinar that happened right after the deal was turned from an you know prospect into an opportunity and then you can see marketing goes quiet like it should while sales is doing its business then without marketing doing anything end users are coming in and participating in the content and it races down to the close where the.
sales activities and the flurry and the close deals this is what a multi-million dollar deal looks like now why dont i explain all of that because mostly were not looking at single customer were looking at cross customer but we do want the ability to visualize a single customer that is a unique thing to do forensics in good and bad deals heres the opposite kind of deal where you can see the first activity was a sales call and.
then very very early it went from red to purple the salesperson said almost immediately we have an opportunity here so we shifted from prospecting to selling you can see most of the behavior here since its a sales driven deal is over on the right hand side but you can see those stripes again that you never see on the right hand side but you see on the left where 50 people do the same thing on the same day because they were stimulated by.
something that happened on the sales side to all go download the same white paper all go see the same webinar ill go pull the same document ill go visit the same website so this kind of visibility tells you youve validated and by the way if you if you cant see these data in your system then you dont even know if you have the event data but you do need to single.
stream it sales and marketing together that takes that’s the that’s the t in etl you need to transform data and tag it and say this was sales data this was marketing data this event happened here the other thing you want to have by the way which is a religion with me is ive got several customers that aggregate their data and theyll take the events and theyll aggregate them by day dont do that leave the time stamps on there if i dont have a time stamp i cant.
tell if a phone call caused a web visit or if a web visit caused a phone call i got to have the time stamp down to the seconds i dont need the milliseconds but frankly in e-commerce there are times when we can only debug problems with milliseconds so do not trunk time date times its worth saving the database hold it you need it okay what else do we do we use a pattern fitting a type of.
machine learning uh that will and artificial itll go in and find pattern fitting you can see the curve at the bottom is the second order polynomials curve fit uh this particular one comes from a particular type of curve fit this particular one is a halt analysis and that brings up another topic most of the algorithms were looking at came from the 40s 50s and 60s very little in data science other than.
topology is new but the math is now applied with massive databases fast networks cloud containers smart notebooks and good programming languages what this picture is showing is the deal was one and these were all one deals down in the lower hand corner that’s day zero everything before that’s negative time so 360 days before people were doing evaluations and you can see that this is a pattern.
of how people consume product evaluations for a sas product before they buy it and you can see a fair number of people are more than a year so some people evaluate a product more than a year before they buy it this is part of the b2b conundrum what marketing interactions actually led to the sale its a lot by the way in this one little deals at the bottom really big deals at the top and you can see even for some really big deals people are are up there evaluating a product.
60 days before they buy it so there are quick sales even in this world white papers uh white papers in general dont take this personally are awful at uh at being associated with sales they are theyre throwaway consumables they very seldom drive a sale very seldom become the attribution king for a sales cycle well and by the way i could have shown.
you 50 models and that’s one thing we do when somebody says whats the best algorithm its like you give me your data let me cycle through 50 of them and ill tell you which is the best fit and the best predictor um can i can i guess in advance sure for certain kinds of data we can usually say this model or that model or another model particularly works best there are about 200 popular artificial intelligence pattern fitting models there are lots of different kinds well talk about that in a minute.
but first of all you have to get the data and ganesh already talked about a lot of that and im not going to dig in too much you have to get access to all systems you see that bullet there getting access to all the systems that’s weeks if you get that overnight you’re you’re the ceo uh it takes a lot of time to get all permission to find out where all the systems are and go from there.
okay um after youve gotten the data at some points you want to go to what i showed you i showed you those champagne charts to to convince you you want time-based data that can get you all the way down to a person organization and sale on the right use a proper database do not store your stuff in flat files dont put it on a laptop plan for it to scale so even if you’re.
just kicking the tires start with a database please start with a good one start with a cloud one we really like snowflake could be because its fast its cloud so if we want to transfer a project from europe to the united states for analytics the data is already you know achievable and its not if its in your data center it needs a cloud expert to do that your data center database may be 50 to 100 times slower if you try to.
get to it across an ocean um then you subset and model the data you want to do those models i showed you and i realized that you probably want a tutorial on how to do that ask and well do that well actually go through live data and have a little tutorial its pretty geeky but you want to identify what success looks like you want to identify the population regardless of success by the way dont.
call them losers you want winners at the top and then you want everybody dont try to say people who who didnt buy just compare who won with everybody including you one its a better model you benchmark it you want to show the difference between customer demographics between buyers and non-buyers inbound attribution campaigns in cycle behaviors all touch points by pi i time please keep the time stamp not just the date you want to find the patterns and train.
the models you want to go from broad discovery algorithms to deep learning and then you want to find the best fit and then you’re all done and you prevent your metrics for phase one which is those charts i showed you are phase one romney charts heres what we found heres whats working we have trained models now now models and probability matching um one of the things you do like when.
you train a model and i i pick pictures because people can relate with it well give an ai system 100 million images of cat its very important to give it images of not cat make sure you put some stuffed toy cats in the not cat otherwise your model will have been trained to train cats how does this relate to business data point your data at things that are these deals one and then point your data at heres.
the general population not necessarily losers so make sure your data knows the difference between these deals one and these deals were just visitors generic visitors once the model is trained depending on your data and depending on how good the model is itll take a picture of a cat and itll give you a probability so what you want when you’re all done is a model that can tell you heres my prediction heres the probability that its right at the bottom too by the way you can see.
the same two pictures identical except one of them has lost some information therefore the probability is cut in one third that my model can guarantee its a cat if i cant see the head i cant tell its a cat how does that relate to marketing data a lot of times in business to business if you completely switch the buyer from the people you talk to in the beginning to the people who are dealing with the deal completely mid in the middle like the like the key buyer leaves you’re just.
like a headless cat you cant tell what this deal is going to do because the buyer changed i have no history on what this persons going to do or has consumed makes a big difference and again as i put a note on the bottom think about this in terms of stuff that came from the period between 1935 and 1965 when most of these algorithms were invented a lot of the pattern matching comes from uh from information theory and claude dr claude shannons uh pepper and information theory in 1948.
models that curve fit by the way most of the models that curve fit came from people like john von neumann uh who who basically modernized 48 transforms into usable and if you have enough data and discrete information your ai will look at that top set of data and say i recognize that that’s called root mean square anybody whos ever had an oscilloscope recognizes that curve in the real world because we lack things and how do we lack things so your customers are accessing your website on their phone and your marketo didnt cookie their.
phone so you cant see all the instances so what you’d like to have looks like the top curve really looks like the middle curve and why does that matter because the polynomial that fits the data is much more complex and computationally expensive then you get down to that bottom data that bottom data is actually just one of those curves coming down from the descending root mean square its about to pop back up but if you look at it in isolation youll say.
this is going to zero nothings happening here when in fact what its about to do is rebound and what you’re looking for is a probability cone like that little green cone in the middle most machine learning models create one of those on their forecasting uh if i screenshotted 100 zeppel notebooks you’d see that little cone 100 times because that’s what curve fitting and then prediction looks like so get your data into a proper database sure some of it will always be messy some of.
it like for natural language processing will just be text stream data or audio data but where you can put your put your data in a proper database remember that most data science algorithms are working on what i call square data anyway theyre looking at basically a table data and theyre mostly looking at it in memory you’re gonna need to move to a proper database anyway because you want to do.
this with live data you want to hook your systems up so its not a one-time extraction so that youve got live data streaming into a database and then you can hook it into the system heres one of the challenges that was brought up by ganesh dont store your stuff on roving laptops your your most expensive valuable data can be everywhere it can be stolen it can be walked out the door by an employee make sure you’re keeping it.
in security and this by the way is one of the reasons why you marry notebooks like zeppel with snowflake because then we can have trusted people who have what we call the secrets the access to the data files and credentials and they can share the output with people who cant even see the database that’s very useful so we use the notebooks a little bit as the presentation layer.
for the results of data science to the end users who may be business executives there can be thousands of people who see the notebook but only two people know the secrets to log in that’s why you want this in a proper place live data you need a picture like this left-hand side all your sources my crm my advertising systems my storefront my erp inventory.
systems marketing automation point of sale i want that going to an input engine you usually need something like a zeppel transform engine you use notebooks to transform the data so well you know well have a proper etl by the way snowflake has a lot of really cool connectors that do a lot of that etl work for us but sometimes we need a little bit of nudge a little bit of logic to turn it.
into the kind of data that’s in that blue database in the middle and there its single stream then you want a modeling engine that’s constantly retraining and its taking yesterdays inputs and turning it into the days predictions forecasting product recommendations you can feed this back to your marketing campaigns turn the spend more on these ads spend less on these other ads.
it can shove your online store here are the most popular products that goes into the product recommendation engine uh goes to finance heres the forecasting based on whats going on you want a universal engine and you do want it in a live database just like for showing with live engines that can do it and one of the nice things about using those live engines is you eat them by the drink you pay for them by the drink they only.
run when they need to so you’re not paying for containers and resources you’re not using you’re only paying for what you’re using for at that moment that’s true by the way with snowflake and with zeppel you know you’re kind of paying on consumption as opposed to just for static infrastructure as you scale up you want lots of pieces you do not want to build one huge application you probably want 50 notebooks uh its nice to have a single database.
and its nice to have mini modules you only want to wake them up when theyre needed you only want them to run when theyre needed that’s one nice thing about using the zeppel notebook model you only pay when theyre running that’s very helpful um you dont pay for idle resources uh and then if you need to modify a.
module you modify it in isolation of the rest of the system so you have to break another stuff and you want it to scale um you want the ability to scale to huge size but you only want to pay for what you’re paying for by breaking up into modules each module is the right size for the job it needs and if you can you want to stay in the notebook world some people like to leave the notebook world and say oh i built the code of the notebook pull it out of there.
put it in the data center what do you have untrained models you took your models you deployed them the data changed and your models not changing with it stay in a data science notebook where you can retrain the models in production retrain them daily retrain them weekly you’re always deploying fresh information you’re not dealing with static models its a great combination the cloud cloud database the notebook model is.
actually working people are doing this today and it works really well so ganesh how are you doing im uh im just zoned in grover thanks for sharing some of those use cases okay uh elise do we have any questions are there any questions for uh myself or ganesh we do have a couple questions um so somebody said so you mentioned data science notebooks we use both jupiter and zeppelin notebooks we confess that yes many of those live on individual laptops rather than a central location you know what what would i do how would i use.
duple or snowflake to standardize and get a handle uh and security around these notebooks um well ill tackle part of that um and part by the way part of the security answer is snowflake itself we actually recommend that you take the secrets from those jupiter notebooks so if you got a jupiter notebook probably somewhere on your local machine there’s the secrets how to access the database we recommend.
that you put that into a connector and that you put the secrets into a trusted database so that theyre under administrative control and not sitting around on someones laptop the other thing is we can just take and import jupiter notebooks or zeppelin notebooks it works fine the history of notebooks is a common history there are a lot of features here uh that you do want to go beyond that zeppel does have that productionizes.
this uh i guess that’s a verb i made up but um but you know you you want to move away from you want to move away from stuff that’s on laptops and stuffs on the cloud ill give you an example of my world three years ago i was paying three thousand dollars each every time i upgraded my laptop every time i now am able to work for two.
weeks in a row on a 300 laptop why because the tools and the cloud are faster better more secure and more reliable its just better and so get away from the 3 dollar laptops uh you know and having a big disc in your home apartment uh and use cloud notebooks in the notebooks in zeppel notebooks it works better hope that helps cool thank you how do you access data from marketo and.
salesforce um theyre very different uh i think uh ganesh you guys actually have connectors a lot of these systems dont you yeah um speaking about both of those specifically i know that you know a lot of different ways you can bring marketo data into snowflake its a very common structured database sometimes you can get unstructured jsons as well from your general marketing channels even beyond marketo there’s etl solutions that you can use that are.
that are pretty effective at bringing that data into snowflake and then speaking about salesforce particularly i know snowflake and salesforce have had a great partnership especially with the last round of funding and there’s some connectors that are available for beta today and in general moving forward there’s going to be some great data sharing across those two partners in general i would highly recommend looking back to the slide around how do you choose the right etl vendor um going back to ensuring that youve got the right connectors theyre easy to.
use theyre self-serve theyre not custom code based but point-and-click-based um so that you can have the kind of flexibility going forward and it does save a lot of time people like me used to spend 80 of my time doing that and people like snowflake have built technologies so that the time i spend on that is now down to a couple percent instead of 80 um the other thing is i would motivate you to do it in the case of the motivation for moving.
salesforce data into snowflake its just access you just you get better access you move marketo data into snowflake because marketo doesnt keep events past 90 days anymore and that’s key you want to have a record of your for b2b sales it could take you two years to close a deal marketo is keeping the events for 90 days and that’s one of the motivations for for keeping key events and putting them.
in an into database great thank you um grover this questions for you um if we dont have much data science or engineering experience what can we do where do we start what kind of team do we need to build out um i wouldnt recommend and this is hard this is a tough question i companies cant start by hiring data scientists usually you need to go back to the profession that ganesh talked about you need to bring in.
some consultants who know what theyre doing prove to you the benefit of what you can do because for a company to go from zero to data science meet with internal people is usually really hard you what you need to do is say ive got a project i dont have i dont have huge goals i get a project and i want the project to be a proof of concept that we can get some benefit that project will have to do a fair amount of the work we just talked about.
but by constraining it and literally by leaving some of the systems out you shrink the project down to be successful then once youve learned where it is you now know where to bring in data scientists and i will caution you about data scientists most data scientists are who have any experience are domain experts they know a particular kind of data really well im okay with epidemiologic data but with business data you know all those systems that ganesh showed i know them all i every one of them and you dont get that if you have.
someone that’s outside your domain so the skills are the common the module models are actually very common but actually the domain experience and so i would say start with a project find a small data science team who can do this um and and go from there but there there is a model to bring this all in house and after the notebooks come in once you know whats going to work.
then you want data scientists and by the way i will also say this good business analysts whove been using tools like sas for years are really data scientists in the making the the jump from being a business analyst or a bi person to a data scientist i know its 100 hours of training its not that bad so you know that’s where we can if you email me i can point you to how to get started depending on your domain and grover is that also something zeppel.
can help with if someone were to identify heres a project that i have sort of where do i start who are the consultants that i should bring in we actually have some uh consulting partners uh so yes we can help we can point you to consulting partners uh you know for some certain partners we actually need snowflake and zeppel right now are working jointly with consulting projects for.
certain customers usually slightly larger customers but yes uh we can help but you know in a literal sense getting connections in the data science world is not that hard so connect us and well try to connect you with people who can help you great no i thought that was a great question great response grover i thought id add to that um one thing that weve seen on the snowflake side is a bit of a maturity model when it comes.
to getting started in this space i know the person asked specifically around data science or data engineering resources and where to get started um the first step we often see is just getting a 360 view and doing some of the descriptive analytics that grover was speaking to understanding where are your quick wins where do you want to invest taking a step back and and understanding the landscape of the potential right um then there’s kind of three options from there right and these are the most common ones there might be others.
one is around getting a better understanding of roi attribution so i know grover spoke about romney and a couple other different models you can start with descriptive and then get deeper with data science on on attribution pretty quick im sure many people are already doing attribution analytics its not like you’re going from zero to one there but you’re bringing more data from your.
360 view to make that roi analysis more productive and then there then you can get into more campaign optimization right get into better sub segments of your customers again starting with descriptive analytics um and then finally you can get to deep deep personalization in doing real-time engagements across all of your channels uh grover said this pretty well up front right starting with data science when you are using the definition of.
data science as multi-channel real-time personalization might be a recipe for disaster because you’re basically out of the starting gates at 100 miles an hour without even a seat belt on right so take a step back understand what data you have your customer 360 view and then start figuring out what are the right investments to make great and then just uh one last one grover what models work best for campaign analysis what kind of machine models work machine learning models work best let me give you an answer i usually give when we talk about mass testing a b.
test subjects and emails ive ive deployed probably a quarter million email subject tests in the last five years and people say you do this a lot which subject you think is going to win i dont know that’s why we test them and really the same answer is the truth data science get the data in a place you can iterate across it and then study it with as many as models.
as you can stand i usually plan for big projects on using 50 different kinds of models until we something that has what we call an r value that says this model works the best to fit these data and it varies now i will give you a little bit of hope once a model is proven to fit your data it usually just needs to be retrained not replaced so uh and this is i mean its its this.
this this is a subject of two days worth of lectures how to iterate through models uh and that’s one thing youll notice there’s not one model there are hundreds and hundreds of data science pattern fitting predictive models awesome thank you gentlemen um super great content um i dont think we have any other questions uh thank you everybody for your your attendance and your questions and your time and ganesha and grover thank you so much for uh sharing your expertise with us today have a great day everybody.
appreciate it everybody thanks everyone.
Leave a Reply