Using Data in Decision Making - CEO of Metabase
Note: I have no idea why I (Jevin) sound like a robot in this episode. Sorry y’all!
Jevin: [00:00:00] Welcome everyone to building remote teams. I’m here with Samir from Metabase. Hello, Samir.
[00:00:07] Sameer: [00:00:07] Hey, Jevin, how’s it going?
[00:00:09] Jevin: [00:00:09] Am doing well, yes, we are still living in deep, deep COVID times. So I’m always happy to do some adulting with some other experience lifers. So today I want to talk about kind of data and how we can use that in decision-making, and this is, of course it’s like a super huge topic, especially like early two thousands for the whole business intelligence.
[00:00:33] Buzzwords. And now we’re using data science to come up with the answers. But I’ve noticed in my own life that, that sometimes we can use data and in like really unhealthy ways in a team. So I’m hoping we can kind of jam on that a little bit today, but first Sameer, I want to hear a little bit about Metabase, tell us about the company and kind of your big vision there.
[00:00:55] Sameer: [00:00:55] Yeah. So might’ve base at its heart and soul is an open source project. We do offer commercial additions. We do offer managed services, but fundamentally what it is is it’s a simple, easy way to connect our application to any database you have lying around. And once that section’s made anyone in your company can ask their own questions.
[00:01:13] They can build their own dashboards. They can set themselves up with alerts or not the emails. And in general, the thing that we’re really provide is the ability for anyone, regardless of whether they’re in the data priesthood or not to just pull their own information and control their own destiny in that way.
[00:01:30]Some of the things that make us interesting are just this focus on iterative querying. So it’s not so much that you have a specific question that you formulated, you read the SQL, it’s more that I’m curious about signups last week and we let you essentially rummage around the data sets that you have access to.
[00:01:46] And so a lot of our magic comes in when everyone, your company gets to rummage around and find the things that help them be a little smarter at their job.
[00:01:55] Jevin: [00:01:55] Nice. Now, did you say you don’t have to be part of the data priesthood? So did I hear that right?
[00:02:01] Sameer: [00:02:01] Yes you did.
[00:02:02] Jevin: [00:02:02] What the heck is the data of priesthood? I need this. Can I get a certification as this is offered by Google?
[00:02:08] Sameer: [00:02:08] you probably can. So I think it’s kinda my running joke about in most companies there’s kind of official canonical, blessed from on high data has looked at it, someone’s checked it. Someone’s built the dashboard for you. There are these people over there that understand data and for most people, data is a four-letter word. It’s, a label you give to things you don’t understand. If you understood what something was, you have a proper noun for it. So you have sales, qualified leads. You have website visitors, you have, charts, you have uploaded photos, you have user generated stories. And so in general, for people that are working in an actual real life job, like they have words for the things they interact with.
[00:02:54]And when they use data, it’s usually in the sense of this like unknown, uncertain basket of things they don’t fully understand. And so whenever I hear a large company talk about monetizing their data in general means they have no idea what they’re doing.
[00:03:10] Jevin: [00:03:10] okay.
[00:03:11] Sameer: [00:03:11] because the companies that monetize their data don’t use that word. , they use things like we are going to sell our user location data to creepy, startups in ad tech, or we’re going to deliver industry reports based on the information we’re seeing or any one of a number of things that it signify that.
[00:03:32] What you’re working with. You have a framework for why it’s valuable you’re not just trying to collect a bunch of data and somehow magically transform into something else. back to your original question around data priesthood, it’s the idea that in most companies there’s like a handful of people in the org chart.
[00:03:51] Roughly speaking that understand what the data is. They understand where it came from and they have the blessings of the company to use it. And to disseminate it to everyone else in the org.
[00:04:04] Jevin: [00:04:04] Very cool. Okay. So people have installed Metabase or some other tool inside their company. And so today I want to talk about, using, using the data within the company for the company’s own purposes. So you’re sitting in marketing and you’re trying to make better decisions as a team. So I want to run a story by you, Samir, and I want you to be able to critique, how Jevin did with how he, how he tried to use data as a voice in decision-making. Okay. So so I was working as a consultant for a team.
[00:04:35] And we were, I came in to kind of help kind of on the growth engineering side. So like overhaul the analytics and like help help to focus some of our initiatives. So I came in and the companies, main kind of like core user is like a middle-aged woman, ? And so one person or the team, is happens to know that demographic really well.
[00:04:53] And she had said, everyone is using. Pinterest, this is where this demographic’s hanging out, which I think makes a lot of sense. And so once I started digging into the analytics and realizing , there’s nobody coming in through Pinterest. And yet this, this individual has put, like, I would probably easily say, 80 hours, just putting content on to Pinterest to try to drive leads, but looking at the data, it looked like that was not the case.
[00:05:15]So once I, like, Hey look, here’s, what’s happening. It looks like we’re actually not getting any leads from effectively all of your efforts. , I said it probably in a nicer way than that, but that really sets something off. And , that that individual had, had really.
[00:05:31] Kind of shut me out. So I was a little bit, I was a little bit taken aback because it was like, it’s just data, it’s just an objective thing. Like when, of course our listeners are gonna be like, Jevon, you idiot. Like obviously she has an emotional attachment to this, but Sameer, can you, like, I don’t know if you’ve seen this in other teams or maybe it’s super clear to you, but like what did I do wrong in this situation?
[00:05:54]Sameer: [00:05:54] I can’t speak to you or her, so I don’t know what’s in either of your heads or hearts.
[00:05:59] Jevin: [00:05:59] That’s fair.
Decision Making Styles in Teams - Narrative and/or Data Driven
[00:05:59]Sameer: [00:05:59] There is typically is a decision-making culture and most teams just to, overly simplify the world is in one of exactly two options. And there’s more, but I think there’s like two ones to talk about.
[00:06:15] And one of them is what I’d call narrative based decision making, where you kind of come in and you have a view of the world. , you have the view of the world as had, could be, or as it should be, or as you believe that it is. And in this case, it was your, your, I guess, coworker or clients, a belief that the demographic that they were chasing that you’re chasing lives on Pinterest and that you should be getting leads from Pinterest. And so I think there’s a lot in that statement, but I’ll just jump ahead. And then there’s another data decision making culture, which is more data data-driven or data-based. Where it’s like, Hey, here’s the current leads we have. Our fundamental operating principle is we want to look to things that are working and double down on them. And I think those two are just fundamentally different ways of making decisions and I think they both have their place. And so maybe one question that. If you were to play it back again, would be useful to add to, or one framing that be useful to approach them with , it looks like that, our demographic is primarily, , women ranging from X to Y ages.
[00:07:26], they’re definitely live more on, on Pinterest and they do on Twitter or other places. It seems like we’re not getting any leads from them. And so, this is one of those things where if we have a strategy for generating leads from Pinterest, because we believe it’s a good source, then we should like refine it and spend more time on it.
[00:07:45] And so it’s not that the 80 hours that you spent are useless it’s that maybe that core problem of, we believe that Pinterest should be a source for whatever reasons we have going in. Do you, are you, do you want to operate under that world and make that work or. Hey, we noticed that ad-words are working really well.
[00:08:04] We have limited budget. Let’s just, work that channel till it’s saturated. Now, I have my own opinions about what, what you should do, but I think some of this is client management and, making sure that you don’t get in the way of their bad decisions.
[00:08:18] Jevin: [00:08:18] Sure that makes sense. I mean, I was bringing this up as an example and how to, maybe how not to use data in, in, in to try to, I guess, force your way through, with your opinion. I mean, looking at the numbers strictly, strictly speaking, this was not a channel that was working not to say things wouldn’t change, but I think so.
[00:08:37] So explain to me the difference between those two ways of approaching things, because it seems like you were using data. As, as just like a little bit of a dashboard light instead of kind of being this hard thing, saying, yes, no, we shouldn’t be doing this
[00:08:50] Sameer: [00:08:50] Yeah. So I’ll kind of give you a little bit of anchoring and how we use data internally and database. And
[00:08:55] Jevin: [00:08:55] sure. And we can speak more generally. You don’t have to do, we don’t have to keep checking my, my terrible, my terrible, my terrible collaboration in that particular instance.
[00:09:04] Sameer: [00:09:04] But I think there’s something interesting about that story as well, because it touches on a lot of threads that you can pull on. One of the most interesting threads to pull on is the degree to which the current state of your channels. Is what it should be. And I think that there’s kind of two approaches to, , building up companies, building up products, building up, marketing efforts.
[00:09:27] And one of them is we have a high conviction bet that X is that what’ll work out in the end that we have a high conviction bet that we want to get, customers that spend a hundred thousand dollars with us. Because if we’re successful in our domain, most of our revenue come from there, therefore we need to chase them.
[00:09:48] And if you kind of come to me and say, most of the customers coming in wanting pay a hundred dollars, we should launch a product that, that charges a hundred dollars a year. The answer to that is like, doesn’t matter, what’s coming in because that’s not going to work. So if that’s what’s coming in, we have to change.
[00:10:06] What’s coming in. Now, some people kind of do this in their heads and never articulate the statement that the, our current set of channels, while they may be having bounded success, have the kind of success that we don’t think is long-term sustainable or useful. as kind of a meta, sort of like meta game, we want to change what’s on the board. Like, we don’t want to just say cool. We’re doing really well in the AdWords, but we’re getting our, our behinds handed to us on just keyword bidding. Fundamentally, it’s a losing game. If we’re going to be successful, we have to find a organic social strategy. And even if it’s not working, we need to make it work. And I’m not sure this is what was in your client’s head. I don’t know if this was kind of the way she was contemplating internally, but I do think that it’s important to call out this idea that. There tends to be a very simplified view of data and data. Decision-making just the data says X.
Interpreting Data is Not Easy
[00:11:04] And the honest truth of it is that inferring the right decision based on data is a very challenging and very hard. a high degree of skill and understanding what questions to ask and then interpreting the answers and often going back and understanding if the question that you asked is actually something that. Gives you like an signal to make decisions. And so the example that kind of threw out, which is, our inbound lead funnel is, is brimming with, is with folks that want to pay a thousand dollars a year. And we have this incredible signal let’s lean in that, but if your underlying cost of service delivery is like \$500, then you did just should be playing that game. And so the right answer is just drop a chess board, pick up another, another game and start over or just change the game somehow. And so I
[00:12:00] Jevin: [00:12:00] That makes sense. Yep.
[00:12:02] Sameer: [00:12:02] often when you’re kind of trying to cross the boundary, but people have very strong, intuitive senses of what the right decision is with kind of a more data-driven decision making process. There’s often a lot more going on in the intuitive decision-making world than you necessarily get at first glance and there’s often less going on in the data driven world. And so, maybe a, another example of using your team with data is kind of the usual AB testing games, where, if you talk to anyone that’s done AB testing at large politicized companies, there’s often this thing that starts to happen where there’s a pressure to stop the AB test once the desired answer pops up.
[00:12:49] Jevin: [00:12:49] right. The CEO really wanted this orange button and it’s currently blue and , it’s starting to lead in the polls of the B of the AB test. So let’s just do it. Let’s just lock it in orange. It is
[00:13:01] Sameer: [00:13:01] And this dynamic happens all over the place and you start to see it more and more, and sometimes it’s conscious sometimes you’re just, If you give the CEO what they want, you’ll get a promotion. And sometimes it’s just like, you have your own preferences. And once you see enough data to, to confirm your biases, you’re ready to move on and not waste more time. And so I think, when you start getting into, how to make decisions backed up by data important to remember that, like you only ever see a subset of things you need. You almost always have to work with imperfect data and that, rather than chess, maybe the best analogy is poker where you, you never know what all the cards are, but you have to still make decisions based on imperfect information.
[00:13:46] Jevin: [00:13:46] Right, right. Yeah, that makes sense. I think people want to have this idea of perfect data. , I, I I’m I have people reach out periodically on, on , for, for some, for some analytics and consulting and people are just so hung up on like finding every single data point. When I encourage people, instead just get three data points to start, and then you can get more and more granular, but people get so hung up on wanting like every well, some people, I guess, that are like, I want to control everything.
[00:14:14] I want to know exactly what’s happening. And they, and they’ve spent so much effort on just setting up their analytics and they just get lost in their data.
How to make your life painful with data (Pack rats or only showing key metrics)
[00:14:22] Sameer: [00:14:22] Yeah. So if I can kind of like describe another way to like, make data painful for everyone involved.
[00:14:28] Jevin: [00:14:28] please. Yeah. Tell us how to make it painful.
[00:14:31] Sameer: [00:14:31] so there’s like kind of two opposite ends of the spectrum, both of which will cause you a lot of pain. One of them is just the pack rat kind of set up where people want everything. So every possible aspect of every possible event needs to be itemized and catalog right now. There is, the danger that you spent so much time collecting and not enough time actually formulating hypotheses that you then test with data. the other is kind of this narrow fixation on, there’s one like there’s one, two or three metrics that matter, and that everything else is superfluous and that you end up having this idealized conversation about what those metrics should be.
[00:15:12] And then you realize that getting those metrics is impossible. And then you’re kind of stuck in this weird world where you had these things that you want, that you can’t really have, but you’re still focused on like, these are the three things that drive growth. So for example, we must measure cross-channel virality and the answer is maybe that’s just not in the cards for you.
[00:15:34] Jevin: [00:15:34] yeah,
[00:15:35] Sameer: [00:15:35] Like,
[00:15:35] Jevin: [00:15:35] because of what, because of how they’re tracking their data or just like, why wouldn’t it work for them?
[00:15:40] Sameer: [00:15:40] Maybe you’re not tracking data properly. Maybe the channels you’re trying to correlate across is fundamentally don’t line up. , I, I’ve seen people with like demand, offline to online attribution and it’s like, well, not all of that is going to happen for you. And even if it’s a truly important, like, ground zero KPI that determines how well you’re doing, might not just be able to get it.
[00:16:03] This like a mushy world in the middle where. We live, which is, your, your data Nacht, your, your company naturally collects a fair amount of data. a lot of that has a value in decision-making. You’re not probably not going to end up with this idealized world where if this number is over 2.75, do X, if it’s below 0.7 to five to do Y but instead you kind of have to formulate your own.
[00:16:29] No framework or model for your business, you have to essentially tease apart various hypothesis you have. And then based on what you see in front of you in the charts or in the dashboards or the data polls or the spreadsheet, then kind of go back and be like, okay, based on this mental framework I have for growth.
[00:16:50] So for example , one common framework is retention, Uber Alles, where, if you. If you drop, churn, everything else magically works. then you start saying like, cool. So like the fundamental framework I have is churn is the root cause I needed to get to the bottom of why people are churning, whether that’s using qualitative data, quantitative information from tracking, or just, get concensus.
[00:17:12] I want to use the data I have to validate or invalidate those. I want to like work my way through all the reasons people are churning and fix them. And then, hopefully that then leads to growth. but along the way, you’re making decisions like, which of these are important, like which, which potential reasons for churn are actually hitting are actually hitting our users.
[00:17:32]Which ones are really just rounding errors. Now that I know which ones are important, which ones seem like they’ll have the most technical lift. And so, if, for example, there’s fundamental product dissatisfaction and you’re not going to fix it this quarter, but you have to do something. Maybe you clean up like.
[00:17:50] The other areas of churn and all along the way, I would contend you should be infusing those decisions with the data in that context, as opposed to having this again, either pack rat mentality where you need everything at all times, or conversely, where this is the only thing that matters. And these two supporting proxies are the only things that matter.
[00:18:14] And so we’re going to make all our decisions based on these three numbers.
How to Start using Data in a Better Way and Finding Early Wins
[00:18:18] Jevin: [00:18:18] Sure. That makes sense. So, so for people, yeah. So, so it sounds like those are kind of opposite ends of the spectrum. So someone’s listening to this and they’re like, okay. Yeah. Like, maybe they’re not using really data at all, apart from maybe the most superficial kind of data, like, uniques a month from, from Google analytics or, leads per month through Salesforce or something.
[00:18:41]But they want to become kind of more. I guess, healthy and how they, how they’re using this data to explore. How can they get started to like start incorporating data in healthy ways maybe from where it’s not on kind of those two ends of the spec, like either one of those sides of the spectrum, but , kind of in the middle using it to kind of ask questions, informing. maybe their decision-making process, where the, now they have to understand that your data is probably not perfect. That there’s actually more behind it.
[00:19:10] Sameer: [00:19:10] Yeah. I think there’s just like almost a dark matter of. Data that I would call out, which is putting lists of things in front of people that need them. So, as people talk about all their aspirational hopes for making decisions, data, like some of the, just the nuts and bolts get lost. So concretely, something as simple as do we have just a list of everyone that signed up yesterday and can people look at it.
[00:19:40] Can we combine the kind of law, the signup information with maybe a sketch of their first day’s activity. Maybe can we just have some like rough, the average number of comments , percent completion of a signup of funnel number of connections added. And can you just get that list in front of people? And can you like iterate on that list? And I use the word list instead of like dictate database table or spreadsheet or dataset, because I think most people internalize it as like, I just want the, like the people that signed up last night yesterday,
[00:20:13] Jevin: [00:20:13] sure.
[00:20:14] Sameer: [00:20:14] kind of you, you slap Metabase on top of that.
[00:20:17] I E there’s a table where you put all this somehow either it’s a, an ad hoc join either. You’re storing it that way. Or there’s some ETL pipeline that produces that. And then you just put that in front of as many people as you can, and you let it infuse their conversations. And so I think that there’s often this jump to the final step, which is what are the piece of information we need to make this decision that neglects the substrate, which is people are used to looking up information as they think about decision-making.
[00:20:49] Jevin: [00:20:49] right. That’s what you’re looking for.
[00:20:51] Sameer: [00:20:51] yeah, you, I mean, I would say like the, the thing you’re trying to get people to do is to start sharing links in Slack. I saw this. Hey, interesting. This subset looks a little weird. Hey, I found this is kind of like not what I thought it would be that people pass it around show to each other.
[00:21:10] And then, when people start having arguments or debates, they start seeding those with like, Again, links to charts or links to lists or links to summarizations.
[00:21:21] Jevin: [00:21:21] Yes.
[00:21:22] Sameer: [00:21:22] and , again, you, you kind of want to build this up and it’s probably more valuable to build that up to again, start at the very end and say, how do I decide, how do I decide which ad units to buy? It’s like, well, you you’re, you have to get there. And like it many companies, there are these like core businesses at the, you have to get get to. I think that you concurrently, or even beforehand need to just have data accessibility and data powered conversations a norm as opposed to a world where people come in with this narrative and they have a few carefully chosen data points to support their narrative. That’s kind of all you got.
[00:22:06] Jevin: [00:22:06] Yep. Now I have in your experience, , you install say Metabase, it’s got single sign on everyone in the company could sign into it. , maybe you have one champion in the company that starts building these many dashboards for, for marketing. Like, Hey, here are the people that signed up yesterday.
[00:22:21] You can look up their product profiles, Google, their email addresses,
[00:22:24] in your experience putting Metabase or some other kind of tool in front of people, do they start that, does the whole company just start using it or what are kind of the key catalysts that you’ve found that that’s made that work?
[00:22:36] Sameer: [00:22:36] Yeah, so I mean, the answer is it varies. it varies primarily on. Does the person who sees that table understand what they’re looking at or not. does it map to something they care about? So if people have jobs that they’re trying to do and you give them data that helps them do their job, use it all the time. And so one of our big success stories is I think there’s a case study or website, but a Go-Jek, which is essentially the Uber of Southeast Asia.
[00:23:07] Jevin: [00:23:07] Oh, cool.
[00:23:08] Sameer: [00:23:08] know, we have, I think. 10,000 people using megabase base on a weekly basis, there was something silly.
[00:23:14] Jevin: [00:23:14] Toledo.
[00:23:15]Sameer: [00:23:15] And so that’s a place where, they made a conscious effort to get data pulled together, to get everyone from the C suite down to customer service reps, acess to data sets and questions they need.
[00:23:26]And , it’s an heavy, heavy, heavy use.
[00:23:29] Jevin: [00:23:29] Wow.
[00:23:30] Sameer: [00:23:30] There are lots of companies where they actually don’t really think too much about the self service world and they just say, we need to make some dashboards. Here’s some dashboards piece. And in those places, if the dashboard is not somebody that helps you day to day, you’re not going to care.
[00:23:45] Jevin: [00:23:45] Yeah.
[00:23:45] Sameer: [00:23:45] it’s like top line KPIs, like who cares, like someone will read it out in a meeting anyway, you don’t have to sit there and watch the counter.
[00:23:53] Jevin: [00:23:53] Yeah.
[00:23:54] Sameer: [00:23:54] like. Nobody really cares if it’s like, I’m talking to someone that they’re angry and I want to look up like what happened in the last six hours and the data that supports that question is in front of me.
[00:24:08] I’m going to do that a lot. Yeah.
[00:24:09] Jevin: [00:24:09] Yeah. Yeah, that’s great. We’ve used it. Where we built, we built like a rails app where we don’t really conscious about tables. We’re building with the vision. I’m like, we’re just gonna use Mehta base for, the data team to interact and watch, how the data is coming in and do their analysis.
[00:24:27], we don’t have to now go and build a whole custom dashboard, on top of rails to go and do that. We can just have them. Do with Metabase, but we just have to be smarter about how we’re building our tables out. And it worked out super well, worked out really well.
[00:24:40] Sameer: [00:24:40] Yeah. I also throw out the, like, this is one of those places where. If I may kind of take a tangent,
[00:24:46] Jevin: [00:24:46] Sure.
[00:24:46] Sameer: [00:24:46] most normal people have a certain mental model of how their business works.
[00:24:52], a customer has a subscription subscriptions have line items. Or sorry, subscriptions might have line numbers, but they really have like a set of check boxes.
[00:25:01]They have a certain net payment term. They are, have a certain discount or coupon code. And for the most part, people again have this mental model that often breaks down to being roughly like spreadsheets shaped. They don’t really think about like multiple entities. They don’t think about like , normalized datas data.
[00:25:22] They certainly don’t think about address as being the thing that you normalize out. And often there is kind of the quote naive data model that DBAs will make fun of you for putting in. That’s actually really close to what a normal human beings envision of their business lies is. And if you hand over a data model, that’s like, again, the equivalent of your first rails app.
[00:25:46] Jevin: [00:25:46] yeah.
[00:25:47] Sameer: [00:25:47] that’s probably closer to how a real person thinks about their day-to-day job than a hyper normalized or like snowflake schema or any of the things that data DBA is really like. And so the, the process of putting something intelligible in front of end users is actually really important. And I think one of the mistakes that most people do is they get way too sophisticated and smart.
[00:26:14] When they should really just accept an inefficient storage format for the schema more closely to what end users conceptualize those entities and those relationships being. and that, again, lets you get out of this. There’s only, there’s only a handful of dashboards that someone special made.
[00:26:36] Like if you have to do three or four joints to get something that an end user can interpret. Then you’re essentially meaning that only people that know how to do three or four joints can use that data.
[00:26:48] Jevin: [00:26:48] Yeah, that’s right.
[00:26:49] Sameer: [00:26:49] it’s kind of circling back to your question, like, do people use it, if you give me this crazy shit show of a, 18 table joined that I have to put together each time I look up people like, who signed up yesterday, not going to do it.
[00:27:02] If you hand over a nice clean, like table where all the columns have proper nouns, I know what they all mean. The numbers all make sense, and I can basically treat as a glorified spreadsheet then I’m and it, and it actually is something that I care about day to day. Then I’m probably gonna use it day to day.
How do you start data exploration if your data isn’t clean
[00:27:19] Jevin: [00:27:19] Yeah. Yeah. But so what do people do when their data, when they’re, the, maybe someone’s listening to this. I was like, wow, that sounds great. Let’s let’s expose this. And then they asked her CTO to do it and they grumble grumble and they’re like, finally they, they implement, they throw up the Docker instance and the tables are just all over the place and it’s just not, not happening. It’s 12 joins to be able to find out, the user signups from yesterday, like where, where do they, where were they? Where did they go next? So that’s trying to figure out how to get some of this data house where they can start asking questions.
[00:27:53] Sameer: [00:27:53] Yeah. So I think, I mean, I think starting from the CEO is probably going to be a recipe for pain.
[00:28:01] Jevin: [00:28:01] or whoever, I guess maybe, maybe it’s the head of marketing or something. It’s like, I’m desperate for this.
[00:28:06] Sameer: [00:28:06] Yeah. I would actually like start from the other end. And so if I were to wave a magic wand and if I were to plant a seed and the people listening, it’s, if you’re a C level or head of blob, don’t think about what you need, because right now you probably have power. You probably have the ability to ask someone to pull up a question for you.
[00:28:27] Jevin: [00:28:27] Yeah, that’s true.
[00:28:28] Sameer: [00:28:28] can you look at X, Y, or Z? And that a lot of what kind of the next leg of analytics is for companies worldwide is expanding that out to everyone else in the org chart. So it’s not about what you are curious about. It’s about what your customer success managers. Who were, hopping on phone calls and video calls and emails day to day could look up
[00:28:54] Jevin: [00:28:54] Yeah, sure.
Building Quick Wins with the Company Without Rewriting your Schema
[00:28:55] Sameer: [00:28:55] it’s.
[00:28:56] And so it’s more about finding specific data sets that are valuable because , the, the generic problem of all my data is like crazy complicated. You kind of need to dissolve away as opposed to tackle head-on.
[00:29:11] Jevin: [00:29:11] W what does that mean? Like you just, you’re just trying to get some utility from it and then just kind of clean things up as you go.
[00:29:18] Sameer: [00:29:18] Yeah. And like find, use cases where, Hey, if I create this ETL table or this view in our database, then the customer success team has a quick lookup call that a win, productionize it, get in front of people and then do the next one. And then do the next one, your next one. And that keeps the problem from becoming politically attractable.
[00:29:42] Because at some point there’s politics here, which is your CTO has probably made a bunch of decisions on what the database schema looks like for a set of constraints. , I’m going assume they’re very smart. I’m going assume they know what they’re doing. I’m going to assume that like most people in that position, there’s an app they’re building that has to get shipped. Maybe they have this idea that they also need to provide dashboards. Maybe they also have this idea that they need to provide internal tooling either via something like retool or, adjacent applications that are really simple or via like the main line application itself. But in general, they’ll have made a set of trade-offs under the idea that shipping the app is the paramount concern, which honestly it probably is.
[00:30:29] Jevin: [00:30:29] it probably is. Yeah.
[00:30:31]Sameer: [00:30:31] There’s the politics of, they made these decisions that optimize performance and latency of the application and scalability that then destroy the self-service analytics case.
[00:30:43] Jevin: [00:30:43] right?
[00:30:44] Sameer: [00:30:44] So at some point there’s this fundamental tension between, Hey, we want the customer success managers to be able to pull this stuff up and your CTO is like, that’s great.
[00:30:53] Do you want to have leniency bump to 1.5 seconds from 150 milliseconds?
[00:30:59] Jevin: [00:30:59] Great.
[00:31:00] Sameer: [00:31:00] And so at that point, you’re kind of stuck in the, okay, cool. I now have to spend, hundreds of thousands of dollars engineering time to create an ETL pipeline, to create a data Lake, to create a data warehouse, to manipulate all this.
[00:31:13] And at that point you’re in just these death march projects that probably never get done. Whereas if it’s like, Hey, we’ve got these three data sources. Can we just set up a quick SQL view on this little Postgres box? We have lying around and flash information there nightly. And the answer to the questions like that is like, Oh yeah, that would be do it.
[00:31:30] We’ll do that. Whereas it’s like completely rework your schema in a production app that has live traffic. The answer to that is like I
[00:31:37] Jevin: [00:31:37] your CTO will probably not like that answer. Yeah, probably, probably not super excited about it, but starting by doing some simple scripts to do some overnight stuff into your own, maybe your own analytics based or, or , like a bit of a cleaner data set data lake. Cause you’re saying that you can just start doing queries on separate from.
[00:31:57] The CTO is precious schema might be kind of to try to get some early wins might be a good first, first place to start.
[00:32:04] Sameer: [00:32:04] Yeah, I’d say first, second through 10th or 20th. yeah, because I think like once you cobbled together 10 ish of these use cases and data sets, You’ve probably built up a fair amount that invade your eventual data pipeline. You’ve almost certainly done this messy way that you have to refactor rebuild and productionize, but you’ve also established, hopefully, that these are use cases that have value.
[00:32:26] You can start to show, see, VP and C level people that like this is actually like improving performance in some measurable way. You can build a case for some of the bigger projects. And honestly, even if you never do that, you’ve made, 10 little groups of people, much more efficient at their, at their day-to-day jobs and hopefully made the company perform better.
[00:32:46] Jevin: [00:32:46] Yep. I really liked that. I think that’s a great middle ground where you’re just, as you’re saying, as you said, like trying to just get these quick wins and try to just start to socialize the idea of using data and getting them to start sharing the dashboards and the answers to their questions and Slack and people having it as part of the conversation without having to spend yeah.
[00:33:07] W , weeks and weeks and potentially months, try to reconfigure their database schema from scratch, just for analytics, which seems like it’s probably not the main reason why you’d want to do that to have a refactor of your day-to-day schema, just for the analytics.
[00:33:23]I think that’s a really, really healthy approach.
[00:33:26] Sameer: [00:33:26] Yeah. So that’s kind of the general like gospel that we try to spread
[00:33:31] Be bottoms up. Look for use cases that are work and just lean into success, as opposed to trying to fix the whole monstrous problem, which you probably not gonna be able to.
Introducing data in your conversations and decisions
[00:33:42] Jevin: [00:33:42] Yeah. Yeah. I love that. Maybe the last thing that I’d like to touch on would be what should people look out for when they’re trying to inject data into a conversation where they’re trying to decide on a particular , direction to go? I know you’ve mentioned , just be cognizant that maybe your data isn’t telling the full story because it could be inherently flawed.
[00:34:01], what are some other things that you would suggest to people when they’re trying to just start using data in their, in their daily conversations? , maybe touching on some of the emotional elements that might happen, that are associated with that
[00:34:16]Sameer: [00:34:16] So let’s say that you’re a manager of a team or that you’re. Yeah, the tech lead of a team, or you, you may or may not have the ability to tell people what to do, but you’re trying to exert influence on them is to just start peppering your own. , again, we’re talking about remote work, so you’re probably on some form of chat, just your conversation starts include charts and links to charts of them. You start to have this. Like socialization of not necessarily backing up your points for data, but at least pointed to data and talking about data. And I think that eventually people will start to, at least at very least talk to you around those data points that you bring up. And so things like, Hey, it looks like conversion was down by 15% last week.
[00:35:03]Can we dig into this? And it was down in these sub-segments and, can you take a look at the segment for me make sure that this code change we pushed last week, does it, didn’t break this. And just again, starting from these kernels of I’m having this conversation that has some pointer to an objective reality, it’s not just, Hey, I think we broke a version. I think one of the, one of the things, as a manager or as a lead, that’s sometimes hard to really, internalizes, but, in some ways your job is feeding context to the people that report to you. And that one of the best ways to provide this shared context is to point to like the raw data, as opposed to. The narrative version of it supports your, your perspective.
[00:35:55] Jevin: [00:35:55] right. Yeah.
[00:35:56] Sameer: [00:35:56] so like, if you’re just saying, Hey, we should do X because of this chart. That’s probably not going to have as transformative effect as, Hey, here’s the thing. Here’s what I noticed. And always anchoring it in the kind of the raw list as opposed to here’s the number that I pulled out.
[00:36:16] That justifies why I’m right.
[00:36:19] Jevin: [00:36:19] right.
[00:36:20] Sameer: [00:36:20] people who are being effectively like convinced based on someone’s, co data-driven decisions like what they’re experiencing is, this person has an agenda. They’re going to flash a few things to confirm their bias at me.
[00:36:37] And there’s a very rightful suspicion of like, is this thing, your representative? And I might just being snowed. And so I think that’s actually been one of the big anchors that seems to work is don’t start at the decision. You’re trying to get people to reach just inject context and inject that hopefully it’s objective.
[00:37:02] I mean, again, like, reality is messy. We don’t collect all the data. We need all the, all the usual caveats, but I think that there is a qualitative difference between. , having a conversation where, quote unquote at the table is just, a list of all the information you have and engaging over that information so to speak versus showing up with a perspective and a position that you had debate powered by data. So I think that, like, there’s a difference between data driven decision making and data back debating.
[00:37:32] Jevin: [00:37:32] Yeah.
[00:37:33] Sameer: [00:37:33] the latter when they think they’re doing the former.
[00:37:35] Jevin: [00:37:35] Yeah. Yeah, that makes a lot of sense. It was the fighting over these numbers and , cause to, to further dig into their own and justify their own kind of decision or perspective and less having these different pieces of data that people kind of put in the middle and everyone’s trying to make sense of it to try to pick up different pickup the direction they should go, ?
[00:37:56] So just far, far more collaborative in the, in the latter. If I understand what you’re saying. Yeah. I love that this is great, while it may not be super obvious, I guess how the data fits in this isn’t making it works into building remote teams. But this is all about just working together.
[00:38:14] Better. So I think , having data as part of the conversation as we was just talking, we were just talking about in that kind of last, that last segment. I think I’m, I’m hoping, I’m hoping when people hear this, they’ll be able to self identify and figure out ways they can just have healthier conversations around their the data that they they’re bringing to their decisions.
[00:38:33] Any final thoughts here?
[00:38:36]Sameer: [00:38:36] Just the most blatant self-serving everyone should download Metabase, using the data, they have lying on hand and just see what’s in there and like give it to folks in your company, let them rummage around. Most people tend to be very pleasantly surprised with what they already have on hand.
[00:38:51], treat it less of a, I have these hard requirements and more of dislike what’s in there. And what can I do with it? I think again, that perspective leads to people discovering like, Oh, wow. I didn’t realize X, Y or Z as opposed to, Oh, we can’t get cross channel attribution. Okay.
[00:39:07] Jevin: [00:39:07] right, right. Yeah. Check out Metabase, everyone. It’s got almost 20,000 commits on GitHub, so it is super active with 237 contributors. So it does not look like it’s going away for a while. So. You can get the Docker file for that and incorporate it super easily. Sameer, thanks a lot for coming on the show I had, this is lots of fun to talk to someone else who thinks deeply about how data can be used in teams.
[00:39:33] Sameer: [00:39:33] my pleasure.
[00:39:35] Jevin: [00:39:35] Alrighty. Well, for now this is Jevin and Sameer for the hoping that you can do a better job building remote teams
[00:39:42] Sameer: [00:39:42] all right. Thank you so much.