Lorre Atlan - Wisdom in Data
Welcome to "AigoraCast", conversations with industry experts on how new technologies are transforming sensory and consumer science!
Dr. Lorre Atlan currently works as a Data Scientist at GSK. Lorre was hired as a Future Leaders Program associate, which provides leadership expertise through 3 rotations throughout GSK. She has tackled various machine learning problems in Pharma R&D and Manufacturing. In her current role in Consumer Health R&D, she builds machine learning models to help sensory scientists improve the consumer sensory experience. Lorre revels in strategically employing machine learning and Python programming for tangible returns to patients and colleagues. Before GSK, Lorre earned her doctorate in Bioengineering from the University of Pennsylvania and her bachelor’s in Biomedical Engineering from Johns Hopkins.
Transcript (Semi-automated, forgive typos!)
John: So, Lorre, welcome to the show.
Lorre: Hey, John, thanks for inviting me and having me.
John: Thank you so much for being on the show. So, it's a pleasure. So Lorre, I think just to get started, it would be interesting. It's always amazes me how in sensory there are so many different kind of, there's so many different paths into the field. So I think a great place to start to kind of hear your story of how you ended up in sensory, at least for the moment, and how it is, you know, like the lessons you kind of learned along the way, the things that you feel like are helping you in your current role as a sensory scientist.
Lorre: Yeah. I love to talk about my story. So, yeah, as you covered my backgrounds in Bioengineering at Penn and during that time, you know, if I had a lot of time for learning and you know, not just learning about pediatric brain trauma, which was my dissertation thesis, but also learning about python. Learning how to code in R, just expanding from my background in Matlab during undergrad. So I think that really got me interested in data science and machine learning. And, you know, at the time there were lots of, you know, freely available books that I could take and I had the bandwidth at the time to take, which, you know, I don't so much anymore. But it really was a time of great learning and just getting comfortable and getting that foundation, right? And I already knew how to code, but really being able to dig in to the machine learning piece and try and then also having the opportunity to apply that what I had been learning to my dissertation work was really great. So I really think that from my doctorate, I've been able to relay that I don't know that love of learning and having that solid foundation in sort of science and experimentation, as well as machine learning, data science and being able to relay that into my current role in consumer health now. So I guess to round out the story, so I write, you know, I was doing my doctorate and started hearing about data science machine learning and I was like, what is this? And I you know, I love algorithms. I love the idea of taking data or numbers and being able to extract, you know, insight and wisdom from them. Like, I love that idea of like just, you know, cranking the numbers through, you know, an algorithm and getting something meaningful. So, you know, this sort of sounded up my stream. Right? And, you know, did what any grad student at the time would do, which is learn more about it and allocate great too many hours learning about machine learning data. Everything that I could get my hands on, I thought I thought might be relevant to my dissertation I started learning about. And so that was really, really fun. And my advisor was thankfully understanding. And yeah, I think from then on I sort of, you know, knew that I would like, this is what I'd like to do when I graduate. And so having decided that academia wasn't quite for me. I wanted to continue this kind of work in an industry setting and wanted to leverage my background in biomedical engineering as well. So pharma seemed like the best fit for that. So I was like, you know, I'm going to apply to work for some farmers in a data science context. And thankfully, you know, GSK took me on after an intense interviewing sort of period. So I applied for the FLP position, so that's the future leaders program. So it's just a three year rotational program that allows me to rotate throughout GSK. And, yeah, after a couple of weeks of that interviewing, you know, I got an offer, so I took them up on it. And, you know, it's been really good to you know, being in the program, working at GSK, like learning so much about pharma, how we manufacture drugs and how, you know, how much work it takes to do that. And, you know, there's lots of opportunities for data scientists throughout the business. So, yeah, that's pretty much how I ended up, you know here at GSK. And it's been really, really good ride. And I can't wait to learn even more.
John: Right. So that's really is fascinating and it's amazing they took two weeks interview after talking to you for ten minutes I would hire you so surprised they took so long. But I guess they have to be thorough.
Lorre: Yeah. A lot of behavioral interviews and tests is quite intense, but it is worth it. It's good, you know.
John: Yeah. A lot of behavioral interviews and tests is quite intense, but it is worth it. It's good, you know.
Lorre: Yeah. Just about. So it's actually so three years in total. My first rotation was in pharma R&D and that span for a year and a half technically. And then I went to the UK and worked in manufacturing for a year and then I came back and now I'm with consumer health. So all of these, you know, movements throughout my rotation have been driven by me, but also my home line manager and who's been very supportive. And it's been really great to see different facets of the business through this program, essentially. So I've been doing data science and, you know, all of these different facets. And yeah, it's been really, really, really a great ride. Just working with different people, you know, learning their ways of working and, you know, being able to bring it back to my home group.
John: Right, well, this is really interesting to me because I was curious what you think about sensory and consumer science compared to, say, the manufacturing or pharma side. What are the things that you see that are unusual about sensory science in the way that we approach our problems or the sorts of problems we're trying to solve? What similar to what you've seen in your other areas and what would you say is kind of more unique in our field?
Lorre: I yeah, I guess I would say, you know. Sensory is very similar to R&D, sorry, pharma R&D in the sense of, you know, it can be rather exploratory in nature and, you know, people are not afraid of innovating and taking chances and just, you know, playing with the tool even and, you know, trying to see how far they can get with using the store, incorporating it with legacy infrastructure. So that yeah, that's been I think that yeah, pharma and sensory consumer those very similar cultures I would say in terms of data and machine learning, I would say a very open, right? I think we start at least based on my experience with manufacturing. I think you're talking about engineers who are used to, you know, following a prescribed protocol to get the same thing, right? And so, you know, instead of thinking about why data sets, now, you're thinking about, you know, very narrow data sets, but of the same things.
Lorre: So you're producing the same product. You may be you're tweaking your when I mean, narrow data sets I mean, like the narrow parameters, you know, controlling. All of your parameters to produce consistent products. Right? So I think that translates a bit into the culture, too. So perhaps a little less innovative, perhaps, I wouldn't say less innovative, but less keen maybe to try different tools or algorithms. You know, the bar is higher to prove that you're going to deliver on savings and resources essentially.
John: Right. Well, the risk profile is different, right? Because if you make something goes wrong on the manufacturer studies, it would be extremely bad.
Lorre: You're talking about patients, yeah, people's lives are at risk, potentially. So.
John: Right. Whereas, you know, on the consumer side, as long as the product is safe, if it is you know, what flavor is this or that is not going to, yeah, it’s a different risk profile. So I think that encourages different behavior. That's very interesting. Yeah. I have some things I enjoy about sensory is how flexible we are as a field and how, you know, I think open we are to new ideas. So, well, in that vein, like something that I think people are very open to and asking about a lot is data science within sensory. And so I think it would be good to hear, you know, first your thoughts on what are the tools from data science that you think are most important for sensory scientists to start to learn either ways of thinking or, you know, specific, you know, ideas? And then I'd like to talk to you a little bit about the kind of R, I would say versus but the R/python debate because of something else people ask me that's what I really need to learn python, that's a question that I get.
John: Start with ideas from data science that you think are that also sensory scientists should at least be aware. What would be some of the kinds of big picture ways of working?
Lorre: Yeah. I think the first thing that comes to my mind is that, you know, with the hype around AI and machine learning, just I guess being a bit more savvy about what AI or machine learning actually is and what it can do, the power of it, but also like the limitations, because I feel like the limitations, what you can't do with it, you know, that’s rarely covered. I think even the ethics around machine learning or, you know, gaining more hype, I would say these days still coming to light, which is great. But I think if you're as a sensory scientist, you're coming in and you don't maybe you're not as familiar and you're not as comfortable with the some of the ideas underlying AI or machine learning, you don't know what it is. I think it would be great to just, you know, find very light, high level, you know, courses or blogs, blog articles, whatever material just to come up to speed. Because I think that, you know, you want to demystify it. Like at the end of the day, you know, it’s not that complex. It's not so complex that you can't understand there. You know, practical deep learning courses that require no more than high school math, basically to learn. Right? I mean, once you know how to program. So I just want to, I guess, reiterate that it's accessible to everyone in that. You know, there's a lot of hype around it. There's lots of power, of course, to these algorithms and these techniques. But there are limitations and can't do everything. And it takes a long time to develop these machine learning models such that they are robust and reliable. You know, we both know like it’s not a magic bullet. Right?
Lorre: Not a quick one anyway. Yeah, I think that's probably one of the things that I would say. I think being aware, but then I guess on the flip side of that, so being aware of the limitations of machine learning, but also knowing what where you know, where to pull on this tool. Right? Like this is what we need right now. You know, when we're talking about automating, you know, cleaning data sets, you know. You have the standardized data set. This is you know, you want to capitalize on the power or the capability of, you know, this is more programming rather than, you know, AI or machine learning. But, you know, those things could come into play as well so that you can free up the time for sensory sciences. Right? Like, so they don't have to spend time, you know, combing through excel sheets. You know, wasting valuable time. You know, these are experts in the field. Right? We want them to spend their time, you know, creatively thinking about, you know, their problems, not in the domain, not doing trivial tasks. So, I think the way to approach or think about, you know, data science is really keeping that in mind that, you know, it's not to replace anybody's job, but it's to help and it's artificial intelligence, so, you know, just keeping it, keeping that in mind. You know, it's not here to take anything away from anybody because there are so many things that, you know, AI can't do that humans can. And so I think not to feel threatened, I would say, don't feel threatened. You know, it's going to help you be more effective.
John: Right. Now, I totally agree with that. In fact, I would say for me, one of the red flags with AI startups is when AI startup promises to replace the department or replace function. Yeah, it's not going to be like that.
Lorre: Yeah, that's just like overhyped of AI in a model and you know, it really I don't know, sometimes it's so disappointing because people, you know, really take that to heart and then, you know, you design a program or script to help folks do their job better and they're really resistant, you know, and a bit threatened by it and so it's like, no, I don't want to replace you. There's no replacing you. There's just helping you get through your job faster. And yeah, that's I don't know. I'd love that somebody wants to help me get through my job faster to be more efficient. So yeah, that's what I would say about that.
John: Alright, well, now let's get into the R versus Python, because I think it is a good discussion and you're one of the first people on the show I can actually have the discussion with. So, what would you say, okay, so you've got a sensory scientists and they've heard, first off, how would you characterize the difference you know, as far as R and Python? Because they're both very popular tools and languages for data science. Based on what you've seen for sensory science, what would be the pros and cons of the various languages when you're looking at them to people trying to pick what to learn first? What would you say to them?
Lorre: Well, I would say that I think depending, it really depends on the context and for sensory since, you know, sensory sort of born out of statistics and R, you know, using R, a lot of the packages exist there. Right? That is the sort of domain theme language. Right? So, I think that counts for something. And if you have a team, you know, in sensory and all of your packages, all the innovations are happening in R and they're not happening primarily in Python. Well I think that's a big reason to utilize R over Python. Right? And so my experience so I started, I, yeah I started learning Matlab and then I started learning R because I took a, there was a statistics class that exposed me to R and I was like oh this is really nifty. You know, it's really easy to learn. Right? And so I think that transition since I had prior programming experience to from Matlab or even Java before that to R was very easy. I found it incredibly easy to pick up. And so the learning right, it's a very easy curve. Right? Learning curve. Whereas with Python I remember struggling with it at first. I remember being very confused about all the data types at first because I was sort of, initially, I didn't take a more comprehensive course. I was just sort of like hashing at it and doing one off things. And so I remember getting some errors and not understanding what was going on and feeling like the documentation was lacking. So I think it depends on what your background is and where you're coming from I think. R is definitely easier to pick up. And honestly, if you can start picking up those programming sort of foundations earlier, no matter what language they are, I think that's great. That's a step up. I would say that Python is the primary language for data science. And you know, that's where innovation on the data science machine learning AI front. All of it's basically happening in Python. I think, well, a lot of it is, I think it's you know, it's I think that you cannot be a data scientist now and not know Python essentially, or it's harder outside of certain domains.
John: If that's your job, your data scientist proper.
Lorre: Yeah. If you're a data scientist proper and this is all you do, there's no way you have to know Python. Like, I just can't imagine it. But yeah, I think Python, you know, it's more traditional, you know programming language and so all the, you know, things you want to do with object oriented programming, for instance so, yeah, I love python. It's really, really great. But the one of the things that Python is not good at, you know, is or well, I'll say R is better at is visualizations. Right? Like if you're going to do if you're going to visualize, you're going to visualize plots, you're going to create power points, whatever it is that you're going to do, it's probably looks better in R. That was the whole reason that I decided to learn R because I was sick and tired of ugly Matlab. So yeah, I was like, oh, I can't do this anymore. And then I, you know, I was like messing around and then I was like, how do you know, how do I get know publication ready figures. You know, just very quickly and I was spending so much time creating facets essentially in Matlab and its like this is just the wrong language. I think Matlab since improved now. But, you know, like some five years ago it was it was awful. And so for visualizations and so R is really nice for that and it still is. It's super easy to use.
John: Okay, Kind of like a lighter weight. I mean you're right, R is not a true programming language, it's a programming environment. I mean I think it never really it's claimed to be programming language. So I think that for us sensory scientists who want to get up and running, they want to start to bring ideas from data science.
Lorre: Exactly. Yeah.
John: You know, it seems to me to be a good place to start. Now, at some point, people may decide their true calling is to become a data scientist. And at that point, you know, or they want to really start to get closer to the data science course. You know, like for example, someone like you, you know, you code mainly in Python at this point, would you say?
Lorre: I do. Yeah, that's great.
John: Yeah. So I think that's a fair assessment. Yeah, it's a big question, you know, how much do people really need to learn to code?
Lorre: I was just going to say how much to learn to code. I think these days it's getting easier and easier and I think it's I don't know, there's so many blogs and stack overflow all these resources that people can access now and just copy and paste their code and be up and going. And that's, I don't know, that's great. I found R really easy to learn and in a way that I didn't quite find Python I had to invest a lot more resource, I think in time to really get up to speed.
John: Right. Yeah. Well that's great. We can talk about that all day I'm sure. Yeah. I mean, I've been increasingly programming in Python and I have to say, for my brain it actually works better because that's kind of the way I think about things, you know. But like I think I have a kind of unusual background in that I'm coming from math, you know. So I think the more that you're on, the thing is I think that there's more abstraction of python. And so you've got, there's clearer. Yeah, it's not as literal. I think R is a little more literal. Like the way you're thinking about things is the way you write them. Whereas I think in Python it's a little bit more pre-organized so that you'll get, yeah, I mean, there's a whole culture of computer science. I mean, it's a lot of is based around performance and readability and yeah, you want to make sure things work in production. You know, there are a lot of kind of practical considerations that computer scientists have worked out of time and that, you know kind of naturally built into Python and you end up, R, is not really built that way. R is built so that the way you're thinking about things is the way you write it and then it runs. Yeah. So it's interesting. Okay, let's talk about two topics that you would want to make sure we get to, because something I'm really interested in is this the way that technology is helping us to bridge the gap between quantitative and qualitative research. And I know that you're doing some really interesting kind of, before the show, we were talking about some of the applications of speaking in Python, this is a good use of Python. Applying NLP, an actual language processing tools to transcribe, open ended or focus group data. So I think it'd be interesting to hear your thoughts on what are some of the things that you can do when you start to use natural language processing and apply it to text data in the sensory? What are some of the things that become possible when you start to use these tools?
Lorre: So much, I mean, becomes possible I think. I think primarily just making, I just being able to ask more insightful questions, I would say. I mean, you're basically starting out with your text and maybe you're removing stop words and such and once you've done that, you can get to maybe doing some topic modeling and trying to discover various ideas throughout your text, right? So I think you could do lda more traditional topic modeling. But then there's also corex topic modeling which I found works quite a bit better. It more gives more intuitive, I think, topics.
John: And for our listeners, just so they don't know topic modeling is figuring out the topic that's being discussed, is that the idea?
Lorre: Yeah. Exactly. So you're discovering themes throughout your text, right? So if I have corpus of news articles, you know, and I put it through topic modeling algorithm, I'm able to pick out, you know, maybe the sports. Maybe there's some stuff on biology. Maybe some stuff about the elections. Right? That you're going to be able to group all of those news articles or snippets of those news articles for that matter together without, you know, automatically without explicitly programming that into your code. So it's really nice to be able to do that if you have lots and lots of data. Right? So instead of just manually reading through each snippet of text, you know, you're having the program sort of reallocate different classes to the text. So that's pretty nice I think that can be helpful in the sensory contexts.
John: So if you have focus group data, you might have an hour's worth of transcribed data where people are talking about some product and then you run your algorithms and you're able to figure out, okay, at the beginning they were talking about the packaging and..
John: And talking about, the way it tasted and then whatever, you know. So you can get an understanding automatically.
Lorre: Yeah. That's exactly right. So, you know, you're able to pinpoint what's being spoken about by the theme or the topic and when that happened and, you know, then you know, you could take these topics. So let's say, you know, they're talking about packaging or labels and you can analyze that that group of text. Right? Related to packaging and labels for, you know, whatever it is you might analyze. So one thing you might analyze for sentiment. So that's another thing that's been pretty helpful to find in the sensory space. Right? So, say you want to know if how people feel about your packaging and labels. Right? And you do your survey, you want to know if people are excited by it. If you want to know if they're disappointed by it, perhaps, you know, there are various sentiments or emotions that you can measure through the text they're using NLP. And so that's, I don't know, I think that’s helpful to know. And then I think the next obvious question then is, well, why do they feel the way that they do? Right? And sort of getting into that answer as well. So being able to dig, maybe this is like the limit where you're reaching the limit of the data that you're analyzing. But I think it's basically taking an iterative approach to drill down and get this reveal this insight, this wisdom right? From the data and just taking a first pass, you know, maybe you do your topic modeling just to see what's there in your data and then, yeah. And then again, going into sentiment or what else could you do. You know, sort of semantic relationships between words and how they're related to each other. They co-occurring together and if they are, that must be meaningful. So why is it meaningful? Are there examples of programs that we can pull out to illustrate why? You know, these words are co-occurring together consistently right throughout the interview? So, yeah, I like to think of it like a several pass kind of at the data to reveal something, you know, hopefully amazing. Some insight there. Something meaningful.
John: Right. And that tells me you use the word wisdom a few times. I think that's really good. The idea that there might be wisdom in the data and you can find it.
John: Yeah. I think it's interesting, you know, someone pointed out to me recently that when someone's reading, when you're reading an open end, reading all these open ends, you have your own biases that you're bringing, whereas potentially the algorithm is going to extract things that you wouldn't see because of your own interpretations that you're bringing. But there's a chance to get kind of an outside view on the data.
Lorre: Yeah, that’s a really good point. And that's one of the reasons that, you know, you would want to use it. Right? I think in taking a more objective approaches is really good. I think at the same time, you know, the limitations of some of these techniques comes into play because it's like, you know, you're limited to the data that you have, in some words, intuitively, you know, when I read that somebody says, oh, that's you know, how are you doing? You know, like I'm chill or something like that. You know, it doesn't have anything to do with, you know, being cold or it's not a negative. Right? But, you know, your technique may not pick that up and may you know, attribute it to be something negative and have a negative sentiment.
John: Right. Well, that's right. And I think that the techniques keep getting better from handling data.
Lorre: For sure.
John: Yeah, it's good. Well, amazingly, Lorre, we have reached a half an hour. I can't believe I could talk to you very easily for a one time. We didn't even get a chance to talk about graph data science, so let that be back on the show.
Lorre: I know. I would love to come back. Love to be back and talk about graph data science. I can talk about that all day.
John: Awesome. Alright, great. So now if someone wanted to reach out to you after the show, what's the best way for them to get in touch with you?
Lorre: LinkedIn. LinkedIn would be great. I'd love to connect with anyone on LinkedIn.
John: Okay, sounds great. And we'll put the link in the show notes. So they can find it there.
John: Any last bits of advice for our listeners today?
Lorre: Oh, I don't know, just be curious. Be curious and learn anything that you feel like excites you. Just go with that momentum and learn even if it takes, you know, an hour, five minutes, whatever, man. Just be curious. Go for it.
John: Totally agree.Okay, excellent. All right, thanks a lot, Lorre. This has been great.
Lorre: Thanks, John.
John: Okay, that's it. Hope you enjoyed this conversation. If you did, please help us grow our audience by telling your friend about AigoraCast and leaving us a positive review on iTunes. Thanks.
That's it for now. If you'd like to receive email updates from Aigora, including weekly video recaps of our blog activity, click on the button below to join our email list. Thanks for stopping by!