Benjamin Cabe - On the Edge
Welcome to "AigoraCast", conversations with industry experts on how new technologies are transforming sensory and consumer science!
AigoraCast is available on Apple Podcasts, Stitcher, Google Podcasts, Spotify, PodCast Republic, Pandora, and Amazon Music. Remember to subscribe, and please leave a positive review if you like what you hear!
Benjamin is a technology enthusiast with a passion for empowering developers to build innovative solutions. A long-time open source advocate, he co-founded the Eclipse IoT Working Group in 2011 and grew, from scratch, a vibrant open-source community of hundreds of developers and dozens of deeply engaged companies. He is currently working at Microsoft as a Principal Program Manager for Azure IoT, where he is leading developer engagement initiatives with some of the top communities and companies in the embedded, AI, and open hardware space.
Transcript (Semi-automated, forgive typos!)
John: Benjamin, welcome to the show.
Benjamin: Well, thanks, John. Thanks for having me.
John: It's really a pleasure. And I'm excited, actually, to have you talk to our audience because I think you're bringing a different perspective from a lot of our guests. So it's really nice that you join us today. Okay, so, Benjamin, I think it would be good if people just knew a little bit more about your background because a lot of the times people on the show are working as sensory scientist or academics. But you're at Microsoft and you're working in the kind of Internet of Things space. So could you take us through your journey into sensory science and how it is that you've ended up on this part of work?
Benjamin: Yeah. So, I'm a software engineer and I've been in the IoT space for as long as I can remember, and IoT an open source as well. So, my journey has always been to try and empower people to use IoT to connect their things. Right? And there's many things that you need to do when you want to connect things. You need to work on your communication protocols. You need to do embedded development. You have some cloud development skills that are involved. And so, yeah, in general, my role even prior to joining Microsoft has always been to try and create content like tutorials, demos, articles, you name it, that can showcase some of the interesting applications of IoT, right? IoT in the Internet of Things is not only about like the I don't know, the connected smart collar for dogs or like those sort of like consumer gadgets. There are tons of applications, right? And I guess that led me to explore a bit about like the intersection of AI, IoT and Centaurs back in May last year, right? And that's how we how we got in touch, right?
John: Yeah. Well, I think, okay, sensory, that concept is fascinating to me. And in fact, I was recently the idea of sensory is a very general concept. I was actually invited recently to contribute a chapter to a book on sensory where they were interested in voice activated technologies, thinking about smart speakers as a type of sensor that that's out in the world when you can collect data. So can you explain a little bit more to our audience, please about IoT the Internet of Things in case anyone's not entirely familiar.
John: Maybe let's just take a step back and you can kind of explain what is the idea of the Internet of Things? What are the key components? What's your audience know about the Internet?
Benjamin: Yeah, that's actually a really good question. So, this is general term, the Internet of Things is this idea of connecting the physical world to the digital world, right? And so, if you're in manufacturing, you’ve been operating your factory for decades and you have like existing processes, et cetera. But what if you can complement those existing processes with sensors and sensors deployed in the field, they are able to give you some signals on what's happening. Right? There is a vibration that's being picked up on the particular machine or there is a temperature that is going maybe too high. So, the Internet of Things is set of techniques that enable the connection of physical sensors to your backend environment. Like your good old IT systems. So, in order to achieve and to make IoT possible, like I briefly mentioned in the intro, it turns out you need tons of building blocks, right? You need the Internet communication protocols that actually allow you to turn those signals, the temperature that you're measuring, the humidity, the pressure and whatnot, into actual Internet packets that you can route securely to you and your backend environment probably in the cloud. Actually, you need everything for securing the solution, because if you're operating a factory, you don't want to, like, have your trade secrets being something that people can access. Right. If your sensory as data that anyone can eavesdrop, then you might be in trouble, right? Yeah, IoT in general is this idea of collecting signals from the physical world and it's actually both ways, like something that might be out of scope for two days conversation is also this notion of actuators like once you've collected information from your temperature sensors and so on, you collect these data in your information system. You run some workflows, you take some decisions essentially. And those decisions might be turned into physical actions that you want to have in the field. Right? You want to shut down a machine. You want to open a water valve based on something that you've just computed and that you've enabled things to your sensors in your IoT infrastructure. Right? But does that make sense?
John: Yeah. And it actually has, I think, tremendous overlap with sensory science, right? That it isn't just a coincidence that sensory is about a sensation that we're talking about perception, measurements that humans...
Benjamin: Are reacting to that.
John: Right, exactly. And we're actually waiting in response to senses that things we're proceeding. I think that takes us now to maybe the reason we're on this call which is I think you're, you know, when you look at the history of technological advances, it isn't always the people who are in academia who make advances. Oftentimes what happens is people who are experimenting and inventing, you know, you look at the industrial revolution, just how many advances were basically done by people who were tinkering with things. They got something to work and then later the engineers figured out why it worked. Right? It's very common that a lot of times, you know, that the research will follow the invention, not the other way around. And I think you have done something I think pretty inspiring for somebody who is versed in computer science and not as familiar with sensory to make progress in the area of the artificial intelligence. So, can you talk to our listener about your area, please?
Benjamin: Yeah. Early during the lockdown situation back in May last year, in May of 2020, I was stuck at home trying to perfect my bread recipe and sourdough bread. And so, I wanted to and kind of at the back of my head was this idea that I guess I wanted to build some kind of IoT solution and I wanted to play with gas sensors to figure out whether there could be a way to correlate how it would call that the olfactory fingerprint of my sourdough starter when the quality of the bread that potentially I would get out of a particular sourdough starter, essentially trying to figure out when the sourdough starter would be perfectly right. Like you mentioned, I had no idea about sensory science or about gas sensors in the first place. But yet my intuition was that somehow there must be a correlation between the amount of alcohol, the amount of volatile organic compounds, the amount of carbon monoxide that you can sense out of a sourdough starter and its quality. Right? If sourdough starter doesn't smell alcoholic at all, then it's probably not ready. Right? Because it means that the fermentation is not at its optimal state just yet or it might actually be the other way around. I'm not a bread expert either, but I could sense that there was a correlation. Right? And so, I wanted to build a device that would allow me to do that. And back to the Internet of Things situation, I guess there would have been two possible routes and path for me to try and experiment, one would be (a) let's build a device that takes gas sensor data, streams the data and feeds the data into the cloud using Wi-Fi or whatever and then we'll figure it out. We'll do some AI. Some neural networks will run that in the cloud. Or what I actually wanted to experiment was, is it's possible to have the brain of the solution effectively running on the device itself. What some folks would call tiny ML, tiny machine learning using a microcontroller like a really dumb piece of silicon feeding it with the sensor data. Feeding the data through a neural network, and then try and like make some guesses and establish some kind of correlation between the sensory input and essentially predict what would be the quality of the bread that a particular sourdough starter would potentially give me.
John: Yeah, I mean, that's somehow fascinating. So, I mean, I could see the idea of edge computing.
Benjamin: Yes, exactly.
John: That you've got this calculation's happening. Okay, so can you take us through your journey then? So you got started, you started making these measurements. Now you had some labeled data. Is that correct?
Benjamin: Yes. Correct. Yes. I think one thing that I should mention is that I've been talking about neural networks and AI a couple of times already. As of last year, I knew literally nothing about AI and neural networks. And if you remember, like I'm a software engineer, I should know better, except that I don't. The concept of AI, quickly it turns into math. Right? It turns into manipulating matrixes. And that was just too much for me for some reason. And I was like, hey, maybe I'm not the only one. Like if I can get my head around neural networks then there might be a lesson there and I better get to the bottom of it and figure out whether I can teach myself some AI with a concrete use case with actual sensors as opposed to maybe when you look at AI tutorials or examples out there, it's always more or less around image recognition and like trying to classify handwritten digits and things like that. It should be simple, like it's very simple, but yet it was still too abstract for me. Manipulating pixels was just too abstract. And so, my journey started essentially, like you said, with acquiring some sensory data so using off the shelf supercheap and we can get back to that later supercheap gas sensors that would allow me to get the OC, alcohol, nitrogen dioxide and carbon monoxide measurements. So, I started to acquire and to smell a bunch of things, not a sourdough starter. I actually realized it was like early in the pandemic, there was hardly any flour on the shelves. So, my idea of training a model against dozens and dozens of the GED's, I realized that maybe it wasn't that good of an idea, so I had some smelly things around. So, I started actually like smelling things like whiskey, coffee, vodka. Those were my initial samples. So, I started to collect some data and then I used some tools that are freely available online, one is called Edge in polls and essentially there's this complete environment that allowed me to not only collect all the data and like labeled the data, like you mentioned, but it also guided me to build my first neural network because I had no idea what a neural network was. I didn't want to use Python because I'm not, like super fluent in Python, things like that. Like it was, I guess, too much for me so I ended up like being guided with the online tool, an edge in polls. And it allowed me to build my first fully connected neural network, which I figured later was actually way simpler than what I thought. It was essentially just a way to establish and to solve an equation where the input would be the gas sensor data and the output would be like the handful of labels, handful of smells that I wanted to classify and yeah, you fed the data through the tool and did some initial training of the model and I thought a couple hours, I already realized that I could get really good accuracy. So, yeah, that was interesting to see it. I was like, really that was that easy.
John: Interesting. So, I suppose in this case, the samples are fairly static. So, there isn't really a temporal component. Right? And so, you took different snapshots in time and labeled them and how many actual rows of data were involved?
Benjamin: That is a really, really, really good question. So back to my sort of intuition. I was like, okay, what characterizes the smell? Like, I'm no chemist. I'm no sensory scientist. But what characterizes the smell if I have access to a sensor that can smell NO2 and volatile organic compounds, whatever, I missed one. But surely the concentration of each compound is some kind of information. I might need more like if I am to smell whiskey for a couple seconds, the amount of alcohol will slightly vary, right? And this variation might be slightly different for whiskey than it will be for vodka. So, the way I'm sampling the data to your question is I figured that just looking at the two second time window and acquiring a data at ten hertz to be super specific. So, acquiring 20 samples for each gas during literally just a couple of seconds was good enough for me to not only capture the average, but also to extract on this particular time window what would be the minimum concentration of gas, the maximum, the root mean square, a bunch of other sort of statistical characteristics which would essentially capture. At least that's what my intuition told me would capture the, I guess, the intensity of the smell. Right? If you look at how much the amount of alcohol varies from vodka versus whiskey, it is probably varies differently because vodka is stronger kind of. And so that was my intuition. Long story short, it turns out that this approach was actually like probably the good one because the accuracy was really good and also allowing me to not have false positives. Right? It would really like I would be able to capture, although, like two whiskeys would be fairly similar. The more repeated one would register slightly differently on one of those characteristics, at least just enough so that the neural network could learn that key characteristic of such a smell.
John: So that means you did a little bit of feature engineering then? So fed these computer statistics into the model? Because that's kind of my next question, the inputs. So, you took the data, you computed some statistics and then those statistics where the input to the model. So how many features actually went into the model?
Benjamin: Well, we can do the math together. So, it's effectively two seconds, well, I think actually one point five to be super accurate. So, one point five seconds of smells sample that 10 hertz and it’s four gasses. So that's 15 samples times 4 gases so that 60 like row characteristics. But for each gas we don't really care. We actually just extract five characteristics. It's the average, the minimum, the maximum, the standard deviation and the which means square so effectively five statistical characteristics times 4 gasses. So that's 20. Twenty features that we feed into the model and then the output of the model is however many smells that you want to look at. So, if you look at whiskey, coffee and ambient air, you have three outputs. Right? And so what do you want to correlate is those 20 statistical like twenty features that you want to need to correlate with the three smells that they may correspond to?
John: Right and so the output is the probability of...
Benjamin: Correct. Yes.
John: And so then how many samples were there in total if you ended up if you had labeled that you train the model?
Benjamin: It doesn't require much, so this is where we can take the conversation in that direction. One thing that you may remember from earlier is that I used off the shelf low-cost sensors which means that they don't necessarily have a super great resolution. They don't necessarily like super well calibrated which means that they will be really good at telling coffee apart from whiskey. But then you will tell me, like they are just so different anyways, that maybe you don't even need a neural network for that. So, teaching a model to tell whiskey, apart from coffee, you don't need much like you just need to smell maybe 10 seconds of each and you'll be good to go. Now if you want to tell a particular whiskey apart from another then you need to capture probably something like ten minutes of data sampled at 10 hertz for each so that you have different variations of the sensor being maybe like straight on top of the bottleneck of the whiskey. Maybe another variation where the sensor is slightly higher and slightly and picking up maybe a bit more of ambient air and then, I mean, yeah, it might require more training time if your smells are more similar to one another and or it might require better sensors which is something that's always a possibility. Right? If I'm interested and I know some of you listening today, you might be in that domain like for things that are food related. If I were to train a model for picking up burnt food, ammonia is the key characteristic there. And I don't have an ammonia sensor in the mix. Maybe the NO2 sensor would pick up kind of like a bit of ammonia indirectly but it's probably better if I am to add an ammonia sensor in the first place if what I care about is really good accuracy for telling burnt meat apart from burnt popcorn kind of thing. So there's a bit of that too.
John: Yeah, well, it's fascinating because I mean, the fact is, if you see the pictures of early automobiles what they looked like. They were basically like you have to bicycle and engine right in between. And you would never think that within 50 years they would be racing those cars like, you know, it was tremendous advance. And I think what you've done is really exciting because you put together, okay, it's fairly simple and maybe the categories are easy to differentiate. But I think that the idea of training models based on very little data and based on relatively easy to gather measurements in a way that can run on the edge where the data don't have to go back to the cloud is fascinating. Because there's tons of applications and it's just going to get better and better. You can see telling your oven what it is you're cooking and the oven will know when the thing is burning based on what it is. What is burned fish like? Yeah, and also could be you have a glass, it's a smart glass and you're going to pour something in the glass and maybe there's an ideal temperature that different beverages should be served at. And your glass is smart enough to has a little heater inside it and it warms or doesn't warm the glass because different spirits are going to perform differently at different temperatures. So, what is this that's important to a glass and then it's going to be the right temperature when you go to drink it. So, I think that you're opening a lot of doors with this. I think it's very exciting. So, let's talk a little bit more, because I do want to get into some of our ideas about Internet of Things and smart packaging. You know, you have some comments on supply chain. So, what are some of the applications that you see within, I mean, I know this isn't your like day to day work, but when you look at the kind of consumer packaged goods, the industry, where do you see Internet of Things providing value a kind of first now like what's ready to go and then what do you see as things that are on the horizon in the next few years if people can start to look forward to?
Benjamin: Yeah, I mean, if you look at a device like quote unquote, an artificial nose, yes, it can smell things and it can tell you what it smells. But if the device is just like something that works on its own and with no connection with anything whatsoever, it might not be super helpful. But if you think about this sort of smart nose and you think about the applications when you connected to the Internet, it would be things like for folks, and actually spoke to some people were in the business of coffee cupping and sourcing coffee beans. Right. And so, they at some point during their sourcing of coffee beans or chocolate beans, there are actual experts whose job is to make sure that there's no defect, no mold and things like that in the coffee. Right? And but if you think about a connected nose, what happens is that anywhere in the supply chain, you can, like, monitor the olfactory fingerprint of the beans and not only can you in real time send that information to the cloud, but more importantly, that information that signal that like literally that signal is something that you can feed into your existing processes. If you are in the business of like, sourcing coffee beans, surely you know what you do when a particular container arrives with beans in them. You already have processes in place that you can trigger in your ERP environment, SAP, whatever it is you're using. And so, you can automatically trigger those workflows and essentially enabling beyond IoT and it might seem like a burst word but it's what referred to as digital twins. You have existing like information systems and then you have the physical world and the physical world is the containers in which the coffee beans are being moved around the world. Those are the machines for roasting the beans, et cetera. Those machines can also join the digital environment. And you may want to reflect whatever is happening in the real world. You want to reflect that into your information system. The nose is picking up mold. Where is the nose? Oh, it's in container X, Y, Z in the middle of the Atlantic Ocean. And by the way, where are those beans coming from? They come from Venezuela. This particular supplier, let's send them an email straight away to set up a meeting to to discuss with them the fact that they just shipped something wrong to us and so that's that's the kind of scenario. All things in food industry, all things security as well like monitoring in, I'm actually thinking covid. If you have in an office building, you can have a smart nose that not only picks up whenever the restrooms might require cleaning because the air doesn't look and smell as clean as it could. But supposedly there's been some some papers lately it might be possible to pick up the smell of covid-19 in the air. And so whenever you pick up that particular smell in a particular building, you can trigger all sorts of workflows. Again, you can email the facility manager, you can alert people went to the restroom that particular day, et cetera. Right?
John: Yeah, that's fascinating. Okay, Benjamin, I could talk to you for quite a bit longer, but I do want to ask for our listeners who are interested in getting started with IoT, maybe someone's inspired, what are some resources for starting to learn more about this area?
Benjamin: Yes, that's a great question. One thing that we didn't mention until now, I guess is for the artificial nose specifically, everything is on GitHub, everything is open source. So actually, do want and I encourage anyone to check out the source code, like to check out the model to extend the model. I actually would love it if at some point, just like there are resources out there, like public datasets for image recognition, for real time video analytics. I don't think we have quite the same for like a public open database of smells. But it could be really useful for data scientists and for AI researchers if we could get there. So that might be the next step for the open-source project. That's a starting point to learn IoT through the angle of building a connected smart nose. Otherwise, there's tons of great tutorials. We have Microsoft learn platform where you can get started like even if you don’t know virtually nothing about IoT. We have some great courses. We can probably add some links in the description of the podcast, but there's great resources out there. And I guess I would recommend folks to especially your audience to think about approaching IoT, really complementing it with this notion of tiny ML/edge computing/AI, right? Because those are complementary. You start like feeding sensory data into some kind of machine, but then you want to do as much processing as possible on the tiny machine itself. And then eventually, once you figured what is the actual piece of information that's really the relevant signal, then you may want to send that signal to your IoT platform.
John: Right. Now, it's fascinating. I can start to see many applications. I mean, for consumer research, you can really supplement your consumer research with measurements that are made in the home around the time that products are evaluated. There's really a lot here that's really exciting measurement. Well, we always like to conclude advice for young researchers. What advice do you have for, normally we say sensory scientists, but what would you say to young researchers?
Benjamin: That's an interesting question. I guess my journey with, I was super surprised to see how popular my tiny, silly pet project, and one reason might be that it is actually that simple, right? There's probably many other ways to build an artificial nose with super complex neural networks and super like way better model architectures and so on. But if you remember like what I said is that originally AI, I just couldn't get it. And maybe one reason I couldn't get it in the first place was that every time I was looking for content and for articles about AI, it felt like really complex technology to me. And sometimes it doesn't need to be. Sometimes even when you're solving what seems to be a complex problem, it might make sense to try and make sure that it actually resonates with people and really focus on what problem it solves. And sometimes it might not be rocket science, but that's fine, right? I don't know. That would be my advice. I'm not sure I formulated it well. But I think that's the idea.
John: Yeah. We definitely work on problems that matter and that you can sometimes make big progress with relatively simple. Well, okay, tools that are a bit have been made relatively simple to use. I mean, I think under the hood some of those tools are fairly complex. But yeah, there's a lot out there for us to play with and I think we just need to get started and try to work on problems.
Benjamin: Yeah, that's for sure. Take whatever road feels the more tangible to you. Like my journey into AI was a bit unconventional but at the end of the day I learned a lot about AI and hopefully I inspired a few folks along the way.
John: Well, you've inspired me and I'm sure you inspire our audience, too. So, thank you very much, Benjamin. It has been great.
Benjamin: Thank you. Thanks for having me.
John: And one last thing before you go, we'll put the link to your LinkedIn page to the show notes so people want to reach out to you, then they can find you.
Benjamin: That's great. Cool.
John: Awesome! Alright, thank you.
Benjamin: Thank you.
John: Okay, that's it. Hope you enjoyed this conversation. If you did, please help us grow our audience by telling your friend about AigoraCast and leaving us a positive review on iTunes. Thanks.
That's it for now. If you'd like to receive email updates from Aigora, including weekly video recaps of our blog activity, click on the button below to join our email list. Thanks for stopping by!