{"id":13717,"date":"2026-05-23T19:46:27","date_gmt":"2026-05-23T19:46:27","guid":{"rendered":"https:\/\/wildgreenquest.com\/?p=13717"},"modified":"2026-05-23T19:46:27","modified_gmt":"2026-05-23T19:46:27","slug":"putting-the-senses-in-ai","status":"publish","type":"post","link":"https:\/\/wildgreenquest.com\/?p=13717","title":{"rendered":"Putting The Senses In AI"},"content":{"rendered":"<p><br \/>\n<\/p>\n<div>\n<figure class=\"embed-base image-embed embed-1\" role=\"presentation\">\n<div style=\"padding-top:66.53%;position:relative\" class=\"image-embed__placeholder\"><picture><source media=\"(min-width: 960px)\" sizes=\"50vw\" srcset=\"https:\/\/imageio.forbes.com\/specials-images\/imageserve\/6a11f1533e7747feb1843045\/Digital-eye-\/0x0.jpg?width=960&amp;dpr=1 1x, https:\/\/imageio.forbes.com\/specials-images\/imageserve\/6a11f1533e7747feb1843045\/Digital-eye-\/0x0.jpg?width=960&amp;dpr=1.5 1.5x, https:\/\/imageio.forbes.com\/specials-images\/imageserve\/6a11f1533e7747feb1843045\/Digital-eye-\/0x0.jpg?width=960&amp;dpr=2 2x\"\/><\/picture><\/div>\n<div>\n<div class=\"bMqrj\">\n<p><span style=\"-webkit-line-clamp:2\" class=\"Ccg9Ib-7 _8XF2kHYM\">Digital generated image of multicolored particles forming eye shape against black background.<\/span><\/p>\n<p><small class=\"pGGCM2aD\">getty<\/small><\/div>\n<\/div>\n<\/figure>\n<p><strong> <\/strong>It\u2019s no secret that smart wearables are becoming a big industry, and in the context of that, the \u201cawareness\u201d of hardware through sensory apparatus is a big factor. The machines that see (in their own ways) and experience the world around them utilize sensory items like cameras and other analytical tools, to feed data into the LLM or brain of the system.<\/p>\n<p>I wrote last week about a doubling in the smart glasses sector last year, making that a bigger part of tech retail. Then there are all of these robotic applications to business, with automation making greater inroads into production, service jobs, and even things like janitorial work.<\/p>\n<p>&#8220;In a benign scenario, probably none of us will have a job,&#8221; said richest man in the world Elon Musk,<a rel=\"nofollow\" href=\"https:\/\/www.foxbusiness.com\/economy\/musk-predicts-ai-create-universal-high-income-make-saving-money-unnecessary\" target=\"_blank\" rel=\"nofollow noopener noreferrer\" data-ga-track=\"ExternalLink:https:\/\/www.foxbusiness.com\/economy\/musk-predicts-ai-create-universal-high-income-make-saving-money-unnecessary\" aria-label=\"according to reporting by Eric Revell at Fox Business.\"> <u data-ga-track=\"ExternalLink:https:\/\/www.foxbusiness.com\/economy\/musk-predicts-ai-create-universal-high-income-make-saving-money-unnecessary\">according to reporting by Eric Revell at Fox Business.<\/u><\/a> &#8220;There will be universal high income \u2013 and not universal basic income \u2013 universal high income. There&#8217;ll be no shortage of goods or services.&#8221;<\/p>\n<p>That\u2019s a pretty rosy projection, but the idea is not lost on many with front row seats to this wave of advancement: that in the end, AGI will become so capable that it can do almost any rote task with a great degree of success.<\/p>\n<h2 class=\"subhead-embed\">Analyzing Progress<\/h2>\n<p>There\u2019s also the question of how we get there. I saw a panel at this year\u2019s April Imagination in Action event at MIT, where a group of accomplished people discussed how sensory AI is flourishing, and what models business use to push the envelope. (Disclaimer: April\u2019s IIA event is an annual conference that I help to facilitate.)<\/p>\n<p>In moderating the panel, our own Paul Liang of MIT\u2019s Media Lab asked the group where they think the common approach to AI has most gone off the rails.<\/p>\n<p>\u201cThe thing I&#8217;m actually most worried about is that as AI integrates with all the sensors that we have in our lives, from our watches to our rings to pens to glasses, it will know everything about us,\u201d said Alvin Graylin of Stanford, \u201cand if that data is not controlled by the user, we will at some point become controlled by whoever controls the platforms that owns that data, and I think our loss of agency is one of the biggest risks that we have as humans, as AI becomes more prevalent and as data becomes more available.\u201d<\/p>\n<p>Cinnamon Sipper, CEO of Godela, had this to say about the path to advancing AI:<\/p>\n<p>\u201cI don&#8217;t believe that the type of output that looks like, you know, general intelligence and physics reasoning will come about by scaling any one model the same way,\u201d Sipper said. \u201cI think, instead, being able to tackle complex physics problem-solving, bringing true physical reasoning into different AI models or different systems, will require a little bit more of a orchestration of different models, as opposed to any just one master general model.\u201d<\/p>\n<p>James Le talked about how things work at his company, TwelveLabs, where he is Head of Developer Experience. He pointed out how so many firms use a method involving big data and supervised learning that is more mechanical, less agile, and less based on teaching the model to understand.<\/p>\n<p>\u201cOur focus as a company is to take the other direction,\u201d he said, \u201ctraining the video natively on a lot of video content, building these communities that can understand temporal dimension, how spaces relate to each other through time. To that point about orchestration, I think it&#8217;s also super-important to view kind of a corpus level, video orchestration that can think about concept objects, activities inside the video frame, how they relate to each other, and then, when you ask questions about any specific entities or activities, you can actually derive the context graph, the knowledge graph.\u201d<\/p>\n<h2 class=\"subhead-embed\">Domain Expertise<\/h2>\n<p>In going over some of these more sophisticated tacks on AI progress, the panel kept touching on that idea of whether to lean more toward explainable AI, or something different.<\/p>\n<p>Sipper mentioned the drawbacks of \u201cblack box\u201d systems, suggesting that \u201cpouring a bunch of data into a model, and hoping that it solves all sorts of problems, is a little bit of an intractable trade-off in value and investment right now.\u201d<\/p>\n<p>Le explained combining data labeling, which is a big business, and domain-specific modalities, and AG expanded on that, noting the constraints of using video to teach robots:<\/p>\n<p>\u201cWhen you look at just using video, it&#8217;s not enough fidelity of information to train robots to do activities,\u201d Graylin said, \u201cbecause they don&#8217;t have pressure data, they don&#8217;t have directional data, they don\u2019t have details.\u201d<\/p>\n<p>He continued:<\/p>\n<p>\u201cThere\u2019s a lot of occlusion that happens when things are being done, when things are getting complex, and also very fine-grained positional data of objects and body parts and so forth, so if you&#8217;re looking at just training systems with a lot of video, it still won&#8217;t solve those kinds of problems. Having a combination of well-labeled data with alternative multimodal sensing, I think that allows you to then create the more sophisticated learning that you&#8217;re talking about.\u201d<\/p>\n<p>Le elaborated:<\/p>\n<p>\u201cIf you train with language first, you acquire the bias of the text modality,\u201d he said, \u201cand in our domain, for example, the temporal motion part gets extremely important, and adding on video as an afterthought is not effective.\u201d<\/p>\n<h2 class=\"subhead-embed\">The Big Brain<\/h2>\n<p>Some of the discussion also moved toward comparing smart AI to humans.<\/p>\n<p>\u201cIf we learn from biology, humans learn about the physical world before we learn language,\u201d Graylin said, \u201cso it would actually make sense to do a multimodal model of learning, because if we&#8217;re modeling the brain, then it would make a lot of sense to learn from all modes at the same time. In fact, if you look at children who learn multiple languages, they may be a little bit slower in the beginning, but they\u2019ll automatically be able to translate between all these languages eventually.\u201d<\/p>\n<p>\u201cThese arguments are great,\u201d Liang noted, \u201cbut empirically, we don&#8217;t see the evidence that large scale natively multimodal training outperforms first training language models, and only then stapling other modalities on as an afterthought. So, do you think something needs to change in maybe the architectures of these models, the way that they are trained, the way that data is collected and presented for these models?\u201d<\/p>\n<p>In response, Graylin mentioned self-driving technologies, where the earlier efforts started out with a lot of labeled data, and then better LLMs brought higher-level inference and processing, and how that looks like progress.<\/p>\n<p>Sipper talked about how her company trains with scalar field outputs of simulation data, and the meshes of objects.<\/p>\n<h2 class=\"subhead-embed\">Privacy and Agency<\/h2>\n<p>As panelists discussed the necessity for privacy and user agency, Graylin argued for permissionless systems.<\/p>\n<p>\u201cThis has to be the default,\u201d he said, \u201cthat a system does not share beyond the device that data is collected in, and it\u2019s only serving the user. If the user would like to share that with different platforms, then it makes sense, but if it&#8217;s automatically being captured by platforms, or the device manufacturers, or an advertising vendor, then there&#8217;s going to be significant privacy backlash.\u201d<\/p>\n<p>Le, again, presented this through the lens of how his company works:<\/p>\n<p>\u201cWe think about government, national security, defense use cases, and in that industry, privacy and security are even more prominent.\u201d<\/p>\n<p>\u201cThere is such a strong demand for on-prem solutions,\u201d Sipper said, \u201cthat a lot of people haven&#8217;t really figured out how that is compatible with an increasingly cloud-based infrastructure, and wanting to own different parts of the stack, and so I think there are very interesting business models evolving. I&#8217;m sure there are more philosophical, grand big questions that will come about.\u201d<\/p>\n<p>\u201cHow do we keep people from allowing machines to direct everything?\u201d Graylin asked, \u201cBecause when we start to have everything being sensed, then the machine will just give you the answers, and it will just be automatic, and more and more we will rely on machines to tell us what to do, where to go. We&#8217;re already doing that today when we drive, but we&#8217;re going to do that to all aspects of our lives.\u201d<\/p>\n<h2 class=\"subhead-embed\">More Senses<\/h2>\n<p>\u201cI am really excited about the sense of touch, and the sense of smell,\u201d Liang said in conclusion. \u201cI think some of you already alluded to this, that we need AI that understands the physical world, and for it to understand the physical world, it must feel and interact with objects like people can. So, how do you build really good sensors to capture the sense of touch? How do you build sensors that capture smells of different objects, and use that as a way of recognizing whether something is good or bad, or whether something is dangerous, right? These are all very interesting questions that aim to extend our human senses and implant them into AI machines. We&#8217;ve built systems that can transmit smells over digital mediums, and have somebody else wearing something, and recreate that smell. There&#8217;s lots of senses beyond language, obviously video, audio, that are part of the human experience, and are worth investigating.\u201d<\/p>\n<p>This was such an interesting foray into what people are doing with AI now. Touch? Taste? Smell?<\/p>\n<p>What do you think? Drop me a comment and let me know.<\/p>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/www.forbes.com\/sites\/johnwerner\/2026\/05\/23\/putting-the-senses-in-ai\/\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Digital generated image of multicolored particles forming eye shape against black background. getty It\u2019s no secret that smart wearables are becoming a big industry, and in the context of that, the \u201cawareness\u201d of hardware through sensory apparatus is a big factor. The machines that see (in their own ways) and experience the world around them<\/p>\n","protected":false},"author":1,"featured_media":13718,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37],"tags":[],"class_list":["post-13717","post","type-post","status-publish","format-standard","has-post-thumbnail","category-brand-spotlights"],"_links":{"self":[{"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/posts\/13717","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13717"}],"version-history":[{"count":0,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/posts\/13717\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/media\/13718"}],"wp:attachment":[{"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13717"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13717"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13717"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}