{"id":9373,"date":"2026-03-25T03:34:44","date_gmt":"2026-03-25T03:34:44","guid":{"rendered":"https:\/\/wildgreenquest.com\/?p=9373"},"modified":"2026-03-25T03:34:44","modified_gmt":"2026-03-25T03:34:44","slug":"this-microsoft-security-team-stress-tests-ai-for-its-worst-case-scenarios","status":"publish","type":"post","link":"https:\/\/wildgreenquest.com\/?p=9373","title":{"rendered":"This Microsoft security team stress-tests AI for its worst-case scenarios"},"content":{"rendered":"<p><br \/>\n<br \/><\/p>\n<div data-testid=\"content-chunk\">\n<p>As soon as new AI products are released, security researchers and pranksters begin probing them for weaknesses, trying to push systems to <a rel=\"nofollow\" href=\"https:\/\/www.ibm.com\/think\/topics\/prompt-injection\" target=\"_blank\" rel=\"noreferrer noopener\">violate their own safety precautions<\/a> and coax them into producing anything from offensive content to instructions for building weapons.<\/p>\n<\/div>\n<div data-testid=\"content-chunk\">\n<p>After all, AI risks are not just theoretical. In recent months, various AI companies have faced criticism for their software allegedly contributing to mental illness and suicide,  nonconsensual fake nude images\u00a0of real people, and\u00a0<a rel=\"nofollow\" href=\"https:\/\/www.securityweek.com\/hackers-weaponize-claude-code-in-mexican-government-cyberattack\/\" target=\"_blank\" rel=\"noreferrer noopener\">aiding hackers in cybercrime<\/a>. At the same time, techniques for bypassing safeguards also continue to evolve, with recent methods including everything from\u00a0<a rel=\"nofollow\" href=\"https:\/\/www.theguardian.com\/technology\/2025\/nov\/30\/ai-poetry-safety-features-jailbreak\" target=\"_blank\" rel=\"noreferrer noopener\">malicious prompts disguised with poetry<\/a>\u00a0to\u00a0<a rel=\"nofollow\" href=\"https:\/\/www.microsoft.com\/en-us\/security\/blog\/2026\/02\/10\/ai-recommendation-poisoning\/\" target=\"_blank\" rel=\"noreferrer noopener\">surreptitiously planting ideas in AI assistant memories<\/a>\u00a0via innocuous-looking online tools.\u00a0<\/p>\n<p>But long before new models reach the public, internal security teams are already stress-testing them. At Microsoft, that responsibility largely falls to the company\u2019s <a rel=\"nofollow\" href=\"https:\/\/learn.microsoft.com\/en-us\/security\/ai-red-team\/\">AI Red Team<\/a>, a group that since 2018 has worked with product teams and the broader AI community to pressure-test models and applications before bad actors can.<\/p>\n<p>In cybersecurity parlance, a red team focuses on simulating attacks against a system, while a blue team focuses on defending it. Microsoft\u2019s AI Red Team is no exception, exploring a wide range of safety and security concerns\u2014from loss-of-control situations where AI evades human oversight to issues around chemical, biological, and nuclear threats\u2014across an assortment of AI software.\u00a0<\/p>\n<\/div>\n<div data-testid=\"content-chunk\">\n<p>\u201cWe see a really, really diverse set of tech,\u201d says Tori Westerhoff, principal AI security researcher on the Microsoft AI Red Team. \u201cPart of the kind of magic of the team is that we can see anything from a product feature to a system to a copilot to a frontier model, and we get to see how tech is integrated across all of those, and how AI is growing and evolving.\u201d\u00a0<\/p>\n<p>In one case, says Pete Bryan, principal AI security research lead on the Red Team, members worked with other Microsoft researchers to test whether AI could be manipulated into assisting with cyberattacks, including generating or refining malware. They experimented with framing questions in benign ways, such as describing a student project or security research scenario, then pushing systems to produce increasingly detailed outputs.<\/p>\n<p>The effort went beyond simple prompt testing. Researchers evaluated whether the AI could generate code that actually compiled and ran, and whether certain programming languages increased the likelihood of harmful outputs. In the worst case, Bryan says, the systems produced code comparable to what a low- to mid-level hacker might already create, but the team still refined detection systems to better flag such behavior.<\/p>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/www.fastcompany.com\/91513979\/microsoft-red-team-stress-testing-ai\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>As soon as new AI products are released, security researchers and pranksters begin probing them for weaknesses, trying to push systems to violate their own safety precautions and coax them into producing anything from offensive content to instructions for building weapons. After all, AI risks are not just theoretical. In recent months, various AI companies<\/p>\n","protected":false},"author":1,"featured_media":9374,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[37],"tags":[],"class_list":["post-9373","post","type-post","status-publish","format-standard","has-post-thumbnail","category-brand-spotlights"],"_links":{"self":[{"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/posts\/9373","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=9373"}],"version-history":[{"count":0,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/posts\/9373\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=\/wp\/v2\/media\/9374"}],"wp:attachment":[{"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=9373"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=9373"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wildgreenquest.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=9373"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}