Latest news with #largeLanguageModels


Fox News
8 hours ago
- Science
- Fox News
MORNING GLORY: Why the angst about AI?
Should we be alarmed by the acceleration of "artificial intelligence" ("AI") and the "large language models" (LLMs) AI's developers employ? Thanks to AI I can provide a short explanation of the LLM term: "Imagine AI as a large umbrella, with generative AI being a smaller umbrella underneath. LLMs are like a specific type of tool within the generative AI umbrella, designed for working with text." Clear? Of course not. The intricacies of AI and the tools it uses are the stuff of start-ups, engineers, computer scientists and the consumers feeding them data knowingly or unknowingly. In the first Senate version of the "One Big Beautiful Bill," Senator Ted Cruz sponsored and the drafting committees accepted a 10-year ban on state legislatures laying down rules of the road for AI. Senator Cruz advocated for a federal moratorium on states enforcing their unique AI laws. Senator Cruz argued that states' regulations could create a confusing patchwork of rules that could hinder AI development and adoption. After much discussion and debate, the proposal was stricken from the Senate bill, which then went on to pass the Senate and House and was signed into law on July 4, creating in six months an enormous set of legislative accomplishments for President Trump as every one of the priorities he campaigned on was delivered via the OBBB. What about the concerns about AI? Very, very few essays or columns or even books leave lasting marks. One that did so for me was penned by Dr. Charles Krauthammer in 2011 and included in the magnificent collection of his very best work, "Things That Matter." In that collection is the brief column titled "Are We Alone In The Universe?" Krauthammer quickly recounts the reasons why we ought not to be alone as an intelligent species in the universe, as well as the explanation of why we haven't "heard from" any other civilizations in even our own galaxy. The answer, Krauthammer states, "is to be found, tragically, in…the high probability that advanced civilizations destroy themselves." Krauthammer credits Carl Sagan and others with this gloomy proposition, but it is Krauthammer who sums it up nicely; "[T]his silent universe is conveying not a flattering lesson about our uniqueness but a tragic story about our destiny," Krauthammer continued. "It is telling us that intelligence may be the most cursed faculty in the entire universe —an endowment not just ultimately fatal but, on the scale of cosmic time, nearly instantly so." But no gloom and doom for Krauthammer, only clarity: "Intelligence is a capacity so godlike, so protean, that it must be contained and disciplined." "This is the work of politics," Krauthammer concludes, "understood as the ordering of society and the regulation of power to permit human flourishing while simultaneously restraining the most Hobbesian human instincts." Krauthammer is right and Senator Cruz was correct to tee up the debate which isn't over, only begun. That's the "politics" part which is never-ending until the civilization ends. AI is indeed "godlike" in the promises its boosters make but profoundly disruptive of all of human history that went before it. Does it mean we are stepping off the edge of a cliff that destroyed all the other civilizations that went before us on distant planets from whom we will never hear a peep because they have run out their own string? Impossible to say, but kudos to Senator Cruz for kicking off the debate. The conversation deserves much more attention than it has thus far received. It's too easy to simply go full "disaster is inevitable" mode, but some speed bumps —Cruz 2.0 in the next reconciliation?— would be welcome. Hugh Hewitt is host of "The Hugh Hewitt Show," heard weekday mornings 6am to 9am ET on the Salem Radio Network, and simulcast on Salem News Channel. Hugh wakes up America on over 400 affiliates nationwide, and on all the streaming platforms where SNC can be seen. He is a frequent guest on the Fox News Channel's news roundtable hosted by Bret Baier weekdays at 6pm ET. A son of Ohio and a graduate of Harvard College and the University of Michigan Law School, Hewitt has been a Professor of Law at Chapman University's Fowler School of Law since 1996 where he teaches Constitutional Law. Hewitt launched his eponymous radio show from Los Angeles in 1990. Hewitt has frequently appeared on every major national news television network, hosted television shows for PBS and MSNBC, written for every major American paper, has authored a dozen books and moderated a score of Republican candidate debates, most recently the November 2023 Republican presidential debate in Miami and four Republican presidential debates in the 2015-16 cycle. Hewitt focuses his radio show and his column on the Constitution, national security, American politics and the Cleveland Browns and Guardians. Hewitt has interviewed tens of thousands of guests from Democrats Hillary Clinton and John Kerry to Republican Presidents George W. Bush and Donald Trump over his 40 years in broadcast, and this column previews the lead story that will drive his radio/ TV show today.


South China Morning Post
a day ago
- Business
- South China Morning Post
Huawei defends AI models as home-grown after whistle-blowers raise red flags
The Huawei Technologies' lab in charge of large language models (LLMs) has defended its latest open-source Pro MoE model as indigenous, denying allegations that it was developed through incremental training of third-party models. The Shenzhen-based telecoms equipment giant, considered the poster child for China's resilience against US tech sanctions, is fighting to maintain its relevance in the LLM field, as open-source models developed by the likes of DeepSeek and Alibaba Group Holding gain ground. Alibaba owns the South China Morning Post. Huawei used an open-sourced artificial intelligence (AI) model called Pangu Pro MoE 72B, which had been trained on Huawei's home-developed Ascend AI chips. However, an account on the open-source community GitHub, HonestAGI, on Friday alleged that the Huawei model had 'extraordinary correlation' with Alibaba's Qwen-2.5 14B model, raising eyebrows among developers. Huawei's Noah's Ark Lab, the unit in charge of Pangu model development, said in a statement on Saturday that the Pangu Pro MoE open-source model was 'developed and trained on Huawei's Ascend hardware platform and [was] not a result of incremental training on any models'. The Huawei Ascend AI booth at The World Artificial Intelligence Conference in Shanghai, July 4, 2024. Photo: AP The lab noted that development of its model involved 'certain open-source codes' from other models, but that it strictly followed the requirements for open-source licences and that it clearly labelled the codes. The original repository uploaded by HonestAGI has gone, but a brief explanation remains.


CNET
31-05-2025
- Business
- CNET
LLMs and AI Aren't the Same. Everything You Should Know About What's Behind Chatbots
Chances are, you've heard of the term "large language models," or LLMs, when people are talking about generative AI. But they aren't quite synonymous with the brand-name chatbots like ChatGPT, Google Gemini, Microsoft Copilot, Meta AI and Anthropic's Claude. These AI chatbots can produce impressive results, but they don't actually understand the meaning of words the way we do. Instead, they're the interface we use to interact with large language models. These underlying technologies are trained to recognize how words are used and which words frequently appear together, so they can predict future words, sentences or paragraphs. Understanding how LLMs work is key to understanding how AI works. And as AI becomes increasingly common in our daily online experiences, that's something you ought to know. This is everything you need to know about LLMs and what they have to do with AI. What is a language model? You can think of a language model as a soothsayer for words. "A language model is something that tries to predict what language looks like that humans produce," said Mark Riedl, professor in the Georgia Tech School of Interactive Computing and associate director of the Georgia Tech Machine Learning Center. "What makes something a language model is whether it can predict future words given previous words." This is the basis of autocomplete functionality when you're texting, as well as of AI chatbots. What is a large language model? A large language model contains vast amounts of words from a wide array of sources. These models are measured in what is known as "parameters." So, what's a parameter? Well, LLMs use neural networks, which are machine learning models that take an input and perform mathematical calculations to produce an output. The number of variables in these computations are parameters. A large language model can have 1 billion parameters or more. "We know that they're large when they produce a full paragraph of coherent fluid text," Riedl said. How do large language models learn? LLMs learn via a core AI process called deep learning. "It's a lot like when you teach a child -- you show a lot of examples," said Jason Alan Snyder, global CTO of ad agency Momentum Worldwide. In other words, you feed the LLM a library of content (what's known as training data) such as books, articles, code and social media posts to help it understand how words are used in different contexts, and even the more subtle nuances of language. The data collection and training practices of AI companies are the subject of some controversy and some lawsuits. Publishers like The New York Times, artists and other content catalog owners are alleging tech companies have used their copyrighted material without the necessary permissions. (Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed on Ziff Davis copyrights in training and operating its AI systems.) AI models digest far more than a person could ever read in their lifetime -- something on the order of trillions of tokens. Tokens help AI models break down and process text. You can think of an AI model as a reader who needs help. The model breaks down a sentence into smaller pieces, or tokens -- which are equivalent to four characters in English, or about three-quarters of a word -- so it can understand each piece and then the overall meaning. From there, the LLM can analyze how words connect and determine which words often appear together. "It's like building this giant map of word relationships," Snyder said. "And then it starts to be able to do this really fun, cool thing, and it predicts what the next word is … and it compares the prediction to the actual word in the data and adjusts the internal map based on its accuracy." This prediction and adjustment happens billions of times, so the LLM is constantly refining its understanding of language and getting better at identifying patterns and predicting future words. It can even learn concepts and facts from the data to answer questions, generate creative text formats and translate languages. But they don't understand the meaning of words like we do -- all they know are the statistical relationships. LLMs also learn to improve their responses through reinforcement learning from human feedback. "You get a judgment or a preference from humans on which response was better given the input that it was given," said Maarten Sap, assistant professor at the Language Technologies Institute at Carnegie Mellon University. "And then you can teach the model to improve its responses." LLMs are good at handling some tasks but not others. Alexander Sikov/iStock/Getty Images Plus What do large language models do? Given a series of input words, an LLM will predict the next word in a sequence. For example, consider the phrase, "I went sailing on the deep blue..." Most people would probably guess "sea" because sailing, deep and blue are all words we associate with the sea. In other words, each word sets up context for what should come next. "These large language models, because they have a lot of parameters, can store a lot of patterns," Riedl said. "They are very good at being able to pick out these clues and make really, really good guesses at what comes next." What are the different kinds of language models? There are a couple kinds of sub-categories you might have heard, like small, reasoning and open-source/open-weights. Some of these models are multimodal, which means they are trained not just on text but also on images, video and audio. They are all language models and perform the same functions, but there are some key differences you should know. Is there such a thing as a small language model? Yes. Tech companies like Microsoft have introduced smaller models that are designed to operate "on device" and not require the same computing resources that an LLM does, but nevertheless help users tap into the power of generative AI. What are AI reasoning models? Reasoning models are a kind of LLM. These models give you a peek behind the curtain at a chatbot's train of thought while answering your questions. You might have seen this process if you've used DeepSeek, a Chinese AI chatbot. But what about open-source and open-weights models? Still, LLMs! These models are designed to be a bit more transparent about how they work. Open-source models let anyone see how the model was built, and they're typically available for anyone to customize and build one. Open-weights models give us some insight into how the model weighs specific characteristics when making decisions. Meta AI vs. ChatGPT: AI Chatbots Compared Meta AI vs. ChatGPT: AI Chatbots Compared Click to unmute Video Player is loading. Play Video Pause Skip Backward Skip Forward Next playlist item Unmute Current Time 0:04 / Duration 0:06 Loaded : 0.00% 0:04 Stream Type LIVE Seek to live, currently behind live LIVE Remaining Time - 0:02 Share Fullscreen This is a modal window. This video is either unavailable or not supported in this browser Error Code: MEDIA_ERR_SRC_NOT_SUPPORTED The media could not be loaded, either because the server or network failed or because the format is not supported. Technical details : Session ID: 2025-05-31:c79bda8fcb89fbafa9a86f4a Player Element ID: vjs_video_3 OK Close Modal Dialog Beginning of dialog window. Escape will cancel and close the window. Text Color White Black Red Green Blue Yellow Magenta Cyan Opacity Opaque Semi-Transparent Text Background Color Black White Red Green Blue Yellow Magenta Cyan Opacity Opaque Semi-Transparent Transparent Caption Area Background Color Black White Red Green Blue Yellow Magenta Cyan Opacity Transparent Semi-Transparent Opaque Font Size 50% 75% 100% 125% 150% 175% 200% 300% 400% Text Edge Style None Raised Depressed Uniform Drop shadow Font Family Proportional Sans-Serif Monospace Sans-Serif Proportional Serif Monospace Serif Casual Script Small Caps Reset Done Close Modal Dialog End of dialog window. Close Modal Dialog This is a modal window. This modal can be closed by pressing the Escape key or activating the close button. Close Modal Dialog This is a modal window. This modal can be closed by pressing the Escape key or activating the close button. Meta AI vs. ChatGPT: AI Chatbots Compared What do large language models do really well? LLMs are very good at figuring out the connection between words and producing text that sounds natural. "They take an input, which can often be a set of instructions, like 'Do this for me,' or 'Tell me about this,' or 'Summarize this,' and are able to extract those patterns out of the input and produce a long string of fluid response," Riedl said. But they have several weaknesses. Where do large language models struggle? First, they're not good at telling the truth. In fact, they sometimes just make stuff up that sounds true, like when ChatGPT cited six fake court cases in a legal brief or when Google's Bard (the predecessor to Gemini) mistakenly credited the James Webb Space Telescope with taking the first pictures of a planet outside of our solar system. Those are known as hallucinations. "They are extremely unreliable in the sense that they confabulate and make up things a lot," Sap said. "They're not trained or designed by any means to spit out anything truthful." They also struggle with queries that are fundamentally different from anything they've encountered before. That's because they're focused on finding and responding to patterns. A good example is a math problem with a unique set of numbers. "It may not be able to do that calculation correctly because it's not really solving math," Riedl said. "It is trying to relate your math question to previous examples of math questions that it has seen before." While they excel at predicting words, they're not good at predicting the future, which includes planning and decision-making. "The idea of doing planning in the way that humans do it with … thinking about the different contingencies and alternatives and making choices, this seems to be a really hard roadblock for our current large language models right now," Riedl said. Finally, they struggle with current events because their training data typically only goes up to a certain point in time and anything that happens after that isn't part of their knowledge base. Because they don't have the capacity to distinguish between what is factually true and what is likely, they can confidently provide incorrect information about current events. They also don't interact with the world the way we do. "This makes it difficult for them to grasp the nuances and complexities of current events that often require an understanding of context, social dynamics and real-world consequences," Snyder said. How are LLMs integrated with search engines? We're seeing retrieval capabilities evolve beyond what the models have been trained on, including connecting with search engines like Google so the models can conduct web searches and then feed those results into the LLM. This means they could better understand queries and provide responses that are more timely. "This helps our linkage models stay current and up-to-date because they can actually look at new information on the internet and bring that in," Riedl said. That was the goal, for instance, a while back with AI-powered Bing. Instead of tapping into search engines to enhance its responses, Microsoft looked to AI to improve its own search engine, in part by better understanding the true meaning behind consumer queries and better ranking the results for said queries. Last November, OpenAI introduced ChatGPT Search, with access to information from some news publishers. But there are catches. Web search could make hallucinations worse without adequate fact-checking mechanisms in place. And LLMs would need to learn how to assess the reliability of web sources before citing them. Google learned that the hard way with the error-prone debut of its AI Overviews search results. The search company subsequently refined its AI Overviews results to reduce misleading or potentially dangerous summaries. But even recent reports have found that AI Overviews can't consistently tell you what year it is. For more, check out our experts' list of AI essentials and the best chatbots for 2025.


Forbes
27-05-2025
- Business
- Forbes
The AI Arms Race: Why China May Be Playing For Second Place
Tencent's hunyuan model and OpenAI's ChatGPT In the high-stakes arena of artificial intelligence, where tech giants vie for dominance, a fascinating new narrative is emerging. Observers at Google's recent I/O Developer Conference couldn't help but notice the striking presence of Chinese-developed AI models prominently featured alongside American tech stalwarts. As LLMs (large language models) become critical yardsticks of technological prowess, China's rapid ascent is reshaping global AI dynamics. At Google's annual showcase, the Chatbot Arena leaderboard—an influential crowdsourced benchmark hosted by LMSYS on Hugging Face—highlighted remarkable advances by Chinese AI models. Names such as DeepSeek, Tencent's Hunyuan TurboS, Alibaba's Qwen, and Zhipu's GLM-4 weren't just entries—they were top contenders, especially in critical tasks like coding and complex dialogues. This shift suggests that while U.S. companies like OpenAI and Google maintain overall leadership, China's AI ambitions are gaining undeniable momentum. TOPSHOT - Google CEO Sundar Pichai addresses the crowd during Google's annual I/O developers ... More conference in Mountain View, California on May 20, 2025. (Photo by Camille Cohen / AFP) (Photo by CAMILLE COHEN/AFP via Getty Images) Yet, intriguingly, China might not be racing to win outright. Angela Zhang, a USC law professor and author of "High Wire: How China Regulates Big Tech and Governs Its Economy" argues a contrarian view in a recent essay in the Financial Times. According to Zhang, Beijing may have strategically decided that being a close second in AI serves its broader economic and geopolitical interests better than direct supremacy. This counterintuitive stance arises partly from recent aggressive U.S. measures restricting advanced semiconductor exports to China. By blocking sales of critical chips like Nvidia's H20—optimized for AI inference tasks—Washington aims to maintain a technological edge. However, these policies inadvertently push China towards accelerating its domestic semiconductor capabilities. Chinese firms like Huawei and Cambricon have swiftly moved into the vacuum, with Huawei's Ascend 910c chip already delivering about 60% of Nvidia's H100 inference performance. Moreover, U.S. chip export controls have broader global implications, extending restrictions to critical markets like India, Malaysia, and Singapore. Faced with these challenges, emerging economies may increasingly turn to China, indirectly spurring demand for Chinese technology. In a significant policy shift, the Trump administration recently rescinded the Biden-era AI Diffusion Rule, which categorized countries into tiers for AI chip exports. Instead, the administration has issued new guidance stating that the use of Huawei's Ascend AI chips—specifically models 910B, 910C, and 910D—anywhere in the world violates U.S. export controls. This move effectively imposes a global ban on these chips, citing concerns that they incorporate U.S. technology and thus fall under U.S. regulatory jurisdiction. The Department of Commerce's Bureau of Industry and Security emphasized that companies worldwide must avoid using these chips or risk facing penalties, including potential legal action. This unprecedented extraterritorial enforcement has drawn sharp criticism from China, which warns of legal consequences for entities complying with the U.S. directive, arguing that it infringes upon international trade norms and China's development interests. In response, China's AI leaders have redoubled efforts in semiconductor self-sufficiency. Huawei, for instance, spearheads a coalition aiming for China to achieve 70% semiconductor autonomy by 2028. The recent unveiling of Huawei's CloudMatrix 384 AI supernode—a system reportedly surpassing Nvidia's market-leading NVL72—signifies a crucial breakthrough, addressing a critical bottleneck in China's AI computing infrastructure. Tencent's strategy further illustrates this strategic shift. During its May AI summit, Tencent introduced advanced models such as TurboS for high-quality dialogue and coding, T1-Vision for image reasoning, and Hunyuan Voice for sophisticated speech interactions. Additionally, Tencent has embraced open-source approaches, making its Hunyuan 3D model widely available and downloaded over 1.6 million times, underscoring China's commitment to fostering global developer communities. Google's former CEO Eric Schmidt recently named directly, in addition to DeepSeek, China's most noteworthy models are Alibaba's Qwen, as well as Tencent's Hunyuan. their level has been quite close to Open AI's o1, which is a remarkable achievement. Angela Zhang suggests this positioning is intentional. Rather than risking further escalations in U.S.-China tensions, Beijing appears content to cultivate robust domestic and international ecosystems around its technology. This stance aligns well with China's traditional emphasis on strategic autonomy and incremental innovation. Open-source dynamics reinforce this calculated approach. With lower technical barriers in AI inference—a rapidly expanding market segment expected to dominate 70% of AI compute demand by 2026, according to Barclays—China's AI industry could benefit significantly from widespread adoption of its domestically developed solutions. Open-source releases from Chinese firms like DeepSeek and Baichuan also bolster global developer engagement, potentially offsetting U.S. containment efforts by creating diverse, globalized ecosystems reliant on Chinese technology. Still, it's crucial to note the challenges ahead. While Chinese models excel technically, global adoption remains limited, mostly confined to domestic markets. Issues like interface design, user familiarity, and developer support still give U.S.-based models a distinct advantage internationally. Moreover, despite impressive hardware strides, China continues to trail the U.S. in software sophistication and ecosystem integration. Yet, the trajectory is clear. China's foundational models are rapidly closing technical gaps. With strategic governmental support and substantial investment in semiconductor self-sufficiency, China appears poised not just to endure U.S. sanctions but to thrive within their constraints. Zhang's insight reframes the AI race less as a zero-sum game and more as a multipolar competition, where nations seek strategic rather than absolute dominance. For China, being second might be more beneficial, reducing geopolitical friction while securing substantial economic benefits through technology self-reliance and international partnerships. Ultimately, the AI landscape is shifting rapidly. Leadership in this field will increasingly hinge on adaptability, global collaboration, and strategic foresight rather than merely raw computing power. For now, China's measured pursuit of second place might be exactly the kind of innovative thinking the tech world needs—less about outright dominance and more about sustainable and strategic competitiveness.


Forbes
27-05-2025
- Health
- Forbes
Latest Research Assesses The Use Of Specially Tuned Generative AI For Performing Mental Health Therapy
In today's column, I explore and analyze the results of a recent research study that examined the efficacy of using a specially tuned generative AI to perform a limited range of mental health therapy occurring over an eight-week period. Subjects were generally monitored in a devised experimental setting. The uptake is that the treatment-group participants appeared to benefit from the use of the tuned generative AI, spurring improvements in dealing with various mental health conditions such as depression, weight-related concerns, and anxiety. This is an encouraging sign that generative AI and large language models (LLMs) provide a potential facility for adequately performing mental health therapy. Still, important caveats are worth noting and require further study and consideration. Let's talk about it. This analysis of AI breakthroughs is part of my ongoing Forbes column coverage on the latest in AI, including identifying and explaining various impactful AI complexities (see the link here). I've been extensively covering and analyzing a myriad of facets of contemporary AI that generates mental health advice and undertakes interactive AI-driven therapy. This rapidly increasing use of AI has principally been spurred by the widespread adoption of generative AI and large language models (LLMs). There are tremendous upsides to be had, but at the same time, hidden risks and outright gotchas come into these endeavors too. I frequently speak up about these pressing matters, including in an episode of CBS 60 Minutes, see the link here. For a quick summary of some of my posted columns on AI for mental health therapy, see the link here, which recaps forty of the over one hundred column postings that I've published on the evolving topic. Active and extensive research on the use of AI for mental health purposes has been going on for many years. One of the earliest and most highly visible instances involved the impacts of a rudimentary form of AI known as Eliza during the 1960s, see my discussion at the link here. In that now famous or classic case, a simple program coined as Eliza, echoed user-entered inputs and did so with the air of the AI acting like a therapist. To some degree, this was a surprise to everyone at that time. The mainstay of the surprise was that a barebones computer program could cause people to seemingly believe they were conversing with a highly capable mental health professional or psychologist. Almost a dozen years later, the legendary astrophysicist and science communicator, Carl Sagan, made a prediction in 1975 about the eventuality and inevitably of AI acting as a psychotherapist for humans. As I have discussed about his prophecy, at the link here, in many notable ways he was right, but in other facets, he was a bit off and we have not yet witnessed the fullness of his predictions. During the heyday of expert systems, many efforts were launched to use rules-based capabilities to act as a therapist, see my discussion at the link here. The notion was that it might be feasible to identify all the rules that a human therapist uses to perform therapy and then embed those rules into a knowledge-based system. The upside of those expert systems was that it was reasonably plausible to test the AI and gauge whether it would dispense proper advice. A builder of such an AI system could exhaustively examine various paths and rules, doing so to try and ensure that the expert system would not produce improper advice. Parlance in the AI field is that this type of AI was considered to be deterministic. In contrast, and a disconcerting issue with today's generative AI and LLMs, is that the latest AI tends to work on a non-deterministic basis. The AI uses statistics and probabilities to generate the responses being emitted to a user. In general, it isn't feasible to then fully test such AI since the outputs are somewhat unpredictable. It is for that reason that we need to be particularly cautious in promoting generative AI and LLMs as handy aid for performing therapy. People are doing so anyway, and are often unaware that the AI could give out untoward advice, including incurring so-called AI hallucinations that give unsupported made-up contrivances (see my explanation at the link here). I've repeatedly noted that we are amid a grand experiment that involves the world's population and the use of generative AI for mental health advisement. This AI-based therapy is being used actively at scale. We don't know how many people are avidly using LLMs for this purpose, though guesses reach into the many millions of users (see my analysis at the link here). An intriguing tradeoff is taking place before our very eyes. On the one hand, having massively available AI-based therapy at a near-zero cost to those using it, and being available anywhere at any time, might be a godsend for population-level mental health. The qualm is that we don't yet know whether this will end up as a positive outcome or a negative outcome. A kind of free-for-all is taking place and seemingly only time will tell if this unfettered unfiltered use of AI will have a net positive ROI. A recent research study opted to take a close look at how a specially tuned generative AI might perform and did so in a thoughtfully designed experimental setting. We definitely need more such mindfully crafted studies. Much of the prevailing dialogue about this weighty topic is based on speculation and lacks rigor and care in analysis. In the study entitled 'Randomized Trial of a Generative AI Chatbot for Mental Health Treatment', Michael V. Heinz, Daniel M. Mackin, Brianna M. Trudeau, Sukanya Bhattacharya, Yinzhou Wang, Haley A. Banta, Abi D. Jewett, Abigail J. Salzhauer, Tess Z. Griffin, and Nicholas C. Jacobson, New England Journal of Medicine AI, March 27, 2025, these key points were made (excerpts): Readers deeply interested in this topic should consider reading the full study to get the details on the procedures used and the approach that was undertaken. Some additional historical context on these matters might be beneficial. There have been a number of prior research studies focusing on principally expert-systems-based AI for mental health therapy, such as a well-known commercial app named Woebot (see my analysis at the link here), a rules-based app named Tessa for eating disorders (see my discussion at the link here), and many others. Those who have rules-based solutions are often seeking to augment their systems by incorporating generative AI capabilities. This makes sense in that generative AI provides fluency for interacting with users that conventional expert systems typically lack. The idea is that you might get the best of both worlds, namely the predictable nature of an expert system that combines with the highly interactive nature of LLMs. The challenge is that generative AI tends to have the qualms I mentioned earlier due to its non-deterministic nature. If you blend a tightly tested expert system with a more loosey-goosey generative AI capability, you are potentially taking chances on what the AI is going to do while dispensing mental health advice. It's quite a conundrum. Another angle is to see if generative AI can be bound sufficiently to keep it from going astray. It is conceivable that with various technological guardrails and human oversight, an LLM for mental health use can be reliably utilized in the wild. This has spurred an interest in devising highly customized AI foundational models that are tailored specifically to the mental health domain, see my discussion at the link here. Let's shift gears and consider the myriad of research pursuits from a 30,000-foot level. We can garner useful insights into how such research has been conducted, and how such research might be further conducted on an ongoing basis and in the future too. Here are five notable considerations that are worthwhile contemplating: The research studies in this realm that aim to be highly methodical and systematic will typically make use of the longstanding time-tested practice of RCT (randomized control trial). This consists of devising an experimental design that randomly assigns subjects to a treatment or experimental group, and other subjects to a control group. Such a rigorous approach aims to try and prevent confounding factors from getting in the way of making suitable claims regarding what the research identified and stipulates as strident outcomes. First, let's give suitably due credit to those studies using RCT. The amount of time and energy can be substantial. Unlike other research approaches that are more ad hoc, trying to do things the best way possible can be time-consuming and costly. A willingness and persistence to do AI-related research in this manner is exceedingly laudable. Thanks for doing so. An issue or challenge about RCT is that since it does tend to take a longer time to conduct, the time lag can be a kind of downfall or detractor from the results. This is especially the case in the high-tech field, including AI. Advances in AI are happening very quickly, in the order of days, weeks, and months. Meanwhile, some of these RCT studies take a year or two to sometimes undertake and complete, along with writing up the study and getting it published. Some would say that these studies often have a rear-view mirror perspective and are talking about the past rather than the present or the future. RCT is somewhat immersed in a brouhaha right now. It is the gold standard and top-notch science work relies on proceeding with RCT. But does the time lag tend to produce results that are outdated or no longer relevant? In a provocative article entitled 'Fixing The Science Of Digital Technology Harms' by Amy Orben and J. Nathan Matias, Science, April 11, 2025, they make this intriguing assertion: The problem that we seem to be faced with is an agonizing choice of longer research that has strong credibility, in contrast to less rigorous approaches that can be undertaken faster but that would then be dicey when seeking to embrace the stated results. I certainly don't want newbie researchers in AI to think that this hotly debated issue gives them a free ride to ditch RCT. Nope. That's not the answer. We need solid research in the field of AI for mental health. Period, end of story. The more, the better. Meanwhile, perhaps we can find some middle ground and be able to have our cake and eat it too. Stay tuned as I cover this mind-bending puzzle in-depth in an upcoming posting. A final thought for now comes from the legendary Marcus Aurelius, famously having made this telling remark alluding to the vital nature of research: 'Nothing has such power to broaden the mind as the ability to investigate systematically and truly all that comes under thy observation in life.' Let's fully embrace that credo.