Latest news with #performanceMetrics


Fox News
3 days ago
- Business
- Fox News
Top Dem demands answers from Social Security, claiming wait times spiked during DOGE cuts
EXCLUSIVE: The top Democrat on the Joint Economic Committee will demand Thursday that the Social Security Administration explain a spike in phone call wait times following the removal of an online tracking tool from its website. New Hampshire Sen. Maggie Hassan wrote to SSA Commissioner Frank Bisignano at agency headquarters in Baltimore with "serious concerns regarding changes to the performance metrics that the SSA shares through its public dashboard." Hassan also questioned whether DOGE-centric cuts to the federal workforce and other assets of the government played into the situation, citing Washington Post reporting on the removal and/or replacement of publicly-accessible tools and an independent analysis by minority staff on the Joint Economic Committee. "Unlike the previous dashboard, the new version also lacks historical data, as well as general processing times for retirement, survivor, disability, and Medicare benefits," Hassan wrote in the letter obtained exclusively by Fox News Digital. "These data and other metrics provide critical insight into the performance of your agency and served as guideposts for seniors and other beneficiaries navigating the benefits process," she wrote, adding that call and callback times, field-office casework processing times and other data are no longer easily accessible. Hassan also questioned whether DOGE's actions had precipitated any of the changes. "Removing this information may also obscure the impact of deep staffing and resource cuts — driven by the Department of Government Efficiency on SSA's ability to deliver for seniors," Hassan said. "For these reasons, I urge you to immediately restore all previous metrics to the SSA performance dashboard." Earlier this month, DOGE won a major victory in its efficiency quest, as the Supreme Court issued an unsigned order lifting a Maryland federal court injunction on its efforts to access SSA systems – which critics argued was untoward because they contain Americans' sensitive data. "[U]nder the present circumstances, SSA may proceed to afford members of the SSA DOGE Team access to the agency records in question in order for those members to do their work," the order read. A purported screenshot of the now-removed call-time chart shared with Fox News Digital showed a rise from about 4% to 28% hold time from February to March. However, SSA has claimed similar information cited in the Washington Post report is "false." "The reality for callers to our 800-number is about 42 percent handle their business through automated, self-service options. And for those who want to speak to a representative, about 75 percent use the callback-assist feature—they do not wait on hold for long periods of time," said Stephen McGraw, an SSA spokesman. "Considering the experience of our customers electing to receive a callback, our average speed of answer on the 800 number is now about 19 minutes so far this year. Moreover, the monthly trends are improving and better than the previous two years, as callers waited only about 12 minutes on the phone before speaking to a representative in May compared to 30 minutes in January." McGraw said wait times are forecasted to improve for the rest of the year. "As Commissioner Bisignano evaluates the agency, we are updating our performance metrics to reflect the real-life experiences of the people we serve and highlight the fastest ways our customers can get service." "It is critical that the agency measures what matters most to improve customer service while providing all Americans the information they need to select the service channel that works best for them," McGraw concluded. Bisignano has said increased staffing is not the long-term solution to the agency's systemic woes. He also appeared to endorse the DOGE idea of upgrading tech systems, saying he wants SSA to be a "digital-first, technology-led organization that puts the public as our focal point." Bisignano acknowledged SSA's last-place ranking among government agencies in employee satisfaction, prior to his taking the reins, and said he wants to improve that aspect too. In her letter, Hassan also cited DOGE work eliminating 7,000 jobs at SSA – about 4,000 of which were voluntary. DOGE also has sought to upgrade and update SSA technology systems, including a coding regimen called "COBOL" that goes back to the 1950s. "As a result, beneficiaries have faced service disruptions, error messages, and unprecedented failures in tools to schedule and manage appointments at field offices throughout the country," said Hassan, whose state had about 20% of its population on Social Security as of 2023. Hassan outlined several questions for Bisignano, including a real-time report by 5PM ET on the current callback times, wait times and such for Thursday. She also inquired about any adjustments or deletions to datasets as well as Social Security processing times for retirement, survivor and Medicare benefits, plus specific data for New Hampshire.


Fast Company
7 days ago
- Business
- Fast Company
Why we're measuring AI success all wrong—and what leaders should do about it
Here's a troubling reality check: We are currently evaluating artificial intelligence in the same way that we'd judge a sports car. We act like an AI model is good if it is fast and powerful. But what we really need to assess is whether it makes for a trusted and capable business partner. The way we approach assessment matters. As AI models begin to play a part in everything from hiring decisions to medical diagnoses, our narrow focus on benchmarks and accuracy rates is creating blind spots that could undermine the very outcomes we're trying to achieve. In the long term, it is effectiveness, not efficiency, that matters. Think about it: When you hire someone for your team, do you only look at their test scores and the speed they work at? Of course not. You consider how they collaborate, whether they share your values, whether they can admit when they don't know something, and how they'll impact your organization's culture—all the things that are critical to strategic success. Yet when it comes to the technology that is increasingly making decisions alongside us, we're still stuck on the digital equivalent of standardized test scores. The Benchmark Trap Walk into any tech company today, and you'll hear executives boasting about their latest performance metrics: 'Our model achieved 94.7% accuracy!' or 'We reduced token usage by 20%!' These numbers sound impressive, but they tell us almost nothing about whether these systems will actually serve human needs effectively. Despite significant tech advances, evaluation frameworks remain stubbornly focused on performance metrics while largely ignoring ethical, social, and human-centric factors. It's like judging a restaurant solely on how fast it serves food while ignoring whether the meals are nutritious, safe, or actually taste good. This measurement myopia is leading us astray. Many recent studies have found high levels of bias toward specific demographic groups when AI models are asked to make decisions about individuals in relation to tasks such as hiring, salary recommendations, loan approvals, and sentencing. These outcomes are not just theoretical. For instance, facial recognition systems deployed in law enforcement contexts continue to show higher error rates when identifying people of color. Yet these systems often pass traditional performance tests with flying colors. The disconnect is stark: We're celebrating technical achievements while people's lives are being negatively impacted by our measurement blind spots. Real-World Lessons IBM's Watson for Oncology was once pitched as a revolutionary breakthrough that would transform cancer care. When measured using traditional metrics, the AI model appeared to be highly impressive, processing vast amounts of medical data rapidly and generating treatment recommendations with clinical sophistication. However, as Scientific American reported, reality fell far short of this promise. When major cancer centers implemented Watson, significant problems emerged. The system's recommendations often didn't align with best practices, in part because Watson was trained primarily on a limited number of cases from a single institution rather than a comprehensive database of real-world patient outcomes. The disconnect wasn't in Watson's computational capabilities—according to traditional performance metrics, it functioned as designed. The gap was in its human-centered evaluation capabilities: Did it improve patient outcomes? Did it augment physician expertise effectively? When measured against these standards, Watson struggled to prove its value, leading many healthcare institutions to abandon the system. Prioritizing dignity Microsoft's Seeing AI is an example of what happens when companies measure success through a human-centered lens from the beginning. As Time magazine reported, the Seeing AI app emerged from Microsoft's commitment to accessibility innovation, using computer vision to narrate the visual world for blind and low-vision users. What sets Seeing AI apart isn't just its technical capabilities but how the development team prioritized human dignity and independence over pure performance metrics. Microsoft worked closely with the blind community throughout the design and testing phases, measuring success not by accuracy percentages alone, but by how effectively the app enhanced the ability of users to navigate their world independently. This approach created technology that genuinely empowers users, providing real-time audio descriptions that help with everything from selecting groceries to navigating unfamiliar spaces. The lesson: When we start with human outcomes as our primary success metric, we build systems that don't just work—they make life meaningfully better. Five Critical Dimensions of Success Smart leaders are moving beyond traditional metrics to evaluate systems across five critical dimensions: 1. Human-AI Collaboration. Rather than measuring performance in isolation, assess how well humans and technology work together. Recent research in the Journal of the American College of Surgeons showed that AI-generated postoperative reports were only half as likely to contain significant discrepancies as those written by surgeons alone. The key insight: a careful division of labor between humans and machines can improve outcomes while leaving humans free to spend more time on what they do best. 2. Ethical Impact and Fairness. Incorporate bias audits and fairness scores as mandatory evaluation metrics. This means continuously assessing whether systems treat all populations equitably and impact human freedom, autonomy, and dignity positively. 3. Stability and Self-Awareness. A Nature Scientific Reports study found performance degradation over time in 91 percent of the models it tested once they were exposed to real-world data. Instead of just measuring a model's out-of-the-box accuracy, track performance over time and assess the model's ability to identify performance dips and escalate to human oversight when its confidence drops. 4. Value Alignment. As the World Economic Forum's 2024 white paper emphasizes, AI models must operate in accordance with core human values if they are to serve humanity effectively. This requires embedding ethical considerations throughout the technology lifecycle. 5. Long-Term Societal Impact Move beyond narrow optimization goals to assess alignment with long-term societal benefits. Consider how technology affects authentic human connections, preserves meaningful work, and serves the broader community good. Supporting genuine human connection and collaboration Preserving meaningful human choice and agency Serving human needs rather than reshaping humans to serve technological needs The Path Forward Forward-thinking leaders implement comprehensive evaluation approaches by starting with the desired human outcomes and then establishing continuous human input loops and measuring results against the goals of human stakeholders. The companies that get this right won't just build better systems—they'll build more trusted, more valuable, and ultimately more successful businesses. They'll create technology that doesn't just process data faster but that genuinely enhances human potential and serves societal needs. The stakes couldn't be higher. As these AI models become more prevalent in critical decisions around hiring, healthcare, criminal justice, and financial services, our measurement approaches will determine whether these models serve humanity well or perpetuate existing inequalities. In the end, the most important test of all is whether using AI for a task makes human lives genuinely better. The question isn't whether your technology is fast enough but whether it's human enough. That is the only metric that ultimately matters.


Entrepreneur
03-06-2025
- Business
- Entrepreneur
5 Metrics Every Business Should Track to Maximise AI Investments
As the European AI landscape evolves, so too must the standards to measure success. Opinions expressed by Entrepreneur contributors are their own. You're reading Entrepreneur Europe, an international franchise of Entrepreneur Media. The Office of National Statistics (ONS) updates its shopping basket every year to reflect how consumers spend, adding new items like VR headsets or yoga mats as habits evolve. Businesses need to do the same with performance metrics. Artificial intelligence is now a central force in driving growth, yet many companies still measure success using outdated KPIs. With nearly half (42%) of European businesses now regularly using artificial intelligence (AI) — a 27% increase in just one year — the urgency is clear: if you don't measure what matters, you can't manage it. To truly maximise AI investments, C-suite leaders must update their own shopping baskets and rethink the benchmarks used to judge value. Here are five metrics that every business should be tracking to ensure AI success. 1. Data quality Even the most advanced AI models produce untrustworthy results if they're trained on inaccurate or irrelevant information. At best, this shortcoming is a temporary inconvenience that drains money and time. At worst, entrusting unsatisfactory data to AI systems leads to costly mistakes in user-level applications — all of which can damage an organisation's reputation and profit. With the success of AI hinging on high-quality data, it's important to perform regular data audits focused on improving accuracy. Routine reviews like this are a way to patrol data pipelines, checking that they're free of inconsistencies that could otherwise undermine AI outputs. 2. Data coverage Clean data is one priority; complete data is another. The AI models without access to every dataset are more vulnerable to blind spots, causing limitations in their ability to detect trends and identify key opportunities. For instance, insurers that automate their risk assessment processes with AI typically ingest data from operational logs, market patterns and even independent sources like weather forecasts. Accidentally neglecting just one of these could result in the misinterpretation of costly payout claims. To counter similar risks, conducting regular assessments of your data landscape to uncover overlooked data points. Eliminating visibility gaps allows businesses to paint a full picture of their digital environment, ensuring all data channels are readily available for AI usage. 3. Operational efficiency gains The clearest way to measure the success of a new initiative is to see how much time or money it saves compared to the previous approach. Put simply: a factory that installs a faster conveyor belt should see an increase in productivity. AI is no exception to that logic. From accelerating loan approvals to automating data entry, the long-term objective of AI in any industry is to reduce turnaround times and cut costs. Failure to gauge operational impact makes it difficult to justify ongoing investment. As such, it's sensible to measure process durations before and after AI integration — a benchmarking method DHL deployed to recognise that its AI-powered robots had delivered a 40% increase in sorting capacity, quantifying their investment's active contribution to business KPIs. 4. Adoption rate across teams Just because a solution successfully goes live, it doesn't mean adoption is fully guaranteed. Really, true value comes when AI is embedded into workflows across the whole company — not just the IT department. Some teams will immediately embrace the AI tools presented to them, whereas others need more support. To assess where training or change management might be necessary, it's helpful to track departmental usage data and run regular employee feedback surveys. This approach works for high-performing organisations, who are more likely to bring employees with them on their AI journey by providing extensive AI training. In this context, understanding digital behaviour is the starting point for extracting more engagement from AI. 5. Return on investment (ROI) Naturally, businesses leaders need to understand the financial return they're getting back from investment. However, the ROI generated from AI initiatives is often complex, involving both tangible and intangible benefits. Take the Berlin-based online retailer Zalando, which recently shared that it uses generative AI to produce digital imagery at a rapid rate. Not only has that directly reduced costs by 90%, but the faster turnaround in editorial campaigns also indirectly boosted the company's competitiveness in the fast fashion market. Every possible performance metric must be considered when curating a digital strategy. That's why it's important to develop a well-rounded ROI framework for AI — factoring in both the direct and indirect consequences of any planned change. Measure what matters, scale what works AI is already demonstrating its ability to reshape organisations, but the reality is that many still struggle to prove its concrete value. Without establishing the right criteria for success, businesses will lack accountability and struggle to align tech performance with financial gains. To maximise ROI on AI, you must clarify the standards that you wish your digital growth to be founded on. This will unlock the insights needed to safely course-correct, scale success, and build long-term trust in your AI strategy. As the AI landscape evolves, so too must the standards to measure success. Just like the ONS shopping basket reflects changing habits, businesses must ensure performance metrics reflect the realities of AI-driven operations. By focusing on data quality, coverage, efficiency, adoption, and ROI, leaders can ensure AI investments aren't just tracked but transformed into long-term value.