Author: admin

  • Researchers build a better AI model memory probe • The Register

    Researchers build a better AI model memory probe • The Register

    If you’ve ever wondered whether that chatbot you’re using knows the entire text of a particular book, answers are on the way. Computer scientists have developed a more effective way to coax memorized content from large language models, a…

    Continue Reading

  • Why trouble for the biggest foreign buyer of U.S. debt could ripple through America’s bond market

    Why trouble for the biggest foreign buyer of U.S. debt could ripple through America’s bond market

    By Vivien Lou Chen

    Developments in Japan are creating a risk that investors in the U.S. Treasury market may one day pull the rug out by keeping more of their savings at home

    Why turmoil around Japan’s new government could wash up in U.S. financial markets.

    Recent developments overseas have the potential to complicate the White House’s agenda to bring down borrowing costs, while heightening competition for investors in the U.S. and Japanese bond markets.

    Aggressive fiscal-stimulus efforts by the cabinet of Japan’s first female prime minister, Sanae Takaichi, have created a spike in long-dated yields of Japanese government bonds and further weakness in the yen (USDJPY) in the past few weeks. It’s a situation that is being likened to the September-October 2022 crisis in the U.K., which stemmed from a crisis in confidence over a package of unfunded tax cuts proposed by then-Prime Minister Liz Truss’s government.

    Read: Liz Truss redux? Simultaneous drop for Japanese currency and bonds draws eerie parallels

    The U.S. needs to manage the cost of interest payments given a more than $38 trillion national debt, and this is a primary motivation for why the Trump administration wants to bring down long-term Treasury yields. Last week, Treasury Secretary Scott Bessent said in a speech in New York that the U.S. is making substantial progress in keeping most market-based rates down. He also said the 10-year “term premium,” or additional compensation demanded by investors to hold the long-dated maturity, is basically unchanged. Longer-duration yields matter because they provide a peg for borrowing rates used by U.S. households, businesses and the government.

    Developments in Japan are now creating the risk that U.S. yields could rise alongside Japan’s yields. This week, Japanese government-bond yields hit their highest levels in almost two decades, with the country’s 10-year rate BX:TMBMKJP-10Y spiking above 1.78% to its highest level in more than 17 years. The 40-year yield BX:TMBMKJP-40Y climbed to an all-time high just above 3.7%.

    In the U.S., 2- BX:TMUBMUSD02Y and 10-year yields BX:TMUBMUSD10Y finished Friday’s session at the lowest levels of the past three weeks, at 3.51% and almost 4.06% respectively. The 30-year U.S. yield BX:TMUBMUSD30Y fell to 4.71% or lowest level since Nov. 13.

    There’s a risk now that U.S. yields may not fall as much as they otherwise might after factoring in market-implied expectations for a series of interest-rate cuts by the Federal Reserve into 2026.

    Japan’s large U.S. footprint

    Treasury yields are not going to necessarily follow rates on Japanese government bonds higher “on a one-for-one basis,” but there might be a limit on how low they can go, said Adam Turnquist, chief technical strategist at LPL Financial. He added that the impact of Japanese developments on the U.S. bond market could take years to play out, but “we care now because of the direction Japan’s policy is going in” and the possibility that this impact might occur even sooner.

    Some of the catalysts that usually tend to push Treasury yields lower, such as any commentary from U.S. monetary policymakers that suggests the Fed might be inclined to cut rates, “might be muted because of the increased value of foreign debt,” Turnquist added.

    U.S. government debt rallied for a second day on Friday, pushing yields lower, after New York Fed President John Williams said there is room to cut interest rates in the near term.

    All three major U.S. stock indexes DJIA SPX COMP closed higher Friday, but notched sharp weekly losses, as investors attempted to calm doubts over the artificial-intelligence trade.

    The troubling spike in yields on Japanese government bonds hasn’t fully spilled over into the U.S. bond market yet, but it remains a risk. “A repeat of the Truss episode is what people are afraid of,” said Marc Chandler, chief market strategist and managing director at Bannockburn Capital Markets.

    Concerns about Japan gained added significance on Friday, when Takaichi’s cabinet approved a 21.3 trillion yen (or roughly $140 billion) economic stimulus package, which Reuters described as lavish. The amount of new spending being injected into the country’s economy from a supplementary budget, much of which is not repurposed from existing funds, is 17.7 trillion yen ($112 billion).

    Anxiety over Takaichi’s stimulus efforts has resulted in a Japanese yen that has weakened against its major peers and fallen to a 10-month low ahead of Friday’s session, and in a spike in the country’s long-dated yields. Yields on 30-year BX:TMBMKJP-30Y Japanese government debt have risen this month to 3.33%.

    Japan is the biggest foreign holder of Treasurys, with a roughly 13% share, according to the most recent data from the U.S. Treasury Department, and the concern is that the country’s investors might one day pull the rug by keeping more of their savings at home.

    Bond-auction anxiety

    Earlier in the week, a weak 20-year auction in Japan was cited as one reason why U.S. Treasury yields were a touch lower in early New York trading, which means that demand for U.S. government paper remained in place. Global investors are often incentivized to move their money based on which country offers the highest yields and best overall value.

    “The conventional wisdom is that as yields rise in Japan, the Japanese are more likely to keep their savings at home rather than export it,” Chandler said. “The Japanese have been buyers of Treasurys and U.S. stocks, and if they decide to keep their money at home, those U.S. markets could lose a bid.”

    For now, Japanese investors, which include insurers and pension funds, appear to be continuing to export their savings by buying more foreign government debt like Treasurys. Data from the U.S. Treasury Department shows that as of September, Japanese investors held just under $1.19 trillion in Treasurys, a number which has been climbing every month this year and is up from about $1.06 trillion last December.

    One reason for this is the exchange rate. The yen has depreciated against almost every major currency this year. Japanese investors have been buying U.S. Treasurys because they can diversify against the yen, which is the weakest of the G-10 currencies on an unhedged basis, according to Chandler.

    If concerns about the Takaichi government’s stimulus efforts translate into even higher yields in Japan, this could incentivize local investors to keep more of their savings at home, but might also mean rising yields for countries like the U.S.

    -Vivien Lou Chen

    This content was created by MarketWatch, which is operated by Dow Jones & Co. MarketWatch is published independently from Dow Jones Newswires and The Wall Street Journal.

    (END) Dow Jones Newswires

    11-21-25 1609ET

    Copyright (c) 2025 Dow Jones & Company, Inc.

    Continue Reading

  • Pre-Conception Hypertension Linked to Adverse Pregnancy Outcomes in IgA Nephropathy

    Pre-Conception Hypertension Linked to Adverse Pregnancy Outcomes in IgA Nephropathy

    Recent findings on pregnancy outcomes in women with IgA nephropathy (IgAN) suggest pre-conception use of non-renin-angiotensin-aldosterone system inhibitor (RASi) antihypertensive medications correlates with increased risk of severe hypertensive…

    Continue Reading

  • Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization

    Zoomer: Powering AI Performance at Meta’s Scale Through Intelligent Debugging and Optimization

    • We’re introducing Zoomer, Meta’s comprehensive, automated debugging and optimization platform for AI. 
    • Zoomer works across all of our training and inference workloads at Meta and provides deep performance insights that enable energy savings, workflow acceleration, and efficiency gains in our AI infrastructure. 
    • Zoomer has delivered training time reductions, and significant QPS improvements, making it the de-facto tool for AI performance optimization across Meta’s entire AI infrastructure.

    At the scale that Meta’s AI infrastructure operates, poor performance debugging can lead to massive energy inefficiency, increased operational costs, and suboptimal hardware utilization across hundreds of thousands of GPUs. The fundamental challenge is achieving maximum computational efficiency while minimizing waste. Every percentage point of utilization improvement translates to significant capacity gains that can be redirected to innovation and growth.

    Zoomer is Meta’s automated, one-stop-shop platform for performance profiling, debugging, analysis, and optimization of AI training and inference workloads. Since its inception, Zoomer has become the de-facto tool across Meta for GPU workload optimization, generating tens of thousands of profiling reports daily for teams across all of our apps. 

    Why Debugging Performance Matters

    Our AI infrastructure supports large-scale and advanced workloads across a global fleet of GPU clusters, continually evolving to meet the growing scale and complexity of generative AI.

    At the training level it supports a diverse range of workloads, including powering models for ads ranking, content recommendations, and GenAI features.  

    At the inference level, we serve hundreds of trillions of AI model executions per day.

    Operating at this scale means putting a high priority on eliminating GPU underutilization. Training inefficiencies delay model iterations and product launches, while inference bottlenecks limit our ability to serve user requests at scale. Removing resource waste and accelerating workflows helps us train larger models more efficiently, serve more users, and reduce our environmental footprint.

    AI Performance Optimization Using Zoomer

    Zoomer is an automated debugging and optimization platform that works across all of our AI model types (ads recommendations, GenAI, computer vision, etc.) and both training and inference paradigms, providing deep performance insights that enable energy savings, workflow acceleration, and efficiency gains.  

    Zoomer’s architecture consists of three essential layers that work together to deliver comprehensive AI performance insights: 

    Infrastructure and Platform Layer

    The foundation provides the enterprise-grade scalability and reliability needed to profile workloads across Meta’s massive infrastructure. This includes distributed storage systems using Manifold (Meta’s blob storage platform) for trace data, fault-tolerant processing pipelines that handle huge trace files, and low-latency data collection with automatic profiling triggers across thousands of hosts simultaneously. The platform maintains high availability and scale through redundant processing workers and can handle huge numbers of profiling requests during peak usage periods.

    Analytics and Insights Engine

    The core intelligence layer delivers deep analytical capabilities through multiple specialized analyzers. This includes: GPU trace analysis via Kineto integration and NVIDIA DCGM, CPU profiling through StrobeLight integration, host-level metrics analysis via dyno telemetry, communication pattern analysis for distributed training, straggler detection across distributed ranks, memory allocation profiling (including GPU memory snooping), request/response profiling for inference workloads, and much more. The engine automatically detects performance anti-patterns and also provides actionable recommendations.

    Visualization and User Interface Layer

    The presentation layer transforms complex performance data into intuitive, actionable insights. This includes interactive timeline visualizations showing GPU activity across thousands of ranks, multi-iteration analysis for long-running training workloads, drill-down dashboards with percentile analysis across devices, trace data visualization integrated with Perfetto for kernel-level inspection, heat map visualizations for identifying outliers across GPU deployments, and automated insight summaries that highlight critical bottlenecks and optimization opportunities.

    The three essential layers of Zoomer’s architecture.

    How Zoomer Profiling Works: From Trigger to Insights

    Understanding how Zoomer conducts a complete performance analysis provides insight into its sophisticated approach to AI workload optimization.

    Profiling Trigger Mechanisms

    Zoomer operates through both automatic and on-demand profiling strategies tailored to different workload types. For training workloads, which involve multiple iterations and can run for days or weeks, Zoomer automatically triggers profiling around iteration 550-555 to capture stable-state performance while avoiding startup noise. For inference workloads, profiling can be triggered on-demand for immediate debugging or through integration with automated load testing and benchmarking systems for continuous monitoring.

    Comprehensive Data Capture

    During each profiling session, Zoomer simultaneously collects multiple data streams to build a holistic performance picture: 

    • GPU Performance Metrics: SM utilization, GPU memory utilization, GPU busy time, memory bandwidth, Tensor Core utilization, power consumption, clock frequencies, and power consumption data via DCGM integration.
    • Detailed Execution Traces: Kernel-level GPU operations, memory transfers, CUDA API calls, and communication collectives via PyTorch Profiler and Kineto.
    • Host-Level Performance Data: CPU utilization, memory usage, network I/O, storage access patterns, and system-level bottlenecks via dyno telemetry.
    • Application-Level Annotations: Training iterations, forward/backward passes, optimizer steps, data loading phases, and custom user annotations.
    • Inference-Specific Data: Rate of inference requests, server latency, active requests, GPU memory allocation patterns, request latency breakdowns via Strobelight’s Crochet profiler, serving parameter analysis, and thrift request-level profiling.
    • Communication Analysis: NCCL collective operations, inter-node communication patterns, and network utilization for distributed workloads

    Distributed Analysis Pipeline

    Raw profiling data flows through sophisticated processing systems that deliver multiple types of automated analysis including:

    • Straggler Detection: Identifies slow ranks in distributed training through comparative analysis of execution timelines and communication patterns.
    • Bottleneck Analysis: Automatically detects CPU-bound, GPU-bound, memory-bound, or communication-bound performance issues.
    • Critical Path Analysis: Systematically identifies the longest execution paths to focus optimization efforts on highest-impact opportunities.
    • Anti-Pattern Detection: Rule-based systems that identify common efficiency issues and generate specific recommendations.
    • Parallelism Analysis: Deep understanding of tensor, pipeline, data, and expert parallelism interactions for large-scale distributed training.
    • Memory Analysis: Comprehensive analysis of GPU memory usage patterns, allocation tracking, and leak detection.
    • Load Imbalance Analysis: Detects workload distribution issues across distributed ranks and recommendations for optimization.

    Multi-Format Output Generation

    Results are presented through multiple interfaces tailored to different user needs: interactive timeline visualizations showing activity across all ranks and hosts, comprehensive metrics dashboards with drill-down capabilities and percentile analysis, trace viewers integrated with Perfetto for detailed kernel inspection, automated insights summaries highlighting key bottlenecks and recommendations, and actionable notebooks that users can clone to rerun jobs with suggested optimizations.

    Specialized Workload Support

    For massive distributed training for specialized workloads, like GenAI, Zoomer contains a purpose-built platform supporting LLM workloads that offers specialized capabilities including GPU efficiency heat maps and N-dimensional parallelism visualization. For inference, specialized analysis covers everything from single GPU models, soon expanding to massive distributed inference across thousands of servers.

    A Glimpse Into Advanced Zoomer Capabilities

    Zoomer offers an extensive suite of advanced capabilities designed for different AI workload types and scales. While a comprehensive overview of all features would require multiple blog posts, here’s a glimpse at some of the most compelling capabilities that demonstrate Zoomer’s depth:

    Training Powerhouse Features:

    • Straggler Analysis: Helps identify ranks in distributed training jobs that are significantly slower than others, causing overall job delays due to synchronization bottlenecks. Zoomer provides information that helps diagnose root causes like sharding imbalance or hardware issues.
    • Critical Path Analysis: Identification of the longest execution paths in PyTorch applications, enabling accurate performance improvement projections
    • Advanced Trace Manipulation: Sophisticated tools for compression, filtering, combination, and segmentation of massive trace files (2GB+ per rank), enabling analysis of previously impossible-to-process large-scale training jobs

    Inference Excellence Features:

    • Single-Click QPS Optimization: A workflow that identifies bottlenecks and triggers automated load tests with one click, reducing optimization time while delivering QPS improvements of +2% to +50% across different models, depending on model characteristics. 
    • Request-Level Deep Dive: Integration with Crochet profiler provides Thrift request-level analysis, enabling identification of queue time bottlenecks and serving inefficiencies that traditional metrics miss.
    • Realtime Memory Profiling: GPU memory allocation tracking, providing live insights into memory leaks, allocation patterns, and optimization opportunities.

    GenAI Specialized Features:

    • LLM Zoomer for Scale: A purpose-built platform supporting 100k+ GPU workloads with N-dimensional parallelism visualization, GPU efficiency heat maps across thousands of devices, and specialized analysis for tensor, pipeline, data, and expert parallelism interactions.
    • Post-Training Workflow Support: Enhanced capabilities for GenAI post-training tasks including SFT, DPO, and ARPG workflows with generator and trainer profiling separation.

    Universal Intelligence Features:

    • Holistic Trace Analysis (HTA): Advanced framework for diagnosing distributed training bottlenecks across communication overhead, workload imbalance, and kernel inefficiencies, with automatic load balancing recommendations.
    • Zoomer Actionable Recommendations Engine (Zoomer AR): Automated detection of efficiency anti-patterns with machine learning-driven recommendation systems that generate auto-fix diffs, optimization notebooks, and one-click job re-launches with suggested improvements.
    • Multi-Hardware Profiling: Native support across NVIDIA GPUs, AMD MI300X, MTIA, and CPU-only workloads with consistent analysis and optimization recommendations regardless of hardware platform.

    Zoomer’s Optimization Impact: From Debugging to Energy Efficiency

    Performance debugging with Zoomer creates a cascading effect that transforms low-level optimizations into massive efficiency gains. 

    The optimization pathway flows from: identifying bottlenecks → improving key metrics → accelerating workflows → reducing resource consumption → saving energy and costs.

    Zoomer’s Training Optimization Pipeline

    Zoomer’s training analysis identifies bottlenecks in GPU utilization, memory bandwidth, and communication patterns. 

    Example of Training Efficiency Wins: 

    • Algorithmic Optimizations: We delivered power savings through systematic efficiency improvements across the training fleet, by fixing reliability issues for low efficiency jobs.
    • Training Time Reduction Success: In 2024, we observed a 75% training time reduction for Ads relevance models, leading to 78% reduction in power consumption.
    • Memory Optimizations: One-line code changes for performance issues due to inefficient memory copy identified by Zoomer, delivered 20% QPS improvements with minimal engineering effort. 

    Inference Optimization Pipeline:

    Inference debugging focuses on latency reduction, throughput optimization, and serving efficiency. Zoomer identifies opportunities in kernel execution, memory access patterns, and serving parameter tuning to maximize requests per GPU.

    Inference Efficiency Wins:

    • GPU and CPU Serving parameters Improvements: Automated GPU and CPU bottleneck identification and parameter tuning, leading to 10% to 45% reduction in power consumption.
    • QPS Optimization: GPU trace analysis used to boost serving QPS and optimize serving capacity.

    Zoomer’s GenAI and Large-Scale Impact

    For massive distributed workloads, even small optimizations compound dramatically. 32k GPU benchmark optimizations achieved 30% speedups through broadcast issue resolution, while 64k GPU configurations delivered 25% speedups in just one day of optimization.

    The Future of AI Performance Debugging

    As AI workloads expand in size and complexity, Zoomer is advancing to meet new challenges focused on several innovation fronts: broadening unified performance insights across heterogeneous hardware (including MTIA and next-gen accelerators), building advanced analyzers for proactive optimization, enabling inference performance tuning through serving param optimization, and democratizing optimization with automated, intuitive tools for all engineers. As Meta’s AI infrastructure continues its rapid growth, Zoomer plays an important role in helping us innovate efficiently and sustainably.


    Continue Reading

  • Here are real AI stocks to invest in and speculative ones to avoid

    Here are real AI stocks to invest in and speculative ones to avoid

    Continue Reading

  • Robert F Kennedy Jr instructed CDC to change stance on vaccine and autism | Robert F Kennedy Jr

    Robert F Kennedy Jr instructed CDC to change stance on vaccine and autism | Robert F Kennedy Jr

    Robert F Kennedy Jr, the US health secretary, said in an interview with the New York Times that he personally instructed the federal Centers for Disease Control and Prevention (CDC) to change its longstanding position that vaccines do not cause…

    Continue Reading

  • The Stand With Public Media Gala Honored Truth-seekers and Storytellers

    The Stand With Public Media Gala Honored Truth-seekers and Storytellers

    Spoiler alert: when Manhattan eventually became their joint address, Colbert realized what he’d been missing. “WNYC got you going in the morning: all the information, all the culture, and all the things you needed to know about New York and…

    Continue Reading

  • Save $350 on the iPhone of Androids This Black Friday

    Save $350 on the iPhone of Androids This Black Friday

    Black Friday deals alert: The Google Pixel 9 Pro is currently $350 off during early Black Friday sales, making it an impressive $649.

    MOBILE DEALS OF THE WEEK

    Deals are selected by the CNET Group commerce team, and may be unrelated to this…

    Continue Reading

  • Analog Devices to Participate in the UBS Global Technology Conference

    Analog Devices to Participate in the UBS Global Technology Conference

    WILMINGTON, Mass., Nov. 21, 2025 /PRNewswire/ — Analog Devices, Inc. (Nasdaq: ADI) today announced that the Company’s Executive Vice President & Chief Financial Officer, Richard Puccio, will discuss business topics and trends at the UBS Global Technology Conference, taking place at the Phoenician Hotel, located in Scottsdale, Arizona on Tuesday, December 2, 2025, at 10:15 a.m. MST.

    The webcast for the conference may be accessed live via the Investor Relations section of Analog Devices’ website at investor.analog.com. An archived replay will also be available following the webcast for at least 30 days.

    About Analog Devices, Inc.

    Analog Devices, Inc. (NASDAQ: ADI) is a global semiconductor leader that bridges the physical and digital worlds to enable breakthroughs at the Intelligent Edge. ADI combines analog, digital, and software technologies into solutions that help drive advancements in digitized factories, mobility, and digital healthcare, combat climate change, and reliably connect humans and the world. With revenue of more than $9 billion in FY24 and approximately 24,000 people globally, ADI ensures today’s innovators stay Ahead of What’s Possible. Learn more at www.analog.com and on LinkedIn and Twitter (X).

    For more information, please contact:
    Jeff Ambrosi
    Senior Director of Investor Relations
    Analog Devices, Inc.
    781-461-3282
    [email protected]

    SOURCE Analog Devices, Inc.


    Continue Reading