Computer Science (arXiv)

AuRA: Internalizing Audio Understanding into LLMs as LoRA

cs.LG Jun 09, 2026

Recent efforts to extend large language models (LLMs) to speech inputs typically rely on cascaded ASR-LLM pipelines, end-to-end speech-language models, or bridge/distillation-based adaptation. While these routes respectively reuse strong pretrained components, enable native speech-language interaction, or offer lightweight adaptation, they often suffer from transcript-interface latency, costly multimodal training, or sequential speech-language coupling. To address these limitations, we present AuRA, a method that distills audio encoding capability into the LLM. Specifically, AuRA feeds the same speech input to an ASR encoder (as a teacher) and a LoRA-adapted LLM (as a student) through a lightweight audio embedding layer, and uses layer-wise distillation to align the student's hidden states with corresponding teacher representations, thereby internalizing speech representations into lightweight LLM-side adaptations. Compared with cascaded and serial bridge methods, AuRA enables tighter speech-language joint modeling and efficient parallel end-to-end inference, while also reusing pretrained speech and language models rather than requiring large-scale multimodal training. On multiple speech-language benchmarks, AuRA consistently outperforms cascaded systems, speech-to-LLM adaptation baselines, and large-scale speech-language and multimodal models in both effectiveness and efficiency.

U-TTT: Towards Generalizable PET Image Denoising via Test-Time Training

cs.CV Jun 09, 2026

Existing deep learning models for Positron Emission Tomography (PET) image denoising often suffer from severe performance degradation under distribution shifts, fundamentally restricting their robust clinical deployment. This lack of generalization stems from the conventional paradigm of fixed-parameter models that cannot adapt to variations in test data (e.g., dose levels or scanner types) after training. To overcome this limitation and achieve robust generalization, we introduce U-TTT, a novel U-shaped model that integrates Test-Time Training (TTT) layers to dynamically adjust model parameters during inference through self-supervision, thereby adapting to the specific characteristics of each test instance. Furthermore, to comprehensively capture the complex degradations of 3D PET data, U-TTT features a dual-domain adaptation mechanism comprising a Spatial Test-Time Training (S-TTT) layer and a Frequency Test-Time Training (F-TTT) layer. The S-TTT layer captures and corrects spatial structural degradations, while the F-TTT layer suppresses global noise spectra and restores delicate high-frequency details. Extensive experiments demonstrate that U-TTT achieves state-of-the-art PET denoising performance and exhibits superior generalization under challenging distribution shifts, including both unseen dose levels and unseen scanners. Our code will be available at https://github.com/Yaziwel/U-TTT.

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

cs.LG Jun 09, 2026

Recent work has demonstrated that online reinforcement learning (RL) can substantially improve the quality and alignment of flow matching models for image and video generation. Methods such as Flow-GRPO and CPS cast the denoising process as a Markov Decision Process and apply PPO-style ratio clipping to enforce a trust region. However, we argue that ratio clipping is structurally ill-suited for flow models: the probability ratio between new and old policies is a noisy, single-sample estimate of the true policy divergence, leading to over-constraining in some regions of the trajectory and under-constraining in others. We propose Flow-DPPO (Flow Divergence Proximal Policy Optimization), which replaces ratio clipping with a divergence proximal constraint. A key observation is that the per-step policy in flow models is Gaussian, enabling exact and cheap computation of the KL divergence between old and new policies. Flow-DPPO employs an asymmetric divergence mask that blocks gradient updates only when they simultaneously move away from the trusted region and violate the divergence threshold. Experiments show that Flow-DPPO achieves higher rewards with better KL-proximal efficiency, alleviates catastrophic forgetting, promotes balanced multi-objective optimization, and enables stable multi-epoch training where ratio clipping degrades. Code and models are available at https://github.com/Tencent-Hunyuan/UniRL/tree/main/FlowDPPO.

Generative Archetype-Grounded Item Representations for Sequential Recommendation

cs.IR Jun 09, 2026

Sequential recommendation aims to predict users' next interaction with items by analyzing their historical behavior. However, the limited quality of item representations remains a critical bottleneck. While pre-trained large language models (LLMs) can provide rich semantic representations, existing approaches only rely on static encoding of fixed attributes, overlooking the crucial role of target audiences in defining item identity. Moreover, the semantic space struggles to reflect actual user behavior, resulting in a significant gap between semantic representations and behavioral patterns. To address these limitations, we propose GenAIR, a general framework that empowers sequential recommendation with Generative Archetype-grounded Item Representations. Specifically, we first leverage an LLM to analyze item metadata and infer textual description of the Archetype, which represents the conceptual profile of the item's ideal target audience. We then extract the corresponding embeddings in a single forward pass. Further, to ground these generative archetypes in real-world behavior, we introduce a behavioral calibration objective, which explicitly incorporates behavioral signals from actual interactions. This objective adjusts the structure of the embedding space to reflect empirical patterns. GenAIR enables seamless integration with most existing models while maintaining high efficiency. Comprehensive experiments conducted on three real-world datasets demonstrate that GenAIR significantly improves the performance of various sequential recommendation models and consistently outperforms state-of-the-art baseline approaches. Implementation codes are available at https://github.com/AI-Santiago/GenAIR.

When Discovery Outpaces Remediation: Modeling AI-Accelerated Vulnerability Discovery in Interconnected Systems

cs.CR Jun 09, 2026

Advanced AI systems for code analysis, binary analysis, fuzzing orchestration, and penetration-test planningmay significantly increase the rate at which latent vulnerabilities are discovered. While improved discovery can benefit defenders, it can also overload remediation pipelines and accelerate adversarial weaponization. This paper develops a queueing and network-theoretic model of AI-accelerated vulnerability discovery in interconnected systems. We represent an enterprise as a weighted dependency graph with replenishing vulnerability pools, finite remediation capacity, triage degradation, exploit window compression, and dynamic compromise propagation. We derive stability conditions for vulnerability backlogs, formulate a dynamic coupling between unresolved backlog and cascade risk, and evaluate mitigation strategies through simulation. Results indicate that when actionable discovery arrivals exceed remediation throughput, backlogs grow rapidly and systemic risk increases nonlinearly. In hub-dominated topologies, segmentation can reduce propagated compromise more effectively than remediation speed alone, while the strongest defense combines remediation automation with reduced network coupling.

Making a Name for Myself: On Academic Naming Policies and their Impact

cs.DL Jun 09, 2026

In academic publishing, names connect scholars to their work. When scholars change their names, including for marriage, academic recognition, or gender transition, they may lose credit for past publications. However, despite significant impacts on citation accuracy and researcher well-being, no existing studies examine how naming policies in computer science serve researchers who change their names. We use a mixed-methods approach combining surveys, interviews, and large-scale citation analysis of papers from eight major computer science venues from 2019-2025. We document the multi-year advocacy effort that established the first name change policies, identify implementation barriers including incomplete publisher updates and months-long processing delays. Researchers continue being cited with misparsed and incorrect names despite publisher updates. When these citation errors happen, interviewees report significant mental health impacts, including stress, anxiety, and safety risks. Empirically, we find that venues with accessible and visible name change policies have significantly fewer citation errors compared to inaccessible policies (899 vs. 996 errors per 1,000 papers). Our annotation analysis shows that deadnaming of transgender researchers in citations decreased by 92% from 2019 to 2024. Our findings demonstrate the importance of inclusive publishing policies, for which name change policy advocacy led by trans researchers has been a significant driver. We recommend that venues adopt proactive visible name change policies, support queer advocacy groups, and improve publication infrastructure to build an inclusive publishing landscape. The accompanied toolkit to check errors in bibliographic latex file is available here https://github.com/pranav-ust/cite-updater.

Diffusion Forcing Planner: History-Annealed Planning with Time-Dependent Guidance for Autonomous Driving

cs.RO Jun 09, 2026

Learning-based motion planners, despite recent progress, often suffer from temporal inconsistency. Small perturbations across frames can accumulate into unstable trajectories, degrading comfort and safety in closed-loop driving. Several methods attempt to inject history as a static conditioning signal to stabilize outputs, only to induce the planner to copy historical patterns instead of adapting to environment contexts. To address this limitation, we propose Diffusion Forcing Planner (DFP), a diffusion-based planning framework driven by history-guided control. Specifically, DFP decomposes the full trajectory into history, current and future segments, and assign independent noise levels to each segment. The model jointly denoises the historical and the future segments, enforcing a heterogeneous joint diffusion process. At inference, classifier-free guidance (CFG) is applied to steer future sampling using annealed history in a controllable manner. Closed-loop evaluation and comprehensive ablations on nuPlan show that DFP achieves competitive performance while producing continuous, stable, and controllable motion plans in complex driving scenarios.

Measuring Human Value Expression in Social Media Texts: Calibrated LLM Annotation and Encoder Transfer

cs.CL Jun 09, 2026

Measuring subjective constructs in naturally occurring social media text requires annotation procedures that are theoretically grounded, empirically validated, and transferable to an encoder model for scalable prediction. Using non-English social media posts annotated according to Schwartz's theory of basic human values, we investigate how different LLMs, prompts, and instruction languages operationalize the expression of values in text. We argue that although texts may permit multiple plausible interpretations, theory-based value definitions can constrain interpretations and reduce spurious value attributions. Beyond precision, recall, and F1, we evaluate structural alignment between values, error structure, confidence-ambiguity relations, and annotation stability. We show that different LLMs produce different value interpretations. Iterative prompt calibration through error analysis reduces misattributions and improves alignment with expert annotations. We also derive targeted expert verification rules from recurrent error structures and use them during corpus annotation. Finally, we show that LLM annotations can be transferred to an encoder model through soft-label training, retaining theory-based value interpretations and information about uncertainty in value expression.

Data-Driven Runway and Taxiway Exits Prediction of Landing Aircraft: A Case Study at Hartsfield-Jackson Atlanta International Airport

cs.LG Jun 09, 2026

Airport surface operations increasingly constrain performance at high-throughput hubs. This study examines arrival taxi-in decisions at Hartsfield-Jackson Atlanta International Airport (KATL) and proposes a two-stage, data-driven decision aid that mirrors controller workflow. Stage I predicts the runway exit selected by an arriving aircraft. Stage II predicts whether, given that exit, the aircraft will cross the active departure runway at a designated point or use the end-around taxiway. Models are trained using ASDE-X surface trajectories, aircraft characteristics, ramp destinations, short-horizon traffic rates, and weather across multiple look-back windows. We benchmark nine classifiers, including Random Forest, XGBoost, LightGBM, and CatBoost, and evaluate accuracy, macro-F1, precision-recall behavior, confusion matrices, Brier score, and Expected Calibration Error. Across east and west flows, XGBoost and LightGBM outperform Random Forest. Stage I achieves 0.86-0.89 accuracy with macro-F1 scores of 0.40-0.50, while Stage II achieves 0.70-0.74 accuracy with macro-F1 scores of 0.28-0.55. Feature-importance analysis shows that approach speed is the main driver of exit choice. Departure rate, crossing rate, ramp destination, and, for west flow, the selected exit are the strongest predictors of crossing versus end-around routing. Minority classes remain harder to predict because of feature-space overlap, as shown by t-SNE and UMAP analyses. The proposed framework supports controller situational awareness through calibrated, explainable predictions while preserving human responsibility for final routing decisions.

Superficial Beliefs in LLM Decision-Making

cs.AI Jun 09, 2026

We ask whether large language models (LLMs) merely imitate rationales when choosing between two options, or whether their choices reflect a systematic underlying decision structure. Using synthetic binary decision settings in which models choose between profiles defined by graded attributes, we compare the attribute a model says mattered most with the attribute that best explains its choice under a behavioural model fit to prior decisions. The behavioural model predicts held-out choices well, showing that model behaviour is systematically related to the visible attributes rather than being random. However, direct self-reports and a separate score-based judge recover the behaviourally inferred driver only partially. The resulting picture is neither one of arbitrary behaviour nor one of fully articulated belief - outputs are structured enough to support prediction, but explicit reasons track the recovered driver only imperfectly. This qualitative pattern persists across prompt-order and sampling perturbations, alternative behavioural models, targeted occlusion analyses, and structurally varied decision settings. We interpret this as evidence for ``superficial belief'' in LLM decision-making: models behave as if guided by probabilistic local priorities over attributes, while having only limited verbal access to the attributes that drive their decisions.

Structure from Reasoning, Numbers from Search: On-Premise Open LLMs as Structural Priors for Coupled MIMO Controller Tuning

cs.AI Jun 09, 2026

Tuning controllers for strongly coupled multi-input multi-output (MIMO) industrial processes is hard: decentralized classical auto-tuning ignores loop interaction, and local numerical optimization from natural initializations stalls in the resulting non-convex cost landscape. We ask whether on-premise open-source large language models (LLMs), which keep data on-site and need no plant model, can help. On a single-loop CSTR, classical relay-feedback tuning (IAE 0.106, near the 0.102 optimum) beats an LLM tuner (0.162): for simple loops the LLM adds nothing. The picture inverts on a strongly coupled quadruple-tank with conflicting set-points, scored by a penalized cost J = IAE + lambda*TV(u) that rewards tracking without chattering actuators. There, naive relay tuning (J ~ 28.6) and naive LLM tuning (29.7) are no better than open loop (22.7), and a local optimizer from balanced starts fails in 10/10 runs. A scaffolded open LLM instead reasons about the coupling, proposes the counter-intuitive asymmetric structure, and reaches J ~ 16.9 +/- 0.2 from any start; refining it with a classical optimizer attains the smooth global optimum (J ~ 12.0, 10/10 vs. 0/10), which even applies a non-obvious negative integral correction decentralized tuning cannot. A global optimizer (differential evolution) also reaches this optimum, so the LLM is not the only route; its advantage is sample efficiency and interpretability: a usable controller in 18 evaluations (where the global optimizer is worse than open loop) plus a stated rationale. This edge grows with dimension, reaching ~6x fewer evaluations on a 3x3 plant. The behaviour generalizes across four open models, and on a benign plant the LLM offers no advantage, sharpening the boundary. We contribute a reproducible benchmark delimiting when open LLMs help in control tuning: not as optimizers, but as a sample-efficient, interpretable structural prior.

An Uncertainty Estimation Framework for Dose Accumulation in Adaptive Radiotherapy: Application to CBCT-Guided Radiotherapy for Cervical Cancer

cs.CV Jun 09, 2026

Background and purpose: oART enables daily plan adaptation to interfraction anatomical variations, but cumulative dose estimation remains limited by DIR, segmentation, and anatomical uncertainties. We introduce IMPACT-DoseAcc, an uncertainty-aware dose accumulation framework, within IMPACT for semantic feature-driven image analysis. The framework is modality- and disease-agnostic and is applied to CBCT-guided oART for cervical cancer (LACC). Material and Methods: Nine LACC patients were retrospectively analyzed using daily CBCT-derived virtual CTs for dose recalculation. IMPACT-DoseAcc focuses on uncertainty from DIR, without modeling vCT-generation uncertainty. Two DIR uncertainty strategies were tested within IMPACT-Reg: a Bayesian segmentation-guided approach using one probabilistic model to quantify anatomical uncertainty, and an ensemble of segmentation models targeting structures to capture epistemic variability. Voxel-wise uncertainty maps were propagated through dose warping and accumulation to generate probabilistic dose-volume histograms. Ensemble uncertainty was quantified from voxel-wise standard deviation across deformation fields, and geometric error was assessed using surface distance between warped and validated contours. Anatomical-variability weighting refined aggregation. Results: Ensemble DIR uncertainty correlated with geometric error, with Pearson coefficients of 0.63 for CTVt and 0.66 for bladder. For CTVt, pDVHs achieved 96.3 +/- 3.9% coverage, showing calibration of propagated uncertainty. Weighting stabilized estimates across fractions and organs. Conclusions: IMPACT-DoseAcc propagates registration-driven uncertainty to cumulative dose metrics, improving interpretation of accumulated dose under anatomical variations. Its 3DSlicer integration supports reproducible, uncertainty-informed ART workflows.

Who Brought Easter Eggs to Eid? Auditing Cultural Translation of Math Word Problems Across Diverse Languages and Regions

cs.CL Jun 09, 2026

Large language models are increasingly used to adapt math word problems for personalized learning at scale, but it remains an open question whether those adaptations are consistent across models, preserve cultural diversity at scale, and reveal which cultural entities models treat as most salient. We analyze how Claude Opus 4, GPT-4.1, and Gemini 2.5 Pro adapt 60 English math word problems into Bengali, Hindi, Punjabi (India), Urdu, Sindhi (Pakistan), Italian, and Sicilian (Italy), a language set spanning the full resource spectrum, from high-resource Italian and Hindi to under-studied Sindhi, Sicilian, and Punjabi. We annotate 6,489 entity transformations, coding whether models preserve, localize, generalize, omit, or change entities such as names, foods, and places. Models agree on transformation type in 62.5% of cases and on specific substitutions in only 33.5%, meaning model choice directly shapes which cultural world students encounter. All 21 language-model combinations show entropy collapse, with adaptation compressing rather than expanding cultural diversity. Models prioritize surface markers such as names, foods, and currencies while preserving deeper structural features such as grade-level systems that embed culturally specific assumptions. Despite prompts specifying target countries, models misattribute regional context by using Bangladeshi taka for Indian Bengali students and produce cross-cultural contamination, such as adapting egg hunts as Eid activities. Some failures are visible in individual translations. Others, including diversity collapse, systematic preference for surface markers, and consistent regional misattribution, emerge only through corpus-level analysis. The surface plausibility that makes adapted problems look correct is precisely what makes deeper failures easy to overlook.

Understanding and mitigating the risks of OpenClaw for non-technical users: A practical guide with Skill

cs.CR Jun 09, 2026

OpenClaw has rapidly emerged as a transformative artificial intelligence (AI) agent framework, and its ability to autonomously execute complex, multi-step tasks has attracted an ever-growing and diverse user base. However, this capability comes with significant risks. While existing research has made important strides in characterizing these threats, such work is predominantly directed at technically sophisticated audiences. It remains largely inaccessible to non-technical users. This demographic now makes up an increasingly large and underserved portion of the community, yet it is these very users who most urgently need practical and straightforward guidance. In response, we bridge this gap through a series of interconnected efforts designed to lower the risk barrier for non-technical OpenClaw users. First, we identify and categorize seven core risks that OpenClaw users may encounter in daily usage, explaining each in plain language so that non-technical users can readily grasp the nature and potential consequences of these threats. Second, for each identified risk, we distill a set of corresponding defensive strategies into clear and actionable operational steps that are easy to follow. Third, to make protection even easier, we provide a companion OpenClaw Skill that automates key security configurations, enabling users to safeguard their systems with minimal manual intervention. Through this work, we demonstrate that safeguarding against the risks of intelligent agents need not be the exclusive domain of security experts, and that non-technical users can meaningfully participate in reducing these risks through simple, practical actions.

A Case Study Reexamining the Cold-Start Problem in Knowledge Tracing Models and Implications for SafeInsights, an Education Research Infrastructure

cs.HC Jun 09, 2026

Knowledge tracing (KT) models are widely used to predict students' evolving knowledge states from their learning history. However, many KT models are evaluated using specific datasets, platforms, and learning contexts, raising questions about whether reported model performance replicates and generalizes across newer datasets that vary in context. This paper replicates and extends Zhang et al. (2021), which examined the cold-start problem in KT models and found that deep-learning-based KT models performed better, partly because of stronger predictions when students began practicing a skill. Using a more recent ASSISTments dataset, FoundationalASSIST, we replicate the previous analysis by evaluating model performance across opportunities to practice and extend the analysis by examining performance across problem types, including fill-in-the-blank, multiple-choice select-one, multiple-choice select-all, and order/sort problems. Results show that KT model performance varies across both student practice trajectories and problem types. Beyond the empirical replication, this study identifies practical challenges in reproducing educational data mining studies and serves as a proof of concept, showing how privacy-preserving research infrastructures such as SafeInsights can be leveraged to facilitate educational research and support replication analyses.

Weighing Timed Regular Languages: The Final Step (long version)

cs.FL Jun 09, 2026

The bandwidth of a timed language characterizes the quantity of information per time unit (with a finite observation precision $\varepsilon$). The asymptotic behavior of the bandwidth as $\varepsilon \to 0$ classifies timed regular languages in three classes: meager, normal, and obese. Normal timed automata have a bounded frequency of events and some non-punctual transitions, and, up to now, were the only class of timed automata for which no algorithm was available for computing their bandwidth. In this article, we compute the bandwidth of any such automaton in the form $\approxα\log{1/\varepsilon}$. Our approach reduces this problem to computing the best reward-to-cost ratio in a weighted finite graph constructed from the given timed automaton.

IPSM-Bench: A New Intermediate Phase Segmentation Benchmark in Microstructure Images of Zinc-Based Absorbable Biomaterials

cs.CV Jun 09, 2026

Zinc-based alloys are indispensable emerging absorbable metallic biomaterials, and their macroscopic performance is governed by microstructural characteristics. Intermediate phases-key microstructural constituents-are pivotal in regulating mechanical and functional properties. However, intermediate phase segmentation in zinc alloy microstructures faces formidable challenges: scarce annotated datasets, low contrast, difficulty detecting small targets, and heterogeneous morphologies. To this end, we construct IPSM-Bench, the largest high-quality dataset for zinc-alloy intermediate phase segmentation. Furthermore, we propose SCoP-SAM, a new Spatial Context Prior-guided SAM method that leverages the gradient structure and grayscale properties of intermediate phases to capture spatial context priors and incorporates them into the entire SAM encoding-decoding process, improving segmentation performance. Based on the proposed IPSM-Bench, we establish a new benchmark for intermediate phase segmentation to systematically evaluate state-of-the-art (SOTA) methods and advance research on zinc alloy microstructure analysis. Extensive experiments on IPSM-Bench and additional public alloy benchmarks demonstrate that our SCoP-SAM not only achieves SOTA performance for zinc-alloy intermediate phase segmentation but also generalizes remarkably well to other alloy scenarios.

Analog Quantum Asynchronous Event-Based Graph Neural Network

cs.LG Jun 09, 2026

Asynchronous, event-based graph neural networks (AEGNNs) have recently emerged as an efficient paradigm for processing the sparse and high-temporal-resolution data from event cameras. In this paper, we propose quantum analog AEGNNs (QA-AEGNNs), a novel framework to implement an AEGNN on a neutral-atom quantum computer. Neutral-atom quantum processors offer a programmable analog quantum computing platform based on controllable Rydberg-atom interactions. To this end, we map the streaming event data to an array of trapped neutral atoms, where each atom represents a graph node (event) and is positioned such that geometric proximity reflects the spatio-temporal neighborhood of events. The native Rydberg Hamiltonian of the quantum processor is programmed to mirror the message-passing computations of the AEGNN, with atomic qubit states serving as node feature embeddings and inter-atom interactions realizing graph edges. Furthermore, we propose a hybrid quantum-classical training scheme in which the analog Hamiltonian parameters (e.g., laser pulse amplitudes and detunings) are optimized using classical feedback to learn the quantum AEGNN model from data. Our approach leverages the continuous Hamiltonian dynamics and massive parallelism of neutral-atom quantum systems to natively execute event-based graph computations with potential accuracy improvements

A Companion App for an Autonomous Family Vehicle: Identification of Values for an Autonomous Mobility System

cs.CY Jun 09, 2026

In this paper, we present a companion app for an autonomous vehicle aimed at user groups who would normally require an accompanying person to drive them. Two aspects of a companion app are presented in this paper: First, the possibility for a trusted person to track the ride of the person in need of support and second, to put the settings of the vehicle for persons in need of support in the hands of a trusted person. In addition, this article describes the requirements and addressed values and discusses the safety-relevant aspects of such a companion app. We also discuss and identify the values that influence passengers and trusted persons using the companion app. Overall, a companion app can provide new perspectives and opportunities for people in need of support, allowing them to take advantage of the features offered by autonomous vehicles. It enables trusted individuals to configure the vehicle according to the passengers needs. Also such an app can be a mechanism to involve trusted persons in the options given by the vehicle and give them the possibility to adapt the vehicle to the needs of the person in need of support.

Null-Space Constrained Low-Rank Adaptation for Response-Specified Large Language Model Unlearning

cs.AI Jun 09, 2026

Large language model unlearning aims to suppress designated undesirable knowledge while preserving benign capabilities. Many unlearning objectives focus on suppressing undesired answers, while recent target-guided variants specify replacement behavior but still leave update locality largely unconstrained. This paper introduces \emph{Null-Space Constrained Response-Specified Unlearning} (NSRU), a projection-constrained low-rank framework for controlled LLM unlearning. NSRU uses an explicitly structured safe target response to specify the desired behavior for each forget query, while suppressing the original undesired content. To localize adaptation, NSRU estimates per-module retain subspaces from benign hidden representations and uses an orthogonal-projected low-rank parameterization to confine LoRA updates to the null space of the retain subspace. The resulting objective jointly optimizes safe-target learning, undesired-response suppression, and retention preservation under this constrained parameterization. We provide a local first-order analysis showing that the projected update reduces retain-side perturbations while preserving editable directions for shaping forget-query behavior. Experiments on TOFU show that NSRU effectively suppresses extractable forget-set knowledge while improving retain QA performance, model utility, and safe-target alignment over representative baselines. On WMDP, NSRU keeps hazardous-domain accuracy near the random-choice region while preserving broad and domain-adjacent MMLU utility. Ablation studies support the complementary roles of safe-target supervision, undesired-response suppression, retention loss, and null-space projected updates, while sensitivity and robustness analyses indicate stable behavior across the tested hyperparameter and prompt variations.

AnimaSpark: A Feed-Forward Method for Animating Arbitrary 3D Objects

cs.CV Jun 09, 2026

While recent advancements in generative AI have substantially accelerated static 3D model creation workflows, the synthesis of category-agnostic 3D animations remains a significant bottleneck in 3D asset production. Current methods for category-agnostic animation generation exhibit critical limitations in inference speed, motion quality, and adherence to textual prompts, thereby leaving the process dependent on labor-intensive manual artistry. To address these challenges, this paper introduces AnimaSpark, a novel pipeline for category-agnostic 3D animation generation. Our approach is motivated by the key insight that for many fundamental motions in the 3D world, the corresponding joint transformations can often be effectively modeled within a two-dimensional subspace. The pipeline begins by rendering a rigged static 3D model into multi-layered image representations of its mesh and skeleton, which are subsequently fed into a video generation model. We then employ a keypoint tracking algorithm on the generated video to capture the motion of the skeletal joints projected onto the camera's viewing plane. In the final stage, we distill the planar translations and rotations from these tracked keypoints and lift them from the 2D domain into 3D space to animate the character. Comprehensive evaluations reveal that our method achieves superior performance over existing state-of-the-art techniques across key metrics, including text-motion alignment, quality of motion, and computational efficiency.

Multi-UAV Active Sensing with Information Gain-based Planning and Belief Fusion

cs.RO Jun 09, 2026

Unmanned aerial vehicles (UAVs) are increasingly used for active sensing and information gathering in spatially distributed environments. Their performance, however, is constrained by limited flight time, sensing uncertainty, and the trade-off between spatial coverage and observation accuracy. This paper presents a real-world validation of a multi-UAV active sensing framework for probabilistic binary terrain mapping, with precision agriculture used as the application case. The environment is represented as a probabilistic belief map, where spatial dependencies are modeled through a factor-graph formulation. UAV decision making is guided by Information Gain based Informative Path Planning (IGbIPP), and the approach is compared with Random Walk and Sweep coverage path planning baselines using both synthetic terrains and real UAV-derived agricultural imagery. The study also evaluates spatial correlation weights and several probabilistic belief-fusion rules for multi-UAV information sharing. Results show that IGbIPP reduces entropy and mapping error more effectively than the baselines, while a wider field of view improves real-world coverage and map accuracy. The results further show that simple equal or biased spatial weights can be more robust than adaptive weights, and that Bayesian, log-odds, and Dempster--Shafer fusion achieve the best cooperative mapping performance. These findings highlight the importance of uncertainty-driven planning, sensing geometry, spatial modeling, and probabilistic fusion for real-world UAV-based active sensing.

FairWave : A Fairness-Aware Asynchronous DAG-BFT Consensus

cs.DC Jun 09, 2026

Combining asynchronous Byzantine Fault Tolerant (BFT) consensus with Proof-of-Stake (PoS) creates a trilemma between Sybil resistance, reward distribution fairness, and protection against persistent plutocracy. Existing DAG-BFT approaches (Narwhal+Tusk, Bullshark, and Mysticeti) prioritize liveness over the fairness implications of stake-based selection, resulting in persistent longitudinal centralization. FairWave is a dual-channel DAG BFT protocol that separates anchor selection from reward distribution. The selection channel is super-linear in stake, guaranteeing Sybil gain < 1 for all split factors K > 1. The reward channel is sub-linear, using square-root stake normalization to mitigate rich-get-richer dynamics. The finalized DAG structure provides deterministic uptime and latency factors, allowing honest validators to agree on operational quality without any external oracle. To avoid circular dependency between selection outcomes and selection weights, reputation is used in a lagged form: the active value at epoch e equals the prior epoch's final value. We derive closed-form constraints for both channels and validate them through nine empirical analyses (approximately 550,000 Monte Carlo rounds) against eight baselines. FairWave achieves a Gini coefficient of 0.149 (vs. Pure-PoS's 0.488), a monotone HHI reduction from 0.039 to 0.021 over 50,000 epochs, an optimal-adversary Sybil split of K* = 1, and a success-rate coefficient of variation of 5.2% under +/-25% input perturbation. Safety (agreement and validity) is a formal consequence of the 2f+1 strong-support commit rule, holding unconditionally for f < n/3; the empirical differential is the monotone-continuous liveness-degradation curve, which decreases from 99.6% commit rate at b=0.20 to 71.1% at the theoretical bound b=1/3 without the discontinuous cliff characteristic of view-change-driven leader-BFT.

Bellman-Taylor Score Decoding for Markov Decision Processes with State-Dependent Feasible Action Sets

cs.AI Jun 09, 2026

Many Markov decision processes (MDPs) in operations research have feasible actions that are state dependent and defined implicitly by various operational constraints. These features make it difficult to use standard deep reinforcement learning (DRL) algorithms, whose action interfaces typically assume either a fixed finite action catalog or a simple Euclidean space. Motivated by a Taylor expansion of the optimal action-value function, we propose Bellman--Taylor score decoding, a framework that moves policy learning to a Euclidean score space while enforcing feasibility through an action decoder. The induced latent-score MDP then can be optimized by standard DRL algorithms without differentiating through the decoder. We provide a performance guarantee showing that the optimality gap of this approach decomposes into a structural approximation error and an algorithmic learning error. Lastly, we apply this framework to a queueing network control problem, where the policy essentially learns a state-dependent index-based dispatching rule. Numerical experiments show near-optimal performance in small instances and considerable improvements over benchmarks in larger systems.

Fixed-Parameter Tractability of Private Synthetic Data Generation

cs.DS Jun 09, 2026

We study the problem of generating synthetic data under differential privacy. We establish fixed-parameter tractability (FPT) for this problem where the parameter is the treewidth of the query family's incidence graph. Our algorithms attain optimal error rates across all regimes and are realized by two different approaches: the first is based on linear programming (LP) and the FPT of the separation problem for the LP dual; the second is based on a subsampled private multiplicative weights method, where we obtain FPT for sampling from Gibbs distributions. Both approaches are unified by a dynamic programming framework over a tree decomposition.

Computer Science (arXiv)

Cookie Preferences

Essential Cookies

Analytics Cookies