🏠 Vault / Research / Literature / arxiv_papers_report.md

Recent arXiv Papers on Sparse Coding and Predictive Processing

<#>Recent arXiv Papers on Sparse Coding and Predictive Processing

Search Date: 2026-04-08 Total Papers Found: 15

<##>Papers (ordered by announcement date, newest first)

<###>1. LPC-SM: Local Predictive Coding and Sparse Memory for Long-Context Language Modeling

arXiv ID: arXiv:2604.03263

Authors: Keqin Xie

Categories: cs.CL, cs.AI, cs.GL, cs.NE

Date: April 2026

Abstract: Most current long-context language models still rely on attention to handle both local interaction and long-range state, which leaves relatively little room to test alternative decompositions of sequence modeling. We propose LPC-SM, a hybrid autoregressive architecture that separates local attention, persistent memory, predictive correction, and run-time control within the same block, and we use Orthogonal Novelty Transport (ONT) to govern slow-memory writes. We evaluate a 158M-parameter model...

<###>2. PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Coding

arXiv ID: arXiv:2604.00886

Authors: Nan Wang, Zhiwei Jin, Chen Chen, Haonan Lu

Categories: cs.CV, cs.AI, cs.CL

Date: April 2026

Abstract: Document understanding and GUI interaction are among the highest-value applications of Vision-Language Models (VLMs), yet they impose exceptionally heavy computational burden: fine-grained text and small UI elements demand high-resolution inputs that produce tens of thousands of visual tokens. We observe that this cost is largely wasteful -- across document and GUI benchmarks, only 22-71% of image patches are pixel-unique...

<###>3. HCLSM: Hierarchical Causal Latent State Machines for Object-Centric World Modeling

arXiv ID: arXiv:2603.29090

Authors: Jaber Jaber, Osama Jaber

Categories: cs.LG, cs.CV, cs.RO

Date: March 2026

Abstract: World models that predict future states from video remain limited by flat latent representations that entangle objects, ignore causal structure, and collapse temporal dynamics into a single scale. We present HCLSM, a world model architecture that operates on three interconnected principles: object-centric decomposition via slot attention with spatial broadcast decoding, hierarchical temporal dynamics through a three-level engine combining selective state space models for continuous physics, sparse...

<###>4. Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation

arXiv ID: arXiv:2603.28744

Authors: Vitória Barin Pacela, Shruti Joshi, Isabela Camacho, Simon Lacoste-Julien, David Klindt

Categories: cs.LG

Date: March 2026

Abstract: The linear representation hypothesis states that neural network activations encode high-level concepts as linear mixtures. However, under superposition, this encoding is a projection from a higher-dimensional concept space into a lower-dimensional activation space, and a linear decision boundary in the concept space need not remain linear after projection. In this setting, classical sparse coding methods with per-sample iterative inference leverage compressed sensing guarantees to recover latent...

<###>5. From Data Statistics to Feature Geometry: How Correlations Shape Superposition

arXiv ID: arXiv:2603.09972

Authors: Lucas Prieto, Edward Stevinson, Melih Barsbey, Tolga Birdal, Pedro A. M. Mediano

Categories: cs.LG, cs.AI, cs.CV

Date: March 2026

Abstract: A central idea in mechanistic interpretability is that neural networks represent more features than they have dimensions, arranging them in superposition to form an over-complete basis. This framing has been influential, motivating dictionary learning approaches such as sparse autoencoders. However, superposition has mostly been studied in idealized settings where features are sparse and uncorrelated. In these settings, superposition is typically understood as introducing interference that must...

<###>6. LISTA-Transformer Model Based on Sparse Coding and Attention Mechanism and Its Application in Fault Diagnosis

arXiv ID: arXiv:2603.04146

Authors: Shuang Liu, Lina Zhao, Tian Wang, Huaqing Wang

Categories: cs.CV

Date: March 2026

Abstract: Driven by the continuous development of models such as Multi-Layer Perceptron, Convolutional Neural Network (CNN), and Transformer, deep learning has made breakthrough progress in fields such as computer vision and natural language processing, and has been successfully applied in practical scenarios such as image classification and industrial fault diagnosis. However, existing models still have certain limitations in local feature modeling and global dependency capture. Specifically, CNN is limited...

<###>7. Active Inference for Physical AI Agents -- An Engineering Perspective

arXiv ID: arXiv:2603.20927

Authors: Bert de Vries

Categories: stat.ML, cs.LG

Date: March 2026

Abstract: Physical AI agents, such as robots and other embodied systems operating under tight and fluctuating resource constraints, remain far less capable than biological agents in open-ended real-world environments. This paper argues that Active Inference (AIF), grounded in the Free Energy Principle, offers a principled foundation for closing that gap. We develop this argument from first principles, following a chain from probability theory through Bayesian machine learning and variational inference to...

<###>8. Learning-Based Robust Control: Unifying Exploration and Distributional Robustness for Reliable Robotics via Free Energy

arXiv ID: arXiv:2603.06831

Authors: Hozefa Jesawada, Giovanni Russo, Abdalla Swikir, Fares Abu-Dakka

Categories: cs.RO, math.OC

Date: March 2026

Abstract: A key challenge towards reliable robotic control is devising computational models that can both learn policies and guarantee robustness when deployed in the field. Inspired by the free energy principle in computational neuroscience, to address these challenges, we propose a model for policy computation that jointly learns environment dynamics and rewards, while ensuring robustness to epistemic uncertainties. Expounding a distributionally robust free energy principle...

<###>9. NAB: Neural Adaptive Binning for Sparse-View CT reconstruction

arXiv ID: arXiv:2602.02356

Authors: Wangduo Xie, Matthew B. Blaschko

Categories: cs.CV, cs.LG

Date: February 2026

Abstract: Computed Tomography (CT) plays a vital role in inspecting the internal structures of industrial objects. Furthermore, achieving high-quality CT reconstruction from sparse views is essential for reducing production costs. While classic implicit neural networks have shown promising results for sparse reconstruction, they are unable to leverage shape priors of objects. Motivated by the observation that numerous industrial objects exhibit rectangular structures, we propose a novel Neural Adaptive Bi...

<###>10. Active inference and artificial reasoning

arXiv ID: arXiv:2512.21129

Authors: Karl Friston, Lancelot Da Costa, Alexander Tschantz, Conor Heins, Christopher Buckley, Tim Verbelen, Thomas Parr

Categories: q-bio.NC, physics.data-an, stat.ML

Date: December 2025

Abstract: This technical note considers the sampling of outcomes that provide the greatest amount of information about the structure of underlying world models. This generalisation furnishes a principled approach to structure learning under a plausible set of generative models or hypotheses. In active inference, policies - i.e., combinations of actions - are selected based on their expected free energy, which comprises expected information gain and value...

<###>11. Developmental Symmetry-Loss: A Free-Energy Perspective on Brain-Inspired Invariance Learning

arXiv ID: arXiv:2512.10984

Authors: Arif Dönmez

Categories: q-bio.NC, cs.AI, cs.LG, nlin.AO

Date: December 2025

Abstract: We propose Symmetry-Loss, a brain-inspired algorithmic principle that enforces invariance and equivariance through a differentiable constraint derived from environmental symmetries. The framework models learning as the iterative refinement of an effective symmetry group, paralleling developmental processes in which cortical representations align with the world's structure. By minimizing structural surprise, i.e. deviations from symmetry consistency, Symmetry-Loss operationalizes a Free-Energy...

<###>12. Scalable predictive processing framework for multitask caregiving robots

arXiv ID: arXiv:2510.25053

Authors: Hayato Idei, Tamon Miyake, Tetsuya Ogata, Yuichi Yamashita

Categories: cs.RO, cs.AI, cs.LG, q-bio.NC

Date: October 2025

Abstract: The rapid aging of societies is intensifying demand for autonomous care robots; however, most existing systems are task-specific and rely on handcrafted preprocessing, limiting their ability to generalize across diverse scenarios. A prevailing theory in cognitive neuroscience proposes that the human brain operates through hierarchical predictive processing, which underlies flexible cognition and behavior by integrating multimodal sensory signals. Inspired by this principle, we introduce a hierar...

<###>13. Deep Active Inference with Diffusion Policy and Multiple Timescale World Model for Real-World Exploration and Navigation

arXiv ID: arXiv:2510.23258

Authors: Riko Yokozawa, Kentaro Fujii, Yuta Nomura, Shingo Murata

Categories: cs.RO, cs.AI, cs.LG

Date: October 2025

Abstract: Autonomous robotic navigation in real-world environments requires exploration to acquire environmental information as well as goal-directed navigation in order to reach specified targets. Active inference (AIF) based on the free-energy principle provides a unified framework for these behaviors by minimizing the expected free energy (EFE), thereby combining epistemic and extrinsic values. To realize this practically, we propose a deep AIF framework that integrates a diffusion policy as the policy...

<###>14. Distributionally Robust Free Energy Principle for Decision-Making

arXiv ID: arXiv:2503.13223

Authors: Allahkaram Shafiei, Hozefa Jesawada, Karl Friston, Giovanni Russo

Categories: cs.AI, eess.SY, math.OC

Date: March 2025

Abstract: Despite their groundbreaking performance, autonomous agents can misbehave when training and environmental conditions become inconsistent, with minor mismatches leading to undesirable behaviors or even catastrophic failures. Robustness towards these training-environment ambiguities is a core requirement for intelligent agents and its fulfillment is a long-standing challenge towards their real-world deployments. Here, we introduce a Distributionally Robust Free Energy model (DR-FREE) that instills...

<###>15. Free Energy and Network Structure: Breaking Scale-Free Behaviour Through Information Processing Constraints

arXiv ID: arXiv:2502.12654

Authors: Peter R Williams, Zhan Chen

Categories: cs.SI, physics.soc-ph

Date: February 2025

Abstract: In this paper we show how The Free Energy Principle (FEP) can provide an explanation for why real-world networks deviate from scale-free behaviour, and how these characteristic deviations can emerge from constraints on information processing. We propose a minimal FEP model for node behaviour reveals three distinct regimes: when detection noise dominates, agents seek better information, reducing isolated agents compared to expectations from classical preferential attachment...