<#>Recent arXiv Papers on Sparse Coding and Predictive Processing#>
Search Date: 2026-04-08 Total Papers Found: 15
<##>Papers (ordered by announcement date, newest first)##>
<###>1. LPC-SM: Local Predictive Coding and Sparse Memory for Long-Context Language Modeling###>
arXiv ID: arXiv:2604.03263
Authors: Keqin Xie
Categories: cs.CL, cs.AI, cs.GL, cs.NE
Date: April 2026
Abstract: Most current long-context language models still rely on attention to handle both local interaction and long-range state, which leaves relatively little room to test alternative decompositions of sequence modeling. We propose LPC-SM, a hybrid autoregressive architecture that separates local attention, persistent memory, predictive correction, and run-time control within the same block, and we use Orthogonal Novelty Transport (ONT) to govern slow-memory writes. We evaluate a 158M-parameter model...
<###>2. PixelPrune: Pixel-Level Adaptive Visual Token Reduction via Predictive Coding###>
arXiv ID: arXiv:2604.00886
Authors: Nan Wang, Zhiwei Jin, Chen Chen, Haonan Lu
Categories: cs.CV, cs.AI, cs.CL
Date: April 2026
Abstract: Document understanding and GUI interaction are among the highest-value applications of Vision-Language Models (VLMs), yet they impose exceptionally heavy computational burden: fine-grained text and small UI elements demand high-resolution inputs that produce tens of thousands of visual tokens. We observe that this cost is largely wasteful -- across document and GUI benchmarks, only 22-71% of image patches are pixel-unique...
<###>3. HCLSM: Hierarchical Causal Latent State Machines for Object-Centric World Modeling###>
arXiv ID: arXiv:2603.29090
Authors: Jaber Jaber, Osama Jaber
Categories: cs.LG, cs.CV, cs.RO
Date: March 2026
Abstract: World models that predict future states from video remain limited by flat latent representations that entangle objects, ignore causal structure, and collapse temporal dynamics into a single scale. We present HCLSM, a world model architecture that operates on three interconnected principles: object-centric decomposition via slot attention with spatial broadcast decoding, hierarchical temporal dynamics through a three-level engine combining selective state space models for continuous physics, sparse...
<###>4. Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation###>
arXiv ID: arXiv:2603.28744
Authors: Vitória Barin Pacela, Shruti Joshi, Isabela Camacho, Simon Lacoste-Julien, David Klindt
Categories: cs.LG
Date: March 2026
Abstract: The linear representation hypothesis states that neural network activations encode high-level concepts as linear mixtures. However, under superposition, this encoding is a projection from a higher-dimensional concept space into a lower-dimensional activation space, and a linear decision boundary in the concept space need not remain linear after projection. In this setting, classical sparse coding methods with per-sample iterative inference leverage compressed sensing guarantees to recover latent...
<###>5. From Data Statistics to Feature Geometry: How Correlations Shape Superposition###>
arXiv ID: arXiv:2603.09972
Authors: Lucas Prieto, Edward Stevinson, Melih Barsbey, Tolga Birdal, Pedro A. M. Mediano
Categories: cs.LG, cs.AI, cs.CV
Date: March 2026
Abstract: A central idea in mechanistic interpretability is that neural networks represent more features than they have dimensions, arranging them in superposition to form an over-complete basis. This framing has been influential, motivating dictionary learning approaches such as sparse autoencoders. However, superposition has mostly been studied in idealized settings where features are sparse and uncorrelated. In these settings, superposition is typically understood as introducing interference that must...
<###>6. LISTA-Transformer Model Based on Sparse Coding and Attention Mechanism and Its Application in Fault Diagnosis###>
arXiv ID: arXiv:2603.04146
Authors: Shuang Liu, Lina Zhao, Tian Wang, Huaqing Wang
Categories: cs.CV
Date: March 2026
Abstract: Driven by the continuous development of models such as Multi-Layer Perceptron, Convolutional Neural Network (CNN), and Transformer, deep learning has made breakthrough progress in fields such as computer vision and natural language processing, and has been successfully applied in practical scenarios such as image classification and industrial fault diagnosis. However, existing models still have certain limitations in local feature modeling and global dependency capture. Specifically, CNN is limited...
<###>7. Active Inference for Physical AI Agents -- An Engineering Perspective###>
arXiv ID: arXiv:2603.20927
Authors: Bert de Vries
Categories: stat.ML, cs.LG
Date: March 2026
Abstract: Physical AI agents, such as robots and other embodied systems operating under tight and fluctuating resource constraints, remain far less capable than biological agents in open-ended real-world environments. This paper argues that Active Inference (AIF), grounded in the Free Energy Principle, offers a principled foundation for closing that gap. We develop this argument from first principles, following a chain from probability theory through Bayesian machine learning and variational inference to...
<###>8. Learning-Based Robust Control: Unifying Exploration and Distributional Robustness for Reliable Robotics via Free Energy###>
arXiv ID: arXiv:2603.06831
Authors: Hozefa Jesawada, Giovanni Russo, Abdalla Swikir, Fares Abu-Dakka
Categories: cs.RO, math.OC
Date: March 2026
Abstract: A key challenge towards reliable robotic control is devising computational models that can both learn policies and guarantee robustness when deployed in the field. Inspired by the free energy principle in computational neuroscience, to address these challenges, we propose a model for policy computation that jointly learns environment dynamics and rewards, while ensuring robustness to epistemic uncertainties. Expounding a distributionally robust free energy principle...
<###>9. NAB: Neural Adaptive Binning for Sparse-View CT reconstruction###>
arXiv ID: arXiv:2602.02356
Authors: Wangduo Xie, Matthew B. Blaschko
Categories: cs.CV, cs.LG
Date: February 2026
Abstract: Computed Tomography (CT) plays a vital role in inspecting the internal structures of industrial objects. Furthermore, achieving high-quality CT reconstruction from sparse views is essential for reducing production costs. While classic implicit neural networks have shown promising results for sparse reconstruction, they are unable to leverage shape priors of objects. Motivated by the observation that numerous industrial objects exhibit rectangular structures, we propose a novel Neural Adaptive Bi...
<###>10. Active inference and artificial reasoning###>
arXiv ID: arXiv:2512.21129
Authors: Karl Friston, Lancelot Da Costa, Alexander Tschantz, Conor Heins, Christopher Buckley, Tim Verbelen, Thomas Parr
Categories: q-bio.NC, physics.data-an, stat.ML
Date: December 2025
Abstract: This technical note considers the sampling of outcomes that provide the greatest amount of information about the structure of underlying world models. This generalisation furnishes a principled approach to structure learning under a plausible set of generative models or hypotheses. In active inference, policies - i.e., combinations of actions - are selected based on their expected free energy, which comprises expected information gain and value...
<###>11. Developmental Symmetry-Loss: A Free-Energy Perspective on Brain-Inspired Invariance Learning###>
arXiv ID: arXiv:2512.10984
Authors: Arif Dönmez
Categories: q-bio.NC, cs.AI, cs.LG, nlin.AO
Date: December 2025
Abstract: We propose Symmetry-Loss, a brain-inspired algorithmic principle that enforces invariance and equivariance through a differentiable constraint derived from environmental symmetries. The framework models learning as the iterative refinement of an effective symmetry group, paralleling developmental processes in which cortical representations align with the world's structure. By minimizing structural surprise, i.e. deviations from symmetry consistency, Symmetry-Loss operationalizes a Free-Energy...
<###>12. Scalable predictive processing framework for multitask caregiving robots###>
arXiv ID: arXiv:2510.25053
Authors: Hayato Idei, Tamon Miyake, Tetsuya Ogata, Yuichi Yamashita
Categories: cs.RO, cs.AI, cs.LG, q-bio.NC
Date: October 2025
Abstract: The rapid aging of societies is intensifying demand for autonomous care robots; however, most existing systems are task-specific and rely on handcrafted preprocessing, limiting their ability to generalize across diverse scenarios. A prevailing theory in cognitive neuroscience proposes that the human brain operates through hierarchical predictive processing, which underlies flexible cognition and behavior by integrating multimodal sensory signals. Inspired by this principle, we introduce a hierar...
<###>13. Deep Active Inference with Diffusion Policy and Multiple Timescale World Model for Real-World Exploration and Navigation###>
arXiv ID: arXiv:2510.23258
Authors: Riko Yokozawa, Kentaro Fujii, Yuta Nomura, Shingo Murata
Categories: cs.RO, cs.AI, cs.LG
Date: October 2025
Abstract: Autonomous robotic navigation in real-world environments requires exploration to acquire environmental information as well as goal-directed navigation in order to reach specified targets. Active inference (AIF) based on the free-energy principle provides a unified framework for these behaviors by minimizing the expected free energy (EFE), thereby combining epistemic and extrinsic values. To realize this practically, we propose a deep AIF framework that integrates a diffusion policy as the policy...
<###>14. Distributionally Robust Free Energy Principle for Decision-Making###>
arXiv ID: arXiv:2503.13223
Authors: Allahkaram Shafiei, Hozefa Jesawada, Karl Friston, Giovanni Russo
Categories: cs.AI, eess.SY, math.OC
Date: March 2025
Abstract: Despite their groundbreaking performance, autonomous agents can misbehave when training and environmental conditions become inconsistent, with minor mismatches leading to undesirable behaviors or even catastrophic failures. Robustness towards these training-environment ambiguities is a core requirement for intelligent agents and its fulfillment is a long-standing challenge towards their real-world deployments. Here, we introduce a Distributionally Robust Free Energy model (DR-FREE) that instills...
<###>15. Free Energy and Network Structure: Breaking Scale-Free Behaviour Through Information Processing Constraints###>
arXiv ID: arXiv:2502.12654
Authors: Peter R Williams, Zhan Chen
Categories: cs.SI, physics.soc-ph
Date: February 2025
Abstract: In this paper we show how The Free Energy Principle (FEP) can provide an explanation for why real-world networks deviate from scale-free behaviour, and how these characteristic deviations can emerge from constraints on information processing. We propose a minimal FEP model for node behaviour reveals three distinct regimes: when detection noise dominates, agents seek better information, reducing isolated agents compared to expectations from classical preferential attachment...