Skip to content

opallab/neural_networks_and_computation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 

Repository files navigation

Computational Capability and Efficiency of Neural Networks: A Repository of Papers

Contributed by Kimon Fountoulakis

1. Simulation Results
1.1 Recurrent neural networks
1.2 Transformers
1.3 Feedforward neural networks
1.4 Graph neural networks
2. Learning Results
2.1 Transformers
2.2 Feedforward neural networks
2.3 Graph neural networks
4. Formal Languages
3. Empirical

Simulation Results

Recurrent neural networks

  1. On the Computational Power of Neural Nets. Journal of Computer and System Sciences 1995. paper

    H.T. Siegelmann, E.D. Sontag

Transformers

  1. Attention is Turing Complete. Journal of Machine Learning Research 2021. paper

    Jorge Pérez, Pablo Barceló, Javier Marinkovic

  2. Looped Transformers as Programmable Computers. ICML 2023. paper

    Angeliki Giannou, Shashank Rajput, Jy-Yong Sohn, Kangwook Lee, Jason D. Lee, Dimitris Papailiopoulos

  3. Exposing Attention Glitches with Flip-Flop Language Modeling. NeurIPS 2023. paper

    Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

  4. Transformers Learn Shortcuts to Automata. ICLR 2023. paper

    Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

  5. Memory Augmented Large Language Models are Computationally Universal. arXiv 2023. paper

    Dale Schuurmans

  6. Chain of Thought Empowers Transformers to Solve Inherently Serial Problems. ICLR 2024. paper

    Zhiyuan Li, Hong Liu, Denny Zhou, Tengyu Ma

  7. Representational Capabilities of Feed-Forward and Sequential Neural Architectures. PhD Thesis 2024. paper

    Sanford, Clayton Hendrick

  8. Transformers, parallel computation, and logarithmic depth. ICML 2024. paper

    Clayton Sanford, Daniel Hsu, Matus Telgarsky

  9. Understanding Transformer Reasoning Capabilities via Graph Algorithms. NeurIPS 2024. paper

    Clayton Sanford, Bahare Fatemi, Ethan Hall, Anton Tsitsulin, Mehran Kazemi, Jonathan Halcrow, Bryan Perozzi, Vahab Mirrokni

  10. A Little Depth Goes a Long Way: The Expressive Power of Log-Depth Transformers. NeurIPS 2024 Workshop M3L. paper

    William Merrill, Ashish Sabharwal

  11. On Limitations of the Transformer Architecture. COLM 2024. paper

    Binghui Peng, Srini Narayanan, Christos Papadimitriou

  12. The Expressive Power of Transformers with Chain of Thought. ICLR 2024. paper

    William Merrill, Ashish Sabharwal

  13. Depth-Width tradeoffs in Algorithmic Reasoning of Graph Tasks with Transformers. arXiv 2025. paper

    Gilad Yehudai, Clayton Sanford, Maya Bechler-Speicher, Orr Fischer, Ran Gilad-Bachrach, Amir Globerson

  14. Positional Attention: Expressivity and Learnability of Algorithmic Computation. ICML 2025. paper

    Artur Back de Luca, George Giapitzakis, Shenghao Yang, Petar Veličković, Kimon Fountoulakis

  15. Round and Round We Go! What makes Rotary Positional Encodings useful?. ICLR 2025. paper

    Federico Barbero, Alex Vitvitskyi, Christos Perivolaropoulos, Razvan Pascanu, Petar Veličković

  16. Reasoning with Latent Thoughts: On the Power of Looped Transformers. ICLR 2025. paper

    Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li, Sanjiv Kumar, Sashank J. Reddi

  17. ALTA: Compiler-Based Analysis of Transformers. TMLR 2025. paper

    Peter Shaw, James Cohan, Jacob Eisenstein, Kenton Lee, Jonathan Berant, Kristina Toutanova

Feedforward neural networks

  1. Provably good solutions to the knapsack problem via neural networks of bounded size. AAAI 2021. paper

    Christoph Hertrich, Martin Skutella

  2. ReLU Neural Networks of Polynomial Size for Exact Maximum Flow Computation. Integer Programming and Combinatorial Optimization 2023. paper

    Christoph Hertrich, Leon Sering

  3. Representational Capabilities of Feed-Forward and Sequential Neural Architectures. PhD Thesis 2024. paper

    Sanford, Clayton Hendrick

Graph neural networks

  1. What graph neural networks cannot learn: depth vs width. ICLR 2020. paper

    Andreas Loukas

  2. Simulation of Graph Algorithms with Looped Transformers. ICML 2024. paper

    Artur Back De Luca, Kimon Fountoulakis

  3. Exploring the Power of Graph Neural Networks in Solving Linear Optimization Problems. AISTATS 2024. paper

    Chendi Qian, Didier Chételat, Christopher Morris

  4. MAGNOLIA: Matching Algorithms via GNNs for Online Value-to-go Approximation. ICML 2024. paper

    Alexandre Hayderi, Amin Saberi, Ellen Vitercik, Anders Wikum

  5. Aligning Transformers with Weisfeiler-Leman. ICML 2024. paper

    Luis Müller, Christopher Morris

  6. Graph Transformers Dream of Electric Flow. ICLR 2025. paper

    Xiang Cheng, Lawrence Carin, Suvrit Sra

  7. Primal-Dual Neural Algorithmic Reasoning. arXiv 2025. paper

    Yu He, Ellen Vitercik

Learning Results

Transformers

  1. Positional Attention: Expressivity and Learnability of Algorithmic Computation. ICML 2025. paper

    Artur Back de Luca, George Giapitzakis, Shenghao Yang, Petar Veličković, Kimon Fountoulakis

  2. Learning Compositional Functions with Transformers from Easy-to-Hard Data. COLT 2025. paper

    Zixuan Wang, Eshaan Nichani, Alberto Bietti, Alex Damian, Daniel Hsu, Jason D. Lee, Denny Wu

Feedforward neural networks

  1. Learning to Add, Multiply, and Execute Algorithmic Instructions Exactly with Neural Networks. arXiv 2025. paper

    George Giapitzakis, Artur Back de Luca, Kimon Fountoulakis

Graph neural networks

  1. Graph neural networks extrapolate out-of-distribution for shortest paths. arXiv 2025. paper

    Robert R. Nerem, Samantha Chen, Sanjoy Dasgupta, Yusu Wang

Formal Languages

  1. Neural Networks and the Chomsky Hierarchy. ICLR 2023. paper

    Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Chris Cundy, Marcus Hutter, Shane Legg, Joel Veness, Pedro A. Ortega

  2. Training Neural Networks as Recognizers of Formal Languages. ICLR 2025. paper

    Alexandra Butoi, Ghazal Khalighinejad, Anej Svete, Josef Valvoda, Ryan Cotterell, Brian DuSell

Empirical

  1. Learning to Execute. arXiv 2015. paper

    Wojciech Zaremba, Ilya Sutskever

  2. Neural Programmer-Interpreters. arXiv 2015. paper

    Scott Reed, Nando de Freitas

  3. Neural Programmer: Inducing Latent Programs with Gradient Descent. arXiv 2016. paper

    Arvind Neelakantan, Quoc V. Le, Ilya Sutskever

  4. Deep Neural Solver for Math Word Problems. arXiv 2017. paper

    Yan Wang, Xiaojiang Liu, and Shuming Shi

  5. Analysing Mathematical Reasoning Abilities of Neural Models. arXiv 2019. paper

    David Saxton, Edward Grefenstette, Felix Hill, Pushmeet Kohli

  6. LSTM Networks Can Perform Dynamic Counting. ACL 2019 Workshop on Deep Learning and Formal Languages. paper

    Mirac Suzgun, Sebastian Gehrmann, Yonatan Belinkov, Stuart M. Shieber

  7. Thinking Like Transformers. ICML 2021. paper

    Gail Weiss, Yoav Goldberg, Eran Yahav

  8. Investigating the Limitations of Transformers with Simple Arithmetic Tasks. arXiv 2021. paper

    Rodrigo Nogueira, Zhiying Jiang, Jimmy Lin

  9. A Primer for Neural Arithmetic Logic Modules. arXiv 2022. paper

    Bhumika Mistry, Katayoun Farrahi, Jonathon Hare

  10. Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets. arXiv 2022. paper

    Alethea Power, Yuri Burda, Harri Edwards, Igor Babuschkin, Vedant Misra

  11. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv 2023. paper

    Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou

  12. Implicit Chain of Thought Reasoning via Knowledge Distillation. arXiv 2023. paper

    Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber

  13. Positional Description Matters for Transformers Arithmetic. arXiv 2023. paper

    Ruoqi Shen, Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Yuanzhi Li, Yi Zhang

  14. Learning Transformer Programs. NeurIPS 2023. paper

    Dan Friedman, Alexander Wettig, Danqi Chen

  15. Length Generalization in Arithmetic Transformers. arXiv 2023. paper

    Samy Jelassi, Stéphane d'Ascoli, Carles Domingo-Enrich, Yuhuai Wu, Yuanzhi Li, François Charton

  16. Transformers Can Do Arithmetic with the Right Embeddings. arXiv 2024. paper

    Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Geiping, Avi Schwarzschild, Tom Goldstein

  17. From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step. arXiv 2024. paper

    Yuntian Deng, Yejin Choi, Stuart Shieber

About

Computational abilities and efficiency of neural networks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published