Understanding Kolmogorov–Arnold Networks…

Devansh

Jun 16, 2024

Will the better interpretability, smaller network sizes, and learnable activations allow KANs to topple MLPs

Read →

20 Comments

Sergei Polevikov

Jun 16, 2024

I love splines. I call them the flexible rulers of mathematical modeling. 😉

Expand full comment

Reply (1)

Devansh

Jun 17, 2024

Splines are GG

Expand full comment

Ananya Shahi

Jun 17, 2024

This is a really useful article, thank you so much! 💗

Expand full comment

Reply (1)

Devansh

Jun 17, 2024

Glad you liked it

Expand full comment

Steve Raju

Jun 17, 2024

Philipp Lahm is not underrated. 😂

Expand full comment

Reply (1)

Devansh

Jun 17, 2024

Hell nah. Super underrated Baller.

Expand full comment

Reply (1)

Steve Raju

Jun 17, 2024

People who know nothing about football don’t rate him. People who do, rate him. But is he better than Cafu? Dani Alves? He’s rated just fine. Plus he never played for Stoke.

Expand full comment

Reply (1)

Devansh

Jun 17, 2024

Can't believe you put Dani Alves in the same conversation as Lahm. You Barca fans are wild

Expand full comment

Hesam Sheikh

Jun 16, 2024

Nice to read a new take on KAN! Also very curious to see what people hack together with KANs

Expand full comment

Reply (1)

Devansh

Jun 17, 2024

Very exciting times ahead

Expand full comment

Andrew Polar

Jun 23, 2024

There are several wrong statements in this article. KAN is not new. First, article is published in 2021 by Andrew Polar and Mike Poluektov. Find it. Second, it is not slower than MLP. Visit OpenKAN.org and find there source code that runs quicker than MLP, there are comparisons.

Expand full comment

Reply (1)

Devansh

Jun 23, 2024

Andrew. Thank you for your comments. A few things-

1) I call it new b/c the authors say they "introduced KANs". The related works section talks about how other approaches were not promising, and that "Our contribution lies in generalizing the network to arbitrary widths and depths, revitalizing and contexualizing them in today’s deep learning stream, as well as highlighting its potential role as a foundation model for AI + Science."

2) Similarly, I didn't make the claim about KANs being slower. The paper says that the Training is slower and I reported that. Didn't say that inference would be slower or anything in those lines.

Both those statements, if untrue, would be best taken up by the authors. If you'd like to publish a contradiction- you are free to come on this newsletter and publish it (as long as you can show the results).

You say several wrong statements. Are there any others, besides this?

Expand full comment

Reply (1)

Andrew Polar

Jun 23, 2024

OK. The following is wrong: 1. Lack of research. We published 2 articles with examples, several years before. There are others for example: [KASAM: Spline Additive Models for Function Approximation Heinrich van Deventer ∗ , Pieter Janse van Rensburg and Anna Bosman † Department of Computer Science, University of Pretoria]. They suggested spline model. 2. Slow training. MIT offered only one training method. We used completely different (it is published) and see a quicker training. Here is comparison: http://openkan.org/triangles.html it is quicker than MLP. And this is not the limit. I wrote this code quickly, did not work for years, like MLP people polished code for decades. There is a room for improvement. 3. Catastrophic forgeting. That is complete nonsense. All networks and models forget previous training. In KAN the function values are arguments in another functions, how can you remember previous training when retrain. Such things should be proven theoretically in math papers not stated. MIT article just say that without theoretical backup.

Expand full comment

Reply (1)

Andrew Polar

Jun 23, 2024

MIT paper has reference on our paper, which in turn has other references. Google the word KASAM and you'll find KAN with splines before MIT papers.

Expand full comment

Reply (1)

Devansh

Jun 25, 2024

I think you're misunderstanding something here. The purpose of this article is to break down a specific paper (the one quoted) not the entirety of the research into the space. There is only so much time and space I have to cover things, and of course information is being left out. The goal here is to make sure that people can understand this generalized KAN paper that attracted attention. If they're interested in the details, they can always Google more themselves. If you'd like to do a guest post here highlighting the background of KANs in more detail, you are more than welcome to.

Regarding catastrophic forgetting - the authors did show an experiment that showed promising results. Did you not find it compelling? Why not?

Expand full comment

Logan Thorneloe

Jun 20, 2024

Incredible article. Do you mind if I cross-post it sometime in the near future? I was thinking of writing something up about KANs but this is awesome.

Also, getting Yamcha'ed 🤣

Expand full comment

Reply (1)

Devansh

Jun 20, 2024

Please. I'd be super happy if you x posted

Expand full comment

Leo

Sep 1

Thank you for your article on Kolmogorov–Arnold Networks (KANs). I am not a specialist, but curious about one point: Why do you not directly cite the source article instead of just linking to it? A huge proportion of your article is a reprint of excerpts from the source. Surely the authors of the source should be formally credited? Did I miss something obvious?

Is this a hesitation, because your article is more of a 'repackaging' of their work rather than a review of it?

Several times you refer to 'the paper' and 'the authors.' Is it this one?

Liu, Z., Wang, Y., Vaidya, S., Ruehle, F., Halverson, J., Soljačić, M., ... & Tegmark, M. (2024). Kan: Kolmogorov-arnold networks. arXiv preprint arXiv:2404.19756.

Ziming Liu1,4 Yixuan Wang2 Sachin Vaidya1 Fabian Ruehle3,4 James Halverson3,4 Marin Soljačić1,4 Thomas Y. Hou2 Max Tegmark1,4

1 Massachusetts Institute of Technology

2 California Institute of Technology

3 Northeastern University

4 The NSF Institute for Artificial Intelligence and Fundamental Interactions

Cheers,

Leo

Expand full comment

Reply (1)

Devansh

Sep 1

When it comes to paper breakdowns, this is how I've always approached it. It's pretty obvious that for a specific breakdown- a quote and/or a screenshot comes from the paper I'm breaking down unless explicitly stated otherwise (I always link to other sources as well). So it is credited/cited. I use this style because it's easy for me to read and not clunky. No reader has complained about it (including original authors of many papers that I've broken down)- so I think this

This is the paper mentioned- https://arxiv.org/html/2404.19756v1#abstract. It's linked in the first sentence of the breakdown.

Expand full comment