Who discovered grokking and why is the name hard to find?

Apologies if this is old news to everyone, but perhaps the hive mind knows the answer. I was watching a youtube video "The most complex model we actually understand" by Welch Labs and heard the story about the researcher who left a model training when going on vacation, which then learned to generalize after thousands of training steps. But when I try to look up the name of the discoverer it has not been made public, which seems a shabby way to treat someone. What's the real story?

2 points

asmodeuslucifer

a month ago


2 comments