February 25th, 2025
February 25th, 2025
Artificially Lowering Generalization Abilities
Artificially Lowering Generalization Abilities
The idea of using prediction differentials to highlight interesting data is of course very promising, and naturally related to the xent games
The idea of using prediction differentials to highlight interesting data is of course very promising, and naturally related to the xent games
Something that was perhaps a little bit disappointing with initial experiments was the fact that 'dumber model', meant (if we took a smaller model) a lower memorization alongside with a lower 'intelligence'
Something that was perhaps a little bit disappointing with initial experiments was the fact that 'dumber model', meant (if we took a smaller model) a lower memorization alongside with a lower 'intelligence'
What would be great, if we have a very large dataset, would be to artificially lower the generalization abilities (maybe skip the grokking phase, or remove the W from AdamW?)
What would be great, if we have a very large dataset, would be to artificially lower the generalization abilities (maybe skip the grokking phase, or remove the W from AdamW?)
And so, if we get the same training loss, we can really appreciate nontrivial generalization, and what is within reach of nontrivial generalization, and hence the innovation
And so, if we get the same training loss, we can really appreciate nontrivial generalization, and what is within reach of nontrivial generalization, and hence the innovation
This would naturally be a useful part of the xent game: how to create something that transcends superficial understanding of data
This would naturally be a useful part of the xent game: how to create something that transcends superficial understanding of data
Generally speaking, using 'silly fitters' is definitely a direction that is probably not very much explored
Generally speaking, using 'silly fitters' is definitely a direction that is probably not very much explored
.
ideas-and-notes
about
tricritical-ising
cellular-automata-and-alife
ising-and-e8
xent
chiral-spin-field
computational-equilibrium
misc-ideas
arrows-of-time
de-finetti
local-vs-global-univ
interestingness
quines-and-self-replicators