Gupta, Vima; Varma, Sashank

Learning to count: a neural network model of the successor function

2022

Abstract

What does it mean for a neural network to become a “cardinal principal knower”? We trained a multilayer perceptron to compute the successor of the numbers 0-99. N and N+1 were one-hot encoded on the input and output layers, respectively; the hidden layer had 8 units. 80% of the (N, N+1) pairs constituted the training data, the remaining 20% the test data. The mean cosine similarities of the hidden layer representations of the (N, N+1) pairs was 0.77 (0.79) when N was in the training (test) set. The network learned a continuous notion of number: the hidden-layer representations of N and N+1 were comparable whether they did (0.74) or did not (0.78) cross a tens boundary. Thus, the network did not “discover” place-value. Ongoing research is exploring place-value encoding of inputs and outputs, and also structuring of the training data to better reflect the numerical environment of the child.

Main Content

For improved accessibility of PDF content, download the file to your device.

Proceedings of the Annual Meeting of the Cognitive Science Society

Learning to count: a neural network model of the successor function