Today, large-scale language pre-training neural network models like ChatGPT have become well-known names. The algorithmic core behind GPT—the artificial neural network algorithm—has experienced a tumultuous journey over the past 80 years. During this time, except for a few breakthrough moments, the theory remained dormant, ignored, and even considered “poison” in terms of funding.
The birth of artificial neural networks originated from the collaboration between the unconventional genius Pitts and the accomplished neurophysiologist McCulloch. However, their theory surpassed the technological capabilities of their time and failed to receive widespread attention and empirical validation.
Fortunately, in the initial two decades after its inception, researchers continuously contributed to the field, advancing artificial neural networks from simple mathematical models of neurons and learning algorithms to perceptron models with learning capabilities. However, skepticism from other researchers and the untimely demise of one of the pioneers of the “perceptron,” Rosenblatt, during a sailing trip dealt a heavy blow. Consequently, the field entered a cold winter lasting over twenty years, until the introduction of the backpropagation algorithm into the training process of artificial neural networks.
Afterward, following a dormant period of 20 years, research in artificial neural networks finally experienced a reboot. In the following two decades of accumulation, convolutional neural networks and recurrent neural networks successively emerged.
However, the rapid development of this field in both academia and industry had to wait for a breakthrough in hardware, which occurred 17 years ago with the introduction of general-purpose computing GPU chips. This paved the way for today’s well-known large-scale language pre-training neural network models like ChatGPT.
In a certain sense, the success of artificial neural networks is a stroke of luck because not all research can wait for the crucial breakthrough, for everything to align perfectly. In many other fields, technological breakthroughs either happen too early or too late, leading to gradual extinction. However, within this stroke of luck, we must not overlook the determination and perseverance of the researchers involved. It is through their idealism that artificial neural networks have navigated their roller coaster journey of 80 years and ultimately achieved success.
