Artificial intelligence for science: The easy and hard problems

Ruairidh M. Battleday,Samuel J. Gershman
2024-08-25
Abstract:A suite of impressive scientific discoveries have been driven by recent advances in artificial intelligence. These almost all result from training flexible algorithms to solve difficult optimization problems specified in advance by teams of domain scientists and engineers with access to large amounts of data. Although extremely useful, this kind of problem solving only corresponds to one part of science - the "easy problem." The other part of scientific research is coming up with the problem itself - the "hard problem." Solving the hard problem is beyond the capacities of current algorithms for scientific discovery because it requires continual conceptual revision based on poorly defined constraints. We can make progress on understanding how humans solve the hard problem by studying the cognitive science of scientists, and then use the results to design new computational agents that automatically infer and update their scientific paradigms.
Artificial Intelligence,Machine Learning,Neurons and Cognition
What problem does this paper attempt to address?
The paper primarily explores the application of artificial intelligence in scientific research, particularly distinguishing between "easy problems" and "hard problems" in scientific research. Specifically: 1. **Easy Problems**: These refer to optimization problems pre-defined by domain scientists and engineers, which can be solved by training flexible algorithms. Although these problems themselves remain challenging, the form of the problem and what needs to be optimized are clear. For example, the problem of predicting protein structure given an amino acid sequence, while technically complex, has a clear problem definition. Current AI tools have made significant progress in solving such problems, such as discovering new protein structures, antibiotics, and designing nuclear fusion reactors. 2. **Hard Problems**: These refer to the challenge of defining the research problem itself. This involves conceptual breakthroughs and requires continuous conceptual revision based on unclear constraints. Current AI algorithms are not yet capable of solving this type of problem because they rely on human-defined inputs, outputs, and objective functions. The real difficulty lies in how to automatically identify and update scientific paradigms, rather than merely optimizing known problems. The paper further illustrates the distinction between easy problems and hard problems through three case studies (the discovery of oxygen, the development of electromagnetic field theory, and certain discoveries in molecular biology), and points out that existing computational models, while achieving success in solving easy problems, still have shortcomings in addressing hard problems. These models are often based on preset data representations and assumptions, lacking the capability to support conceptual innovation.