Abram L. Friesen,Pedro Domingos
Abstract:Continuous optimization is an important problem in many areas of AI, including vision, robotics, probabilistic inference, and machine learning. Unfortunately, most real-world optimization problems are nonconvex, causing standard convex techniques to find only local optima, even with extensions like random restarts and simulated annealing. We observe that, in many cases, the local modes of the objective function have combinatorial structure, and thus ideas from combinatorial optimization can be brought to bear. Based on this, we propose a problem-decomposition approach to nonconvex optimization. Similarly to DPLL-style SAT solvers and recursive conditioning in probabilistic inference, our algorithm, RDIS, recursively sets variables so as to simplify and decompose the objective function into approximately independent sub-functions, until the remaining functions are simple enough to be optimized by standard techniques like gradient descent. The variables to set are chosen by graph partitioning, ensuring decomposition whenever possible. We show analytically that RDIS can solve a broad class of nonconvex optimization problems exponentially faster than gradient descent with random restarts. Experimentally, RDIS outperforms standard techniques on problems like structure from motion and protein folding.
What problem does this paper attempt to address?
The problem that this paper attempts to solve is the non - convex optimization problem. Specifically, many continuous optimization problems in the field of artificial intelligence are non - convex. Such problems usually have a large number of local optimal solutions, so that standard convex optimization techniques can only find local optimal solutions, even when using extended methods such as random restarts and simulated annealing. The paper proposes a method based on problem decomposition to solve non - convex optimization problems. This method can utilize the combinatorial structure of local patterns in the objective function, simplify and decompose the objective function by recursively setting variables until the remaining sub - functions are simple enough to be optimized using standard techniques such as gradient descent.
### Core contributions of the paper:
1. **Proposing the RDIS algorithm**: The paper proposes an algorithm named RDIS (Recursive Decomposition into Independent Subspaces). This algorithm decomposes the original complex optimization problem into multiple relatively independent sub - problems by recursively selecting variables and fixing them. These sub - problems can be further solved using standard optimization techniques (such as gradient descent).
2. **Theoretical analysis**: The paper theoretically proves that RDIS has an exponential - level time - efficiency advantage compared to traditional methods such as gradient descent and grid search when dealing with certain types of non - convex optimization problems. Specifically, RDIS can solve problems in polynomial time, while traditional methods may require exponential time.
3. **Experimental verification**: The paper experimentally verifies the effectiveness of RDIS in multiple practical applications, including Structure from Motion, high - dimensional sine wave functions, and protein folding. The experimental results show that RDIS significantly outperforms traditional non - convex optimization algorithms on these problems.
### Key concepts and techniques:
- **Local structure**: The paper defines the local decomposability of the objective function and proposes a method to achieve this local structure. By setting some variables, the objective function can be decomposed into multiple almost independent sub - functions, thereby simplifying the optimization process.
- **Hypergraph partitioning**: RDIS uses hypergraph partitioning techniques to select variables, ensuring that the variables selected each time can maximize the decomposition of the remaining variables. This step is the key to RDIS being able to effectively decompose problems.
- **Simplification and decomposition**: RDIS further simplifies the objective function by checking whether each term can be simplified (i.e., its boundary is small enough) and replacing the simplifiable terms with constants. Then, RDIS identifies and decomposes sub - functions through the connected components of the dynamic graph.
### Experimental results:
- **Structure from Motion**: On the Structure from Motion problem, RDIS significantly outperforms Levenberg - Marquardt and the Block Coordinate Descent method (BCD - LM), especially when the problem scale is large.
- **High - dimensional sine wave functions**: On high - dimensional sine wave functions, RDIS can find better minima more quickly, especially when the problem dimension is high.
- **Protein folding**: On the protein folding problem, RDIS also performs well and can find lower - energy configurations, outperforming the Conjugate Gradient Descent (CGD) and the Block Coordinate Descent method (BCD - CGD).
In general, by proposing the RDIS algorithm, this paper provides a new method to solve non - convex optimization problems, which not only has theoretical advantages but also shows significant performance improvements in practical applications.