(2+f(n))-SAT and its properties
Yunlei Zhao,Xiaotie Deng,C. H. Lee,Hong Zhu
DOI: https://doi.org/10.1016/S0166-218X(03)00194-X
IF: 1.254
2004-01-01
Discrete Applied Mathematics
Abstract:Consider a formula that contains n variables with the form Φ = Φ 2 ∧ Φ 3 , where Φ 2 is an instance of 2-SAT containing m 2 2-clauses and Φ 3 is an instance of 3-SAT containing m 3 3-clauses. Φ is an instance of (2+ f ( n ))-SAT if m 3 /( m 2 + m 3 )⩽ f ( n ). We prove that (2+ f ( n ))-SAT is in P if f(n)= O ( log n/n 2 ) , and in NPC if f(n)=1/n 2−ε (∀ε : 0<ε<2) . Most interestingly, we give a candidate, (2+( log n) k /n 2 ) -SAT ( k ⩾2), for natural problems in NP − NPC − P (denoted as NPI ) with respect to this (2+ f ( n ))-SAT model. We prove that the restricted version of it is not in NPC under P ≠ NP . Actually, it is indeed in NPI under some stronger but plausible assumption, specifically, the exponential-time hypothesis. Keywords Computational complexity SAT Exponential-time hypothesis 1 Introduction In 1975, Lander had shown that there exist some languages in NP – NPC – P (denoted as NPI ) under the assumption P ≠ NP [8] . But the language constructed there is not a natural one because the construction needs to run all Turing machines. So far, no natural problems have been proven to be in NPI under P ≠ NP and finding such a natural problem is considered an important open problem in complexity theory [13,4] . The problems of graph isomorphism GI and factoring, which were suggested by Karp, are regarded as two most likely candidates [13,4] . The satisfiability problem of Boolean formula (SAT) has played a central role in the field of computational complexity theory. It is the first NP -complete problem. And up to now, all known algorithms to find a solution for 3-SAT require exponential time in problem size in the worst case. In practice, the time complexity of the fastest algorithm for 3-SAT is ( 4 3 ) n , where n is the variable number in the formula [14] . It is also an important open question whether sub-exponential time algorithms exist. The plausibility of such a sub-exponential time algorithm for 3-SAT was investigated in [5] , using sub-exponential time reduction. It was shown there that linear size 3-SAT is complete for the class SNP (strict NP ) with respect to such reduction. It implies that if there exists a sub-exponential time algorithm for 3-SAT then all the languages in SNP can be decided in sub-exponential time. Note that some well-studied problems, such as k -SAT, k -colorability, for any k ⩾3, and so on, have been proven to be SNP -complete. In light of both the practical and theoretical supports, Impagliazzo and Paturi introduced the exponential-time hypothesis (ETH) for 3-SAT: 3-SAT does not have a sub-exponential-time algorithm [6] . Although ETH is stronger than NP ≠ P , it is still quite reasonable. In recent advances of cryptography, many important cryptographic primitives and protocols were constructed under the ETH for the one-way functions: DLP or RSA, e.g., verifiable pseudorandom functions [9] , verifiable pseudorandom generator [3] and resettable zero-knowledge arguments systems for NP [2,10] and so on. On the other hand, recently there has been growth of interests to study the link between the hardness of computational complexity of decision problems and the phase boundaries in physical systems [1,12] . It was observed that, similar to physical systems, across certain phase boundaries dramatic changes occur in the computational difficulty and solution character. NP -complete problems become easier to solve away from the boundary and the hardest problems occur at the phase boundary [7,12] . To understand the onset of exponential complexity that occurs when going from a problem in P (2-SAT) to a problem that is NP -complete (3-SAT), the (2+ p )-SAT model was introduced in [11,12] , where p is a constant and 0⩽ p ⩽1. An instance of 2+ p -SAT is a formula with m clauses, of which (1− p ) m contain two variables (2-clauses) and pm contain three variables (3-clause). 2+ p -SAT smoothly interpolates between 2-SAT ( p =0) and 3-SAT ( p =1) when the instances are generated randomly. The median computation cost scales linearly with n (the number of variables) when p < p 0 and exponentially for p > p 0 , where p 0 lies between 0.4 and 0.416 [12] . However, for the worst-case complexity, (2+ p )-SAT is NP -complete for any constant p , p >0 [12,1] . In this work, we further explore the worst-case complexity boundary of P and NPC when p is further reduced (not a constant but a function of n ). Somewhat surprisingly, such an extension allows us to suggest another candidate for natural problems in NPI under NP ≠ P . In fact, we present a natural problem in NPI under ETH. In Section 2 , we present the necessary definitions and the related important properties for our study. In Section 3 , we present a candidate for natural problems in NPI and prove it not in NPC under NP ≠ P . In Section 4 , we prove it is not in P under ETH. We conclude with discussions in Section 5 . 2 Properties of (2+ f ( n ))- SAT In this section, we introduce the (2+ f ( n ))-SAT model. We are mainly concerned with the boundary of f ( n ) that separates the problems between P and NPC . Let Φ is an formula and denoted | Φ | as the number of clauses in Φ . We introduce the definition of (2+ f ( n ))-SAT: Definition 2.1 (2+ f ( n ))-SAT Consider a formula which contains n variables and m clauses with the form Φ = Φ 2 ∧ Φ 3 , where Φ 2 is an instance of 2-SAT which contains m 2 2-clauses, and Φ 3 is an instance of 3-SAT which contains m 3 3-clauses. An instance of (2+ f ( n ))-SAT is one satisfying the condition |Φ 3 | |Φ| = m 3 m = m 3 m 2 +m 3 ⩽f(n). Throughout the paper, we restrict our discussion to instances with f ( n )=| Φ 3 |/| Φ |. Indeed, all our claims hold if they hold under this restriction. Note that m 2 ⩽4 n 2 2 , m 3 ⩽8 n 3 3 , n 2 ⩽2 m 2 , n 3 ⩽3 m 3 , n ⩽3 m , and that the variables which appear in Φ 2 may appear in Φ 3 , and vice versa, i. e., n ⩽ n 2 + n 3 ⩽2 n . Theorem 2.1 For any constant k >0, (2+k log n/n 2 ) - SAT is in P . Proof Consider any instance of (2+k log n/n 2 ) -SAT ( k >0), a formula Φ = Φ 2 ∧ Φ 3 , where m 3 /(m 2 +m 3 )=k log n/n 2 . We get m 3 = k log nm 2 n 2 −k log n ⩽ km 2 log n+k log n n 2 ⩽ (k4n 2 +k) log n n 2 = 4k+ k n 2 log n⩽5k log n. Note that the variables which appear in Φ 2 may appear in Φ 3 , and vice versa. For the 5k log n variables which appear in Φ 3 , we can enumerate all the at most n 5 k truth assignments and then for each truth assignment we can determine Φ 2 in polynomial time of n , and thus the (2+k log n/n 2 ) -SAT ( k ⩾0) is in P . □ Claim 1 Given n variables , we can construct a satisfiable formula Φ , where Φ is an instance of 2- SAT and |Φ|⩽ 3 2 n 2 − 3 2 n . Proof We construct 2-clauses as follows: ( 1 2 n 2 − 1 2 n) clauses with the form ( x i ∨ x j )( i ≠ j ,1⩽ i , j ⩽ n ), ( n 2 − n ) clauses with the form (x i ∨¬ x j ) , (i≠j,1⩽i,j⩽n) . From all these 2-clauses, we select k,1⩽k⩽ 3 2 n 2 − 3 2 n , clauses to construct the formula Φ we need, then Φ is satisfiable when all these n variable are assigned the value “true”. □ Theorem 2.2 (2+ 1 n 2−ε ) - SAT (∀ ε , 0< ε <2) is in NPC . Proof We show that there is a many-one reduction from 3-SAT to (2+ 1 n 2−ε ) -SAT (0< ε <2). Let Φ 3 be an instance of 3-SAT that contains n 3 variables and m 3 3-clauses. Without loss of generality, we assume that m 3 ⩾2. Then we add n 2 = m 3 8/ ε new variables and using these new variables to construct a satisfiable formula Φ 2 which contains m 2 2-clauses. Let m 3 /( m 2 + m 3 )=1/ n 2− ε (0< ε <2) then m 3 m 2 +m 3 = 1 n 2−ε ⩾ 1 (n 2 +n 3 ) 2−ε , m 2 ⩽((n 2 +n 3 ) 2−ε −1)m 3 ⩽(n 2 +n 3 ) 2−ε m 3 ⩽(m 3 8/ε +3m 3 ) 2−ε m 3 . But note that m 3 ⩾2, we get (m 3 8/ε +3m 3 ) 2 m 3 ⩽ 3 2 (m 3 8/ε ) 2 − 3 2 m 3 8/ε (m 3 ) 8 = 3 2 n 2 2 − 3 2 n 2 m 3 8 ⩽ 3 2 n 2 2 − 3 2 n 2 (m 3 8/ε +3m 3 ) ε . That is, m 2 ⩽(m 3 8/ε +3m 3 ) 2−ε m 3 ⩽ 3 2 n 2 2 − 3 2 n 2 ⇒m 2 ⩽ 3 2 n 2 2 − 3 2 n 2 . The satisfiable formula Φ 2 can be constructed according to Claim 1 . Let Φ = Φ 2 ∧ Φ 3 , then Φ is an instance of (2+1/ n 2− ε )-SAT (0< ε <2) and Φ is satisfiable if and only if Φ 3 is satisfiable. Note that the above many-one reduction indeed can be constructed in polynomial time of m 3 (also in polynomial time of n 3 , since n 3 ⩽3 m 3 , m 3 ⩽8 n 3 3 ). Obviously, (2+1/ n 2− ε )-SAT (0< ε <2) is in NP , so the theorem does hold. □ One open problem related to our (2+ f ( n ))-SAT model is: Open problem Does there exist some f ( n ), s.t. k log n/n 2 <f(n)<1/n 2−ε , where k ⩾0 and 0< ε <2, so that (2+ f ( n ))-SAT is in ( NP – NPC ) – P (denoted as NPI ) under the assumption P ≠ NP ? Note that (2+k log n/n 2 ) -SAT is in P , k ⩾0 and (2+1/ n 2− ε )-SAT (0< ε <2) is in NP -complete according to the above theorems. Now, we give another candidate and also another open problem with regard to our (2+ f ( n ))-SAT for natural problems in NPI under P ≠ NP : Open problem In the (2+ f ( n ))-SAT model, is (2+( log n) k /n 2 ) -SAT ( k ⩾2) in ( NP – NPC ) – P under the assumption NP ≠ P ? Note that k 1 log n/n 2 <( log n) k /n 2 (k⩾2)<1/n 2−ε , where k 1 ⩾0 and 0< ε <2. 3 A candidate for natural problems in NPI under NP ≠ P Now, we give another candidate for natural problems in NPI under P ≠ NP which is a restricted version of (2+( log n) k /n 2 ) -SAT ( k ⩾2). We will prove that it is not NP -complete under the assumption P ≠ NP . Actually, it is indeed in NPI under some stronger but reasonable assumptions. Theorem 3.1 In the (2+ f ( n ))- SAT model , if the variables which appear in Φ 2 do not appear in Φ 3 , and vice versa , then (2+( log n) k /n 2 ) - SAT is not in NPC under the assumption NP ≠ P , k ⩾2. Proof Clearly, this problem is in NP . We prove this theorem by showing that 3-SAT cannot be reduced to (2+( log n) k /n 2 ) -SAT by many-one reduction, where k ⩾2. Assume that there exists a many-one reduction(denoted as F ) from 3-SAT to (2+( log n) k /n 2 ) -SAT ( k ⩾2). It means that for any instance of 3-SAT, a formula Φ 0 which contains n 0 variables and m 0 3-clauses, we can construct the F ( Φ 0 ) which is an instance of (2+( log n) k /n 2 ) -SAT ( k ⩾2) in polynomial time of n 0 , where F ( Φ 0 ) contains n variables and m clauses, and F ( Φ 0 ) is satisfiable if and only if Φ 0 is satisfiable. Let F ( Φ 0 )= Φ 2 ∧ Φ 3 , where Φ 2 is an instance of 2-SAT which contains m 2 2-clauses and n 2 variables and Φ 3 is an instance of 3-SAT which contains m 3 3-clauses and n 3 variables, then ( log n) k /n 2 =|Φ 3 |/|Φ|=m 3 /m=m 3 /(m 2 +m 3 ) , k ⩾2. We consider the relation between m 3 and m 0 there are two cases: Case 1: m 3 ⩾ m 0 . Claim 2 m = m 2 + m 3 cannot be expressed as a polynomial of m 3 . Proof of Claim 2 Firstly, for sufficiently large n , ( log n) k /n 2 =m 3 /m⩽ 1 2 (i.e. m ⩾2 m 3 ), where k ⩾2. Secondly, m=m 2 +m 3 ⩽4n 2 +m 3 ⇒n 2 ⩾ m−m 3 4 . Then, for sufficiently large n , the following holds: m 3 m = ( log n) k n 2 ⩽ 4( log 3m) k m−m 3 ⇒4( log 3m) k ⩾m 3 m−m 3 m ⩾ 1 2 m 3 ⇒m⩾ 1 3 2 ( m 3 8 ) 1/k . □ According to Claim 2 , in Case 1, we get the fact that m cannot be expressed as a polynomial of m 3 , and since m 3 ⩾ m 0 , so m also cannot be expressed as a polynomial of m 0 (of course m also cannot be expressed as a polynomial of n 0 since m 0 ⩽8 n 0 3 ). Its absurd since the many-one reduction F ( Φ 0 ) must be done in polynomial time of n 0 . Case 2 : m 3 < m 0 . Since we assume F ( Φ 0 ) can be constructed in polynomial time of n 0 , then m 2 must be expressed as P ( n 0 ), where P (·) is a polynomial. So, if m 3 < m 0 it means that we can decrease the 3-clause number in Φ 0 by adding P ( n 0 ) 2-clauses (by imposing F on Φ 0 ). However, note that we assume the variables which appear in Φ 2 do not appear in Φ 3 , and vice versa, then we can impose F on Φ 3 , and so on. Repeat the above process at most m 0 times we can eliminate all 3-clauses in F ( Φ 0 ) to get a formula Φ ′ and guarantee that Φ ′ is satisfiable if and only if F ( Φ 0 ) is satisfiable if and only if Φ 0 is satisfiable, where Φ ′ contains only 2-clauses and | Φ ′| is at most m 0 P ( n 0 ), or at most 8 n 0 3 P ( n 0 ), another polynomial of n 0 . This means that there exists a many-one reduction from 3-SAT to 2-SAT, which contradicts our assumption P ≠ NP . So, from the arguments above, we can conclude that (2+( log n) k /n 2 ) -SAT ( k ⩾2) is not NP -complete under the assumption P ≠ NP . □ 4 Can the candidate be in P ? In this section, we further show that the candidate presented in the previous section is indeed in NPI under ETH. Definition 4.1 SE A language L ∈ SE if for any x ∈ L there exists an algorithm to find a y so that | y |⩽ m ( x ) and R ( x , y ) in time poly(| x |)2 εm ( x ) for every fixed ε , 1> ε >0, where R is a polynomial time relation called the constraint, and m is a polynomial-time computable and polynomial bounded complexity parameter. Definition 4.2 SERF The sub-exponential reduction family SERF from A 1 with parameter m 1 to A 2 with parameter m 2 is defined as a collection of Turing reduction M ε A 2 , such that for each ε , 1> ε >0: (1) M ε A 2 ( x ) runs in time at most poly(| x |)2 εm 1 ( x ) . (2) If M ε A 2 ( x ) queries A 2 with the input x ′, then m 2 ( x ′)=O( m 1 ( x )) and | x ′|=| x | O(1) . If such a reduction family exists, A 1 is SERF-reducible to A 2 . If each problem in SNP is SERF-reducible to a problem A , then A is SNP -hard under SERF-reduction. And if A is also in SNP then we say A is SNP -complete under SERF-reductions. Note that the SERF-reducibility is transitive, and, if ( A 1 , m 1 ) SERF-reduces to ( A 2 , m 2 ), and ( A 2 , m 2 )∈SE, then ( A 1 , m 1 )∈SE [5] . Definition 4.3 Strong many-one reduction Let A 1 be a problem with complexity parameter m 1 and constraint R 1 and A 2 be a problem with complexity parameter m 2 and constraint R 2 . A many-one reduction f from A 1 to A 2 is called a strong many-one reduction if m 2 ( f ( x ))=O( m 1 ( x )). Strong many-one reduction is a special case of SERF-reduction [5] . Lemma 4.1 3- SAT with complexity parameter n , the number of variables , is SERF- reducible to 3-SAT with complexity parameter m , the number of clauses [5] . Lemma 4.2 3-SAT is SNP - complete under SERF-reductions , with either clauses or variables as the parameter [5] . Definition 4.4 3-ESAT 3-ESAT is a variant of 3-SAT, satisfying that in any instance of 3-ESAT, say a formula Φ , the clause number is equal to the number of variables that appear in Φ . Claim 3 Given n (n⩾5) variables , we can construct a satisfiable formula Φ in polynomial time of n , where Φ is an instance of 3-SAT and | Φ |⩽2 n . Proof We construct 2 n 3-clauses with the form x i ∨ x j ∨ x k , where 1⩽ i , j , k ⩽ n , i ≠ j , i ≠ k , j ≠ k . This can be done since there are C n 3 ⩾2 n 3-clauses with such form. Then we select k , 1⩽ k ⩽2 n , 3-clauses to construct the formula Φ . Φ is satisfiable when all these n variables are assigned the value “true”. □ Theorem 4.1 3-ESAT is SNP - hard under SERF-reductions , with either clauses or variables as the parameter . Consequently , 3-ESAT ∈ SE implies SNP ⊆SE . Proof According to Lemma 4.1 , Lemma 4.2 and the definition of strong many-one reduction, we only need to show there exists a strong many-one reduction from 3-SAT with m (the clause number) as complexity parameter to 3-ESAT with m as complexity parameter. For any given instance of 3-SAT, a formula Φ 0 which contains n 0 variables and m 0 clauses, we construct the many-one reduction, respectively, according to whether m 0 > n 0 or not. Firstly, if m 0 > n 0 , we add 3 2 (m 0 −n 0 ) new variables and use them to construct a formula Φ 1 which contains 1 2 (m 0 −n 0 ) clauses, in which each of all those 3 2 (m 0 −n 0 ) new variables appears once and only once. This means that Φ 1 is always satisfiable. Let Φ = Φ 1 ∧ Φ 0 then we get the instance of 3-ESAT since m 0 + 1 2 (m 0 −n 0 )=n 0 + 3 2 (m 0 −n 0 ) , and Φ is satisfiable if and only if Φ 0 is satisfiable, and the reduction can be done in polynomial time of n 0 . Note that m 0 + 1 2 (m 0 −n 0 )<2m 0 . In the second case, we add n 1 new variables, where n 1 = max { n 0 − m 0 ,5} and construct a satisfiable formula Φ 1 , with the size ( n 1 + n 0 − m 0 ). This can be done according to Claim 3 since n 1 + n 0 − m 0 ⩽2 n 1 . Then similar to the first case, let Φ = Φ 1 ∧ Φ 0 , we get the instance of 3-ESAT with parameter n 1 + n 0 and Φ is satisfiable if and only if Φ 0 is satisfiable. Thus, the reduction is done in polynomial time of n 0 . Note that (n 1 +n 0 −m 0 )+m 0 = max {2n 0 −m 0 ,5+n 0 }⩽ max {5m 0 ,3m 0 +5}. Then according to the properties of SERF-reduction, the theorem does hold. □ From the above proof, it is also easy to see that 3-ESAT is also NP -complete. Definition 4.5 ETH Define s to be the infimum of { δ : there exists an O(2 δn ) algorithm for solving 3-ESAT}. Define ETH for 3-ESAT to be that: s >0. In other words, 3-ESAT does not have sub-exponential time algorithm. Note that this hypothesis is stronger than NP ≠ P but yet plausible according to both theoretical and practical arguments presented in Section 1 . Under this assumption, we have the following result. Theorem 4.2 In the (2+ f ( n ))- SAT model , if the variables which appear in Φ 2 do not appear in Φ 3 , and vice versa , then the (2+( log n) k /n 2 ) - SAT is indeed in NPI under ETH for 3-ESAT , k ⩾2. Proof Consider the special case of (2+( log n) k /n 2 ) -SAT, where Φ 3 is an instance of 3-ESAT and n 3 =m 3 =( log n) k and Φ 2 is always satisfiable. That is, m 3 m = m 3 m 2 +m 3 = ( log n) k n 2 = m 3 (n 2 +n 3 ) 2 , m 2 =(n 2 +n 3 ) 2 −m 3 ⩽(n 2 +n 3 ) 2 . Note that n 2 =n−n 3 =n−( log n) k , n 3 =( log n) k , for sufficiently large n we get (n 2 +n 3 ) 2 ⩽ 3 2 n 2 2 − 3 2 n 2 . This means the special case of (2+( log n) k /n 2 ) -SAT indeed exists according to Claim 1 . Then for this special case of (2+( log n) k /n 2 ) -SAT ( k ⩾2), Φ 3 cannot be solved in polynomial time of n under ETH for 3-ESAT since there are ( log n) k variables in Φ 3 , so does Φ = Φ 2 ∧ Φ 3 since the variables which appear in Φ 2 do not appear in Φ 3 , and vice versa. Thus, (2+( log n) k /n 2 ) -SAT is indeed not in P under ETH for 3-ESAT, k ⩾2, and according to theorem 3 the theorem does hold. □ The more general case of (2+( log n) k /n 2 ) -SAT ( k ⩾2), where the variables which appear in Φ 2 may appear in Φ 3 , and vice versa, is currently under investigation. 5 Remarks and conclusion In this work, we study the boundary between P and NPC for the model of (2+ p )-SAT when p is considered as a function of n , the number of variables in the Boolean formula. The model allows us to obtain a natural problem in NPI under the ETH assumption. It is an interesting open problem whether this can be further shown to be in NPI under the weaker assumption NP ≠ P . Acknowledgements The authors are grateful to the anonymous referees for their many valuable suggestions and constructive criticism that has improved former versions of this paper greatly. The authors are also grateful to Shirley Cheung for her valuable helps in forming this paper. References [1] W. Anderson Solving problems in finite time Nature 400 1999 115 116 [2] R. Canetti, O. Goldreich, S. Goldwasser, S. Micali, Resettable zero-knowledge, in: Frances Yao (Ed.), Proceedings of the STOC’00, ACM Press, Portland, OR, USA, 2000, pp. 235–244. [3] C. Dwork, M. Naor, Zaps and their applications, in: Proceedings of the FOCS’00, IEEE Computer Society Press, Redondo Beach, Canada, 2000, pp. 283–293. [4] O. Goldreich, Introduction to complexity, Lecture Notes, Weizmann Institute, Israel, 1999, pp. 23–25, available from http://theory.lcs.mit.edu/~oded/ . [5] R. Impagliazzo, R. Paturi, Which problems have strongly exponential complexity?, in: Proceedings of the FOCS’98, IEEE Computer Society Press, Palo Alto, Canada, 1998, pp. 653–664. [6] R. Impagliazzo R. Paturi Complexity of k -SAT J. Comput. System Sci. 62 2001 367 375 [7] S. Kirkpatrick B. Selman Critical behavior in the satisfiability of random Boolean expressions Science 264 1994 1297 1301 [8] R.E. Lander On the structure of polynomial time reducibility J. Assoc. Comput. Mach. 22 1975 155 171 [9] S. Micali, M. Rabin, S. Vadhan, Verifiable random functions, in: Proceedings of the FOCS’99, IEEE Computer Society Press, New York, USA, 1999, pp. 120–130. [10] S. Micali, L. Reyzin, Soundness in the public-key model, in: Joe Killian (Ed.), Proceedings of the Crypto’01, Lecture Notes in Computer Science, Vol. 2139, Springer, Berlin, 2001, pp. 542–565. [11] R. Monasson R. Zecchina Tricritical points in random combinatorics the 2+ p SAT case J. Phys. A 31 1998 9209 9217 [12] R. Monasson R. Zecchina S. Kirkpatrick B. Selman L. Troyansky Determining computational complexity from characteristic ‘phase transitions’ Nature 400 1999 133 137 [13] H. Papadimitriou, Computational Complexity, Addison-Wesley, Reading, MA, 1994, pp. 329–332. [14] U. Schoning, A probabilistic algorithm for k -SAT and constraint satisfaction problems, in: Proceedings of the FOCS’99, IEEE Computer Society Press, New York, USA, 1999, pp. 410–420.