(2+f(n))-SAT and its properties
Yunlei Zhao,Xiaotie Deng,C. H. Lee,Hong Zhu
DOI: https://doi.org/10.1016/S0166-218X(03)00194-X
IF: 1.254
2004-01-01
Discrete Applied Mathematics
Abstract:Consider a formula that contains n variables with the form Φ = Φ 2 ∧ Φ 3 , where Φ 2 is an instance of 2-SAT containing m 2 2-clauses and Φ 3 is an instance of 3-SAT containing m 3 3-clauses. Φ is an instance of (2+ f ( n ))-SAT if m 3 /( m 2 + m 3 )⩽ f ( n ). We prove that (2+ f ( n ))-SAT is in P if f(n)= O ( log n/n 2 ) , and in NPC if f(n)=1/n 2−ε (∀ε : 0<ε<2) . Most interestingly, we give a candidate, (2+( log n) k /n 2 ) -SAT ( k ⩾2), for natural problems in NP − NPC − P (denoted as NPI ) with respect to this (2+ f ( n ))-SAT model. We prove that the restricted version of it is not in NPC under P ≠ NP . Actually, it is indeed in NPI under some stronger but plausible assumption, specifically, the exponential-time hypothesis. Keywords Computational complexity SAT Exponential-time hypothesis 1 Introduction In 1975, Lander had shown that there exist some languages in NP – NPC – P (denoted as NPI ) under the assumption P ≠ NP [8] . But the language constructed there is not a natural one because the construction needs to run all Turing machines. So far, no natural problems have been proven to be in NPI under P ≠ NP and finding such a natural problem is considered an important open problem in complexity theory [13,4] . The problems of graph isomorphism GI and factoring, which were suggested by Karp, are regarded as two most likely candidates [13,4] . The satisfiability problem of Boolean formula (SAT) has played a central role in the field of computational complexity theory. It is the first NP -complete problem. And up to now, all known algorithms to find a solution for 3-SAT require exponential time in problem size in the worst case. In practice, the time complexity of the fastest algorithm for 3-SAT is ( 4 3 ) n , where n is the variable number in the formula [14] . It is also an important open question whether sub-exponential time algorithms exist. The plausibility of such a sub-exponential time algorithm for 3-SAT was investigated in [5] , using sub-exponential time reduction. It was shown there that linear size 3-SAT is complete for the class SNP (strict NP ) with respect to such reduction. It implies that if there exists a sub-exponential time algorithm for 3-SAT then all the languages in SNP can be decided in sub-exponential time. Note that some well-studied problems, such as k -SAT, k -colorability, for any k ⩾3, and so on, have been proven to be SNP -complete. In light of both the practical and theoretical supports, Impagliazzo and Paturi introduced the exponential-time hypothesis (ETH) for 3-SAT: 3-SAT does not have a sub-exponential-time algorithm [6] . Although ETH is stronger than NP ≠ P , it is still quite reasonable. In recent advances of cryptography, many important cryptographic primitives and protocols were constructed under the ETH for the one-way functions: DLP or RSA, e.g., verifiable pseudorandom functions [9] , verifiable pseudorandom generator [3] and resettable zero-knowledge arguments systems for NP [2,10] and so on. On the other hand, recently there has been growth of interests to study the link between the hardness of computational complexity of decision problems and the phase boundaries in physical systems [1,12] . It was observed that, similar to physical systems, across certain phase boundaries dramatic changes occur in the computational difficulty and solution character. NP -complete problems become easier to solve away from the boundary and the hardest problems occur at the phase boundary [7,12] . To understand the onset of exponential complexity that occurs when going from a problem in P (2-SAT) to a problem that is NP -complete (3-SAT), the (2+ p )-SAT model was introduced in [11,12] , where p is a constant and 0⩽ p ⩽1. An instance of 2+ p -SAT is a formula with m clauses, of which (1− p ) m contain two variables (2-clauses) and pm contain three variables (3-clause). 2+ p -SAT smoothly interpolates between 2-SAT ( p =0) and 3-SAT ( p =1) when the instances are generated randomly. The median computation cost scales linearly with n (the number of variables) when p < p 0 and exponentially for p > p 0 , where p 0 lies between 0.4 and 0.416 [12] . However, for the worst-case complexity, (2+ p )-SAT is NP -complete for any constant p , p >0 [12,1] . In this work, we further explore the worst-case complexity boundary of P and NPC when p is further reduced (not a constant but a function of n ). Somewhat surprisingly, such an extension allows us to suggest another candidate for natural problems in NPI under NP ≠ P . In fact, we present a natural problem in NPI under ETH. In Section 2 , we present the necessary definitions and the related important properties for our study. In Section 3 , we present a candidate for natural problems in NPI and prove it not in NPC under NP ≠ P . In Section 4 , we prove it is not in P under ETH. We conclude with discussions in Section 5 . 2 Properties of (2+ f ( n ))- SAT In this section, we introduce the (2+ f ( n ))-SAT model. We are mainly concerned with the boundary of f ( n ) that separates the problems between P and NPC . Let Φ is an formula and denoted | Φ | as the number of clauses in Φ . We introduce the definition of (2+ f ( n ))-SAT: Definition 2.1 (2+ f ( n ))-SAT Consider a formula which contains n variables and m clauses with the form Φ = Φ 2 ∧ Φ 3 , where Φ 2 is an instance of 2-SAT which contains m 2 2-clauses, and Φ 3 is an instance of 3-SAT which contains m 3 3-clauses. An instance of (2+ f ( n ))-SAT is one satisfying the condition |Φ 3 | |Φ| = m 3 m = m 3 m 2 +m 3 ⩽f(n). Throughout the paper, we restrict our discussion to instances with f ( n )=| Φ 3 |/| Φ |. Indeed, all our claims hold if they hold under this restriction. Note that m 2 ⩽4 n 2 2 , m 3 ⩽8 n 3 3 , n 2 ⩽2 m 2 , n 3 ⩽3 m 3 , n ⩽3 m , and that the variables which appear in Φ 2 may appear in Φ 3 , and vice versa, i. e., n ⩽ n 2 + n 3 ⩽2 n . Theorem 2.1 For any constant k >0, (2+k log n/n 2 ) - SAT is in P . Proof Consider any instance of (2+k log n/n 2 ) -SAT ( k >0), a formula Φ = Φ 2 ∧ Φ 3 , where m 3 /(m 2 +m 3 )=k log n/n 2 . We get m 3 = k log nm 2 n 2 −k log n ⩽ km 2 log n+k log n n 2 ⩽ (k4n 2 +k) log n n 2 = 4k+ k n 2 log n⩽5k log n. Note that the variables which appear in Φ 2 may appear in Φ 3 , and vice versa. For the 5k log n variables which appear in Φ 3 , we can enumerate all the at most n 5 k truth assignments and then for each truth assignment we can determine Φ 2 in polynomial time of n , and thus the (2+k log n/n 2 ) -SAT ( k ⩾0) is in P . □ Claim 1 Given n variables , we can construct a satisfiable formula Φ , where Φ is an instance of 2- SAT and |Φ|⩽ 3 2 n 2 − 3 2 n . Proof We construct 2-clauses as follows: ( 1 2 n 2 − 1 2 n) clauses with the form ( x i ∨ x j )( i ≠ j ,1⩽ i , j ⩽ n ), ( n 2 − n ) clauses with the form (x i ∨¬ x j ) , (i≠j,1⩽i,j⩽n) . From all these 2-clauses, we select k,1⩽k⩽ 3 2 n 2 − 3 2 n , clauses to construct the formula Φ we need, then Φ is satisfiable when all these n variable are assigned the value “true”. □ Theorem 2.2 (2+ 1 n 2−ε ) - SAT (∀ ε , 0< ε <2) is in NPC . Proof We show that there is a many-one reduction from 3-SAT to (2+ 1 n 2−ε ) -SAT (0< ε <2). Let Φ 3 be an instance of 3-SAT that contains n 3 variables and m 3 3-clauses. Without loss of generality, we assume that m 3 ⩾2. Then we add n 2 = m 3 8/ ε new variables and using these new variables to construct a satisfiable formula Φ 2 which contains m 2 2-clauses. Let m 3 /( m 2 + m 3 )=1/ n 2− ε (0< ε <2) then m 3 m 2 +m 3 = 1 n 2−ε ⩾ 1 (n 2 +n 3 ) 2−ε , m 2 ⩽((n 2 +n 3 ) 2−ε −1)m 3 ⩽(n 2 +n 3 ) 2−ε m 3 ⩽(m 3 8/ε +3m 3 ) 2−ε m 3 . But note that m 3 ⩾2, we get (m 3 8/ε +3m 3 ) 2 m 3 ⩽ 3 2 (m 3 8/ε ) 2 − 3 2 m 3 8/ε (m 3 ) 8 = 3 2 n 2 2 − 3 2 n 2 m 3 8 ⩽ 3 2 n 2 2 − 3 2 n 2 (m 3 8/ε +3m 3 ) ε . That is, m 2 ⩽(m 3 8/ε +3m 3 ) 2−ε m 3 ⩽ 3 2 n 2 2 − 3 2 n 2 ⇒m 2 ⩽ 3 2 n 2 2 − 3 2 n 2 . The satisfiable formula Φ 2 can be constructed according to Claim 1 . Let Φ = Φ 2 ∧ Φ 3 , then Φ is an instance of (2+1/ n 2− ε )-SAT (0< ε <2) and Φ is satisfiable if and only if Φ 3 is satisfiable. Note that the above many-one reduction indeed can be constructed in polynomial time of m 3 (also in polynomial time of n 3 , since n 3 ⩽3 m 3 , m 3 ⩽8 n 3 3 ). Obviously, (2+1/ n 2− ε )-SAT (0< ε <2) is in NP , so the theorem does hold. □ One open problem related to our (2+ f ( n ))-SAT model is: Open problem Does there exist some f ( n ), s.t. k log n/n 2 <f(n)<1/n 2−ε , where k ⩾0 and 0< ε <2, so that (2+ f ( n ))-SAT is in ( NP – NPC ) – P (denoted as NPI ) under the assumption P ≠ NP ? Note that (2+k log n/n 2 ) -SAT is in P , k ⩾0 and (2+1/ n 2− ε )-SAT (0< ε <2) is in NP -complete according to the above theorems. Now, we give another candidate and also another open problem with regard to our (2+ f ( n ))-SAT for natural problems in NPI under P ≠ NP : Open problem In the (2+ f ( n ))-SAT model, is (2+( log n) k /n 2 ) -SAT ( k ⩾2) in ( NP – NPC ) – P under the assumption NP ≠ P ? Note that k 1 log n/n 2 <( log n) k /n 2 (k⩾2)<1/n 2−ε , where k 1 ⩾0 and 0< ε <2. 3 A candidate for natural problems in NPI under NP ≠ P Now, we give another candidate for natural problems in NPI under P ≠ NP which is a restricted version of (2+( log n) k /n 2 ) -SAT ( k ⩾2). We will prove that it is not NP -complete under the assumption P ≠ NP . Actually, it is indeed in NPI under some stronger but reasonable assumptions. Theorem 3.1 In the (2+ f ( n ))- SAT model , if the variables which appear in Φ 2 do not appear in Φ 3 , and vice versa , then (2+( log n) k /n 2 ) - SAT is not in NPC under the assumption NP ≠ P , k ⩾2. Proof Clearly, this problem is in NP . We prove this theorem by showing that 3-SAT cannot be reduced to (2+( log n) k /n 2 ) -SAT by many-one reduction, where k ⩾2. Assume that there exists a many-one reduction(denoted as F ) from 3-SAT to (2+( log n) k /n 2 ) -SAT ( k ⩾2). It means that for any instance of 3-SAT, a formula Φ 0 which contains n 0 variables and m 0 3-clauses, we can construct the F ( Φ 0 ) which is an instance of (2+( log n) k /n 2 ) -SAT ( k ⩾2) in polynomial time of n 0 , where F ( Φ 0 ) contains n variables and m clauses, and F ( Φ 0 ) is satisfiable if and only if Φ 0 is satisfiable. Let F ( Φ 0 )= Φ 2 ∧ Φ 3 , where Φ 2 is an instance of 2-SAT which contains m 2 2-clauses and n 2 variables and Φ 3 is an instance of 3-SAT which contains m 3 3-clauses and n 3 variables, then ( log n) k /n 2 =|Φ 3 |/|Φ|=m 3 /m=m 3 /(m 2 +m 3 ) , k ⩾2. We consider the relation between m 3 and m 0 there are two cases: Case 1: m 3 ⩾ m 0 . Claim 2 m = m 2 + m 3 cannot be expressed as a polynomial of m 3 . Proof of Claim 2 Firstly, for sufficiently large n , ( log n) k /n 2 =m 3 /m⩽ 1 2 (i.e. m ⩾2 m 3 ), where k ⩾2. Secondly, m=m 2 +m 3 ⩽4n 2 +m 3 ⇒n 2 ⩾ m−m 3 4 . Then, for sufficiently large n , the following holds: m 3 m = ( log n) k n 2 ⩽ 4( log 3m) k m−m 3 ⇒4( log 3m) k ⩾m 3 m−m 3 m ⩾ 1 2 m 3 ⇒m⩾ 1 3 2 ( m 3 8 ) 1/k . □ According to Claim 2 , in Case 1, we get the fact that m cannot be expressed as a polynomial of m 3 , and since m 3 ⩾ m 0 , so m also cannot be expressed as a polynomial of m 0 (of course m also cannot be expressed as a polynomial of n 0 since m 0 ⩽8 n 0 3 ). Its absurd since the many-one reduction F ( Φ 0 ) must be done in polynomial time of n 0 . Case 2 : m 3 < m 0 . Since we assume F ( Φ 0 ) can be constructed in polynomial time of n 0 , then m 2 must be expressed as P ( n 0 ), where P (·) is a polynomial. So, if m 3 < m 0 it means that we can decrease the 3-clause number in Φ 0 by adding P ( n 0 ) 2-clauses (by imposing F on Φ 0 ). However, note that we assume the variables which appear in Φ 2 do not appear in Φ 3 , and vice versa, then we can impose F on Φ 3 , and so on. Repeat the above process at most m 0 times we can eliminate all 3-clauses in F ( Φ 0 ) to get a formula Φ ′ and guarantee that Φ ′ is satisfiable if and only if F ( Φ 0 ) is satisfiable if and only if Φ 0 is satisfiable, where Φ ′ contains only 2-clauses and | Φ ′| is at most m 0 P ( n 0 ), or at most 8 n 0 3 P ( n 0 ), another polynomial of n 0 . This means that there exists a many-one reduction from 3-SAT to 2-SAT, which contradicts our assumption P ≠ NP . So, from the arguments above, we can conclude that (2+( log n) k /n 2 ) -SAT ( k ⩾2) is not NP -complete under the assumption P ≠ NP . □ 4 Can the candidate be in P ? In this section, we further show that the candidate presented in the previous section is indeed in NPI under ETH. Definition 4.1 SE A language L ∈ SE if for any x ∈ L there exists an algorithm to find a y so that | y |⩽ m ( x ) and R ( x , y ) in time poly(| x |)2 εm ( x ) for every fixed ε , 1> ε >0, where R is a polynomial time relation called the constraint, and m is a polynomial-time computable and polynomial bounded complexity parameter. Definition 4.2 SERF The sub-exponential reduction family SERF from A 1 with parameter m 1 to A 2 with parameter m 2 is defined as a collection of Turing reduction M ε A 2 , such that for each ε , 1> ε >0: (1) M ε A 2 ( x ) runs in time at most poly(| x |)2 εm 1 ( x ) . (2) If M ε A 2 ( x ) queries A 2 with the input x ′, then m 2 ( x ′)=O( m 1 ( x )) and | x ′|=| x | O(1) . If such a reduction family exists, A 1 is SERF-reducible to A 2 . If each problem in SNP is SERF-reducible to a problem A , then A is SNP -hard under SERF-reduction. And if A is also in SNP then we say A is SNP -complete under SERF-reductions. Note that the SERF-reducibility is transitive, and, if ( A 1 , m 1 ) SERF-reduces to ( A 2 , m 2 ), and ( A 2 , m 2 )∈SE, then ( A 1 , m 1 )∈SE [5] . Definition 4.3 Strong many-one reduction Let A 1 be a problem with complexity parameter m 1 and constraint R 1 and A 2 be a problem with complexity parameter m 2 and constraint R 2 . A many-one reduction f from A 1 to A 2 is called a strong many-one reduction if m 2 ( f ( x ))=O( m 1 ( x )). Strong many-one reduction is a special case of SERF-reduction [5] . Lemma 4.1 3- SAT with complexity parameter n , the number of variables , is SERF- reducible to 3-SAT with complexity parameter m , the number of clauses [5] . Lemma 4.2 3-SAT is SNP - complete under SERF-reductions , with either clauses or variables as the parameter [5] . Definition 4.4 3-ESAT 3-ESAT is a variant of 3-SAT, satisfying that in any instance of 3-ESAT, say a formula Φ , the clause number is equal to the number of variables that appear in Φ . Claim 3 Given n (n⩾5) variables , we can construct a satisfiable formula Φ in polynomial time of n , where Φ is an instance of 3-SAT and | Φ |⩽2 n . Proof We construct 2 n 3-clauses with the form x i ∨ x j ∨ x k , where 1⩽ i , j , k ⩽ n , i ≠ j , i ≠ k , j ≠ k . This can be done since there are C n 3 ⩾2 n 3-clauses with such form. Then we select k , 1⩽ k ⩽2 n , 3-clauses to construct the formula Φ . Φ is satisfiable when all these n variables are assigned the value “true”. □ Theorem 4.1 3-ESAT is SNP - hard under SERF-reductions , with either clauses or variables as the parameter . Consequently , 3-ESAT ∈ SE implies SNP ⊆SE . Proof According to Lemma 4.1 , Lemma 4.2 and the definition of strong many-one reduction, we only need to show there exists a strong many-one reduction from 3-SAT with m (the clause number) as complexity parameter to 3-ESAT with m as complexity parameter. For any given instance of 3-SAT, a formula Φ 0 which contains n 0 variables and m 0 clauses, we construct the many-one reduction, respectively, according to whether m 0 > n 0 or not. Firstly, if m 0 > n 0 , we add 3 2 (m 0 −n 0 ) new variables and use them to construct a formula Φ 1 which contains 1 2 (m 0 −n 0 ) clauses, in which each of all those 3 2 (m 0 −n 0 ) new variables appears once and only once. This means that Φ 1 is always satisfiable. Let Φ = Φ 1 ∧ Φ 0 then we get the instance of 3-ESAT since m 0 + 1 2 (m 0 −n 0 )=n 0 + 3 2 (m 0 −n 0 ) , and Φ is satisfiable if and only if Φ 0 is satisfiable, and the reduction can be done in polynomial time of n 0 . Note that m 0 + 1 2 (m 0 −n 0 )<2m 0 . In the second case, we add n 1 new variables, where n 1 = max { n 0 − m 0 ,5} and construct a satisfiable formula Φ 1 , with the size ( n 1 + n 0 − m 0 ). This can be done according to Claim 3 since n 1 + n 0 − m 0 ⩽2 n 1 . Then similar to the first case, let Φ = Φ 1 ∧ Φ 0 , we get the instance of 3-ESAT with parameter n 1 + n 0 and Φ is satisfiable if and only if Φ 0 is satisfiable. Thus, the reduction is done in polynomial time of n 0 . Note that (n 1 +n 0 −m 0 )+m 0 = max {2n 0 −m 0 ,5+n 0 }⩽ max {5m 0 ,3m 0 +5}. Then according to the properties of SERF-reduction, the theorem does hold. □ From the above proof, it is also easy to see that 3-ESAT is also NP -complete. Definition 4.5 ETH Define s to be the infimum of { δ : there exists an O(2 δn ) algorithm for solving 3-ESAT}. Define ETH for 3-ESAT to be that: s >0. In other words, 3-ESAT does not have sub-exponential time algorithm. Note that this hypothesis is stronger than NP ≠ P but yet plausible according to both theoretical and practical arguments presented in Section 1 . Under this assumption, we have the following result. Theorem 4.2 In the (2+ f ( n ))- SAT model , if the variables which appear in Φ 2 do not appear in Φ 3 , and vice versa , then the (2+( log n) k /n 2 ) - SAT is indeed in NPI under ETH for 3-ESAT , k ⩾2. Proof Consider the special case of (2+( log n) k /n 2 ) -SAT, where Φ 3 is an instance of 3-ESAT and n 3 =m 3 =( log n) k and Φ 2 is always satisfiable. That is, m 3 m = m 3 m 2 +m 3 = ( log n) k n 2 = m 3 (n 2 +n 3 ) 2 , m 2 =(n 2 +n 3 ) 2 −m 3 ⩽(n 2 +n 3 ) 2 . Note that n 2 =n−n 3 =n−( log n) k , n 3 =( log n) k , for sufficiently large n we get (n 2 +n 3 ) 2 ⩽ 3 2 n 2 2 − 3 2 n 2 . This means the special case of (2+( log n) k /n 2 ) -SAT indeed exists according to Claim 1 . Then for this special case of (2+( log n) k /n 2 ) -SAT ( k ⩾2), Φ 3 cannot be solved in polynomial time of n under ETH for 3-ESAT since there are ( log n) k variables in Φ 3 , so does Φ = Φ 2 ∧ Φ 3 since the variables which appear in Φ 2 do not appear in Φ 3 , and vice versa. Thus, (2+( log n) k /n 2 ) -SAT is indeed not in P under ETH for 3-ESAT, k ⩾2, and according to theorem 3 the theorem does hold. □ The more general case of (2+( log n) k /n 2 ) -SAT ( k ⩾2), where the variables which appear in Φ 2 may appear in Φ 3 , and vice versa, is currently under investigation. 5 Remarks and conclusion In this work, we study the boundary between P and NPC for the model of (2+ p )-SAT when p is considered as a function of n , the number of variables in the Boolean formula. The model allows us to obtain a natural problem in NPI under the ETH assumption. It is an interesting open problem whether this can be further shown to be in NPI under the weaker assumption NP ≠ P . Acknowledgements The authors are grateful to the anonymous referees for their many valuable suggestions and constructive criticism that has improved former versions of this paper greatly. The authors are also grateful to Shirley Cheung for her valuable helps in forming this paper. References [1] W. Anderson Solving problems in finite time Nature 400 1999 115 116 [2] R. Canetti, O. Goldreich, S. Goldwasser, S. Micali, Resettable zero-knowledge, in: Frances Yao (Ed.), Proceedings of the STOC’00, ACM Press, Portland, OR, USA, 2000, pp. 235–244. [3] C. Dwork, M. Naor, Zaps and their applications, in: Proceedings of the FOCS’00, IEEE Computer Society Press, Redondo Beach, Canada, 2000, pp. 283–293. [4] O. Goldreich, Introduction to complexity, Lecture Notes, Weizmann Institute, Israel, 1999, pp. 23–25, available from http://theory.lcs.mit.edu/~oded/ . [5] R. Impagliazzo, R. Paturi, Which problems have strongly exponential complexity?, in: Proceedings of the FOCS’98, IEEE Computer Society Press, Palo Alto, Canada, 1998, pp. 653–664. [6] R. Impagliazzo R. Paturi Complexity of k -SAT J. Comput. System Sci. 62 2001 367 375 [7] S. Kirkpatrick B. Selman Critical behavior in the satisfiability of random Boolean expressions Science 264 1994 1297 1301 [8] R.E. Lander On the structure of polynomial time reducibility J. Assoc. Comput. Mach. 22 1975 155 171 [9] S. Micali, M. Rabin, S. Vadhan, Verifiable random functions, in: Proceedings of the FOCS’99, IEEE Computer Society Press, New York, USA, 1999, pp. 120–130. [10] S. Micali, L. Reyzin, Soundness in the public-key model, in: Joe Killian (Ed.), Proceedings of the Crypto’01, Lecture Notes in Computer Science, Vol. 2139, Springer, Berlin, 2001, pp. 542–565. [11] R. Monasson R. Zecchina Tricritical points in random combinatorics the 2+ p SAT case J. Phys. A 31 1998 9209 9217 [12] R. Monasson R. Zecchina S. Kirkpatrick B. Selman L. Troyansky Determining computational complexity from characteristic ‘phase transitions’ Nature 400 1999 133 137 [13] H. Papadimitriou, Computational Complexity, Addison-Wesley, Reading, MA, 1994, pp. 329–332. [14] U. Schoning, A probabilistic algorithm for k -SAT and constraint satisfaction problems, in: Proceedings of the FOCS’99, IEEE Computer Society Press, New York, USA, 1999, pp. 410–420.
What problem does this paper attempt to address?
-
(2+ f(n))-SAT and Its Properties.
Xiaotie Deng,Chan H. Lee,Yunlei Zhao,Hong Zhu
DOI: https://doi.org/10.1007/3-540-45655-4_5
2002-01-01
Abstract:Consider a formula which contains n variables and m clauses with the form = , where is an instance of 2-SAT which contains m 2-clauses and is an instance of 3-SAT which contains m 3-clauses. is an instance of (2 + f ( n ))-SAT if ( m / m + m ) f ( n ). We prove that (2 + f ( n ))-SAT is in P if f ( n ) = O(log n / n ), and in NPC if f ( n ) =1/( n - ) ( : 0 < < 2). Most interestingly, we give a candidate (2 +(log n ) k / n )-SAT ( k = 2), for natural problems in NP - NPC - P (denoted as NPI) with respect to this (2 + f ( n ))-SAT model. We prove that the restricted version of it is not in NPC under the assumption P NP. Actually it is indeed in NPI under some stronger but plausible assumption, specifically, the Exponential-Time Hypothesis (ETH) which was introduced by Impagliazzo and Paturi.
-
Some variants of SAT and their properties
Yunlei Zhao
2002-01-01
Abstract:A new model for the well-known problem, the satisfiability problem of boolean formula (SAT), is introduced. Based on this model, some variants of SAT and their properties are presented. Denote by NP the class of all languages which can be decided by a non-deterministic polynomial Turing machine and by P the class of all languages which can be decided by a deterministic polynomial-time Turing machine. This model also allows us to give another candidate for the natural problems in ((NP-NPC)-P), denoted as NPI, under the assumption P≠NP, where NPC represents NP-complete. It is proven that this candidate is not in NPC under P≠NP. While, it is indeed in NPI under some stronger but reasonable assumption, specifically, under the Exponential-Time Hypothesis (ETH) . Thus we can partially solve this long standing important open problem.
-
On the Satisfaction Probabilities of $k$-CNF Formulas
Till Tantau
2024-08-10
Abstract:The satisfaction probability Pr[$\phi$] := Pr$_{\beta:vars(\phi) \to \{0,1\}}[\beta\models \phi]$ of a propositional formula $\phi$ is the likelihood that a random assignment $\beta$ makes the formula true. We study the complexity of the problem $k$SAT-Pr$_{>p}$ = {$\phi$ is a $k$CNF formula | Pr[$\phi$] > p} for fixed $k$ and $p$. While 3SAT-Pr$_{>0}$ = 3SAT is NP-complete and SAT-Pr$_{>1/2}$ is PP-complete, Akmal and Williams recently showed that 3SAT-Pr$_{>1/2}$ lies in P and 4SAT-Pr$_{>1/2}$ is NP-complete; but the methods used to prove these striking results stay silent about, say, 4SAT-Pr$_{>3/4}$, leaving the computational complexity of $k$SAT-Pr$_{>p}$ open for most $k$ and $p$. In the present paper we give a complete characterization in the form of a trichotomy: $k$SAT-Pr$_{>p}$ lies in AC$^0$, is NL-complete, or is NP-complete. The proof of the trichotomy hinges on a new order-theoretic insight: Every set of $k$CNF formulas contains a formula of maximum satisfaction probability. This deceptively simple statement allows us to (1) kernelize $k$SAT-Pr$_{\ge p}$ for the joint parameters $k$ and $p$, (2) show that the variables of the kernel form a backdoor set when the trichotomy states membership in AC$^0$ or NL, and (3) prove locality properties for $k$CNF formulas $\phi$, by which Pr[$\phi$] < $p$ implies that Pr[$\psi$] < $p$ holds already for a subset $\psi$ of $\phi$'s clauses whose size depends only on $k$ and $p$, and Pr[$\phi$] = $p$ implies $\phi \equiv \psi$ for some $k$CNF formula $\psi$ whose size once more depends only on $k$ and $p$.
Computational Complexity,Logic in Computer Science
-
A New Bound for 3-Satisfiable MaxSat and Its Algorithmic Application.
Gregory Gutin,Mark Jones,Dominik Scheder,Anders Yeo
DOI: https://doi.org/10.1016/j.ic.2013.08.008
IF: 1.24
2013-01-01
Information and Computation
Abstract:Let F be a CNF formula with n variables and m clauses. F is 3-satisfiable if for any 3 clauses in F, there is a truth assignment which satisfies all of them. Lieberherr and Specker (1982) and, later, Yannakakis (1994) proved that in each 3-satisfiable CNF formula at least 23 of its clauses can be satisfied by a truth assignment. We improve this result by showing that every 3-satisfiable CNF formula F contains a subset of variables U, such that some truth assignment τ will satisfy at least 23m+13mU+ρn′ clauses, where m is the number of clauses of F, mU is the number of clauses of F containing a variable from U, n′ is the total number of variables in clauses not containing a variable in U, and ρ is a positive absolute constant. Both U and τ can be found in polynomial time.We use our result to show that the following parameterized problem is fixed-parameter tractable and, moreover, has a kernel with a linear number of variables. In 3-S-MaxSat-AE, we are given a 3-satisfiable CNF formula F with m clauses and asked to determine whether there is an assignment which satisfies at least 23m+k clauses, where k is the parameter.
-
PPSZ for K ≥ 5: More is Better.
Dominik Scheder
DOI: https://doi.org/10.1145/3349613
2019-01-01
ACM Transactions on Computation Theory
Abstract:We show that for k ≥ 5, the PPSZ algorithm for k -SAT runs exponentially faster if there is an exponential number of satisfying assignments. More precisely, we show that for every k ≥ 5, there is a strictly increasing function f : [0,1] → R with f (0) = 0 that has the following property. If F is a k -CNF formula over n variables and |sat(F)| = 2 δ n solutions, then PPSZ finds a satisfying assignment with probability at least 2 − c k n − o ( n ) + f (δ) n . Here, 2 − c k n − o ( n ) is the success probability proved by Paturi et al. [11] for k -CNF formulas with a unique satisfying assignment. Our proof rests on a combinatorial lemma: given a set S ⊆ { 0,1} n , we can partition { 0,1} n into subcubes such that each subcube B contains exactly one element of S . Such a partition B induces a distribution on itself, via Pr [ B ] = |B| / 2 n for each B ∈ B . We are interested in partitions that induce a distribution of high entropy. We show that, in a certain sense, the worst case (min S : |S| = s max B H ( B )) is achieved if S is a Hamming ball. This lemma implies that every set S of exponential size allows a partition of linear entropy. This in turn leads to an exponential improvement of the success probability of PPSZ.
-
Satisfiability With Exponential Families
Dominik Scheder,Philipp Zumstein
DOI: https://doi.org/10.1007/978-3-540-72788-0_17
2007-01-01
Abstract:Fix a set S subset of {0, 1}* of exponential size, e.g. vertical bar S boolean AND {0, 1}(n)vertical bar is an element of Omega(alpha(n)), alpha > 1. The S-SAT problem asks whether a propositional formula F over variables v(1), ..., v(n), has a satisfying assignment (v(1), ..., v(n)) G {0, 1}(n) boolean AND S. Our interest is in determining the complexity of S-SAT. We prove that S-SAT is NP-complete for all context-free sets S. Furthermore, we show that if S-SAT is in P for some exponential S, then SAT and all problems in NP have polynomial circuits. This strongly indicates that satisfiability with exponential families is a hard problem. However, we also give an example of an exponential set S for which the S-SAT problem is not NP-hard, provided P not equal NP.
-
Properties of the satisfiability threshold of the strictly d-regular random (3,2s)-SAT problem
Yongping Wang,Daoyun Xu
DOI: https://doi.org/10.1007/s11704-020-9248-0
IF: 2.6688
2020-07-11
Frontiers of Computer Science
Abstract:A k-CNF (conjunctive normal form) formula is a regular (k, s)-CNF one if every variable occurs s times in the formula, where k ⩾ 2 and s > 0 are integers. Regular (3, s)-CNF formulas have some good structural properties, so carrying out a probability analysis of the structure for random formulas of this type is easier than conducting such an analysis for random 3-CNF formulas. Some subclasses of the regular (3, s)-CNF formula have also characteristics of intractability that differ from random 3-CNF formulas. For this purpose, we propose strictly d-regular (k, 2s)-CNF formula, which is a regular (k, 2s)-CNF formula for which d ⩾ 0 is an even number and each literal occurs (s - {d over 2}) or (s + {d over 2}) times (the literals from a variable x are x and ¬x, where x is positive and ¬x is negative). In this paper, we present a new model to generate strictly d-regular random (k, 2s)-CNF formulas, and focus on the strictly d-regular random (3, 2s)-CNF formulas. Let F be a strictly d-regular random (3, 2s)-CNF formula such that 2s > d. We show that there exists a real number s0 such that the formula F is unsatisfiable with high probability when s > s0, and present a numerical solution for the real number s0. The result is supported by simulated experiments, and is consistent with the existing conclusion for the case of d = 0. Furthermore, we have a conjecture: for a given d, the strictly d-regular random (3, 2s)-SAT problem has an SAT-UNSAT (satisfiable-unsatisfiable) phase transition. Our experiments support this conjecture. Finally, our experiments also show that the parameter d is correlated with the intractability of the 3-SAT problem. Therefore, our research maybe helpful for generating random hard instances of the 3-CNF formula.
computer science, information systems, theory & methods, software engineering
-
A Polynomial Time Algorithm for 3SAT
Robert Quigley
2024-02-13
Abstract:It is shown that any two clauses in an instance of 3SAT sharing the same terminal which is positive in one clause and negated in the other can imply a new clause composed of the remaining terms from both clauses. Clauses can also imply other clauses as long as all the terms in the implying clauses exist in the implied clause. It is shown an instance of 3SAT is unsatisfiable if and only if it can derive contradicting 1-terminal clauses in exponential time. It is further shown that these contradicting clauses can be implied with the aforementioned techniques without processing clauses of length 4 or greater, reducing the computation to polynomial time. Therefore there is a polynomial time algorithm that will produce contradicting 1-terminal clauses if and only if the instance of 3SAT is unsatisfiable. Since such an algorithm exists and 3SAT is NP-Complete, P = NP.
Computational Complexity
-
On the Complexity of Random Satisfiability Problems with Planted Solutions
Vitaly Feldman,Will Perkins,Santosh Vempala
DOI: https://doi.org/10.1137/16m1078471
2018-01-01
SIAM Journal on Computing
Abstract:The problem of identifying a planted assignment given a random $k$-satisfiability ($k$-SAT) formula consistent with the assignment exhibits a large algorithmic gap: while the planted solution becomes unique and can be identified given a formula with $O(n\log n)$ clauses, there are distributions over clauses for which the best-known efficient algorithms require $n^{k/2}$ clauses. We propose and study a unified model for planted $k$-SAT, which captures well-known special cases. An instance is described by a planted assignment $\sigma$ and a distribution on clauses with $k$ literals. We define its distribution complexity as the largest $r$ for which the distribution is not $r$-wise independent ($1 \le r \le k$ for any distribution with a planted assignment). Our main result is an unconditional lower bound, tight up to logarithmic factors, for statistical (query) algorithms [M. Kearns, J. ACM, 45 (1998), pp. 983--1006; V. Feldman, E. Grigorescu, L. Reyzin, S. S. Vempala, and Y. Xiao, J. ACM, 64 (2017), pp. 8:1--8:37], matching known upper bounds, which, as we show, can be implemented using a statistical algorithm. Since known approaches for problems over distributions have statistical analogues (spectral, Markov Chain Monte Carlo, gradient-based, convex optimization, etc.), this lower bound provides a rigorous explanation of the observed algorithmic gap. The proof introduces a new general technique for the analysis of statistical query algorithms. It also points to a geometric paring phenomenon in the space of all planted assignments. We describe consequences of our lower bounds to Feige's refutation hypothesis [U. Feige, Proceedings of the ACM Symposium on Theory of Computing, 2002, pp. 534--543] and to lower bounds on general convex programs that solve planted $k$-SAT. Our bounds also extend to other planted $k$-CSP models and, in particular, provide concrete evidence for the security of Goldreich's one-way function and the associated pseudorandom generator when used with a sufficiently hard predicate [O. Goldreich, preprint, ia.cr/2000/063, 2000].
computer science, theory & methods,mathematics, applied
-
A 3-CNF-SAT descriptor algebra and the solution of the P=NP conjecture
Marcel Rémon,Johan Barthélemy
DOI: https://doi.org/10.48550/arXiv.1609.05709
2016-07-26
Abstract:The relationship between the complexity classes P and NP is an unsolved question in the field of theoretical computer science. In this paper, we investigate a descriptor approach based on lattice properties. This paper proposes a new way to decide the satisfiability of any 3-CNF-SAT problem. The analysis of this exact [non heuristical] algorithm shows a strictly bounded exponential complexity. The complexity of any 3-CNF-SAT solution is bounded by O(2^490). This over-estimated bound is reached by an algorithm working on the smallest description (via descriptor functions) of the evolving set of solutions in function of the already considered clauses, without exploring these solutions. Any remark about this paper is warmly welcome.
Computational Complexity
-
Improving Ppsz For 3-Sat Using Critical Variables
Timon Hertli,Robin A. Moser,Dominik Scheder
DOI: https://doi.org/10.4230/LIPIcs.STACS.2011.237
2011-01-01
Abstract:A critical variable of a satisfiable CNF formula is a variable that has the same value in all satisfying assignments. Using a simple case distinction on the fraction of critical variables of a CNF formula, we improve the running time for 3-SAT from O(1.32216(n)) by Rolf [10] to O(1.32153(n)). Using a different approach, Iwama et al. [5] very recently achieved a running time of O(1.32113(n)). Our method nicely combines with theirs, yielding the currently fastest known algorithm with running time O(1.32065(n)). We also improve the bound for 4-SAT from O(1.47390(n)) [6] to O(1.46928(n)), where O(1.46981(n)) can be obtained using the methods of [6] and [10].
-
The Algorithmic Phase Transition of Random $k$-SAT for Low Degree Polynomials
Guy Bresler,Brice Huang
DOI: https://doi.org/10.48550/arXiv.2106.02129
2021-10-30
Abstract:Let $\Phi$ be a uniformly random $k$-SAT formula with $n$ variables and $m$ clauses. We study the algorithmic task of finding a satisfying assignment of $\Phi$. It is known that satisfying assignments exist with high probability up to clause density $m/n = 2^k \log 2 - \frac12 (\log 2 + 1) + o_k(1)$, while the best polynomial-time algorithm known, the Fix algorithm of Coja-Oghlan, finds a satisfying assignment at the much lower clause density $(1 - o_k(1)) 2^k \log k / k$. This prompts the question: is it possible to efficiently find a satisfying assignment at higher clause densities?
We prove that the class of low degree polynomial algorithms cannot find a satisfying assignment at clause density $(1 + o_k(1)) \kappa^* 2^k \log k / k$ for a universal constant $\kappa^* \approx 4.911$. This class encompasses Fix, message passing algorithms including Belief and Survey Propagation guided decimation (with bounded or mildly growing number of rounds), and local algorithms on the factor graph. This is the first hardness result for any class of algorithms at clause density within a constant factor of that achieved by Fix. Our proof establishes and leverages a new many-way overlap gap property tailored to random $k$-SAT.
Computational Complexity,Data Structures and Algorithms,Mathematical Physics,Probability,Machine Learning
-
Exponential Lower Bounds for the PPSZ K-Sat Algorithm
Shiteng Chen,Dominik Scheder,Navid Talebanfard,Bangsheng Tang
DOI: https://doi.org/10.1137/1.9781611973105.91
2013-01-01
Abstract:In 1998, Paturi, Pudlák, Saks, and Zane presented PPSZ, an elegant randomized algorithm for k-SAT. Fourteen years on, this algorithm is still the fastest known worst-case algorithm. They proved that its expected running time on k-CNF formulas with n variables is at most 2(1−εk)n, where εk ε Ω(1/k). So far, no exponential lower bounds at all have been known.In this paper, we construct hard instances for PPSZ. That is, we construct satisfiable k-CNF formulas over n variables on which the expected running time is at least 2(1--εk)n, for εk ε O(log2k/k).
-
Unsatisfiable Linear K-Cnfs Exist, for Every K
Dominik Scheder
DOI: https://doi.org/10.48550/arxiv.0708.2336
2007-01-01
Abstract:We call a CNF formula linear if any two clauses have at most one variable in common. Let Linear k-SAT be the problem of deciding whether a given linear k-CNF formula is satisfiable. Here, a k-CNF formula is a CNF formula in which every clause has size exactly k. It was known that for k >= 3, Linear k-SAT is NP-complete if and only if an unsatisfiable linear k-CNF formula exists, and that they do exist for k >= 4. We prove that unsatisfiable linear k-CNF formulas exist for every k. Let f(k) be the minimum number of clauses in an unsatisfiable linear k-CNF formula. We show that f(k) is Omega(k2^k) and O(4^k*k^4), i.e., minimum size unsatisfiable linear k-CNF formulas are significantly larger than minimum size unsatisfiable k-CNF formulas. Finally, we prove that, surprisingly, linear k-CNF formulas do not allow for a larger fraction of clauses to be satisfied than general k-CNF formulas.
-
Going after the k-SAT threshold
Amin Coja-Oghlan,Konstantinos Panagiotou
DOI: https://doi.org/10.1145/2488608.2488698
2013-01-01
Abstract:Random k-SAT is the single most intensely studied example of a random constraint satisfaction problem. But despite substantial progress over the past decade, the threshold for the existence of satisfying assignments is not known precisely for any k≥3. The best current results, based on the second moment method, yield upper and lower bounds that differ by an additive k ⋅ {ln2}/2, a term that is unbounded in k (Achlioptas, Peres: STOC 2003). The basic reason for this gap is the inherent asymmetry of the Boolean values 'true' and 'false' in contrast to the perfect symmetry, e.g., among the various colors in a graph coloring problem. Here we develop a new asymmetric second moment method that allows us to tackle this issue head on for the first time in the theory of random CSPs. This technique enables us to compute the k-SAT threshold up to an additive ln2-1/2+O(1/k) ~0.19. Independently of the rigorous work, physicists have developed a sophisticated but non-rigorous technique called the "cavity method" for the study of random CSPs (Mezard, Parisi, Zecchina: Science~2002). Our result matches the best bound that can be obtained from the so-called "replica symmetric" version of the cavity method, and indeed our proof directly harnesses parts of the physics calculations.
-
Further Improvements for SAT in Terms of Formula Length.
Junqiang Peng,Mingyu Xiao
DOI: https://doi.org/10.1016/j.ic.2023.105085
2022-01-01
SSRN Electronic Journal
Abstract:In this paper, we prove that the general CNF satisfiability problem can be solved in O⁎(1.0638L) time, where L is the length of the input CNF-formula (i.e., the total number of literals in the formula), which improves the previous result of O⁎(1.0652L) obtained in 2009. Our algorithm was analyzed by using the measure-and-conquer method. Our improvements are mainly attributed to the following two points: we carefully design branching rules to deal with degree-5 and degree-4 variables to avoid previous bottlenecks; we show that some worst cases will not always happen, and then we can use an amortized technique to get further improvements. In our analyses, we provide some general frameworks for analysis and several lower bounds on the decreasing of the measure to simplify the arguments. These techniques may be used to analyze more algorithms based on the measure-and-conquer method.
-
SARRIGUREN: a polynomial-time complete algorithm for random $k$-SAT with relatively dense clauses
Alfredo Goñi Sarriguren
2024-01-17
Abstract:SARRIGUREN, a new complete algorithm for SAT based on counting clauses (which is valid also for Unique-SAT and #SAT) is described, analyzed and tested. Although existing complete algorithms for SAT perform slower with clauses with many literals, that is an advantage for SARRIGUREN, because the more literals are in the clauses the bigger is the probability of overlapping among clauses, a property that makes the clause counting process more efficient. Actually, it provides a $O(m^2 \times n/k)$ time complexity for random $k$-SAT instances of $n$ variables and $m$ relatively dense clauses, where that density level is relative to the number of variables $n$, that is, clauses are relatively dense when $k\geq7\sqrt{n}$. Although theoretically there could be worst-cases with exponential complexity, the probability of those cases to happen in random $k$-SAT with relatively dense clauses is practically zero. The algorithm has been empirically tested and that polynomial time complexity maintains also for $k$-SAT instances with less dense clauses ($k\geq5\sqrt{n}$). That density could, for example, be of only 0.049 working with $n=20000$ variables and $k=989$ literals. In addition, they are presented two more complementary algorithms that provide the solutions to $k$-SAT instances and valuable information about number of solutions for each literal. Although this algorithm does not solve the NP=P problem (it is not a polynomial algorithm for 3-SAT), it broads the knowledge about that subject, because $k$-SAT with $k>3$ and dense clauses is not harder than 3-SAT. Moreover, the Python implementation of the algorithms, and all the input datasets and obtained results in the experiments are made available.
Data Structures and Algorithms,Computational Complexity
-
Computations with polynomial evaluation oracle: ruling out superlinear SETH-based lower bounds
Tatiana Belova,Alexander S. Kulikov,Ivan Mihajlin,Olga Ratseeva,Grigory Reznikov,Denil Sharipov
2023-07-21
Abstract:The field of fine-grained complexity aims at proving conditional lower bounds on the time complexity of computational problems. One of the most popular assumptions, Strong Exponential Time Hypothesis (SETH), implies that SAT cannot be solved in $2^{(1-\epsilon)n}$ time. In recent years, it has been proved that known algorithms for many problems are optimal under SETH. Despite the wide applicability of SETH, for many problems, there are no known SETH-based lower bounds, so the quest for new reductions continues.
Two barriers for proving SETH-based lower bounds are known. Carmosino et al. (ITCS 2016) introduced the Nondeterministic Strong Exponential Time Hypothesis (NSETH) stating that TAUT cannot be solved in time $2^{(1-\epsilon)n}$ even if one allows nondeterminism. They used this hypothesis to show that some natural fine-grained reductions would be difficult to obtain: proving that, say, 3-SUM requires time $n^{1.5+\epsilon}$ under SETH, breaks NSETH and this, in turn, implies strong circuit lower bounds. Recently, Belova et al. (SODA 2023) introduced the so-called polynomial formulations to show that for many NP-hard problems, proving any explicit exponential lower bound under SETH also implies strong circuit lower bounds.
We prove that for a range of problems from P, including $k$-SUM and triangle detection, proving superlinear lower bounds under SETH is challenging as it implies new circuit lower bounds. To this end, we show that these problems can be solved in nearly linear time with oracle calls to evaluating a polynomial of constant degree. Then, we introduce a strengthening of SETH stating that solving SAT in time $2^{(1-\varepsilon)n}$ is difficult even if one has constant degree polynomial evaluation oracle calls. This hypothesis is stronger and less believable than SETH, but refuting it is still challenging: we show that this implies circuit lower bounds.
Computational Complexity
-
On the geometry of $k$-SAT solutions: what more can PPZ and Schöning's algorithms do?
Per Austrin,Ioana O. Bercea,Mayank Goswami,Nutan Limaye,Adarsh Srinivasan
2024-07-28
Abstract:Given a $k$-CNF formula and an integer $s$, we study algorithms that obtain $s$ solutions to the formula that are maximally dispersed. For $s=2$, the problem of computing the diameter of a $k$-CNF formula was initiated by Creszenzi and Rossi, who showed strong hardness results even for $k=2$. Assuming SETH, the current best upper bound [Angelsmark and Thapper '04] goes to $4^n$ as $k \rightarrow \infty$. As our first result, we give exact algorithms for using the Fast Fourier Transform and clique-finding that run in $O^*(2^{(s-1)n})$ and $O^*(s^2 |\Omega_{F}|^{\omega \lceil s/3 \rceil})$ respectively, where $|\Omega_{F}|$ is the size of the solution space of the formula $F$ and $\omega$ is the matrix multiplication exponent.
As our main result, we re-analyze the popular PPZ (Paturi, Pudlak, Zane '97) and Schöning's ('02) algorithms (which find one solution in time $O^*(2^{\varepsilon_{k}n})$ for $\varepsilon_{k} \approx 1-\Theta(1/k)$), and show that in the same time, they can be used to approximate the diameter as well as the dispersion ($s>2$) problems. While we need to modify Schöning's original algorithm, we show that the PPZ algorithm, without any modification, samples solutions in a geometric sense. We believe that this property may be of independent interest.
Finally, we present algorithms to output approximately diverse, approximately optimal solutions to NP-complete optimization problems running in time $\text{poly}(s)O^*(2^{\varepsilon n})$ with $\varepsilon<1$ for several problems such as Minimum Hitting Set and Feedback Vertex Set. For these problems, all existing exact methods for finding optimal diverse solutions have a runtime with at least an exponential dependence on the number of solutions $s$. Our methods find bi-approximations with polynomial dependence on $s$.
Computational Complexity,Data Structures and Algorithms