PushdownDB: Accelerating a DBMS using S3 Computation

Xiangyao Yu,Matt Youill,Matthew Woicik,Abdurrahman Ghanem,Marco Serafini,Ashraf Aboulnaga,Michael Stonebraker
DOI: https://doi.org/10.48550/arXiv.2002.05837
2020-02-14
Abstract:This paper studies the effectiveness of pushing parts of DBMS analytics queries into the Simple Storage Service (S3) engine of Amazon Web Services (AWS), using a recently released capability called S3 Select. We show that some DBMS primitives (filter, projection, aggregation) can always be cost-effectively moved into S3. Other more complex operations (join, top-K, group-by) require reimplementation to take advantage of S3 Select and are often candidates for pushdown. We demonstrate these capabilities through experimentation using a new DBMS that we developed, PushdownDB. Experimentation with a collection of queries including TPC-H queries shows that PushdownDB is on average 30% cheaper and 6.7X faster than a baseline that does not use S3 Select.
Databases
What problem does this paper attempt to address?