Monday, May 28, 2018

How Careful Engineering Lead to Processing Over a Trillion Rows Per Second

DZone Database Zone
How Careful Engineering Lead to Processing Over a Trillion Rows Per Second
How Careful Engineering Lead to Processing Over a Trillion Rows Per Second
SELECT stock_symbol, count(*) as c FROM trade GROUP BY stock_symbol ORDER BY c desc LIMIT 10;

On March 13, we published a demonstration on the performance of MemSQL in the context of ad hoc analytical queries. Specifically, we showed that the query can process 1,280,625,752,550 rows per seconds on a MemSQL cluster containing 448 Intel Skylake cores clocked at 2.5GHz. In this blog post, we drill down into how this was made possible by carefully designing code, exploiting distributed execution, and instruction-level and data-level parallelism.

Why is such a high throughput needed? Users of applications expect a response time of less than a quarter of a second. Higher throughput means more data can be processed within that time frame.

No comments:

Fun With SQL: Functions in Postgres

DZone Database Zone Fun With SQL: Functions in Postgres In our previous  Fun with SQL  post on the  Citus Data  blog, we covered w...