Tuesday, June 26, 2018

Fun With SQL: Functions in Postgres

DZone Database Zone
Fun With SQL: Functions in Postgres
Fun With SQL: Functions in Postgres

In our previous Fun with SQL post on the Citus Data blog, we covered window functions. Window functions are a special class of function that allow you to grab values across rows and then perform some logic. By jumping ahead to window functions, we missed so many of the other handy functions that exist within Postgres natively. There are, in fact, several hundred built-in functions, and when needed, you can also create your own user defined functions (UDFs) if you need something custom. Today, we’re going to walk through just a small sampling of SQL functions that can be extremely handy in PostgreSQL.

Arrays

First, arrays are a first-class datatype within Postgres. You can have an array of text or an array of numbers. Personally, I love using arrays when dealing with category tags. You can also index arrays, which can make querying extremely fast, but even if you’re not putting arrays directly into your database, you may want to build up arrays within your query.

What’s New with AWS – Week of June 18, 2018

Data Videos
#Data -What’s New with AWS – Week of June 18, 2018

Chunk Change: InnoDB Buffer Pool Resizing

DZone Database Zone
Chunk Change: InnoDB Buffer Pool Resizing
Chunk Change: InnoDB Buffer Pool Resizing

Since MySQL 5.7.5, we have been able to resize dynamically the InnoDB Buffer Pool. This new feature also introduced a new variable — innodb_buffer_pool_chunk_size — which defines the chunk size by which the buffer pool is enlarged or reduced. This variable is not dynamic, and if it is incorrectly configured, it could lead to undesired situations.

Let's see first how innodb_buffer_pool_size, innodb_buffer_pool_instances, and innodb_buffer_pool_chunk_size interact:

Power BI Sudoku, Custom fonts, DAX and more... (June 25, 2018)

Data Videos
#Data -Power BI Sudoku, Custom fonts, DAX and more... (June 25, 2018)

Monday, June 25, 2018

Traditional Database Security Doesn’t Protect Data

DZone Database Zone
Traditional Database Security Doesn’t Protect Data
Traditional Database Security Doesn’t Protect Data

It seems every week there’s a new data breach to read (or tweet) about. I recently discovered this lovely visualization of the growing amount of private data about people like you and me that is being exposed. You can filter and/or sort the data by industry sector, method of leak, and data sensitivity. It makes for a beautifully depressing coffee break.

After reading that, you might like to check to see if your details have been included in any of the data breaches listed on haveibeenpwned.com. Thanks to this site, and an alert I received from it following the 2016 LinkedIn breach, I now use a password manager — and I recommend you do the same.

Sunday, June 24, 2018

Are You a Data Professional? It Pays to Stay Home!

DZone Database Zone
Are You a Data Professional? It Pays to Stay Home!
Are You a Data Professional? It Pays to Stay Home!
2018 Data Professionals Salary Survey Results

Earlier this year, Brent Ozar completed his 2018 Data Professionals Salary Survey and published the results:

I played with the results last year in my 2017 post titled, "When Does It Pay for a DBA to Have an Associate Degree?" — which was a fun post to write. This year, much of the Twitter conversation focused on a gender gap in the results (see "Female DBAs Make Less Money. Why?").

Saturday, June 23, 2018

AWS Summit Mexico City 2018 - Caso de Éxito Kueski [Spanish]

Data Videos
#Data -AWS Summit Mexico City 2018 - Caso de Éxito Kueski [Spanish]

Codex KV: Properly Generating the File

DZone Database Zone
Codex KV: Properly Generating the File
Codex KV: Properly Generating the File

The previous post has a code sample in it that was figuratively physically painful for me to write. Avoiding the number of syscalls that are invoked, the code isn’t all too efficient as I now measure things. It uses way too much managed memory and it is subject to failures as we increase the amount of data we push through. For this post, I’m going to be rewriting the CodexWriter class as I would for code that is going into RavenDB.

I’m sorry, there is going to be a big jump in the complexity of the code because I’m going to try to handle performance, parallelism, and resource utilization all at once. The first thing to do is to go into the project’s settings and enable both unsafe code (without which is it nearly impossible to write high-performance code) and C# 7.3 features, we’ll need these.

Amazon Neptune: Build Applications for Highly Connected Datasets

Data Videos
#Data -Amazon Neptune: Build Applications for Highly Connected Datasets

Friday, June 22, 2018

MongoDB Ruby Driver 2.5.x Case-Sensitivity Issues With Hostnames on Replica Sets

DZone Database Zone
MongoDB Ruby Driver 2.5.x Case-Sensitivity Issues With Hostnames on Replica Sets
MongoDB Ruby Driver 2.5.x Case-Sensitivity Issues With Hostnames on Replica Sets

Having trouble connecting to MongoDB replica sets after upgrading the MongoDB Ruby driver to 2.5.x? We've recently received a few inquiries about this issue with the latest MongoDB Ruby driver version and wrote this post to share our findings on the problem and cause.

The error message that was encountered on connection attempt was:

Thursday, June 21, 2018

How Realm is Better Compared To SQLite

DZone Database Zone
How Realm is Better Compared To SQLite
How Realm is Better Compared To SQLite

While starting a new application, we often wonder which database to use, especially if the application is database intensive. Recently, I came across Realm, which is really well-built and surprisingly very fast compared to SQLite. In this post, I aim at showing how Realm compares to SQLite.

Let’s start with looking at basic CRUD operations in Realm.

Introducing Amazon EKS

Data Videos
#Data -Introducing Amazon EKS

Wednesday, June 20, 2018

Castilleja School Automates Data Protection and Shortens RTOs

Data Videos
#Data -Castilleja School Automates Data Protection and Shortens RTOs

Getting Started with MongoDB #3

DZone Database Zone
Getting Started with MongoDB #3
Getting Started with MongoDB #3

Hello everyone! In my previous article, I explained CRUD operations in MongoDB, which you can find here. In this article, I will explain some leftover parts like sorting, projection, comparison query operator, logical query operator, and many more.

Before starting, let's insert a document first:

Centralized and Externalized Logging Architecture for Modern Rack Scale Applications using NVMe Shared Storage

DZone Database Zone
Centralized and Externalized Logging Architecture for Modern Rack Scale Applications using NVMe Shared Storage
Centralized and Externalized Logging Architecture for Modern Rack Scale Applications using NVMe Shared Storage

“We are a log Management Company that happens to Stream Videos”

-Netflix Chief Architect

2-AWS for Microsoft Workloads: Monitoring of .NET Applications with AWS X-Ray

Data Videos
#Data -2-AWS for Microsoft Workloads: Monitoring of .NET Applications with AWS X-Ray

Tuesday, June 19, 2018

AWS Summit Madrid 2018 - Inteligencia artificial en AWS [Spanish]

Data Videos
#Data -AWS Summit Madrid 2018 - Inteligencia artificial en AWS [Spanish]

Set Up a Database Diagram Using a Stored Procedure In SQL Server

DZone Database Zone
Set Up a Database Diagram Using a Stored Procedure In SQL Server
Set Up a Database Diagram Using a Stored Procedure In SQL Server
Steps To Be Followed

Create tables.

Create stored procedure using inner join between two tables.

Monday, June 18, 2018

A Case for GraphQL in Enterprise

DZone Database Zone
A Case for GraphQL in Enterprise
A Case for GraphQL in Enterprise

GraphQL supports dynamic queries and is type-safe. This reduces the number of APIs to be developed and allows enforcing compile-time checks on the data being requested by consumers.

It was designed to be able to seamlessly front multiple sources of data, reducing the number of complex, cross-functional API dev iterations.

Sunday, June 17, 2018

Operating a Data Warehouse

DZone Database Zone
Operating a Data Warehouse
Operating a Data Warehouse

Having designed and built your data warehouse, I imagine that you’d like to deliver it successfully to the business and run it smoothly on a daily basis. That’s the topic of today’s article.

As digitalization continues apace across all industries, the role and value of a data warehouse — together with its attendant data marts and associated data lake — becomes ever more central to business success. With such informational systems now becoming as important as traditional operational systems, and often more so, it should be self-evident that highly reliable and efficient operating practices must be adopted.

What’s New with AWS – Week of June 11, 2018

Data Videos
#Data -What’s New with AWS – Week of June 11, 2018

Saturday, June 16, 2018

SQL Prompt Safety Net Features for Developers

DZone Database Zone
SQL Prompt Safety Net Features for Developers
SQL Prompt Safety Net Features for Developers

Occasionally, mistakes happen. You accidentally close an SSMS query tab without saving it before realizing it contained an essential bit of code. You're working late, switching between test and development servers, and accidentally execute code against the wrong server. SSMS conspires against you and crashes unexpectedly, and you lose all your currently open query tabs, some of which you hadn't saved.

We've all been there. I recall one such incident vividly. I was working at Redgate's offices, and a passing developer laughed at my howls of rage as SSMS crashed on me just when I had almost finished a particularly clever stored procedure. A short while later, a higher entity made it happen to him too, so after reflecting soberly for some time, he developed SQL Tab Magic, in a down-tools week project. It became a cult tool and eventually went mainstream as part of SQL Prompt.

Arquivei: From Ingestion to Processing, Data Lake for Fiscal Documents [Portuguese]

Data Videos
#Data -Arquivei: From Ingestion to Processing, Data Lake for Fiscal Documents [Portuguese]

Friday, June 15, 2018

Rachel Mushahwar at Intel Talks About the Partnership with AWS Public Sector

Data Videos
#Data -Rachel Mushahwar at Intel Talks About the Partnership with AWS Public Sector

BigQuery vs Redshift: Pricing Strategy

DZone Database Zone
BigQuery vs Redshift: Pricing Strategy
BigQuery vs Redshift: Pricing Strategy

In this article, we’re going to break down BigQuery vs Redshift pricing structures and see how they work in detail. 

You can also join a free webinar on managing BigQuery performance and costs. 

Thursday, June 14, 2018

Configuring Memory for Postgres

DZone Database Zone
Configuring Memory for Postgres
Configuring Memory for Postgres

work_mem is perhaps the most confusing setting within Postgres. work_mem is a configuration within Postgres that determines how much memory can be used during certain operations. At its surface, the work_mem setting seems simple: after all, work_mem just specifies the amount of memory available to be used by internal sort operations and hash tables before writing data to disk. And yet, leaving work_mem unconfigured can bring on a host of issues. What perhaps is more troubling, though, is when you receive an out of memory error on your database and you jump in to tune work_mem, only for it to behave in an un-intuitive manner.

Setting Your Default Memory

The work_mem value defaults to 4MB in Postgres, and that’s likely a bit low. This means that per Postgres, activity (each join, some sorts, etc.) can consume 4MB before it starts spilling to disk. When Postgres starts writing temp files to disk, obviously things will be much slower than in memory. You can find out if you’re spilling to disk by searching for temporary file within your PostgreSQL logs when you have log_temp_files enabled. If you see temporary file, it can be worth increasing your work_mem.

Onica: Serverless Monitoring and Analytics for AWS IoT-connected Tanks

Data Videos
#Data -Onica: Serverless Monitoring and Analytics for AWS IoT-connected Tanks

Reviewing the Bleve Search Library

DZone Database Zone
Reviewing the Bleve Search Library
Reviewing the Bleve Search Library

Bleve is a Go search engine library, and that means that it hits a few good points with me. It is interesting, it is familiar ground, and it is in a language that I’m not too familiar with, so that is a great chance to learn some more.

I reviewed revision: 298302a511a184dbab2c401e2005c1ce9589a001

Rehost or Rearchitect - Understanding the Why and How of Very Different Paths to Cloud Success

Data Videos
#Data -Rehost or Rearchitect - Understanding the Why and How of Very Different Paths to Cloud Success

Wednesday, June 13, 2018

Virtual Log Files: 200 or 1000?

DZone Database Zone
Virtual Log Files: 200 or 1000?
Virtual Log Files: 200 or 1000?

Last week, I had the privilege of reviewing possibly the best SQL Server production environment I've seen in Canada. During the follow-up meeting, the senior DBA and I had a discussion about Virtual Log Files (VLFs), disagreeing on the maximum number of Virtual Log Files a transaction log should have. I said 200, and he said 1000.

Both numbers are arbitrary, so let's explore why VLFs exist and why we might prefer one over the other.

How the Cloud Enables the Future of Mobility

Data Videos
#Data -How the Cloud Enables the Future of Mobility

Tuesday, June 12, 2018

Amazon Sumerian: How to Change Entity Color

Data Videos
#Data -Amazon Sumerian: How to Change Entity Color

Getting Enterprise Features to Your MongoDB Community Edition

DZone Database Zone
Getting Enterprise Features to Your MongoDB Community Edition
Getting Enterprise Features to Your MongoDB Community Edition

Many of us need MongoDB Enterprise Edition but might be short of resources or would like to compare the value.

I have summarized several key features of MongoDB Enterprise Edition and their alternatives:

Monday, June 11, 2018

jOOQ 3.11 Released With 4 New Databases, Implicit Joins, Diagnostics, and Much More

DZone Database Zone
jOOQ 3.11 Released With 4 New Databases, Implicit Joins, Diagnostics, and Much More
jOOQ 3.11 Released With 4 New Databases, Implicit Joins, Diagnostics, and Much More

Today, jOOQ 3.11 has been released with support for 4 new databases, implicit joins, diagnostics, and much more

New Databases Supported

At last, 4 new SQL dialects have been added to jOOQ! These are:

Sunday, June 10, 2018

Spring Boot + CockroachDB in Kubernetes/OpenShift

DZone Database Zone
Spring Boot + CockroachDB in Kubernetes/OpenShift
Spring Boot + CockroachDB in Kubernetes/OpenShift
TL;DR: In this post, we look at how to use CockroachDB inside a Spring Boot application. Read on for the details.

In my previous post, I showed why CockroachDB might help you if you need a cloud-native SQL database for your application. I explained how to install it in Kubernetes/OpenShift and how to validate that the data is replicated correctly.

In this post, I am going to show you how to use Cockroach DB in a Spring Boot application. Notice that Cockroach DB is compatible with PostgresSQL driver, so in terms of configuration, it is almost the same.

Saturday, June 09, 2018

What’s New with AWS – Week of June 4, 2018

Data Videos
#Data -What’s New with AWS – Week of June 4, 2018

Where Less Is More But You Still Pay Less: Hosting Your Database on a Raspberry Pi

DZone Database Zone
Where Less Is More But You Still Pay Less: Hosting Your Database on a Raspberry Pi
Where Less Is More But You Still Pay Less: Hosting Your Database on a Raspberry Pi

Is there such a thing as too much performance? No way! You can never have too much of a good thing.

But what happens when your database software is so fast that it hits the limits of your hardware? Even if your database has the ability double its performance, the nuts and bolts of what it’s running on simply can’t support it. It’s kind of like Scotty screaming at you, “You’ve got to cut power to the warp drive, the ship is breaking up!”

Friday, June 08, 2018

Consistency in Databases

DZone Database Zone
Consistency in Databases
Consistency in Databases

How will you know if a database is strong or eventual consistent?

The Rules

R + W > N

Using Domain-Specific Language to Manipulate NoSQL Databases in Java With Eclipse JNoSQL

DZone Database Zone
Using Domain-Specific Language to Manipulate NoSQL Databases in Java With Eclipse JNoSQL
Using Domain-Specific Language to Manipulate NoSQL Databases in Java With Eclipse JNoSQL

From Wikipedia, "A domain-specific language (DSL) is a computer language specialized to a particular application domain." The DSL has several books, and the most famous one from Martin Fowler says, "DSLs are small languages, focused on a particular aspect of a software system." That is often referred to as a fluent interface. In the NoSQL world, we have an issue, as the picture below shows. We have four different document NoSQL databases doing exactly the same thing, however, with different APIs. Does it make sense to have a standard do these habitual behaviors? In this article, we'll cover who does manipulation with Eclipse JNoSQL API.

Querying NoSQL Database Programmatically in Java

To manipulate any entity in all NoSQL types, there is a template interface. The template offers convenience operations to create, update, delete, and query for NoSQL databases and provides a mapping between your domain objects and JNoSQL. That looks like a template method to NoSQL databases, however, no heritage is necessary. There are DocumentTemplate, ColumnTemplate, GraphTemplate, and KeyValueTemplate.

Thursday, June 07, 2018

Meet our Team of Solutions Architects from AWS Japan

Data Videos
#Data -Meet our Team of Solutions Architects from AWS Japan

Automatic Data Versioning in MariaDB Server 10.3

DZone Database Zone
Automatic Data Versioning in MariaDB Server 10.3
Automatic Data Versioning in MariaDB Server 10.3

MariaDB Server 10.3 comes with a new, very useful feature that will ease the design of many applications. Data versioning is important for several perspectives. Compliance might require that you need to store data changes. For analytical queries, you may want to look at data at a specific point in time and for auditing purposes, what changes were made, and when is important. Also, in the case of a table being deleted, it can be of great value to recover it from history. MariaDB Server now includes a feature named System-Versioned Tables, which is based on the specification in the SQL:2011 standard. It provides automatic versioning of table data.

I’ll walk through the concept of system-versioned tables with a very simple example, which will show you what it is all about. Let’s start by creating a database and a table.

How to Create and Deploy a Deep Learning Project With AWS DeepLens

Data Videos
#Data -How to Create and Deploy a Deep Learning Project With AWS DeepLens

Wednesday, June 06, 2018

Manage and respond to your GDPR Data Subject requests within Office 365

Data Videos
#Data -Manage and respond to your GDPR Data Subject requests within Office 365

The Future Isn't in Databases, but in the Data

DZone Database Zone
The Future Isn't in Databases, but in the Data
The Future Isn't in Databases, but in the Data


In the past year, you may have heard me mention my certificates from the Microsoft Professional Program. One certificate was in Data Science, the other in Big Data. I'm currently working on a third certificate, this one in Artificial Intelligence.

You might be wondering why a database guy would be spending so much time on data science, analytics, and AI. Well, I'll tell you.

Tuesday, June 05, 2018

Creating a Sturdy Backup System

DZone Database Zone
Creating a Sturdy Backup System
Creating a Sturdy Backup System

At Foreach, we own a Synology RS815+ to store all our backups. These backups come from different sources in our network such as routers, switches, database servers, web servers, application log files, mail servers, and so on.

The Synology NAS makes it really easy to configure file shares and quotas for these backups. However, it lacked a few features:

Monday, June 04, 2018

Kubernetes: The State of Stateful Apps

DZone Database Zone
Kubernetes: The State of Stateful Apps
Kubernetes: The State of Stateful Apps

Over the past year, Kubernetes — also known as K8s — has become a dominant topic of conversation in the infrastructure world. Given its pedigree of literally working at Google-scale, it makes sense that people want to bring that kind of power to their DevOps stories; container orchestration turns many tedious and complex tasks into something as simple as a declarative config file.

The rise of orchestration is predicated on a few things, though. First, organizations have moved toward breaking up monolithic applications into microservices. However, the resulting environments have hundreds (or thousands) of these services that need to be managed. Second, infrastructure has become cheap and disposable — if a machine fails, it's dramatically cheaper to replace it than triage the problems.

How to Wrangle Data for Machine Learning on AWS

Data Videos
#Data -How to Wrangle Data for Machine Learning on AWS

Sunday, June 03, 2018

Using Read-through and Write-through in Distributed Cache

DZone Database Zone
Using Read-through and Write-through in Distributed Cache
Using Read-through and Write-through in Distributed Cache

With the explosion of extremely high transaction web apps, SOA, grid computing, and other server applications, data storage is unable to keep up. The reason is data storage cannot keep adding more servers to scale out, unlike application architectures that are extremely scalable.

In these situations, in-memory distributed cache offers an excellent solution to data storage bottlenecks. It spans multiple servers (called a cluster) to pool their memory together and keep all cache synchronized across servers, and it can keep growing this cache cluster endlessly, just like the application servers. This reduces pressure on data storage so that it is no longer a scalability bottleneck.

Saturday, June 02, 2018

Data Warehouse-Friendly Database Design

DZone Database Zone
Data Warehouse-Friendly Database Design
Data Warehouse-Friendly Database Design

A data warehouse is a collection of data that facilitates the decision-making process. It is non-volatile, time-variant, and integrated. Database design, on the other hand, refers to the creation of a detailed data model of a database. The model includes logical and physical design choices along with physical storage parameters.

More and more technical experts are emphasizing on creating a database design that is coherent with the data warehouse (data warehouse-friendly designs).

Friday, June 01, 2018

AWS Summit Benelux May 2018 Keynote

Data Videos
#Data -AWS Summit Benelux May 2018 Keynote

How to Decrypt Views in SQL Server

DZone Database Zone
How to Decrypt Views in SQL Server
How to Decrypt Views in SQL Server

“I am using SQL Server 2014. I have made views and I want to migrate those views from one server to another server. But, I need to decrypt my SQL Server 2014 view (encrypted view) as I want to modify the views as per my database requirement.”

Solution:

Sometimes we don’t want anyone to make changes to our views or don’t want anyone to make changes to our database object.

Daisy Chaining Dataflow in RavenDB

DZone Database Zone
Daisy Chaining Dataflow in RavenDB
Daisy Chaining Dataflow in RavenDB

I have talked before about RavenDB’s MapReduce indexes and their ability to output results to a collection as well as RavenDb’s ETL processes and how we can use them to push some data to another database (a RavenDB database or a relational one).

Bringing these two features together can be surprisingly useful when you start talking about global distributed processing. A concrete example might make this easier to understand.

Tim Bray and Friends | Messaging for Real-Time User Engagement | Guest: Georgie Matthews

Data Videos
#Data -Tim Bray and Friends | Messaging for Real-Time User Engagement | Guest: Georgie Matthews

Thursday, May 31, 2018

Hands-On: MariaDB ColumnStore Spark Connector

DZone Database Zone
Hands-On: MariaDB ColumnStore Spark Connector
Hands-On: MariaDB ColumnStore Spark Connector

In February, with the release of MariaDB ColumnStore 1.1.3, we introduced a new Apache Spark connector (Beta) that exports data from Spark into MariaDB ColumnStore. The Spark connector is available as part of our MariaDB AX analytics solution and complements our suite of rapid-paced data ingestion tools such as a Kafka data adapter and MaxScale CDC data adapter. The connector empowers users to directly export machine learning results stored in Spark DataFrames to ColumnStore for high performance analytics. Internally, it utilizes ColumnStore's Bulk Data Adapters to inject data directly into MariaDB ColumnStore's WriteEngine.

In this blog, we'll explain how to export the results of a simple machine learning pipeline on the classification example of the well-known mnist handwritten digits dataset. Feel free to start your own copy of our lab environment by typing:

Amazon Sumerian | Ep 5: Creating an Interactive Digital Signage Experience

Data Videos
#Data -Amazon Sumerian | Ep 5: Creating an Interactive Digital Signage Experience

Wednesday, May 30, 2018

Databases on Kubernetes: How to Recover from Failures and Scale Up and Down in a Few Line Commands

DZone Database Zone
Databases on Kubernetes: How to Recover from Failures and Scale Up and Down in a Few Line Commands
Databases on Kubernetes: How to Recover from Failures and Scale Up and Down in a Few Line Commands

A month ago, Kubernetes launched a beta for Local Persistent Volumes. In summary, it means that if a Pod using a local disk get killed, no data will be lost (let's ignore edge cases here). The secret is that a new Pod will be rescheduled to run on the same node, leveraging the disk which already exists there.

Of course, its downside is that we are tying our Pod to a specific node, but if we consider the time and effort spent on loading a copy of the data somewhere else, being able to leverage the same disk become a big advantage.

Amazon Sumerian - Getting Started 01: User Interface Overview

Data Videos
#Data -Amazon Sumerian - Getting Started 01: User Interface Overview

Tuesday, May 29, 2018

RDS Vs. MySQL on EC2 [Comic]

DZone Database Zone
RDS Vs. MySQL on EC2 [Comic]
RDS Vs. MySQL on EC2 [Comic]
Is It Really About Cost Every Time?

Image title

Which privilege do you want to have:

Saama: Clinical Trials with Saama's Life Science Analytics Cloud

Data Videos
#Data -Saama: Clinical Trials with Saama's Life Science Analytics Cloud

3 Day Coding Challenge: Creating MySQL Admin. for ASP.NET

DZone Database Zone
3 Day Coding Challenge: Creating MySQL Admin. for ASP.NET
3 Day Coding Challenge: Creating MySQL Admin. for ASP.NET

I have just finished my "3 days coding challenge" and released a new version of Phosphorus Five, containing a stable, extremely secure, and unbelievably lightweight MySQL admin module that allows you to do most of the important tasks you'd normally need MySQL Workbench or PHPMyAdmin to perform. It features the following:

Automatic Syntax Highlighting of your SQL.

Monday, May 28, 2018

How Careful Engineering Lead to Processing Over a Trillion Rows Per Second

DZone Database Zone
How Careful Engineering Lead to Processing Over a Trillion Rows Per Second
How Careful Engineering Lead to Processing Over a Trillion Rows Per Second
SELECT stock_symbol, count(*) as c FROM trade GROUP BY stock_symbol ORDER BY c desc LIMIT 10;

On March 13, we published a demonstration on the performance of MemSQL in the context of ad hoc analytical queries. Specifically, we showed that the query can process 1,280,625,752,550 rows per seconds on a MemSQL cluster containing 448 Intel Skylake cores clocked at 2.5GHz. In this blog post, we drill down into how this was made possible by carefully designing code, exploiting distributed execution, and instruction-level and data-level parallelism.

Why is such a high throughput needed? Users of applications expect a response time of less than a quarter of a second. Higher throughput means more data can be processed within that time frame.

Sunday, May 27, 2018

Streaming Data From MariaDB Server Into MariaDB ColumnStore via MariaDB MaxScale

DZone Database Zone
Streaming Data From MariaDB Server Into MariaDB ColumnStore via MariaDB MaxScale
Streaming Data From MariaDB Server Into MariaDB ColumnStore via MariaDB MaxScale

In this blog post, we look at how to configure Change Data Capture (CDC) from the MariaDB Server to MariaDB ColumnStore via MariaDB MaxScale. Our goal in this blog post is to have our analytical ColumnStore instance reflect the changes that happen on our operational MariaDB Server.

MariaDB MaxScale Configuration

We start by creating a MaxScale configuration with a binlogrouter and avrorouter instances. The former acts as a replication slave and fetches binary logs and the latter processes the binary logs into CDC records.

Saturday, May 26, 2018

Create Inline CRUD Using jQuery and AJAX

DZone Database Zone
Create Inline CRUD Using jQuery and AJAX
Create Inline CRUD Using jQuery and AJAX

These are the four actions that make up the significant part of the actions of a PHP project. By the time developers get to the mid-level, they have actually created dozens of CRUD grids. In many cases, CRUD operations are an important part of CMS, inventory, and accounts management systems.

The idea behind the CRUD operations is to empower the users so that they could use the app to the maximum. All the information generated or modified through CRUD operations is stored in a database (generally MySQL).

How a Rideshare Giant Uses AI to Detect Business Anomalies

Data Videos
#Data -How a Rideshare Giant Uses AI to Detect Business Anomalies

Mssql-cli Command-Line Query Tool

DZone Database Zone
Mssql-cli Command-Line Query Tool
Mssql-cli Command-Line Query Tool

A recent announcement on the release of several SQL Server tools has raised expectations across various groups. Product requirements and business are almost always a trade-off, and striking the right balance in a product in terms of the toolset is a sign of a successful product. After testing the SQL Operations Studio, I feel that it's a promising tool for many developers, administrators, and DevOps specialists. In my opinion, the mssql-cli tool adds another feature to SQL Server in order to make it a leading database product.

Microsoft announced mssql-cli, a SQL Server user-friendly, command line interactive tool hosted by the dbcli-org community on GitHub. It's an interactive, cross-platform command line query tool. The public preview release of mssql-cli is available for testing. Mssql-cli is based on Python and the command-line interface projects such as pgcli and mycli. Microsoft released this tool under the OSF (Open Source Foundation) BSD 3 license. We can find its source code on GitHub. Now, the tool and the code are available for public preview. The tool is officially supported on Windows, Linux, and MacOS, and is compatible with Python versions 2.7, 3.4, and higher.

Friday, May 25, 2018

The best cloud for your Windows Server workloads

Data Videos
#Data -The best cloud for your Windows Server workloads

RedisConf18 in Review

DZone Database Zone
RedisConf18 in Review
RedisConf18 in Review

Over 1,200 Redis enthusiasts took over Pier 27 on the San Francisco waterfront for three days of training, talks, and fun at RedisConf18. The theme of this year's conference was "Everywhere" and with over 60 breakout sessions across six concurrent tracks, Redis really was everywhere.

This year, RedisConf moved to the beautiful Pier 27 with panoramic views of the San Francisco Bay, both bridges and several iconic San Francisco landmarks. Pier 27 is the San Francisco Cruise Terminal originally built as a staging site for the 2013 America's Cup. The pier serves as a cruise terminal and a conference venue during the off-season.

Thursday, May 24, 2018

The Hidden Costs of Half a Database

DZone Database Zone
The Hidden Costs of Half a Database
The Hidden Costs of Half a Database

It costs plenty to install, secure, and implement a good database solution. What balloons the costs is when you need additional plugins to make the database meet your business needs. Even if the database itself is within your budget, you have to factor in buying new hardware, diverting additional developers who know the plugin technology, and of course, paying the database provider to provide “support” in integrating all these separate parts.

On top of all this mess, these plugins add additional operational layers to your data, killing your performance.

Fender: Teaching How to Play a Guitar with Serverless Technologies

Data Videos
#Data -Fender: Teaching How to Play a Guitar with Serverless Technologies

Wednesday, May 23, 2018

AWS Stockholm Summit May 2018 Keynote

Data Videos
#Data -AWS Stockholm Summit May 2018 Keynote

Mule 4: Database Connector (Part 2)

DZone Database Zone
Mule 4: Database Connector (Part 2)
Mule 4: Database Connector (Part 2)

Dynamic Queries:

To get rid of SQL Injection, we need to parameterize the "where" clause in our SQL query. What do we need to do when we don't only need to parameterize the "where" clause but also parts of the query? In Mule 3, we can't achieve this as in Mule 4 from the DB connector drop down. We need to select that we are using Dynamic query or parameterized. You have to choose between having a dynamic query and parameterized. In Mule 4 DB connector, we can use parameterized "where" clause and parts of the query simultaneously. In this example, you can see how a full expression is used to produce the query by building a string in which the table depends on a variable. An important thing to notice is that although the query text is dynamic, it is still using input parameters:

How to do Storage Replica within the same region in Azure (Pt. 2)

Data Videos
#Data -How to do Storage Replica within the same region in Azure (Pt. 2)

Database Connectivity and Transaction in Python

DZone Database Zone
Database Connectivity and Transaction in Python
Database Connectivity and Transaction in Python

It is very easy to establish a connection to a database and execute various DML (PL/SQL) statements in Python. Here, I am going to explain two different modules through which we are going to connect to different databases. The first module is "cx_Oracle" for Oracle Database, and the second one is "pyodbc module" to connect to MS SQL server, Sybase, MySQL, etc. 

So, my first example is with "cx_Oracle." I am not going to describe this module in detail, but my focus will be mainly on how to connect to the database and execute different SQL in it. For detailed documentation, please refer to https://cx-oracle.readthedocs.io/en/latest/.

Tuesday, May 22, 2018

Automatic Provisioning of Developer Databases with SQL Provision

DZone Database Zone
Automatic Provisioning of Developer Databases with SQL Provision
Automatic Provisioning of Developer Databases with SQL Provision

The GDPR and other regulations require that we be careful in how we handle sensitive data. One of the easiest ways to avoid a data breach incident, and any accompanying fine, is to limit the sensitive data your organization collects and then restrict the "exposure" of that data, within your organization. Many high-profile incidents in the last few years have been caused by sensitive data leaking out of database copies held on test and development servers, which are typically less well protected than the production servers.

If you want to avoid being mentioned in the news for lax security, then a good start is to ensure you keep PII and other sensitive data away from any less secure environments. One way the GDPR recommends we do this is by pseudonymizing or anonymizing sensitive data before it enters these insecure systems.

How to re:Invent | Episode 1: re:Invent 2018 - What's New? (AWS Online Tech Talks)

Data Videos
#Data -How to re:Invent | Episode 1: re:Invent 2018 - What's New? (AWS Online Tech Talks)

Monday, May 21, 2018

AWS Knowledge Center Video: How do I install PHP 5.6 and Apache in RHEL 7.2?

Data Videos
#Data -AWS Knowledge Center Video: How do I install PHP 5.6 and Apache in RHEL 7.2?

KSQL Deep Dive — The Open Source Streaming SQL Engine for Apache Kafka

DZone Database Zone
KSQL Deep Dive — The Open Source Streaming SQL Engine for Apache Kafka
KSQL Deep Dive — The Open Source Streaming SQL Engine for Apache Kafka

I had a workshop at Kafka Meetup Tel Aviv in May 2018: "KSQL Deep Dive — The Open Source Streaming Engine for Apache Kafka".

Here is the agenda, the slides, and the video recording.

Sunday, May 20, 2018

Finding Code Smells Using SQL Prompt

DZone Database Zone
Finding Code Smells Using SQL Prompt
Finding Code Smells Using SQL Prompt

Using TOP in a SELECT statement without a subsequent ORDERBY clause is legal in SQL Server, but meaningless because asking for the TOP10 rows implies that the data is guaranteed to be in a certain order, and tables have no implicit logical order. You must specify the order.

In a SELECT statement, you should always use an ORDERBY clause with the TOP clause, to specify which rows are affected by the TOP filter. If you need to implement a paging solution in an application widget, to send chunks or “pages” of data to the client so a user can scroll through data, it is better and easier to use the OFFSET–FETCH subclause in the ORDERBY clause, instead of the TOP clause.

Fun With SQL: Functions in Postgres

DZone Database Zone Fun With SQL: Functions in Postgres In our previous  Fun with SQL  post on the  Citus Data  blog, we covered w...