Tuesday, June 26, 2018

Fun With SQL: Functions in Postgres

DZone Database Zone
Fun With SQL: Functions in Postgres
Fun With SQL: Functions in Postgres

In our previous Fun with SQL post on the Citus Data blog, we covered window functions. Window functions are a special class of function that allow you to grab values across rows and then perform some logic. By jumping ahead to window functions, we missed so many of the other handy functions that exist within Postgres natively. There are, in fact, several hundred built-in functions, and when needed, you can also create your own user defined functions (UDFs) if you need something custom. Today, we’re going to walk through just a small sampling of SQL functions that can be extremely handy in PostgreSQL.

Arrays

First, arrays are a first-class datatype within Postgres. You can have an array of text or an array of numbers. Personally, I love using arrays when dealing with category tags. You can also index arrays, which can make querying extremely fast, but even if you’re not putting arrays directly into your database, you may want to build up arrays within your query.

What’s New with AWS – Week of June 18, 2018

Data Videos
#Data -What’s New with AWS – Week of June 18, 2018

Chunk Change: InnoDB Buffer Pool Resizing

DZone Database Zone
Chunk Change: InnoDB Buffer Pool Resizing
Chunk Change: InnoDB Buffer Pool Resizing

Since MySQL 5.7.5, we have been able to resize dynamically the InnoDB Buffer Pool. This new feature also introduced a new variable — innodb_buffer_pool_chunk_size — which defines the chunk size by which the buffer pool is enlarged or reduced. This variable is not dynamic, and if it is incorrectly configured, it could lead to undesired situations.

Let's see first how innodb_buffer_pool_size, innodb_buffer_pool_instances, and innodb_buffer_pool_chunk_size interact:

Power BI Sudoku, Custom fonts, DAX and more... (June 25, 2018)

Data Videos
#Data -Power BI Sudoku, Custom fonts, DAX and more... (June 25, 2018)

Monday, June 25, 2018

Traditional Database Security Doesn’t Protect Data

DZone Database Zone
Traditional Database Security Doesn’t Protect Data
Traditional Database Security Doesn’t Protect Data

It seems every week there’s a new data breach to read (or tweet) about. I recently discovered this lovely visualization of the growing amount of private data about people like you and me that is being exposed. You can filter and/or sort the data by industry sector, method of leak, and data sensitivity. It makes for a beautifully depressing coffee break.

After reading that, you might like to check to see if your details have been included in any of the data breaches listed on haveibeenpwned.com. Thanks to this site, and an alert I received from it following the 2016 LinkedIn breach, I now use a password manager — and I recommend you do the same.

Sunday, June 24, 2018

Are You a Data Professional? It Pays to Stay Home!

DZone Database Zone
Are You a Data Professional? It Pays to Stay Home!
Are You a Data Professional? It Pays to Stay Home!
2018 Data Professionals Salary Survey Results

Earlier this year, Brent Ozar completed his 2018 Data Professionals Salary Survey and published the results:

I played with the results last year in my 2017 post titled, "When Does It Pay for a DBA to Have an Associate Degree?" — which was a fun post to write. This year, much of the Twitter conversation focused on a gender gap in the results (see "Female DBAs Make Less Money. Why?").

Saturday, June 23, 2018

AWS Summit Mexico City 2018 - Caso de Éxito Kueski [Spanish]

Data Videos
#Data -AWS Summit Mexico City 2018 - Caso de Éxito Kueski [Spanish]

Codex KV: Properly Generating the File

DZone Database Zone
Codex KV: Properly Generating the File
Codex KV: Properly Generating the File

The previous post has a code sample in it that was figuratively physically painful for me to write. Avoiding the number of syscalls that are invoked, the code isn’t all too efficient as I now measure things. It uses way too much managed memory and it is subject to failures as we increase the amount of data we push through. For this post, I’m going to be rewriting the CodexWriter class as I would for code that is going into RavenDB.

I’m sorry, there is going to be a big jump in the complexity of the code because I’m going to try to handle performance, parallelism, and resource utilization all at once. The first thing to do is to go into the project’s settings and enable both unsafe code (without which is it nearly impossible to write high-performance code) and C# 7.3 features, we’ll need these.

Amazon Neptune: Build Applications for Highly Connected Datasets

Data Videos
#Data -Amazon Neptune: Build Applications for Highly Connected Datasets

Friday, June 22, 2018

MongoDB Ruby Driver 2.5.x Case-Sensitivity Issues With Hostnames on Replica Sets

DZone Database Zone
MongoDB Ruby Driver 2.5.x Case-Sensitivity Issues With Hostnames on Replica Sets
MongoDB Ruby Driver 2.5.x Case-Sensitivity Issues With Hostnames on Replica Sets

Having trouble connecting to MongoDB replica sets after upgrading the MongoDB Ruby driver to 2.5.x? We've recently received a few inquiries about this issue with the latest MongoDB Ruby driver version and wrote this post to share our findings on the problem and cause.

The error message that was encountered on connection attempt was:

Thursday, June 21, 2018

How Realm is Better Compared To SQLite

DZone Database Zone
How Realm is Better Compared To SQLite
How Realm is Better Compared To SQLite

While starting a new application, we often wonder which database to use, especially if the application is database intensive. Recently, I came across Realm, which is really well-built and surprisingly very fast compared to SQLite. In this post, I aim at showing how Realm compares to SQLite.

Let’s start with looking at basic CRUD operations in Realm.

Introducing Amazon EKS

Data Videos
#Data -Introducing Amazon EKS

Wednesday, June 20, 2018

Castilleja School Automates Data Protection and Shortens RTOs

Data Videos
#Data -Castilleja School Automates Data Protection and Shortens RTOs

Getting Started with MongoDB #3

DZone Database Zone
Getting Started with MongoDB #3
Getting Started with MongoDB #3

Hello everyone! In my previous article, I explained CRUD operations in MongoDB, which you can find here. In this article, I will explain some leftover parts like sorting, projection, comparison query operator, logical query operator, and many more.

Before starting, let's insert a document first:

Centralized and Externalized Logging Architecture for Modern Rack Scale Applications using NVMe Shared Storage

DZone Database Zone
Centralized and Externalized Logging Architecture for Modern Rack Scale Applications using NVMe Shared Storage
Centralized and Externalized Logging Architecture for Modern Rack Scale Applications using NVMe Shared Storage

“We are a log Management Company that happens to Stream Videos”

-Netflix Chief Architect

2-AWS for Microsoft Workloads: Monitoring of .NET Applications with AWS X-Ray

Data Videos
#Data -2-AWS for Microsoft Workloads: Monitoring of .NET Applications with AWS X-Ray

Tuesday, June 19, 2018

AWS Summit Madrid 2018 - Inteligencia artificial en AWS [Spanish]

Data Videos
#Data -AWS Summit Madrid 2018 - Inteligencia artificial en AWS [Spanish]

Set Up a Database Diagram Using a Stored Procedure In SQL Server

DZone Database Zone
Set Up a Database Diagram Using a Stored Procedure In SQL Server
Set Up a Database Diagram Using a Stored Procedure In SQL Server
Steps To Be Followed

Create tables.

Create stored procedure using inner join between two tables.

Monday, June 18, 2018

A Case for GraphQL in Enterprise

DZone Database Zone
A Case for GraphQL in Enterprise
A Case for GraphQL in Enterprise

GraphQL supports dynamic queries and is type-safe. This reduces the number of APIs to be developed and allows enforcing compile-time checks on the data being requested by consumers.

It was designed to be able to seamlessly front multiple sources of data, reducing the number of complex, cross-functional API dev iterations.

Sunday, June 17, 2018

Operating a Data Warehouse

DZone Database Zone
Operating a Data Warehouse
Operating a Data Warehouse

Having designed and built your data warehouse, I imagine that you’d like to deliver it successfully to the business and run it smoothly on a daily basis. That’s the topic of today’s article.

As digitalization continues apace across all industries, the role and value of a data warehouse — together with its attendant data marts and associated data lake — becomes ever more central to business success. With such informational systems now becoming as important as traditional operational systems, and often more so, it should be self-evident that highly reliable and efficient operating practices must be adopted.

What’s New with AWS – Week of June 11, 2018

Data Videos
#Data -What’s New with AWS – Week of June 11, 2018

Saturday, June 16, 2018

SQL Prompt Safety Net Features for Developers

DZone Database Zone
SQL Prompt Safety Net Features for Developers
SQL Prompt Safety Net Features for Developers

Occasionally, mistakes happen. You accidentally close an SSMS query tab without saving it before realizing it contained an essential bit of code. You're working late, switching between test and development servers, and accidentally execute code against the wrong server. SSMS conspires against you and crashes unexpectedly, and you lose all your currently open query tabs, some of which you hadn't saved.

We've all been there. I recall one such incident vividly. I was working at Redgate's offices, and a passing developer laughed at my howls of rage as SSMS crashed on me just when I had almost finished a particularly clever stored procedure. A short while later, a higher entity made it happen to him too, so after reflecting soberly for some time, he developed SQL Tab Magic, in a down-tools week project. It became a cult tool and eventually went mainstream as part of SQL Prompt.

Arquivei: From Ingestion to Processing, Data Lake for Fiscal Documents [Portuguese]

Data Videos
#Data -Arquivei: From Ingestion to Processing, Data Lake for Fiscal Documents [Portuguese]

Friday, June 15, 2018

Rachel Mushahwar at Intel Talks About the Partnership with AWS Public Sector

Data Videos
#Data -Rachel Mushahwar at Intel Talks About the Partnership with AWS Public Sector

BigQuery vs Redshift: Pricing Strategy

DZone Database Zone
BigQuery vs Redshift: Pricing Strategy
BigQuery vs Redshift: Pricing Strategy

In this article, we’re going to break down BigQuery vs Redshift pricing structures and see how they work in detail. 

You can also join a free webinar on managing BigQuery performance and costs. 

Thursday, June 14, 2018

Configuring Memory for Postgres

DZone Database Zone
Configuring Memory for Postgres
Configuring Memory for Postgres

work_mem is perhaps the most confusing setting within Postgres. work_mem is a configuration within Postgres that determines how much memory can be used during certain operations. At its surface, the work_mem setting seems simple: after all, work_mem just specifies the amount of memory available to be used by internal sort operations and hash tables before writing data to disk. And yet, leaving work_mem unconfigured can bring on a host of issues. What perhaps is more troubling, though, is when you receive an out of memory error on your database and you jump in to tune work_mem, only for it to behave in an un-intuitive manner.

Setting Your Default Memory

The work_mem value defaults to 4MB in Postgres, and that’s likely a bit low. This means that per Postgres, activity (each join, some sorts, etc.) can consume 4MB before it starts spilling to disk. When Postgres starts writing temp files to disk, obviously things will be much slower than in memory. You can find out if you’re spilling to disk by searching for temporary file within your PostgreSQL logs when you have log_temp_files enabled. If you see temporary file, it can be worth increasing your work_mem.

Onica: Serverless Monitoring and Analytics for AWS IoT-connected Tanks

Data Videos
#Data -Onica: Serverless Monitoring and Analytics for AWS IoT-connected Tanks

Reviewing the Bleve Search Library

DZone Database Zone
Reviewing the Bleve Search Library
Reviewing the Bleve Search Library

Bleve is a Go search engine library, and that means that it hits a few good points with me. It is interesting, it is familiar ground, and it is in a language that I’m not too familiar with, so that is a great chance to learn some more.

I reviewed revision: 298302a511a184dbab2c401e2005c1ce9589a001

Rehost or Rearchitect - Understanding the Why and How of Very Different Paths to Cloud Success

Data Videos
#Data -Rehost or Rearchitect - Understanding the Why and How of Very Different Paths to Cloud Success

Wednesday, June 13, 2018

Virtual Log Files: 200 or 1000?

DZone Database Zone
Virtual Log Files: 200 or 1000?
Virtual Log Files: 200 or 1000?

Last week, I had the privilege of reviewing possibly the best SQL Server production environment I've seen in Canada. During the follow-up meeting, the senior DBA and I had a discussion about Virtual Log Files (VLFs), disagreeing on the maximum number of Virtual Log Files a transaction log should have. I said 200, and he said 1000.

Both numbers are arbitrary, so let's explore why VLFs exist and why we might prefer one over the other.

How the Cloud Enables the Future of Mobility

Data Videos
#Data -How the Cloud Enables the Future of Mobility

Tuesday, June 12, 2018

Amazon Sumerian: How to Change Entity Color

Data Videos
#Data -Amazon Sumerian: How to Change Entity Color

Getting Enterprise Features to Your MongoDB Community Edition

DZone Database Zone
Getting Enterprise Features to Your MongoDB Community Edition
Getting Enterprise Features to Your MongoDB Community Edition

Many of us need MongoDB Enterprise Edition but might be short of resources or would like to compare the value.

I have summarized several key features of MongoDB Enterprise Edition and their alternatives:

Monday, June 11, 2018

jOOQ 3.11 Released With 4 New Databases, Implicit Joins, Diagnostics, and Much More

DZone Database Zone
jOOQ 3.11 Released With 4 New Databases, Implicit Joins, Diagnostics, and Much More
jOOQ 3.11 Released With 4 New Databases, Implicit Joins, Diagnostics, and Much More

Today, jOOQ 3.11 has been released with support for 4 new databases, implicit joins, diagnostics, and much more

New Databases Supported

At last, 4 new SQL dialects have been added to jOOQ! These are:

Sunday, June 10, 2018

Spring Boot + CockroachDB in Kubernetes/OpenShift

DZone Database Zone
Spring Boot + CockroachDB in Kubernetes/OpenShift
Spring Boot + CockroachDB in Kubernetes/OpenShift
TL;DR: In this post, we look at how to use CockroachDB inside a Spring Boot application. Read on for the details.

In my previous post, I showed why CockroachDB might help you if you need a cloud-native SQL database for your application. I explained how to install it in Kubernetes/OpenShift and how to validate that the data is replicated correctly.

In this post, I am going to show you how to use Cockroach DB in a Spring Boot application. Notice that Cockroach DB is compatible with PostgresSQL driver, so in terms of configuration, it is almost the same.

Saturday, June 09, 2018

What’s New with AWS – Week of June 4, 2018

Data Videos
#Data -What’s New with AWS – Week of June 4, 2018

Where Less Is More But You Still Pay Less: Hosting Your Database on a Raspberry Pi

DZone Database Zone
Where Less Is More But You Still Pay Less: Hosting Your Database on a Raspberry Pi
Where Less Is More But You Still Pay Less: Hosting Your Database on a Raspberry Pi

Is there such a thing as too much performance? No way! You can never have too much of a good thing.

But what happens when your database software is so fast that it hits the limits of your hardware? Even if your database has the ability double its performance, the nuts and bolts of what it’s running on simply can’t support it. It’s kind of like Scotty screaming at you, “You’ve got to cut power to the warp drive, the ship is breaking up!”

Friday, June 08, 2018

Consistency in Databases

DZone Database Zone
Consistency in Databases
Consistency in Databases

How will you know if a database is strong or eventual consistent?

The Rules

R + W > N

Using Domain-Specific Language to Manipulate NoSQL Databases in Java With Eclipse JNoSQL

DZone Database Zone
Using Domain-Specific Language to Manipulate NoSQL Databases in Java With Eclipse JNoSQL
Using Domain-Specific Language to Manipulate NoSQL Databases in Java With Eclipse JNoSQL

From Wikipedia, "A domain-specific language (DSL) is a computer language specialized to a particular application domain." The DSL has several books, and the most famous one from Martin Fowler says, "DSLs are small languages, focused on a particular aspect of a software system." That is often referred to as a fluent interface. In the NoSQL world, we have an issue, as the picture below shows. We have four different document NoSQL databases doing exactly the same thing, however, with different APIs. Does it make sense to have a standard do these habitual behaviors? In this article, we'll cover who does manipulation with Eclipse JNoSQL API.

Querying NoSQL Database Programmatically in Java

To manipulate any entity in all NoSQL types, there is a template interface. The template offers convenience operations to create, update, delete, and query for NoSQL databases and provides a mapping between your domain objects and JNoSQL. That looks like a template method to NoSQL databases, however, no heritage is necessary. There are DocumentTemplate, ColumnTemplate, GraphTemplate, and KeyValueTemplate.

Thursday, June 07, 2018

Meet our Team of Solutions Architects from AWS Japan

Data Videos
#Data -Meet our Team of Solutions Architects from AWS Japan

Automatic Data Versioning in MariaDB Server 10.3

DZone Database Zone
Automatic Data Versioning in MariaDB Server 10.3
Automatic Data Versioning in MariaDB Server 10.3

MariaDB Server 10.3 comes with a new, very useful feature that will ease the design of many applications. Data versioning is important for several perspectives. Compliance might require that you need to store data changes. For analytical queries, you may want to look at data at a specific point in time and for auditing purposes, what changes were made, and when is important. Also, in the case of a table being deleted, it can be of great value to recover it from history. MariaDB Server now includes a feature named System-Versioned Tables, which is based on the specification in the SQL:2011 standard. It provides automatic versioning of table data.

I’ll walk through the concept of system-versioned tables with a very simple example, which will show you what it is all about. Let’s start by creating a database and a table.

How to Create and Deploy a Deep Learning Project With AWS DeepLens

Data Videos
#Data -How to Create and Deploy a Deep Learning Project With AWS DeepLens

Wednesday, June 06, 2018

Manage and respond to your GDPR Data Subject requests within Office 365

Data Videos
#Data -Manage and respond to your GDPR Data Subject requests within Office 365

The Future Isn't in Databases, but in the Data

DZone Database Zone
The Future Isn't in Databases, but in the Data
The Future Isn't in Databases, but in the Data


In the past year, you may have heard me mention my certificates from the Microsoft Professional Program. One certificate was in Data Science, the other in Big Data. I'm currently working on a third certificate, this one in Artificial Intelligence.

You might be wondering why a database guy would be spending so much time on data science, analytics, and AI. Well, I'll tell you.

Tuesday, June 05, 2018

Creating a Sturdy Backup System

DZone Database Zone
Creating a Sturdy Backup System
Creating a Sturdy Backup System

At Foreach, we own a Synology RS815+ to store all our backups. These backups come from different sources in our network such as routers, switches, database servers, web servers, application log files, mail servers, and so on.

The Synology NAS makes it really easy to configure file shares and quotas for these backups. However, it lacked a few features:

Monday, June 04, 2018

Kubernetes: The State of Stateful Apps

DZone Database Zone
Kubernetes: The State of Stateful Apps
Kubernetes: The State of Stateful Apps

Over the past year, Kubernetes — also known as K8s — has become a dominant topic of conversation in the infrastructure world. Given its pedigree of literally working at Google-scale, it makes sense that people want to bring that kind of power to their DevOps stories; container orchestration turns many tedious and complex tasks into something as simple as a declarative config file.

The rise of orchestration is predicated on a few things, though. First, organizations have moved toward breaking up monolithic applications into microservices. However, the resulting environments have hundreds (or thousands) of these services that need to be managed. Second, infrastructure has become cheap and disposable — if a machine fails, it's dramatically cheaper to replace it than triage the problems.

How to Wrangle Data for Machine Learning on AWS

Data Videos
#Data -How to Wrangle Data for Machine Learning on AWS

Sunday, June 03, 2018

Using Read-through and Write-through in Distributed Cache

DZone Database Zone
Using Read-through and Write-through in Distributed Cache
Using Read-through and Write-through in Distributed Cache

With the explosion of extremely high transaction web apps, SOA, grid computing, and other server applications, data storage is unable to keep up. The reason is data storage cannot keep adding more servers to scale out, unlike application architectures that are extremely scalable.

In these situations, in-memory distributed cache offers an excellent solution to data storage bottlenecks. It spans multiple servers (called a cluster) to pool their memory together and keep all cache synchronized across servers, and it can keep growing this cache cluster endlessly, just like the application servers. This reduces pressure on data storage so that it is no longer a scalability bottleneck.

Saturday, June 02, 2018

Data Warehouse-Friendly Database Design

DZone Database Zone
Data Warehouse-Friendly Database Design
Data Warehouse-Friendly Database Design

A data warehouse is a collection of data that facilitates the decision-making process. It is non-volatile, time-variant, and integrated. Database design, on the other hand, refers to the creation of a detailed data model of a database. The model includes logical and physical design choices along with physical storage parameters.

More and more technical experts are emphasizing on creating a database design that is coherent with the data warehouse (data warehouse-friendly designs).

Friday, June 01, 2018

AWS Summit Benelux May 2018 Keynote

Data Videos
#Data -AWS Summit Benelux May 2018 Keynote

How to Decrypt Views in SQL Server

DZone Database Zone
How to Decrypt Views in SQL Server
How to Decrypt Views in SQL Server

“I am using SQL Server 2014. I have made views and I want to migrate those views from one server to another server. But, I need to decrypt my SQL Server 2014 view (encrypted view) as I want to modify the views as per my database requirement.”

Solution:

Sometimes we don’t want anyone to make changes to our views or don’t want anyone to make changes to our database object.

Daisy Chaining Dataflow in RavenDB

DZone Database Zone
Daisy Chaining Dataflow in RavenDB
Daisy Chaining Dataflow in RavenDB

I have talked before about RavenDB’s MapReduce indexes and their ability to output results to a collection as well as RavenDb’s ETL processes and how we can use them to push some data to another database (a RavenDB database or a relational one).

Bringing these two features together can be surprisingly useful when you start talking about global distributed processing. A concrete example might make this easier to understand.

Tim Bray and Friends | Messaging for Real-Time User Engagement | Guest: Georgie Matthews

Data Videos
#Data -Tim Bray and Friends | Messaging for Real-Time User Engagement | Guest: Georgie Matthews

Fun With SQL: Functions in Postgres

DZone Database Zone Fun With SQL: Functions in Postgres In our previous  Fun with SQL  post on the  Citus Data  blog, we covered w...