Thursday, May 31, 2018

Hands-On: MariaDB ColumnStore Spark Connector

DZone Database Zone
Hands-On: MariaDB ColumnStore Spark Connector
Hands-On: MariaDB ColumnStore Spark Connector

In February, with the release of MariaDB ColumnStore 1.1.3, we introduced a new Apache Spark connector (Beta) that exports data from Spark into MariaDB ColumnStore. The Spark connector is available as part of our MariaDB AX analytics solution and complements our suite of rapid-paced data ingestion tools such as a Kafka data adapter and MaxScale CDC data adapter. The connector empowers users to directly export machine learning results stored in Spark DataFrames to ColumnStore for high performance analytics. Internally, it utilizes ColumnStore's Bulk Data Adapters to inject data directly into MariaDB ColumnStore's WriteEngine.

In this blog, we'll explain how to export the results of a simple machine learning pipeline on the classification example of the well-known mnist handwritten digits dataset. Feel free to start your own copy of our lab environment by typing:

Amazon Sumerian | Ep 5: Creating an Interactive Digital Signage Experience

Data Videos
#Data -Amazon Sumerian | Ep 5: Creating an Interactive Digital Signage Experience

Wednesday, May 30, 2018

Databases on Kubernetes: How to Recover from Failures and Scale Up and Down in a Few Line Commands

DZone Database Zone
Databases on Kubernetes: How to Recover from Failures and Scale Up and Down in a Few Line Commands
Databases on Kubernetes: How to Recover from Failures and Scale Up and Down in a Few Line Commands

A month ago, Kubernetes launched a beta for Local Persistent Volumes. In summary, it means that if a Pod using a local disk get killed, no data will be lost (let's ignore edge cases here). The secret is that a new Pod will be rescheduled to run on the same node, leveraging the disk which already exists there.

Of course, its downside is that we are tying our Pod to a specific node, but if we consider the time and effort spent on loading a copy of the data somewhere else, being able to leverage the same disk become a big advantage.

Amazon Sumerian - Getting Started 01: User Interface Overview

Data Videos
#Data -Amazon Sumerian - Getting Started 01: User Interface Overview

Tuesday, May 29, 2018

RDS Vs. MySQL on EC2 [Comic]

DZone Database Zone
RDS Vs. MySQL on EC2 [Comic]
RDS Vs. MySQL on EC2 [Comic]
Is It Really About Cost Every Time?

Image title

Which privilege do you want to have:

Saama: Clinical Trials with Saama's Life Science Analytics Cloud

Data Videos
#Data -Saama: Clinical Trials with Saama's Life Science Analytics Cloud

3 Day Coding Challenge: Creating MySQL Admin. for ASP.NET

DZone Database Zone
3 Day Coding Challenge: Creating MySQL Admin. for ASP.NET
3 Day Coding Challenge: Creating MySQL Admin. for ASP.NET

I have just finished my "3 days coding challenge" and released a new version of Phosphorus Five, containing a stable, extremely secure, and unbelievably lightweight MySQL admin module that allows you to do most of the important tasks you'd normally need MySQL Workbench or PHPMyAdmin to perform. It features the following:

Automatic Syntax Highlighting of your SQL.

Monday, May 28, 2018

How Careful Engineering Lead to Processing Over a Trillion Rows Per Second

DZone Database Zone
How Careful Engineering Lead to Processing Over a Trillion Rows Per Second
How Careful Engineering Lead to Processing Over a Trillion Rows Per Second
SELECT stock_symbol, count(*) as c FROM trade GROUP BY stock_symbol ORDER BY c desc LIMIT 10;

On March 13, we published a demonstration on the performance of MemSQL in the context of ad hoc analytical queries. Specifically, we showed that the query can process 1,280,625,752,550 rows per seconds on a MemSQL cluster containing 448 Intel Skylake cores clocked at 2.5GHz. In this blog post, we drill down into how this was made possible by carefully designing code, exploiting distributed execution, and instruction-level and data-level parallelism.

Why is such a high throughput needed? Users of applications expect a response time of less than a quarter of a second. Higher throughput means more data can be processed within that time frame.

Sunday, May 27, 2018

Streaming Data From MariaDB Server Into MariaDB ColumnStore via MariaDB MaxScale

DZone Database Zone
Streaming Data From MariaDB Server Into MariaDB ColumnStore via MariaDB MaxScale
Streaming Data From MariaDB Server Into MariaDB ColumnStore via MariaDB MaxScale

In this blog post, we look at how to configure Change Data Capture (CDC) from the MariaDB Server to MariaDB ColumnStore via MariaDB MaxScale. Our goal in this blog post is to have our analytical ColumnStore instance reflect the changes that happen on our operational MariaDB Server.

MariaDB MaxScale Configuration

We start by creating a MaxScale configuration with a binlogrouter and avrorouter instances. The former acts as a replication slave and fetches binary logs and the latter processes the binary logs into CDC records.

Saturday, May 26, 2018

Create Inline CRUD Using jQuery and AJAX

DZone Database Zone
Create Inline CRUD Using jQuery and AJAX
Create Inline CRUD Using jQuery and AJAX

These are the four actions that make up the significant part of the actions of a PHP project. By the time developers get to the mid-level, they have actually created dozens of CRUD grids. In many cases, CRUD operations are an important part of CMS, inventory, and accounts management systems.

The idea behind the CRUD operations is to empower the users so that they could use the app to the maximum. All the information generated or modified through CRUD operations is stored in a database (generally MySQL).

How a Rideshare Giant Uses AI to Detect Business Anomalies

Data Videos
#Data -How a Rideshare Giant Uses AI to Detect Business Anomalies

Mssql-cli Command-Line Query Tool

DZone Database Zone
Mssql-cli Command-Line Query Tool
Mssql-cli Command-Line Query Tool

A recent announcement on the release of several SQL Server tools has raised expectations across various groups. Product requirements and business are almost always a trade-off, and striking the right balance in a product in terms of the toolset is a sign of a successful product. After testing the SQL Operations Studio, I feel that it's a promising tool for many developers, administrators, and DevOps specialists. In my opinion, the mssql-cli tool adds another feature to SQL Server in order to make it a leading database product.

Microsoft announced mssql-cli, a SQL Server user-friendly, command line interactive tool hosted by the dbcli-org community on GitHub. It's an interactive, cross-platform command line query tool. The public preview release of mssql-cli is available for testing. Mssql-cli is based on Python and the command-line interface projects such as pgcli and mycli. Microsoft released this tool under the OSF (Open Source Foundation) BSD 3 license. We can find its source code on GitHub. Now, the tool and the code are available for public preview. The tool is officially supported on Windows, Linux, and MacOS, and is compatible with Python versions 2.7, 3.4, and higher.

Friday, May 25, 2018

The best cloud for your Windows Server workloads

Data Videos
#Data -The best cloud for your Windows Server workloads

RedisConf18 in Review

DZone Database Zone
RedisConf18 in Review
RedisConf18 in Review

Over 1,200 Redis enthusiasts took over Pier 27 on the San Francisco waterfront for three days of training, talks, and fun at RedisConf18. The theme of this year's conference was "Everywhere" and with over 60 breakout sessions across six concurrent tracks, Redis really was everywhere.

This year, RedisConf moved to the beautiful Pier 27 with panoramic views of the San Francisco Bay, both bridges and several iconic San Francisco landmarks. Pier 27 is the San Francisco Cruise Terminal originally built as a staging site for the 2013 America's Cup. The pier serves as a cruise terminal and a conference venue during the off-season.

Thursday, May 24, 2018

The Hidden Costs of Half a Database

DZone Database Zone
The Hidden Costs of Half a Database
The Hidden Costs of Half a Database

It costs plenty to install, secure, and implement a good database solution. What balloons the costs is when you need additional plugins to make the database meet your business needs. Even if the database itself is within your budget, you have to factor in buying new hardware, diverting additional developers who know the plugin technology, and of course, paying the database provider to provide “support” in integrating all these separate parts.

On top of all this mess, these plugins add additional operational layers to your data, killing your performance.

Fender: Teaching How to Play a Guitar with Serverless Technologies

Data Videos
#Data -Fender: Teaching How to Play a Guitar with Serverless Technologies

Wednesday, May 23, 2018

AWS Stockholm Summit May 2018 Keynote

Data Videos
#Data -AWS Stockholm Summit May 2018 Keynote

Mule 4: Database Connector (Part 2)

DZone Database Zone
Mule 4: Database Connector (Part 2)
Mule 4: Database Connector (Part 2)

Dynamic Queries:

To get rid of SQL Injection, we need to parameterize the "where" clause in our SQL query. What do we need to do when we don't only need to parameterize the "where" clause but also parts of the query? In Mule 3, we can't achieve this as in Mule 4 from the DB connector drop down. We need to select that we are using Dynamic query or parameterized. You have to choose between having a dynamic query and parameterized. In Mule 4 DB connector, we can use parameterized "where" clause and parts of the query simultaneously. In this example, you can see how a full expression is used to produce the query by building a string in which the table depends on a variable. An important thing to notice is that although the query text is dynamic, it is still using input parameters:

How to do Storage Replica within the same region in Azure (Pt. 2)

Data Videos
#Data -How to do Storage Replica within the same region in Azure (Pt. 2)

Database Connectivity and Transaction in Python

DZone Database Zone
Database Connectivity and Transaction in Python
Database Connectivity and Transaction in Python

It is very easy to establish a connection to a database and execute various DML (PL/SQL) statements in Python. Here, I am going to explain two different modules through which we are going to connect to different databases. The first module is "cx_Oracle" for Oracle Database, and the second one is "pyodbc module" to connect to MS SQL server, Sybase, MySQL, etc. 

So, my first example is with "cx_Oracle." I am not going to describe this module in detail, but my focus will be mainly on how to connect to the database and execute different SQL in it. For detailed documentation, please refer to https://cx-oracle.readthedocs.io/en/latest/.

Tuesday, May 22, 2018

Automatic Provisioning of Developer Databases with SQL Provision

DZone Database Zone
Automatic Provisioning of Developer Databases with SQL Provision
Automatic Provisioning of Developer Databases with SQL Provision

The GDPR and other regulations require that we be careful in how we handle sensitive data. One of the easiest ways to avoid a data breach incident, and any accompanying fine, is to limit the sensitive data your organization collects and then restrict the "exposure" of that data, within your organization. Many high-profile incidents in the last few years have been caused by sensitive data leaking out of database copies held on test and development servers, which are typically less well protected than the production servers.

If you want to avoid being mentioned in the news for lax security, then a good start is to ensure you keep PII and other sensitive data away from any less secure environments. One way the GDPR recommends we do this is by pseudonymizing or anonymizing sensitive data before it enters these insecure systems.

How to re:Invent | Episode 1: re:Invent 2018 - What's New? (AWS Online Tech Talks)

Data Videos
#Data -How to re:Invent | Episode 1: re:Invent 2018 - What's New? (AWS Online Tech Talks)

Monday, May 21, 2018

AWS Knowledge Center Video: How do I install PHP 5.6 and Apache in RHEL 7.2?

Data Videos
#Data -AWS Knowledge Center Video: How do I install PHP 5.6 and Apache in RHEL 7.2?

KSQL Deep Dive — The Open Source Streaming SQL Engine for Apache Kafka

DZone Database Zone
KSQL Deep Dive — The Open Source Streaming SQL Engine for Apache Kafka
KSQL Deep Dive — The Open Source Streaming SQL Engine for Apache Kafka

I had a workshop at Kafka Meetup Tel Aviv in May 2018: "KSQL Deep Dive — The Open Source Streaming Engine for Apache Kafka".

Here is the agenda, the slides, and the video recording.

Sunday, May 20, 2018

Finding Code Smells Using SQL Prompt

DZone Database Zone
Finding Code Smells Using SQL Prompt
Finding Code Smells Using SQL Prompt

Using TOP in a SELECT statement without a subsequent ORDERBY clause is legal in SQL Server, but meaningless because asking for the TOP10 rows implies that the data is guaranteed to be in a certain order, and tables have no implicit logical order. You must specify the order.

In a SELECT statement, you should always use an ORDERBY clause with the TOP clause, to specify which rows are affected by the TOP filter. If you need to implement a paging solution in an application widget, to send chunks or “pages” of data to the client so a user can scroll through data, it is better and easier to use the OFFSET–FETCH subclause in the ORDERBY clause, instead of the TOP clause.

Saturday, May 19, 2018

Tim Bray and Friends | Messaging Fanout for Parallel Processing | Guest Expert: Sam Dengler

Data Videos
#Data -Tim Bray and Friends | Messaging Fanout for Parallel Processing | Guest Expert: Sam Dengler

Top 5 SQL and Database Courses to Learn Online — Best of the Lot

DZone Database Zone
Top 5 SQL and Database Courses to Learn Online — Best of the Lot
Top 5 SQL and Database Courses to Learn Online — Best of the Lot

Hello guys, if you are a computer science graduate or new into programming world and are interested in learning SQL and looking for some awesome resources — e.g. books, courses, and tutorials — to start with, then you have come to the right place. In the past, I have shared some of the best SQL books and tutorials, and today, I am going to share some of the best SQL and database courses to learn so you can master this useful technology. If you don't know what SQL is and why you should learn it, let me give you a brief overview of SQL for everyone's benefit. SQL is a programming language to work with a database. You can use SQL to create database objects — e.g. tables, stored procedure, etc. — and also to store and retrieve data from the database.

The SQL is one of the most important skills for any programmer, irrespective of technology, framework, and domain. It is even more popular than a mainstream programming language like Java and Python, and it definitely adds a lot of value to your CV.

Friday, May 18, 2018

Capped Collection in MongoDB

DZone Database Zone
Capped Collection in MongoDB
Capped Collection in MongoDB

What is Capped Collection?

As the name suggests, a collection created with a cap (limit on size and number of documents) is said to be capped collections.

Thursday, May 17, 2018

Redis Enterprise Service on Kops Managed Kubernetes Cluster

DZone Database Zone
Redis Enterprise Service on Kops Managed Kubernetes Cluster
Redis Enterprise Service on Kops Managed Kubernetes Cluster

This tutorial will show you how to easily set up a Kubernetes cluster on a public cloud using a tool called “Kops.” This post is a complement to our Kubernetes webinar, in which we explained the basic Kubernetes primitives and our previous blog posts about Redis Enterprise Service and local Kubernetes development. For this tutorial, we will use the latest publicly available container image of Redis Enterprise Software. You can read about the high performance, in-memory Redis Enterprise 5.0.2 software release here.

What is Kops?

AWS Summit Milano 2018 | Keynote [Italian]

Data Videos
#Data -AWS Summit Milano 2018 | Keynote [Italian]

Wednesday, May 16, 2018

Masking Data in Practice — Part 1

DZone Database Zone
Masking Data in Practice — Part 1
Masking Data in Practice — Part 1

Even small extracts of data need to be created with caution if they are for public consumption. Sensitive data can 'hide' in unexpected places, and apparently innocuous data can be combined with other information to expose information about identifiable individuals. If we need to deliver an entire database in obfuscated form, the problems can get harder. Phil Factor examines some of the basic data masking techniques and the challenges inherent in masking certain types of sensitive and personal data while ensuring it still looks like the real data and preserving its referential integrity and distribution characteristics.

This article describes the practicalities of data masking, the various methods we can use, and the potential pitfalls. In subsequent articles, I'll demonstrate how we can mask or sanitize different types of data using tools such as SQL Clone, Data Masker for SQL Server, and SQL Data Generator.

What’s It like to Work at AWS as a Technical Account Manager?

Data Videos
#Data -What’s It like to Work at AWS as a Technical Account Manager?

RavenDB 4.1 Features: JavaScript Indexes

DZone Database Zone
RavenDB 4.1 Features: JavaScript Indexes
RavenDB 4.1 Features: JavaScript Indexes

Note: This feature is an experimental one. It will be included in 4.1, but it will be behind an experimental feature flag. It is possible that this will change before full inclusion in the product.

RavenDB now supports multiple operating systems, and we spend a lot of effort to bring RavenDB client APIs to more platforms. C#, JVM, and Python are already done and Go, Node.JS, and Ruby are in various beta stages. One of the things that this brought up was our indexing structure. Right now, if you want to define a custom index in RavenDB, you use C# Linq syntax to do so. When RavenDB was primarily focused on .NET, that was a perfectly fine decision. However, as we are pushing for more platforms, we wanted to avoid forcing users to learn the C# syntax when they create indexes.

Build on Serverless | Process Tweets Like a Pro

Data Videos
#Data -Build on Serverless | Process Tweets Like a Pro

Tuesday, May 15, 2018

Nordstrom: Event-Sourced Serverless Architectures

Data Videos
#Data -Nordstrom: Event-Sourced Serverless Architectures

Tarantool Queues (Part 3): The Art of Queue Parsing

DZone Database Zone
Tarantool Queues (Part 3): The Art of Queue Parsing
Tarantool Queues (Part 3): The Art of Queue Parsing

In our previous article, we used the tarantool-authman module to implement an authentication server. Tarantool-authman is quite good and has almost all of our desired functionality, but in complex distributed systems (including microservice ones), many subsystems must work asynchronously. So, if we implement a system with multiple services and a single authentication tool, the latter can become a bottleneck and will slow down the entire operation. Fortunately, queues provide a solution to our problem.

Our goal in this article is to achieve an implementation of an authentication server, i.e. a Tarantool-based instance, with the tarantool-authman module. To help this server cope with an inevitable flood of web-service requests, we’ll add a queue server made with Tarantool Queue.

Monday, May 14, 2018

What Is a Serverless Database? (Overview of Providers, Pros, and Cons)

DZone Database Zone
What Is a Serverless Database? (Overview of Providers, Pros, and Cons)
What Is a Serverless Database? (Overview of Providers, Pros, and Cons)
Get To Know Serverless Architecture

Serverless computing is a cloud computing execution model, meaning that the cloud provider is dynamically managing the distribution of computer's resources. What's taking up valuable computing resources is the function execution. Both AWS and Azure charge more if you have a combination of allocated memory and the function execution elapse time, which is rounded up to 100ms. AWS Lambda's current pricing is $0.00001667 for every used GB-second, while Azure's functions cost $0.000016 for each GB-second. That gives you an idea of how fast the cost can climb. Considering that the amount of allocated memory can be configurable between 128 MB and 1.5 GB, the price of function execution can be variable depending on your setting. The cost per 100ms of the execution time for the configuration of significant power will be around 12 times more expensive than the 128 MB option, which is the basic one.

Serverless computing still requires servers, and that's where serverless database comes in. Knowing your needs will undoubtedly make it easy to choose the right database service and to start using the most advanced technological solutions of today.

AWS for Automotive - Cloud Connected Vehicles and Applications

Data Videos
#Data -AWS for Automotive - Cloud Connected Vehicles and Applications

Sunday, May 13, 2018

AWS KC Video: Why am I being charged for EC2 when all my instances have been terminated?

Data Videos
#Data -AWS KC Video: Why am I being charged for EC2 when all my instances have been terminated?

You Shouldn’t Manage Your Business Processes With Excel

DZone Database Zone
You Shouldn’t Manage Your Business Processes With Excel
You Shouldn’t Manage Your Business Processes With Excel
Why Do Employees in Companies Repeatedly Use Microsoft Excel or Powerpoint to Manage Business Data and Tasks?

Companies today have a variety of business applications in use. Some for legacy reasons, some for strategic reasons, and some just because they are hip and fun. Most of the time, these applications are not enough to cover all aspects of daily working life. A lot of tasks and data that need to be processed on a daily basis are often managed with Exel and Powerpoint. This is not wrong in general and provides employees a way to prepare business figures in tables or display them clearly in a presentation. But often, these applications are abused for a kind of data management, however, Excel is no database, and Powerpoint is not a messaging tool. The problem is that these tools often collect data that is needed to implement certain business processes and for which there is no suitable application available. In a short time, these solutions become indispensable and are passed down from generation to generation.

What is the Difference Between Business Data and a Business Process?

Now the question is, what is wrong with this situation? The problem comes from the field of computer science and data processing. We often try to describe our environment in an easy way. In other words, we combine the properties of our environment and compare them with each other to find alternatives to react faster. A spreadsheet or database provides a seemingly simple way to do this. But we can only record and document business data. The data is in fact only the basis for achieving a specific business goal, and these goals are represented by business processes within a company. When looking at a business process, we want to see where data came from, how it was processed, and what needs to happen next in order to achieve a business goal.

Saturday, May 12, 2018

Up-to-Date Cache With EclipseLink and Oracle

DZone Database Zone
Up-to-Date Cache With EclipseLink and Oracle
Up-to-Date Cache With EclipseLink and Oracle

One of the most useful features provided by ORM libraries is a second-level cache, usually called L2. An L2 object cache reduces database access for entities and their relationships. It is enabled by default in the most popular JPA implementations like Hibernate or EclipseLink. That won’t be a problem unless a table inside a database is not modified directly by third-party applications, or by the other instance of the same application in a clustered environment. One of the available solutions to this problem is an in-memory data grid, which stores all data in memory, and is distributed across many nodes inside a cluster. Such a tool like Hazelcast or Apache Ignite have been described several times in my blog. If you are interested in one of that tools I recommend you read one of my previous article about it: Hazelcast Hot Cache with Striim.

However, we won’t discuss it in this article. Today, I would like to talk about the Continuous Query Notification feature provided by Oracle Database. It solves a problem with updating or invalidating a cache when data changes in the database. Oracle JDBC drivers have provided support for it since 11gRelease 1. This functionality is based on receiving invalidation events from the JDBC drivers. Fortunately, EclipseLink extends that feature in their solution called EclipseLink Database Change Notification. In this article, I’m going to show you how to implement it using Spring Data JPA together with the EclipseLink library.

AWS Summit London 2018: AWS Partners to Help with Your Machine Learning Needs

Data Videos
#Data -AWS Summit London 2018: AWS Partners to Help with Your Machine Learning Needs

Friday, May 11, 2018

The State of SQL Server Monitoring 2018

DZone Database Zone
The State of SQL Server Monitoring 2018
The State of SQL Server Monitoring 2018

Over 600 technology professionals who work in organizations that use SQL Server recently responded to our survey to discover the current state of SQL Server monitoring.

We asked people across a range of sectors, in organizations of every size around the globe, about how they monitor SQL Server, the technologies they work with, and what they thought the biggest challenges were for them and their estates over the next 12 months.

AWS Summit London 2018: Amazon Aurora Backtrack

Data Videos
#Data -AWS Summit London 2018: Amazon Aurora Backtrack

Thursday, May 10, 2018

Learn how Adobe runs its vast open-source application portfolio in Azure

Data Videos
#Data -Learn how Adobe runs its vast open-source application portfolio in Azure

Using DATABASEPROPERTYEX to Find the Last Good DBCC CHECKDB Time

DZone Database Zone
Using DATABASEPROPERTYEX to Find the Last Good DBCC CHECKDB Time
Using DATABASEPROPERTYEX to Find the Last Good DBCC CHECKDB Time

For decades, a pain point for SQL Server administrators has been determining when the last known DBCC CHECKDB was run against a database. Microsoft has not exposed this information in an easily digestible format. You can find a handful of options available online for returning this information. My favorite was this post by Rob Sewell.

It was my favorite.

Next Games Powers Global Augmented Reality Game in Azure

Data Videos
#Data -Next Games Powers Global Augmented Reality Game in Azure

How Binary Logs Affect MySQL 8.0 Performance

DZone Database Zone
How Binary Logs Affect MySQL 8.0 Performance
How Binary Logs Affect MySQL 8.0 Performance

As part of my benchmarks of binary logs, I've decided to check how the recently released MySQL 8.0 performance is affected in similar scenarios, especially as binary logs are enabled by default. It is also interesting to check how MySQL 8.0 performs against the claimed performance improvements in redo logs subsystem.

I will use a similar setup as in my last article with MySQL 8.0 using the utf8mb4 charset.

Wednesday, May 09, 2018

Getting Started With MongoDB (Part 2)

DZone Database Zone
Getting Started With MongoDB (Part 2)
Getting Started With MongoDB (Part 2)

Hello everyone! In my previous article, I explained what MongoDB is and why you should use it. In this article, I will try to explain CRUD operations in MongoDB using MongoShell.

To check available databases:

Tuesday, May 08, 2018

Meet Tadao Nagasaki – Managing Director & President of Amazon Web Services Japan

Data Videos
#Data -Meet Tadao Nagasaki – Managing Director & President of Amazon Web Services Japan

When Simple Parameterization... Isn't

DZone Database Zone
When Simple Parameterization... Isn't
When Simple Parameterization... Isn't

I'm desperately working to finish up a new version of my book on Execution Plans. We're close, so close. However, you do hit snags. Here's one. My editor decided to change one of my queries. I used a local variable so that I got one set of behaviors. He used a hard-coded value to get a different set. However, the really interesting thing was that his query, at least according to the execution plan, went to simple parameterization. Or did it?

Simple Parameterization

The core concept of simple parameterization is easy enough to understand. You have a trivial query using a hard-coded value like this:

Monday, May 07, 2018

An Implementation of Phase-Fair Reader/Writer Locks

DZone Database Zone
An Implementation of Phase-Fair Reader/Writer Locks
An Implementation of Phase-Fair Reader/Writer Locks

We were in search for some C++ read-write lock implementation that allows a thread to acquire a lock and then optionally pass it on to another thread. The C++11 and C++14 standard library lock implementations std::mutex and shared_mutex do not allow that (it would be undefined behavior. By the way, it's also undefined behavior when doing this with the pthreads library).

Additionally, we were looking for locks that would neither prefer readers nor writers so that there will be neither reader starvation nor writer starvation. And then, we wanted concurrently queued read and write requests that compete for the lock to be brought into some defined execution order. Ideally, queued operations that cannot instantly acquire the lock should be processed in approximately the same order in which they queued.

The Allscripts Prescription for Agility: Lift and Shift to the Cloud

Data Videos
#Data -The Allscripts Prescription for Agility: Lift and Shift to the Cloud

Effective Database Testing With SQL Test and SQL Cover

DZone Database Zone
Effective Database Testing With SQL Test and SQL Cover
Effective Database Testing With SQL Test and SQL Cover

A well-established technique for improving application code quality during software development is to run unit tests in conjunction with a code coverage tool. The aim is not only to test that your software components behave as you would expect but also that your suite of tests gives your code a thorough workout.

Errors encountered within the most common routes through your logic will usually reveal themselves during the development process, long before they ever reach deployment. It's in the darker corners that bugs are more likely to live and thrive; within unusual code paths that triggered by specific inputs that the code can't handle, but which should (in theory, at least) never arise in everyday use. Code coverage gives the developers a measure of how effectively they're delving into these areas.

Sunday, May 06, 2018

Open-Sourcing Code Is a BAD Default Policy

DZone Database Zone
Open-Sourcing Code Is a BAD Default Policy
Open-Sourcing Code Is a BAD Default Policy

I ran into this Medium post that asks: Why is this code open-sourced? Let's flip the question. The premise of the post is interesting, given that the author asks that the default mode for code is that it should be open-source. I find myself in the strange position of being a strong open source adherent that very strongly disagrees on pretty much every point in this article. Please sit tight. This may take a while. This article really annoyed me.

Just to clear the fields, I have been working on open-source software for the past 15 years. The flagship product that we make is open-source and available on GitHub and we practice a very open development process. I was also very active in a number of high-profile open-source projects for many years and had quite a few open-source projects that I had built and released on my own. I feel that I'm quite qualified to talk from experience on this subject.

Saturday, May 05, 2018

Self-Contained Deployments and Embedded RavenDB

DZone Database Zone
Self-Contained Deployments and Embedded RavenDB
Self-Contained Deployments and Embedded RavenDB

In previous versions of RavenDB, we offered a way to run RavenDB inside your process. Basically, you reference a NuGet package and are able to create a RavenDB instance that runs in your own process. That can simplify deployment concerns immensely and we have a bunch of customers who rely on this feature to just take their database engine with their application.

In 4.0, we don’t provide this ability OOTB. It didn’t make the cut for the release, even though we consider this a very important utility feature. We are now adding this in for the next release but in a somewhat different mode.

Friday, May 04, 2018

Verizon: Managed Virtual Network Services

Data Videos
#Data -Verizon: Managed Virtual Network Services

Matching Modern Databases With ML and AI

DZone Database Zone
Matching Modern Databases With ML and AI
Matching Modern Databases With ML and AI

Machine learning (ML) and artificial intelligence (AI) have stirred the technology sector into a flurry of activity over the past couple of years.

However, it is important to remember that it all comes back to data. As Hilary Mason, a prominent data scientist, noted in Harvard Business Review,

Generate Dapper Queries On-The-Fly With C#

DZone Database Zone
Generate Dapper Queries On-The-Fly With C#
Generate Dapper Queries On-The-Fly With C#

ORMs are very common when developing with .NET. According to Wikipedia:

Object-relational mapping (ORM, O/RM, and O/R mapping tool) in computer science is a programming technique for converting data between incompatible type systems using object-oriented programming languages. This creates, in effect, a "virtual object database" that can be used from within the programming language. There are both free and commercial packages available that perform object-relational mapping, although some programmers opt to construct their own ORM tools.

Dapper is a simple object mapper for .NET. It's simple and fast. Performance is the most important thing that we can achieve with Dapper. According to their website:

Creating AR/VR Experiences | Ep 1: Intro to Amazon Sumerian

Data Videos
#Data -Creating AR/VR Experiences | Ep 1: Intro to Amazon Sumerian

Thursday, May 03, 2018

Automating Automatic Indexing in Azure SQL Database

DZone Database Zone
Automating Automatic Indexing in Azure SQL Database
Automating Automatic Indexing in Azure SQL Database

I’ve been in love with the concept of a database-as-a-service ever since I first laid eyes on Azure SQL Database. It just makes sense to me. Take away the mechanics of server management and database management. Focus on the guts of your database. Backups, consistency checks — these easily automated aspects can just be taken care of. The same thing goes with some, not all, but some, index management. Azure SQL Database can manage your indexes for you. I call it weaponizing Query Store.

Anyway, I needed a way to automate this for the book I’m writing. I couldn’t find any good examples online, so I built my own.

How to Enhance Your Application Security Strategy with F5 on AWS

Data Videos
#Data -How to Enhance Your Application Security Strategy with F5 on AWS

Wednesday, May 02, 2018

How Inovalon Uses Sophos to Control Security Costs on AWS

Data Videos
#Data -How Inovalon Uses Sophos to Control Security Costs on AWS

9 DevOps KPIs to Optimize Your Database

DZone Database Zone
9 DevOps KPIs to Optimize Your Database
9 DevOps KPIs to Optimize Your Database

DevOps people like things to be automated, testable, and disaster-proof — and stateless services can achieve those features quite easily.

After all, you only need to recreate the instances from the images, and you're good to go. In a best-case scenario, this can take as little time as a few seconds.

Tuesday, May 01, 2018

Amazon Redshift Spectrum: Diving Into the Data Lake!

DZone Database Zone
Amazon Redshift Spectrum: Diving Into the Data Lake!
Amazon Redshift Spectrum: Diving Into the Data Lake!

Amazon’s Simple Storage Service S3 has been around since 2006. Enterprises have been pumping their data into this data lake at a furious rate. Within ten years of its birth, S3 stored over two trillion objects, each up to five terabytes in size. Enterprises know their data is valuable and worth preserving. But much of this data lies inert, in “cold” data lakes, unavailable for analysis, so-called “dark data.”

The Dark Data Problem. Source: Amazon AWS.

Security at scale with Azure Advanced Threat Protection

Data Videos
#Data -Security at scale with Azure Advanced Threat Protection

Binlog and Replication Improvements in Percona Server for MySQL

DZone Database Zone
Binlog and Replication Improvements in Percona Server for MySQL
Binlog and Replication Improvements in Percona Server for MySQL

Due to continuous development and improvement, Percona Server for MySQL incorporates a number of improvements related to binary log handling and replication. This results in replication specifics, distinguishing it from MySQL Server.

Temporary Tables and Mixed Logging Format Summary of the Fix

As soon as some statement involving temporary tables was met when using a mixed binlog format, MySQL switched to row-based logging for all statements until the end of the session (or until all temporary tables used in the session were dropped). This is inconvenient when you have long-lasting connections, including replication-related ones. Percona Server for MySQL fixes the situation by switching between statement-based and row-based logging when necessary.

What's It like Working in an AWS Data Center? Meet Our Japan Team in Tokyo

Data Videos
#Data -What's It like Working in an AWS Data Center? Meet Our Japan Team in Tokyo

Fun With SQL: Functions in Postgres

DZone Database Zone Fun With SQL: Functions in Postgres In our previous  Fun with SQL  post on the  Citus Data  blog, we covered w...