Saturday, September 30, 2017

What's New with AWS - Week of Sep 29, 2017

Data Videos
#Data -What's New with AWS - Week of Sep 29, 2017

Azure Functions and Lazy Initialization With Couchbase Server

DZone Database Zone
Azure Functions and Lazy Initialization With Couchbase Server
Azure Functions and Lazy Initialization With Couchbase Server

Azure Functions are still new to me, and I’m learning as I’m going. I blogged about my foray into Azure Functions with Couchbase over a month ago. Right after I posted that, I got some helpful feedback about the way I was instantiating a Couchbase cluster (and bucket).

I had (wrongly) assumed that there was no way to save state between Azure Function calls. This is why I created a GetCluster() method that was called each time the function ran. But, initializing a Couchbase Cluster object is an expensive operation. The less often you instantiate it, the better.

Reading Nested Parquet File in Scala and Exporting to CSV

DZone Database Zone
Reading Nested Parquet File in Scala and Exporting to CSV
Reading Nested Parquet File in Scala and Exporting to CSV

Recently, we were working on a problem where a Parquet compressed file had lots of nested tables. Some of the tables had columns with an Array type. Our objective was to read the file and save it to CSV.

We wrote a script in Scala that does the following:

Friday, September 29, 2017

Live from the London Loft | Using Amazon Machine Learning for Fraud Detection

Data Videos
#Data -Live from the London Loft | Using Amazon Machine Learning for Fraud Detection

Live Coding with AWS | Implementing Well Architected Performance Efficiency

Data Videos
#Data -Live Coding with AWS | Implementing Well Architected Performance Efficiency

Recover Deleted Data From SQL Table Using Transaction Log and LSNs

DZone Database Zone
Recover Deleted Data From SQL Table Using Transaction Log and LSNs
Recover Deleted Data From SQL Table Using Transaction Log and LSNs

At times, a user may perform UPDATE operation or DELETE operation in SQL Server database without applying the WHERE condition. This is a very common reason for encountering loss of data from SQL Server tables. As the SQL Server database is a highly popular relational DBMS among corporate sectors and businesses, the data loss problem magnifies even more. So users should be aware of the methods to recover deleted data from SQL Server Table in case of any mishaps.

Deleted rows can be recovered if the time of their deletion is known. This can be done through the use of Log Sequence Numbers (LSNs). LSN is a unique identifier given to every record present in the SQL Server transaction log. The upcoming section will discuss the process to recover deleted SQL Server data and tables with the help of Transaction log and LSNs.

AWS Artifact - Offline BAAs

Data Videos
#Data -AWS Artifact - Offline BAAs

ConfigMgr @ 25

Data Videos
#Data -ConfigMgr @ 25

Parameters and Templates with Power BI Desktop

Data Videos
#Data -Parameters and Templates with Power BI Desktop

This Week in Neo4j: RDF, APOC, and Alternative Facts

DZone Database Zone
This Week in Neo4j: RDF, APOC, and Alternative Facts
This Week in Neo4j: RDF, APOC, and Alternative Facts

Welcome to this week in Neo4j, where we round up what’s been happening in the world of graph databases in the last seven days.

Featured Community Member: Alessandro Negro

This week’s featured community member is Alessandro Negro, Chief Scientist at Neo4j Solutions Partner GraphAware.

Cool SQL Optimizations That Do Not Depend on the Cost Model (Part 1)

DZone Database Zone
Cool SQL Optimizations That Do Not Depend on the Cost Model (Part 1)
Cool SQL Optimizations That Do Not Depend on the Cost Model (Part 1)

Cost-based optimization is the de-facto standard way to optimize SQL queries in most modern databases. It is the reason why it is really hard to implement a complex, hand-written algorithm in a 3GL (third-generation programming language) such as Java that outperforms a dynamically calculated database execution plan that has been generated from a modern optimizer. I’ve recently delivered a talk about that topic:

Today, we don’t want to talk about cost-based optimization, i.e. optimizations that depend on a database’s cost model. We’ll look into much simpler optimizations that can be implemented purely based on metadata (i.e. constraints) and the query itself. They’re usually no-brainers for a database to optimize because the optimization will always lead to a better execution plan, independently of whether there are any indexes, or how much data you have, or how skewed your data distribution is.

Couchbase's History of Everything: DCP

DZone Database Zone
Couchbase's History of Everything: DCP
Couchbase's History of Everything: DCP

Hiding behind an acronym (DCP), Couchbase has a secret superpower. Most people think of databases as storage locations data at a certain moment in time. But with Database Change Protocol (DCP), a Couchbase cluster can be viewed as an ongoing stream of changes.

Essentially, Couchbase can "rewind history" and replay everything that happened to the database from the beginning. In doing so, it can resolve any internal state since. In this article, we're going to cover why anyone would want to do such a crazy thing in the first place, and how we can exploit this superpower to do extra cool stuff with our documents.

Thursday, September 28, 2017

Databases and Distributed Deadlocks: An FAQ

DZone Database Zone
Databases and Distributed Deadlocks: An FAQ
Databases and Distributed Deadlocks: An FAQ

Since Citus is a distributed database, we often hear questions about distributed transactions. Specifically, people ask us about transactions that modify data living on different machines. So we started to work on distributed transactions. We then identified distributed deadlock detection as the building block to enable distributed transactions in Citus.

As we began working on distributed deadlock detection, we realized that we needed to clarify certain concepts. So we created a simple FAQ for the Citus development team. And we found ourselves referring back to the FAQ over and over again. So we decided to share it here on our blog, in the hopes you find it useful.

MongoDB: How to Build a Global Database as a Service

Data Videos
#Data -MongoDB: How to Build a Global Database as a Service

The Story of Multi-Model Databases

DZone Database Zone
The Story of Multi-Model Databases
The Story of Multi-Model Databases

The world of databases has changed significantly in the last eight years or so. Do you remember the time when word database was equivalent to a relational database? Relational databases ruled this niche for more than forty years. And for a good reason. They have strong consistency, transactions, and expressiveness, they are a good integration tool, and so on.

But forty years is a long period of time. A number of things have changed during this time, especially in the technology world. Today, we can see that relational databases cannot satisfy every need of today's IT world. Having fixed database schema, static representation of data and impedance mismatch are just some of the obstacles that users of relational databases faced. That, in turn, gave space for a completely new branch of databases to develop NoSQL databases.

Pagination in Couchbase Server With N1QL and PHP

DZone Database Zone
Pagination in Couchbase Server With N1QL and PHP
Pagination in Couchbase Server With N1QL and PHP

When building applications that deal with a large number of documents, it is important to use pagination to get rows by page.

In this article, I'll demonstrate how to implement pagination when working with N1QL and PHP.

Benchmarking Google Cloud Spanner, CockroachDB, and NuoDB

DZone Database Zone
Benchmarking Google Cloud Spanner, CockroachDB, and NuoDB
Benchmarking Google Cloud Spanner, CockroachDB, and NuoDB

As a NuoDB solution architect, I constantly talk to people about what they're looking for in a database. More and more often, architects and CTOs tell us that they're building their next-generation data center — often using containers or cloud infrastructure — and they need a database that fits this model.

In their ideal world, they want a familiar relational database — but they want one that can deliver elastic capacity, maintain consistent transactions across multiple data centers, and span multiple public clouds at once. They're asking for a database of the future.

Wednesday, September 27, 2017

SaaS in the AWS Marketplace Enables ThingLogix to Grow to Next Level

Data Videos
#Data -SaaS in the AWS Marketplace Enables ThingLogix to Grow to Next Level

How a National Transportation Software Provider Migrated Test Infrastructure to AWS with Cascadeo

Data Videos
#Data -How a National Transportation Software Provider Migrated Test Infrastructure to AWS with Cascadeo

How the FT Accelerates Platform and Product Delivery with Salesforce Heroku and AWS

Data Videos
#Data -How the FT Accelerates Platform and Product Delivery with Salesforce Heroku and AWS

MongoDB: How to Build a Global Database as a Service

Data Videos
#Data -MongoDB: How to Build a Global Database as a Service

How Did MongoDB Get Its Name?

DZone Database Zone
How Did MongoDB Get Its Name?
How Did MongoDB Get Its Name?

Curious how MongoDB got its name? Here's your quick history lesson for the day.

Example of a MongoDB query. Source: MongoDB.

Modernizing SQL Server Applications With Tarantool

DZone Database Zone
Modernizing SQL Server Applications With Tarantool
Modernizing SQL Server Applications With Tarantool

Designed by a team led by a former MySQL engineer, Tarantool is an in-memory cache database that maintains many of the best aspects of Relational Database Management Systems (RDBMS).

Using a single core, Tarantool can process one million queries per second — a benchmark that would require at least 20 cores to achieve with a relational database.

Transitioning From Equivalent Indexes to Index Replicas

DZone Database Zone
Transitioning From Equivalent Indexes to Index Replicas
Transitioning From Equivalent Indexes to Index Replicas

In the previous post, we saw the benefits of using index replicas over equivalent indexes. Let’s say you are on Couchbase Server 4.x and have the following three equivalent indexes spread across three nodes; and with Couchbase 5.0 Beta available, you want to migrate these equivalent indexes to index replicas.

//old 4.x equivalent indexes create index eq_index1 on bucket(field1); create index eq_index2 on bucket(field1); create index eq_index3 on bucket(field1);

Note: If you want to use the same nodes to create the replicas, do make sure that they have the required memory and compute resources for the both the index replicas and equivalent indexes to coexist.

Tuesday, September 26, 2017

Best Practices for Automating Cloud Security Processes with Evident.io and AWS

Data Videos
#Data -Best Practices for Automating Cloud Security Processes with Evident.io and AWS

ConfigMgr @ 25

Data Videos
#Data -ConfigMgr @ 25

How EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud

Data Videos
#Data -How EIS Reduced Costs by 20% and Optimized SAP by Leveraging the Cloud

How to Initialize Database Variables and Assign Them Values in JMeter

DZone Database Zone
How to Initialize Database Variables and Assign Them Values in JMeter
How to Initialize Database Variables and Assign Them Values in JMeter

A variable is an information storage element (for example, storing numeric values, strings, etc.) whose value can be changed. In order to create a variable, it must be declared (specified a name and the type of data stored) and initialized (assigned a value). Creating a variable and assigning a value to it is important for writing test cases that use database queries. This is because we need to get data from the database and also use the values during the execution of test cases.

In our last blog post, we created and asserted a basic data configuration to our MySQL database with Apache JMeter™. Now, we are ready to move on to more advanced scenarios. In this blog post, we will learn to initialize variables and assign values to them in a database. We will do this for one Thread Group.

Live Coding with AWS | Well Architected Reliability

Data Videos
#Data -Live Coding with AWS | Well Architected Reliability

Database Fundamentals #12: Adding Data With SSMS GUI

DZone Database Zone
Database Fundamentals #12: Adding Data With SSMS GUI
Database Fundamentals #12: Adding Data With SSMS GUI

In the previous Database Fundamentals, I argued that you should be learning T-SQL — yet the very next post I'm showing you how to use the GUI. What's up?

Why the GUI?

It's a very simple reason. I want to show you what it is so that I'm not hiding things. However, showing it to you will quickly expose the weaknesses inherent in using the SSMS GUI for direct data manipulation. It's a poor choice. However, we'll understand how it works at the end of this post. I'll also cover it in other posts, showing how to UPDATE and DELETE data using the GUI. They will further illustrate the weaknesses. You will, however, know how it works.

What IntelliJ IDEA 2017.3 EAP Brings to Database Tools

DZone Database Zone
What IntelliJ IDEA 2017.3 EAP Brings to Database Tools
What IntelliJ IDEA 2017.3 EAP Brings to Database Tools

We've recently announced the opening of the EAP for IntelliJ IDEA 2017.3, and we have already even looked at some interesting EAP features. However, we didn't cover changes in the database tools yet. So let's explore the major changes in this area.

Selecting Schema When Running a SQL File

IntelliJ IDEA 2017.3 now prompts you to choose database/schema along with a data source when you try to run an SQL file from the tool window:

Comparing Oracle and N1QL Support for the Date-Time Feature (Part 1)

DZone Database Zone
Comparing Oracle and N1QL Support for the Date-Time Feature (Part 1)
Comparing Oracle and N1QL Support for the Date-Time Feature (Part 1)

Date and time formats/types are very different for different databases. In this article, we will compare Couchbase N1QL date-time functions with Oracle's date-time support.

Oracle contains multiple data types associated with date-time support — namely DATE, TIMESTAMP, TIMESTAMP WITH TIME ZONE, and TIMESTAMP WITH LOCAL TIME ZONE. The TIMESTAMP data type is an extension of the DATE type.

Monday, September 25, 2017

Edmondo: Optimizing Infrastructure Costs with Spot Fleet and On Demand Instances

Data Videos
#Data -Edmondo: Optimizing Infrastructure Costs with Spot Fleet and On Demand Instances

SQL Server 2017 on Linux

Data Videos
#Data -SQL Server 2017 on Linux

A Note on Native Graph Databases

DZone Database Zone
A Note on Native Graph Databases
A Note on Native Graph Databases

It's fun to watch the graph database category evolve from being a seemingly niche category a decade ago (despite the valiant efforts of the Semantic Web community) to a modest — but important! — pillar of the data world as it is today.

But as the category has grown, we see some technical folks who attempt to "educate" engineers on artificial distinctions as if they were real and factual.

Quantum Computing

Data Videos
#Data -Quantum Computing

How to Write Efficient TOP N Queries in SQL

DZone Database Zone
How to Write Efficient TOP N Queries in SQL
How to Write Efficient TOP N Queries in SQL

A very common type of SQL query is the TOP-N query, where we need the "TOP N" records ordered by some value, possibly per category. In this blog post, we're going to look into a variety of different aspects to this problem, as well as how to solve them with standard and non-standard SQL.

These are the different aspects we'll discuss:

What Are Some Database Use Cases?

DZone Database Zone
What Are Some Database Use Cases?
What Are Some Database Use Cases?

To gather insights on the state of databases today, and their future, we spoke to 27 executives at 23 companies who are involved in the creation and maintenance of databases.

We asked these executives, "What are real world problems you, or your clients, are solving with databases?" Here's what they told us:

Sunday, September 24, 2017

RavenDB 4.0 Unsung Heroes: Field Compression

DZone Database Zone
RavenDB 4.0 Unsung Heroes: Field Compression
RavenDB 4.0 Unsung Heroes: Field Compression

I have been talking a lot about major features and making things visible and all sorts of really cool things. What I haven’t been talking about is a lot of the work that has gone into the backend and all the stuff that isn’t sexy and bright. After all, you probably don’t really care how the piping system in your house works — at least until the toilet doesn’t flush.

A lot of the things that we did with RavenDB 4.0 involved at all the pain points that we have run into and trying to resolve them. This series of posts is meant to expose some of these hidden features. If we did our job right, you will never even know that these features exist — they are that good.

Saturday, September 23, 2017

Microsoft Flow Guided Learning - Introduction

Data Videos
#Data -Microsoft Flow Guided Learning - Introduction

How to Run a MongoDB Replica Set on Kubernetes PetSet or StatefulSet

DZone Database Zone
How to Run a MongoDB Replica Set on Kubernetes PetSet or StatefulSet
How to Run a MongoDB Replica Set on Kubernetes PetSet or StatefulSet

Running and managing stateful applications or databases such as MongoDB, Redis, and MySQL with Docker containers is no simple task. Stateful applications must retain their data after a container has been shut down or migrated to a new node (for example, if during a failover or scaling operation, the container was shut down and re-created on a new host).

By default, Docker containers use their root disk as ephemeral storage, a chunk of disk space from the host filesystem that runs the container. This disk space can’t be shared with other processes nor can it be easily migrated to a new host. While you can save the changes made within the container using the “Docker commit” command (which creates a new Docker image that will include your modified data), it can’t be used as a de facto way to store content.

Friday, September 22, 2017

Record Your ConfigMgr Memories @ Microsoft Ignite!

Data Videos
#Data -Record Your ConfigMgr Memories @ Microsoft Ignite!

How to Stop a Runaway Index Build in MongoDB

DZone Database Zone
How to Stop a Runaway Index Build in MongoDB
How to Stop a Runaway Index Build in MongoDB

Index builds in MongoDB can have an adverse impact on the availability of your MongoDB cluster. If you trigger a foreground index build on a large collection on your production server, you may find that your cluster is unresponsive until the index build is complete. On a large collection, this could take several hours or days, as described in the perils of index building in MongoDB.

The recommended best practice is to trigger index builds in the background. However, on large collection indexes, we've seen multiple problems with this approach. In the case of a three-node cluster, both secondaries start building the index and stop responding to any requests. Consequently, the primary does not have quorum and moves to the secondary state, taking your cluster down. Also, the default index builds triggered from the command line are foreground index builds — making this a now widespread problem. In future releases, we're hopeful that this becomes background by default.

Amazon EC2 Systems Manager Introduction

Data Videos
#Data -Amazon EC2 Systems Manager Introduction

Adding Laplace or Gaussian Noise to Database for Privacy

DZone Database Zone
Adding Laplace or Gaussian Noise to Database for Privacy
Adding Laplace or Gaussian Noise to Database for Privacy

The idea of differential privacy is to guarantee bounds on how much information may be revealed by someone's participation in a database. These bounds are described by two numbers: ε (epsilon) and δ (delta). We're primarily interested in the multiplicative bound described by ε. This number is roughly the number of bits of information an analyst might gain regarding an individual.

The multiplicative bound is exp(ε) and so ε, the natural log of the multiplicative bound, would be the information measure, though technically in nats rather than bits since we're using natural logs rather than logs base 2.

Partitioning Behavior of DynamoDB

DZone Database Zone
Partitioning Behavior of DynamoDB
Partitioning Behavior of DynamoDB

This is the third part of a three-part series on working with DynamoDB. The previous article, Querying and Pagination With DynamoDB, focuses on different ways you can query in DynamoDB, when to choose which operation, the importance of choosing the right indexes for query flexibility, and the proper way to handle errors and pagination.

As discussed in the first article, Working With DynamoDB, the reason I chose to work with DynamoDB was primarily its ability to handle massive data with single-digit millisecond latency. Scaling, throughput, architecture, hardware provisioning is all handled by DynamoDB.

Dependency Injection With ASP.NET Core and Couchbase

DZone Database Zone
Dependency Injection With ASP.NET Core and Couchbase
Dependency Injection With ASP.NET Core and Couchbase

Dependency injection is a design pattern that makes coding easier. It saves you the hassle of instantiating objects with complex dependencies, and it makes it easier for you to write tests. With the Couchbase.Extensions.DependencyInjection library, you can use Couchbase clusters and buckets within the ASP.NET Core dependency injection framework.

In my last blog post on distributed caching with ASP.NET, I mentioned the DependencyInjection library. Dependency injection will be explored in-depth in this post. Feel free to follow along with the code samples I've created, available on GitHub.

Thursday, September 21, 2017

Accelerate Your SAP HANA Migration with Capgemini & AWS FAST

Data Videos
#Data -Accelerate Your SAP HANA Migration with Capgemini & AWS FAST

AWS Partner Success: Corent

Data Videos
#Data -AWS Partner Success: Corent

Quantifying Privacy Loss in a Statistical Database

DZone Database Zone
Quantifying Privacy Loss in a Statistical Database
Quantifying Privacy Loss in a Statistical Database

In the previous post, we looked at a simple randomization procedure to obscure individual responses to yes/no questions in a way that retains the statistical usefulness of the data. In this post, we'll generalize that procedure, quantify the privacy loss, and discuss the utility/privacy trade-off.

More General Randomized Response

Suppose we have a binary response to some question as a field in our database. With probability t, we leave the value alone. Otherwise, we replace the answer with the result of a fair coin toss. In the previous post, what we now call t was implicitly equal to 1/2. The value recorded in the database could have come from a coin toss and so the value is not definitive — and yet it does contain some information. The posterior probability that the original answer was 1 ("yes") is higher if a 1 is recorded. We did this calculation for t = 1/2 last time, and here we'll look at the result for general t.

Official Preview: “ConfigMgr @ 25”

Data Videos
#Data -Official Preview: “ConfigMgr @ 25”

How to Create a Cypher Map With Dynamic Keys

DZone Database Zone
How to Create a Cypher Map With Dynamic Keys
How to Create a Cypher Map With Dynamic Keys

I was recently trying to create a map in a Cypher query but wanted to have dynamic keys in that map. I started off with this query:

WITH "a" as dynamicKey, "b" as dynamicValue RETURN { dynamicKey: dynamicValue } AS map ╒══════════════════╕ │"map" │ ╞══════════════════╡ │{"dynamicKey":"b"}│ └──────────────────┘

Not quite what we want! We want dynamicKey to be evaluated rather than treated as a literal. As usual, APOC comes to the rescue!

Database Fundamentals #11: Why Learn T-SQL?

DZone Database Zone
Database Fundamentals #11: Why Learn T-SQL?
Database Fundamentals #11: Why Learn T-SQL?

If you've been following along with the previous 10 Database Fundamentals blog posts, you have a SQL Server installed and a database with a table in it. You may have more if you've been practicing. Now would be the time to start adding data to the database, but first, I want to talk about the importance of T-SQL!

Why T-SQL?

The way SQL Server accepts information is very different than most programs you're used to using. Most programs focus on the graphical user interface as a mechanism for enabling data entry. While there is a GUI within SQL Server that you can use for data entry (and I will do a blog post on it), the primary means of manipulating data within SQL is the Transact-Structured Query Language (or T-SQL).

"Lunch Break" @ Microsoft Ignite!

Data Videos
#Data -"Lunch Break" @ Microsoft Ignite!

When and Why I Use an In-Memory Database or a Traditional Database

DZone Database Zone
When and Why I Use an In-Memory Database or a Traditional Database
When and Why I Use an In-Memory Database or a Traditional Database

In this article, I’d like to talk about when I use an in-memory database, when I prefer a traditional DBMS, and why.

When I need to decide which DBMS to use — in-memory (let’s call it IMDB) or traditional (I’ll be calling it RDBMS) — I usually make a choice based on the type of storage where my data is going to be kept. I divide all the options into three groups: RAM, solid-state drive or flash memory (SSD), and hard disk drive (HDD). First, I pick a type (or types) of storage to use and then, I start thinking about what database (or databases) I want to have on top of that.

Wednesday, September 20, 2017

"Lunch Break" @ Ignite!

Data Videos
#Data -"Lunch Break" @ Ignite!

Create a Competitive Advantage for Your Startup with AWS Marketplace

Data Videos
#Data -Create a Competitive Advantage for Your Startup with AWS Marketplace

The MySQL High Availability Landscape in 2017, Part 3: The Babies

DZone Database Zone
The MySQL High Availability Landscape in 2017, Part 3: The Babies
The MySQL High Availability Landscape in 2017, Part 3: The Babies

This post is the third of a series focusing on the MySQL high availability solutions available in 2017.

The first post looked at the elders, the technologies that have been around for more than ten years. The second post talked about the adults, the more recent and mature technologies. In this post, we will look at the emerging MySQL high availability solutions. The "baby" MySQL high-availability solutions I chose for the blog are group replication, proxies, and distributed storage.

AWS Knowledge Center Videos: How do I restore Glacier objects with restore tiers in the S3 Console?

Data Videos
#Data -AWS Knowledge Center Videos: How do I restore Glacier objects with restore tiers in the S3 Console?

Understanding Preceding Loads

DZone Database Zone
Understanding Preceding Loads
Understanding Preceding Loads

Some of the features of QlikView Training are not frequently blogged about. These features are used on a day-to-day basis, and developers often don't give them any thought. If you're unaware of these techniques, you should revisit them before reading this article. One feature that we will be talking about today is the Preceding Load.

What Is a Preceding Load?

Knowing exactly what a Preceding Load is will be important here. As the name implies, a Preceding Load occurs before, prior to, or in front of another load. Even if you're not aware of it, you've probably used a Preceding Load before. For a basic Preceding Load example, let's imagine that there is an SQL SELECT statement while loading from an OLEDB or ODBC data source. As an option, the wizard could add a LOAD section prior to SELECT. Using one of these ahead of a database load is always recommended, as doing so opens up a complete range of syntax that is unavailable in the SQL statement.

The Biggest Challenges of Moving to NoSQL

DZone Database Zone
The Biggest Challenges of Moving to NoSQL
The Biggest Challenges of Moving to NoSQL

We are in the midst of a shift towards NoSQL data stores across the software world, especially in the web and mobile space. Many developers and enterprises are migrating or looking to do so, yet a great chasm exists between traditional SQL databases and NoSQL databases. What are the challenges facing developers moving from SQL to NoSQL?

Personally, I've experienced both worlds, having developed and maintained SQL and NoSQL apps and migrated several large apps to NoSQL. In this article, I pass that wisdom on to you, dear reader: pitfalls to avoid, riches to reap, new ways of thinking. My experience is primarily with RavenDB, a NoSQL database built on .NET Core, but the lessons here are general enough to apply to many NoSQL databases.

Top 6 Ways GPU Acceleration Is Disrupting Financial Analytics

DZone Database Zone
Top 6 Ways GPU Acceleration Is Disrupting Financial Analytics
Top 6 Ways GPU Acceleration Is Disrupting Financial Analytics

It is widely reported that in the early 1800s, the Rothschild banking family of France set up a vast network of pigeon lofts spread across Europe and deployed a prized coop of racing pigeons to fly between its financial houses, carrying the latest news. From the pigeons, Nathan Rothschild was reportedly the first to learn of the British victory at Waterloo. The story goes that while other traders on the stock exchange prepared for a British loss, he went long and made millions. At the time, pigeons were the fastest way to send and receive information ahead of the competition.

Fast-forward a few hundred years and the financial services community is still on the forefront of using newer and more sophisticated technologies for competitive advantage. Banks, credit card companies, hedge funds, brokerages, and insurance firms are all aggressively testing and deploying the latest cutting-edge technologies to enable real-time analytics and AI use cases. Their primary goals are to reduce risk, execute smarter trades, increase profitability, and eliminate fraud.

Tuesday, September 19, 2017

Microsoft Advanced Threat Analytics - Overview of ATA Deployment in 10mn

Data Videos
#Data -Microsoft Advanced Threat Analytics - Overview of ATA Deployment in 10mn

Gigamon: Visibility Platform for AWS

Data Videos
#Data -Gigamon: Visibility Platform for AWS

GameStop/Pariveda: Building Smart Deployment Pipelines with Lambda and Codebuild on AWS

Data Videos
#Data -GameStop/Pariveda: Building Smart Deployment Pipelines with Lambda and Codebuild on AWS

How Important Is the Database in Game Development?

DZone Database Zone
How Important Is the Database in Game Development?
How Important Is the Database in Game Development?

Thanks to Ben Ballard, Customer Success Manager at VoltDB for sharing his insights on the current and future state of game development. VoltDB provides an in-memory, translytical database that’s able to ingest millions of transactions per second. They have several large game development companies as clients.

Q: What are the keys to developing successful games?

Microsoft Machine Learning Server - All Up

Data Videos
#Data -Microsoft Machine Learning Server - All Up

9 Things to Consider When Considering Amazon Athena

DZone Database Zone
9 Things to Consider When Considering Amazon Athena
9 Things to Consider When Considering Amazon Athena

Amazon has generated a lot of excitement around their release of Athena, an ANSI-standard query tool that works with data stored in Amazon S3. Athena and S3 can deliver results quickly and with the power of sophisticated data warehousing systems. This article covers nine things that you should know about Athena when considering it as a query service.

1. Schema and Table Definitions

To be able to query data with Athena, you will need to make sure you have data residing on S3. With data on S3, you will need to create a database and tables. When creating schemas for data on S3, the positional order is important. For example, if you have a source file with ID, DATE, CAMPAIGNID, RESPONSE, ROI, and OFFERID columns, then your schema should reflect that structure.

AWS Knowledge Center Videos: How do I restore Glacier objects with restore tiers in the S3 Console?

Data Videos
#Data -AWS Knowledge Center Videos: How do I restore Glacier objects with restore tiers in the S3 Console?

Data Encryption and Decryption With Oracle

DZone Database Zone
Data Encryption and Decryption With Oracle
Data Encryption and Decryption With Oracle

In this article, I would like to talk about Oracle's data encryption support. Oracle offers two different packages for data encryption to software developers. They are:

DBMS_CRYPTO (came with Oracle 10g)

Is Your Postgres Query Starved for Memory?

DZone Database Zone
Is Your Postgres Query Starved for Memory?
Is Your Postgres Query Starved for Memory?

For years or even decades, I’ve heard about how important it is to optimize my SQL statements and database schema. When my application starts to slow down, I look for missing indexes; I look for unnecessary joins; I think about caching results with a materialized view.

But instead, the problem might be my Postgres server was not installed and tuned properly. Buried inside the postgresql.conf file is an obscure, technical setting called work_mem. This controls how much “working memory” your Postgres server allocates for each sort or join operation. The default value for this is only 4MB:

Monday, September 18, 2017

Live Coding with AWS | AWS IoT App Challenge Kick Off

Data Videos
#Data -Live Coding with AWS | AWS IoT App Challenge Kick Off

Microsoft Cloud App Security – Working with the Regex Engine

Data Videos
#Data -Microsoft Cloud App Security – Working with the Regex Engine

How Hatco Protects Against Ransomware with Druva on AWS

Data Videos
#Data -How Hatco Protects Against Ransomware with Druva on AWS

AWS Knowledge Center Videos: How do I expand an EBS root volume of a Windows instance?

Data Videos
#Data -AWS Knowledge Center Videos: How do I expand an EBS root volume of a Windows instance?

3 Approaches to Creating a SQL-Join Equivalent in MongoDB

DZone Database Zone
3 Approaches to Creating a SQL-Join Equivalent in MongoDB
3 Approaches to Creating a SQL-Join Equivalent in MongoDB

While there's no such operation as a SQL-style table join in MongoDB, you can achieve the same effect without relying on table schema. Here are three techniques for combining data stored in MongoDB document collections with minimal query-processing horsepower required.

The signature relational-database operation is the table join: Combine data from table 1 with data from table 2 to create table 3. The schemaless document-container structure of MongoDB and other non-relational databases makes such table joins impossible.

Lock, Stock, and MySQL Backups [Q+A]

DZone Database Zone
Lock, Stock, and MySQL Backups [Q+A]
Lock, Stock, and MySQL Backups [Q+A]

Hello again! On August 16, we delivered a webinar on MySQL backups. As always, we've had a number of interesting questions. Some of them we've answered on the webinar, but we'd like to share some of them here in writing.

Q: What is the best way to maintain daily full backups, but selective restores omitting certain archive tables?

Next-Level MySQL Performance: Tarantool as a Replica

DZone Database Zone
Next-Level MySQL Performance: Tarantool as a Replica
Next-Level MySQL Performance: Tarantool as a Replica

Refactoring your MySQL stack by adding an in-memory NoSQL solution can improve throughput, allow scalability, and result in substantial hardware savings.

Tarantool is particularly suited as a MySQL addition, as it offers all of the speed of the basic cache databases while maintaining many features of a traditional Relational Database Management System (RDBMS).

Sunday, September 17, 2017

Monitoring Open-Source Databases [Q+A]

DZone Database Zone
Monitoring Open-Source Databases [Q+A]
Monitoring Open-Source Databases [Q+A]

Welcome to another post in our series of interview blogs for the upcoming Percona Live Europe 2017 in Dublin. This series highlights a number of talks that will be at the conference and gives a short preview of what attendees can expect to learn from the presenter.

This blog post is with Bernd Erk, CEO of Icinga. His talk is titled Monitoring Open-Source Databases With Icinga. Icinga is a popular open-source successor of Nagios that checks hosts and services and notifies you of their statuses. But you also need metrics for performance and growth to deal with your scaling needs. Adding conditional behaviors and configuration in Icinga is not just intuitive but also intelligently adaptive at runtime. In our conversation, we how to intelligently monitor open-source databases.

Saturday, September 16, 2017

Introduction to the Morpheus DataFrame

DZone Database Zone
Introduction to the Morpheus DataFrame
Introduction to the Morpheus DataFrame

The Morpheus library is designed to facilitate the development of high-performance analytical software involving large datasets for both offline and real-time analysis on the Java Virtual Machine (JVM). The library is written in Java 8 with extensive use of lambdas but is accessible to all JVM languages.

Motivation

At its core, Morpheus provides a versatile two-dimensional memory-efficient tabular data structure called a DataFrame, similar to that first popularized in R. While dynamically typed scientific computing languages like R, Python, and Matlab are great for doing research, they are not well-suited for large-scale production systems, as they become extremely difficult to maintain and dangerous to refactor. The Morpheus library attempts to retain the power and versatility of the DataFrame concept while providing a much more type safe and self-describing set of interfaces, which should make developing, maintaining, and scaling code complexity much easier.

Friday, September 15, 2017

Top 3 Errors of SQL Server That Might Corrupt Your Database

DZone Database Zone
Top 3 Errors of SQL Server That Might Corrupt Your Database
Top 3 Errors of SQL Server That Might Corrupt Your Database

Are you speculating about corruption in your SQL data? Do you know that there are different errors that indicate an unhealthy SQL Server database? In this blog, we are going to cover three major errors associated with SQL along with the best solutions.

But first, let's explore some basic information about database corruption.

Massive Parallel Query Log Processing With ClickHouse

DZone Database Zone
Massive Parallel Query Log Processing With ClickHouse
Massive Parallel Query Log Processing With ClickHouse

In this blog, I'll look at how to use ClickHouse for parallel log processing.

Percona is seen primarily for our expertise in MySQL and MongoDB (at this time), but neither is quite suitable to perform heavy analytical workloads. There is a need to analyze data sets, and a very popular task is crunching log files. Below, I'll show how ClickHouse can be used to efficiently perform this task. ClickHouse is attractive because it has multi-core parallel query processing, and it can even execute a single query using multiple CPUs in the background.

Bringing Continuous Delivery to the Database

DZone Database Zone
Bringing Continuous Delivery to the Database
Bringing Continuous Delivery to the Database

Last month at Jenkins World, the annual gathering of DevOps devotees and Jenkins users, I sat down with executives at Datical, a group that focuses on database deployment automation. Ben Geller, vice president of marketing, and Pete Pickerell, co-founder of Datical and its vice president of product strategy, discussed the streamlining of their continuous integration/continuous delivery process with Jenkins 2.6. DevOps is coming to the database.

As changes to applications happen faster and faster, it’s logical that database releases would contribute to delays. And as database administrators scramble to keep up the pace, error rates inevitably continue to increase. In a survey of IT managers conducted by IDG Research and commissioned by Datical, almost a third (30 percent) of IT executives reported seeing an increase over the last year in error rates in production caused by bad database changes. 

PL/SQL Record Types and the Node.js Driver

DZone Database Zone
PL/SQL Record Types and the Node.js Driver
PL/SQL Record Types and the Node.js Driver

The current version of the Node.js driver (v1.12) doesn't support binding record types directly. Does that mean you can't invoke stored procedures and functions that use record types? Of course not! For now, you just have to decompose the record types for binding and then recompose them inside your PL/SQL block. Let's have a look at an example...

Imagine we have the following PL/SQL package spec and body:

The Need for Speed: Access Existing Data 1,000x Faster

DZone Database Zone
The Need for Speed: Access Existing Data 1,000x Faster
The Need for Speed: Access Existing Data 1,000x Faster

Web and mobile applications are sometimes slow because the backing database is slow and/or the connection to the database imposes latencies. Modern UIs and interactive applications require fast back-ends with ideally no observable latency or else users will move on to other services or will just get tired and stop using the service altogether.

In this article, we will learn how analytic database applications can be sped up by orders of magnitude using standard Java 8 streams and Speedment's in-JVM-memory acceleration technology. At the end, we will run a JMH test suit with representative benchmarks that indicate a speedup factor exceeding 1,000 times. Learn more about Speedment's open-source solution here.

Thursday, September 14, 2017

Intro to Redisq: A Java Library for Asynchronous Messaging in Redis

DZone Database Zone
Intro to Redisq: A Java Library for Asynchronous Messaging in Redis
Intro to Redisq: A Java Library for Asynchronous Messaging in Redis

This blog post is about a solution that we built here at GRAKN.AI for executing tasks asynchronously. We thought it would be nice to release it as a generic Java library built on Redis.

The Task of Choosing a Task Queue

We will discuss the design choices behind the engine in another post, but — motivated by the need to simplify our distribution — we decided to use Redis as a task queue.

AWS IPC Core Security Q3 2017

Data Videos
#Data -AWS IPC Core Security Q3 2017

8 Things We Learned Running BuzzFeed on Amazon ECS

Data Videos
#Data -8 Things We Learned Running BuzzFeed on Amazon ECS

Top 3 Errors of SQL Server That Might Corrupt Your Database

DZone Database Zone
Top 3 Errors of SQL Server That Might Corrupt Your Database
Top 3 Errors of SQL Server That Might Corrupt Your Database

Are you speculating about corruption in your SQL data? Do you know that there are different errors that indicate an unhealthy SQL Server database? In this blog, we are going to cover three major errors associated with SQL along with the best solutions.

But first, let's explore some basic information about database corruption.

Fun With SQL: Functions in Postgres

DZone Database Zone Fun With SQL: Functions in Postgres In our previous  Fun with SQL  post on the  Citus Data  blog, we covered w...