Questions About Amazon Data Migration Service (AWS DMS) That You Were Too Shy to Ask

Can you imagine it? You are in a group of smart database people, and they are debating the finer points about AWS DMS, and you don't even know what the letters stand for. You just feel too shy to ask those basic questions that seem ridiculous once you're up to speed. Laerte Junior answers all the questions you need answers for when facing the prospect of getting familiar with Amazon's useful Database Migration Servic… Read more

The Quick and the Dead Slow: Importing CSV Files into Azure Data Warehouse

Although Azure Data Warehouse is part of the bright new jewellery of the Microsoft Data Platform, the old Data Warehouse rules still apply where data imports are concerned. When it comes to data import, it pays to choose the fastest import method first and prepare your data first to ensure that it is compatible with your choice. The subtlety is in the details, as Feodor explains.… Read more

Questions About RDS SQL Server That You Were Too Shy to Ask

There are a number of different ways that you can host SQL Server. RDS SQL Server, for example, uses SQL Server within AWS as a simple database service, much like a more versatile alternative to MySQL. Obviously, it is a compromise, in that you lose many of the extras beyond the database. Laerte Junior answers those questions about RDS that people seem to assume you know the answers to, but which you may be too shy to ask.… Read more

Using the Copy Wizard for the Azure Data Factory

Creating a feed for a data warehouse used to be a considerable task. Now, it just takes a few minutes to work through a series of screens that, in this example, create a pipeline that brings data from a remote FTP server, decompresses the data and imports the data in a structured format, ready for data analysis. The Copy Wizard for the Azure Data Factory is a great time-saver, as Feodor Georgiev explains.… Read more

Pseudonymization and the Inference Attack

It is surprising that so much can be identified by deduction from data. You may assume that you can safely distribute partially masked data for reporting, development or testing when the original data contains personal information. Without this sort of information, much medical or scientific research would be vastly more difficult. However, the more useful the data is, the easier it is to mount an inference attack on it to identify personal information. Phil Factor explains.… Read more

Data in Motion and Data at Rest

Microsoft (StreamInsight), and Azure Stream Analytics represent a very different model for processing data. They are concerned with processing complex event streams of data (CEPs) from such things as sensors to deduce significant patterns and apply filters. Joe Celko discusses the background to an intriguing technology of complex event processing to establish the difference between data at rest, and data on the move.… Read more

Automating the Synchronization of RDS SQL Server Agent Jobs in a Multi-AZ Environment

Although Azure is the obvious Cloud service to host SQL Server, Amazon Relational Database Service (RDS) for SQL Server is a good choice when your organisation uses AWS. RDS deals with maintenance and monitoring, and supports the use of PowerShell to automate routine tasks. What if a script needs to be triggered by an unscheduled event? Even in this case, RDS can be configured to run scripts to react when something like a failover happens. Laerte Junior shows how easy it is to set up Lambda functions and some PowerShell scripts to automatically synchronise agent jobs after a failover.… Read more

Getting What You Need From Azure Storage Disks

If you need persistent data disks for Azure IaaS VMs that are supported on both Windows and Linux then you will be interested in Azure Storage Disks. These can increase the storage capacity of your VMs by up to a terabyte per disk, and they not only allow several availability options, but also offer a range of performance in terms of I/O throughput and latency. With right configuration, you can create as much of the right sort of storage as you need.… Read more

Personal Data, Privacy, and the GDPR

Now that there have been well-publicised examples of the awful consequences of data breaches and data misuse, there is increasing public pressure for legislation on privacy and personal data that has enough clout to prosecute serious offenders. In the vanguard has been the EU data protection regulation, soon to be succeeded by the GDPR. It defines IT practices for data that are likely to extend worldwide. William Brewer gives a rundown of what he sees as the implication for IT practice.… Read more

Azure Load Balancers and SQL Server

Load balancing in Azure has more importance for the DBA, because it is essential for Windows Server Failover Clustering in Azure, whether it is for AlwaysOn Availaiblity Groups, Failover Clustered Instances, or any other highly-available solution. Azure load balancing works out the location of the availability group, and routes traffic there. The load balancer detects a failure, and routes traffic to the new primary replica. Joshua Feierman gives an overview of what is required.… Read more

How to Secure Your Azure Storage Infrastructure

Azure storage is an essential foundation for the more sophisticated services that Microsoft Azure provides. It is therefore important to understand how to make access to your data in Azure storage secure, to control access appropriately, to log activity and to get metrics on usage. Security in Azure can be easily managed and controlled via policies. There are a variety of ways to achieve the types of control over access that your applications need, as Christos Matskas explains.… Read more

Azure Networking for SQL Server DBAs

The network is important to any DBA because so much performance is dependent on I/O, because of the importance of security, and ensuring that everyone get the right access. DBAs generally need not become experts in Azure networks, but it helps to understand the concepts and language. If you are running a SQL Server Virtual Machine in Azure, then VNets, Subnets, Network Security Groups, VNet peering and VPN gateways are all worth knowing about in order to to keep SQL Servers running smoothly.… Read more

How to Build Your First SQL Server Virtual Lab in Windows Azure

If you are a DBA who hasn't so far dived in head-first into using Azure, it is worth setting up an Azure 'Virtual Lab' environment the easy way, using a template. This will then allow you to experiment, try things out with SQL Azure, and get familiar with Resource Groups. Joshua shows how to build a virtual lab, from the ground up in the first of a series that aims to give you a grounding in Azure.… Read more

Why Would I Ever Need to Partition My Big ‘Raw’ Data?

Whether you are running an RDBMS, or a Big Data system, it is important to consider your data-partitioning strategy. As the volume of data grows, so it becomes increasingly important to match the way you partition your data to the way it is queried, to allow 'pruning' optimisation. When you have huge imports of data to consider, it can get complicated. Bartosz explains how to get things right; not perfect but wisely.… Read more

How to Start Big Data with Apache Spark

It is worth getting familiar with Apache Spark because it a fast and general engine for large-scale data processing and you can use you existing SQL skills to get going with analysis of the type and volume of semi-structured data that would be awkward for a relational database. With an IDE such as Databricks you can very quickly get hands-on experience with an interesting technology.… Read more

Azure SQL Data Warehouse: Explaining the Architecture Through System Views

The architecture of Azure SQL Data Warehouse isn't easy to explain briefly, but if you have some useful queries that access the management and catalog views, and diagrams that show how they relate together, you can very quickly get a feel for what is going on under the hood. By using and extending these queries that use these views, you can check on a variety waits, blocking, status, table distribution and data movement in ASDW.… Read more

SQL Database: How to Configure Active Geo-Replication

Active Geo-location is powerful magic for ensuring the high availability of a Azure SQL database, and for disaster-recovery. In choosing the best options, you need to accurately understand the value that the business places on the service you're running, long it will take for a secondary replica to be in synch with the primary replica, the importance of spreading the location of replicas widely, and the maximum tolerable unscheduled downtime. Just clicking all the options could prove to be expensive. … Read more