Feodor has a background of many years working with SQL Server and is now mainly focusing on data analytics, data science and R.
Over more than 15 years Feodor has worked on assignments involving database architecture, Microsoft SQL Server data platform, data model design, database design, integration solutions, business intelligence, reporting, as well as performance optimization and systems scalability.
In the past 3 years he has expanded his focus to coding in R for assignments relating to data analytics and data science.
This article pulls together the concepts from two previous articles to demonstrate a way to automate the “on-demand” creation and deletion of an HDInsight cluster. This not only serves as a demonstration of the power of automating Azure provisioning, but is a practical solution for the user who only occasionally requires the power of HDInsight for scalable computations.… Read more
Because Azure is designed from the ground-up to allow automated provisioning, It provides a number of interesting opportunities for use. With a .NET language such as C#, it is possible to create a very flexible 'on-demand' infrastructure in Azure, using Azure Resource Manager templates, that allows users to economically deploy the resources they need just for the interval of time that they are required, and then destroy them. … Read more
Although Azure Data Warehouse is part of the bright new jewellery of the Microsoft Data Platform, the old Data Warehouse rules still apply where data imports are concerned. When it comes to data import, it pays to choose the fastest import method first and prepare your data first to ensure that it is compatible with your choice. The subtlety is in the details, as Feodor explains.… Read more
Creating a feed for a data warehouse used to be a considerable task. Now, it just takes a few minutes to work through a series of screens that, in this example, create a pipeline that brings data from a remote FTP server, decompresses the data and imports the data in a structured format, ready for data analysis. The Copy Wizard for the Azure Data Factory is a great time-saver, as Feodor Georgiev explains.… Read more
Azure Data Factory provides a radical new cloud-based way of collecting and preparing data in preparation for its storage and analysis. How do you get started with it to explore the possibilities it provides? Feodor Georgiev shows the practicalities of how to go about the task of preparing a pipeline for use, from preparing the Azure environment to downloading a file from a FTP to a blob storage in the Azure environment… Read more
Statistics holds out the promise of teasing out significant relationships and to determine cause and effect. Now that so much data is easily available, does that mean that we can at last harness these techniques to predict future trends and outcomes? Feodor gives an introduction to some simple techniques of predictive analytics, and explains the 'whys', 'hows' and 'wherefores'. … Read more
SQL Server Server Audit has grown in functionality over the years but it can be tricky to maintain and use because it lacks centralization and analysis tools. It can do a fast and lightweight audit of many different activities including DML and DDL at both Instance and Database Levels - even the work of the DBAs. How do you check logins and permissions? How do you script an enterprise-wide audit solution? How can you hope to analyse the log data you get? Feodor gets you started.… Read more
It often pays to use a tool like R, in conjunction with a relational database, to quickly perform a range of analyses, and graphs, in order to ensure that you're answering the right question, to explore alternative hypotheses, or to provide insight into a problem. Feodor demonstrates how to quickly, and interactively, explore the ways that customers purchase goods and services using cohort analysis.… Read more
What's the best way for a SQL programmer to learn about R? It's probably by trying out various experiments with an interesting set of data, and some helpful suggestions that point out the parallels with SQL. Feodor provides the data and the helpful suggestions. The rest is up to you.… Read more
R and SQL Server are a match made in heaven. You don't need anything special to get started beyond the basic instructions. Once you have jumped the hurdle of reliably and quickly transferring data between R and SQL Server you are ready to discover the power of a relational database when when combined with statistical computing and graphics.… Read more
OLTP databases work best when data that becomes no longer current is then transferred to a separate database for analysis and reporting. There are many ways to do this, but Feodor describes a rapid technique that takes advantage of partitions to automates the rotation of the data and moving it to the analysis server.… Read more
ETL ( Extract, transform, load) doesn't have to be like a spell on hell. To make a success of ETL systems, you need the freedom and ability to make graceful U-turns when you detect a mistake in architecture or configuration: to fix the root problem rather than to merely tackle the symptoms. Feodor lists the eight most common root causes of failure in ETL systems, and how to fix them.… Read more
For data to be usefully analyzed, it must be consistent, accurate, and trustworthy. When incoming data is non-uniform, duplicated records are created and the data starts losing its value. In order counteract this issue, SQL Server's Data Quality Services (DQS) helps monitor and maintain incoming data, and deduplicates existing data using rules-based matching. Feodor Georgiev provides a thorough walkthrough on setting up DQS and creating the rules it uses to function as a first step towards data cleansing.… Read more
The Project Deployment Model introduced in SSIS 2012, which was explained in the first part of this series, speeds up the deployment of database projects in which there may be hundreds of SSIS packages per project. Not only that, but deployments can be configured differently for each environments such as test and staging, and there are now ways of monitoring the status and performance of packages and of versioning the SSIS Catalog.… Read more
It used to be that SQL Server Integration Services (SSIS) packages had to be deployed individually. Now, they can be all deployed together from a single file by means of the Project Deployment Model introduced in SSIS 2012. Where there are tens or even hundreds of SSIS packages to deploy, this system is essential. Feodor Georgiev talks us through the basics in the first of a three-part series.… Read more
If you have a number of SQL Server instances with versions ranging from 2005 upwards, with a whole host of databases, and you want to be alerted about a number of diverse events that are useful for first-line problem-diagnosis and auditing, then Feodor's homebrew solution, using SSIS and Robocopy is likely to be what you're looking for.… Read more
The default trace is still the best way of getting important information to provide a security audit of SQL Server, since it records such information as logins, changes to users and roles, changes in object permissions, error events and changes to both database settings and schemas. The only trouble is that the information is volatile. Feodor shows how to squirrel the information away to provide reports, check for unauthorised changes and provide forensic evidence.… Read more
In most databases, very small percentage of data changes between SQL Server backups. We tolerate this because it keeps things simple, and we have grown used to cheap storage and fast networks. To exploit the value of Cloud services, should we be rethinking this in favour of backup strategies that optimise resilience and minimise downtime?… Read more
One way of getting the advantages of the Cloud without having to migrate the entire database is to just maintain a copy of the data that needs to be accessible to internet-based users in Windows Azure SQL Database. There are various ways of keeping the two in sync, and Feodor describes a solution based in using SSIS… Read more
Data Protection and Disaster Recovery (DR) are IT tasks that seldom get the same level of attention as development... until disaster strikes. Only if planning is adequate can an organisation be resilient in the face of unexpected problems. There are several steps that are needed to achieve an adequate DR process and the ability to restore business operations after a disaster.… Read more