20 September 2017

Guest post

This is a guest post from Coeo. Europe's most trusted analytics and data management expert, Coeo is the number one provider of database strategy in the Retail, Financial services and Gaming industries, and delivers technology strategy and support for businesses who need to get the most from their data.

The Coeo team hold more Microsoft certifications than any other data platform specialist in Europe and are passionate about sharing their knowledge and expertise to help customers become industry leaders.

Coeo

20 September 2017

1 Comment

The importance of bringing Python to SQL Server

Guest post

Python and SQL Server When SQL Server 2017 launched, Microsoft proudly announced that it was the first relational database engine to ship with built-in artificial intelligence (AI) capabilities.

In the future, database servers running its software won’t just be able to store, query, and analyse data. Instead, they’ll also be able to perform machine learning, natural language, and neural network processing – the foundations of modern AI. To make this happen, Microsoft has built support for the Python programming language into its database engine. But why Python and SQL Server? What is Python? And how does this compare to the R integration in SQL Server 2016?

What is Python?

Python is a general-purpose programming language, named after Monty Python. It favors code readability over formality, can be extended to do almost anything using libraries of additional functionality, and is popular with a very broad range of audiences. In case it risks sounding too lightweight to be powerful, history says otherwise – the Reddit.com web site is written using it and Google created a Python library that scales data science calculations to concurrently use thousands of processors. What makes it stand out from the crowd right now is the quantity and quality of AI related libraries for it. Very complex Artificial Intelligence functionality, such as machine learning, is available in relatively easy to consume add-ons, such as those listed here.

Why integrate Python and SQL Server?

One of the most likely reasons why Microsoft has now integrated Python into SQL Server is because it’s increasingly an application developer’s preferred language for performing AI operations. It’s broad language capabilities make it much easier for them to embed data analytics, machine learning and AI functionality into an existing application or database’s code – and then do something with the output, such as reformat it for further use.

This is a significant difference compared to other analytical languages, such as R, which are primarily used by data scientists to discover and analyse data in their own time, rather than perform near-real-time operations within an application. That’s not to say they couldn’t do that because they can, but it doesn’t change the fact that they’re more suited to being used for offline analysis rather than playing a part in application activity.

If anything sums the difference up best, then it’s a saying that the more you use the R language, the more of a PhD data scientist you are, while the more Python you use, the more of an engineer or developer you are. Similar capabilities yet different audiences, which is why Microsoft is now offering access to both.

How has Microsoft integrated it?

Microsoft has used the same integration approach for Python with SQL Server as it did for R. In a nutshell, that means the Python computation processes run outside of the database engine, but are connected to it using a very fast, secure and proprietary in-memory communications channel. This channel is used to send data from the database to the Python processes and return the results back to SQL Server. The Python code users want to execute, just like R code, is passed as a parameter to a T-SQL stored procedure. This means any application that can execute T-SQL can now also execute Python.

Finally, just like it did with R, Microsoft has provided a series of RevoScalePy packages which allow Python operations to use parallel processing and data sets without size limits.

Bringing the compute to the data

Despite historic reservations, bringing the compute to the data is the new model for enterprise-scale data analytics, machine learning and AI. Processor power in most database servers is now plentiful compared to just a few years ago, while the security and performance implications of sending data in and out of a database server are now far more problematic.

Another benefit is that this model allows all of the data in a database to be used as an AI model’s training data, whereas in data sent to the remote compute model, network bandwidth and remote compute resources often limited the amount, and therefore the accuracy, of the training data analyzed.

How soon?

Bringing Python capabilities to SQL Server is almost certainly a long term investment for Microsoft, just like integrating R was. It’ll take time for Python developers to consider how they can use a SQL Server platform while it’ll take time for SQL Server developers to consider how they can use Python. But neither are going away and innovation trends show us AI is being used in more and more everyday situations. It’s worth then remembering the capability is there.

Why data anonymization is important to organizations – and their customers

The rise of the digital era has led to increasing concerns among individuals around the protection of their personal data. In 2022, for example, 24% of individuals exercised their right to Data Subject Access Requests, and this rose to 28% in 2023 according to Statista research results. Alongside this has been the creation and... Read more

Data anonymization

James Wood

28 February 2024

James Wood

28 February 2024

Blog

5 things to look for in a database monitoring tool

Databases are the backbone of enterprise-level organizations, facilitating efficient data management, supporting critical business processes, and providing a foundation for secure innovation and growth. Effective database monitoring is critical for maintaining a high level of performance, security, and reliability in database systems, particularly in enterprise-level environments where large volumes of data and critical business... Read more

Maxime Drobot

11 August 2023

Maxime Drobot

11 August 2023

Blog

How to mitigate the challenges of data growth

Over the last decade, I’ve rarely met a data professional whose organization wasn’t experiencing data growth and making more demands of their data. We build and deploy new applications faster than we retire old ones, and new data is accumulating dramatically faster on our existing systems than our ability to decide to delete older... Read more

data growth

postgres

SQL Server

Redgate Test Data Manager

Flyway

Redgate Monitor

Overview

Test Data Management

Automate

Monitor

Redgate Blog

Guest post

The importance of bringing Python to SQL Server

Guest post

What is Python?

Why integrate Python and SQL Server?

How has Microsoft integrated it?

Bringing the compute to the data

How soon?

Further reading

Redgate Test Data Manager

Flyway

Redgate Monitor

Overview

Test Data Management

Automate

Monitor

Guest post

Guest post

What is Python?

Why integrate Python and SQL Server?

How has Microsoft integrated it?

Bringing the compute to the data

How soon?

Further reading

You may also like