Embedding IronRuby and the DLR into SQL Data Generator

You can embed DLR-based languages such as IronPython and IronRuby into applications. This gives you instant scripting, and with it, the potential for greatly simplifying the development process. You can even take advantage of existing applications that have a plug-in architecture, such as SQL Data Generator and NET Reflector. Be Hall takes SQL Data Generator and shows how this technique allows you to create simple generators specifically for your needs.

Embedding IronRuby and the DLR into SQL Data Generator

SQL Data Generator is already a very versatile application for developing databases. With a few clicks you can populate your database with realistic data. However, there is one problem – it is static.  The built-in generators do an amazing job, the SQL Statement generator, for example, will execute a SQL statement and use the results in the generation process. Yet, if you want to build a more complex structure and have more control over the data that you are generating, then you will have to write our own generator. Up to now, writing a plug-in generator meant doing a separate development that used a compiled .NET language in Visual Studio. Now, there is a neater way of doing this. Microsoft have released the Dynamic Language Runtime (DLR) which allows you to use dynamic languages together with the .Net platform. Part of this work includes Microsoft building implementations of dynamic languages including Python, known as IronPython, and Ruby, known as IronRuby. The result is that you can write Python or Ruby code while taking advantage of existing C# applications and the entire .Net framework.

This opens up an interesting opportunity for SQL Data Generator. By embedding a dynamic language into the application, we can give the users more control over the type of data that they want to generate. Users can now dynamically generate the type of data they wish based on their own requirements. They can still take advantage of the existing generators, but when required, they can write Ruby code in order to produce exactly the data they require.

In this article, I will discuss how to create a custom generator and how embedding the DLR and IronRuby language can open up new functionality. The compiled assembly and source code are attached to this article. Feel free to use the generator within your own projects and let us know what you think.

Creating the plug-in  Generator

The first task is to create the basic plug-in generator. These generators are simple to create and there are several examples that are already available on http://www.codeplex.com/SDGGenerators . First, we need a standard class library – in this case C#.  Creating a  generator in C#  is described in an article on our support pages.

Embedding the DLR Engine

With the basic plug-in generator in place, we can then start the fun task of embedding the DLR into the plug-in generator, and thereby into the SQL Data Generator application itself.  The DLR has a hosting API that is designed to allow languages to be embedded into C# applications. By making use of this, it is extremely simple to embed DLR-based languages into plug-in generators. The first task is to reference the DLR and IronRuby. The assemblies can be downloaded from www.codeplex.com/IronRuby. The first two assemblies are the IronRuby language implementation, while the second two is the DLR itself.

  • IronRuby.dll
    IronRuby.Libraries.dll
    Microsoft.Scripting.dll
    Microsoft.Scripting.Core.dll

In our constructor for the generator, we can now ask IronRuby to create us our DLR ScriptEngine.

The script engine has a number of responsibilities, one of which is to maintain different levels of scope which hold variables and method declarations. The other is to execute the actual code.

Defining Scope

Once we have access to the language engine, we can create our scope. This allows us to create variables to store values or create new methods to use, so as to start writing code as you would in a normal application. Within the application, we could have different scopes based on different parts of the application. In the case of SQL Data Generator, we will just have a single scope.  As with the engine, creating the scope is simple – we call the method CreateScope on the engine.

Executing IronRuby code

Once we have created our scope, we can execute our code. Before we can execute code using the DLR, we need to create a ScriptSource object. This is responsible for the actual execution. In a similar fashion to our Scope object, the engine is responsible for the creation of the ScriptSource based on a string containing the code we wish to execute. Along with the code, we need to tell the engine what type of code we want to execute.

The DLR handles a number of dynamic languages, each with different ways to execute code. It can, for example, execute code as part of an interactive session, from a file or a standalone statement. From my experience, the most common way is by the Statements option.  If you’re not sure how the dynamic language is executing code, then there is an AutoDetect property to allow the Engine to determine the type. In this example, we will be executing statements.

Once we have created our ScriptSource object we can call our Execute method. As an argument, we provide the appropriate Scope object to use during execution.

Because we are using the same scope, we could execute different blocks of code at different points in time, each with knowledge of the previous execution. This allows us to define variables in one context while use them in a different setting.

 Our execution method is completed as below. 

Within our application, we can now execute the IronRuby code and use the result. For example, given the execution below the result in the variable would be 2.

However,  we are quite likely to mistype the code we want to execute. If we execute invalid IronRuby code, an exception will be thrown from our ScriptSource object. There are a number of different methods which could be thrown; MethodMissingException or SyntaxErrorException, for example. Because we want to return all the errors back to the user, I will simply catch all the exceptions. I know this is wrong and we should use an application like Exception Hunter to identify and catch the correct exceptions: However, for the project in this article we can catch and report everything. The exception message property contains the reason why the code failed.  Within our generator, we need to add them to a collection so as to report errors back to the user

To be able to display the error on the UI we need to inherit from the IErrorsAndWarnings.

Within this in place, if our code errors during execution then we receive a nice notification.

697-image002.jpg

 We now have everything in place to allow us to execute IronRuby code from our C# application. This means we can use the built-in ruby functions, for example the Random number generator to obtain a value.

Within our GetEnumerator method we can use the response as the result to use during our data generation. But I will now show that we can be cleverer than this. The first part is handing control to the user.

Giving the users access

While it is fine to be able to execute hardcoded strings, the power comes by offering this functionality to the user. To add a custom UI onto our generator, we need to implement IGeneratorUIStyle and IGeneratorUI. In terms of implementation, I simply provide a textbox for the user to enter the code into. In order to know when to execute the code I check to see if a new line has been inserted, if it has, and the cursor location is at the end of the text box, then I execute my code.

Creating Ruby helper methods

We have now allowed our users to write and execute IronRuby code.  They  have the means to generate a value which we will use to populate the table. However, this isn’t very compelling. We have created an interesting concept but we have yet to take full advantage of it.

One of the ways to take advantage is to execute additional code on behalf of the user. For example, we could automatically assign a variable value for them via executing the Ruby code.

Another approach is to take advantage of our Scope to create and pre-define a variable available during execution.

We could even define Ruby methods which people can use when they are writing their own code.

While adding additional Ruby methods is great, it still leads to a disconnection between the C# world and the Ruby world within the application. We want the languages to be at one with each other.

Calling C# methods from Ruby

I want to take the approach of having additional methods available to my Ruby code but I want the methods to be written in C#.  The result is that users will be able to write more advanced code, while our methods will have more influence on how the data is generated.

To work the magic I use the Scope object to define a new variable. The value of the variable is the instance of generator itself.

In order to control the data being executed, I create a new method on my generator called set. This will set the data to use during the generation process. The method is shown below.

This allows us to write code such as Execute(“column.set [1,2,3,4,5,6,7,8,9,0]”); When the IronRuby code is executed, it will call the set method on our generator object passing in the array of ints as an IEnumberable object.  

This means that in our GetEnumerate method we can use this data as the basis for the data being generated.

697-image004.jpg

This is only the start of the power we can now provide. When GetEnumerator is called, a GenerationSession object is provided that has knowledge about the data within the other columns being generated for that particular row. Being able to access this information from a different column is extremely useful. In a similar approach to our column variable I created a second variable called table.  The reason I assigned the same object to two different variables is readability. We want the code being written in the generator UI to be as natural as possible – having appropriately named variables supports this. 

I then define a new method called row which takes in the column number. I also created an overload which takes in the column name and determines the index before calling the method.  This method accesses our Session object and returns the value in the row for that column.

However, this leads to a problem. Instead of writing IronRuby code to generate data for our entire column we are now just generating data for that particular column in that particular row. As we are now using single objects I created an overload method to convert our object into an IEnumerable to use during generation.

Within our GetEnumerator we loop forever: You don’t need to worry about this because the SQL Data Generator Engine knows how and when to stop us. We execute the IronRuby code, provided by the user, multiple times to simulate the effect of generating it once per row. I’ve kept the loop around the Data in as this allows the ‘generate’ to return multiple values per execution. 

If the code fails to execute, then the Data variable will be null; as such we only want DBNull to be used in order to alert the user to a problem.

The opportunities this opens

Once this generator is in place, it opens up some extremely exciting opportunities. It is simple to use Ruby to generate data. The real fun part comes when you start combining different sources and creating more interesting data from them.

Our RegEx generator allows you to combine the results of two columns. However, we can now combine our existing columns together, performing string manipulation in the process….

We can use the builtin Ruby options to generate data…

We can also start to perform more complex logic on the data we are generating. Unfortunately, we need to insert the value into a string before converting it into a float because row will return a SqlMoney object in this case…

We can also use conditional logic in order to generate our results. 

697-image006.jpg

Real World Examples

After completing the generator I decided to take a look on our support calls to find real problems problems the generator would solve. The first one I spotted was a standard use case.  The user had a table with a number of columns. The data within FileAs column was dependent on the IsCompany column.

If IsCompany is true, then it should populate the column with the value of the CompanyName column. Otherwise, it should be a combination of ‘LastName, FirstName’.  With our new generator this is a simple block of code.

The data required based on their own business requirements can now be generated.

A second example is a user who wants to populate each column within the table with slightly different combinations of data.  Each column within the table required a different set of data. Our built-in generators do not support being unable to combine different data sources. An example of the data they wanted to produce was this:

  • “First, add 1000 addresses of type “company address” and fill the columns A,B and C”
  • “Second, add 1000 addresses of type “private address” and fill the columns A,D and E”
  • “Third, add 1000 addresses of type “additional address” and fill the columns B,C and F”

By using our new generator, our user could easily produce data to match their requirements. By assigning each column with the IronRuby Generator we can control the data being produced. We could have our 1000 addresses as external text files and based on our requirements we could simply read the lines in using the standard ruby library. For example, this could be the code for column A.

While column B could be:

Once we have read the lines in, we combine the two arrays as the data to use during population.

While some of this functionality is provided out the box, the ability to use IronRuby as part of the data generation process definitely opens up new opportunities for the application and our users.

NOTE: The DLR requires .Net 2.0 SP1. As such, you will need SP1 installed to use this generator.

NOTE: This Plug-in generator is provided ‘as-is’ and is completely unsupported