{"id":108932,"date":"2026-04-09T14:02:00","date_gmt":"2026-04-09T14:02:00","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=108932"},"modified":"2026-03-03T16:34:23","modified_gmt":"2026-03-03T16:34:23","slug":"data-analysis-in-python-and-esproc-spl-compared-what-are-the-differences-and-which-is-best","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/development\/data-analysis-in-python-and-esproc-spl-compared-what-are-the-differences-and-which-is-best\/","title":{"rendered":"Data analysis in Python and esProc SPL compared &#8211; what are the differences, and which is best?"},"content":{"rendered":"\n<p><strong>In this article, part 3 of the <a href=\"https:\/\/www.red-gate.com\/simple-talk\/collections\/moving-from-python-to-esproc-spl\/\" target=\"_blank\" rel=\"noreferrer noopener\">&#8220;Moving from Python to esProc SPL&#8221;<\/a> series, you&#8217;ll discover the key differences between Python and SPL. Whether you&#8217;re considering adding SPL to your skillset or simply curious about alternative approaches to data analysis, this comparison will help you understand when and why you might choose one over the other.<\/strong><\/p>\n\n\n\n<p>In the <a href=\"https:\/\/www.red-gate.com\/simple-talk\/collections\/moving-from-python-to-esproc-spl\/\" target=\"_blank\" rel=\"noreferrer noopener\">first two articles<\/a>, we looked at setting up the <a href=\"https:\/\/www.esproc.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">esProc SPL<\/a> environment and its syntax and data structures. Now that you have a foundation in SPL basics, it&#8217;s time to address a question some data analysts may ask: &#8220;How does SPL compare to <a href=\"https:\/\/www.python.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Python<\/a>, and why might I want to add it to my toolkit?&#8221;<\/p>\n\n\n\n<p>As a Python developer, you&#8217;ve likely mastered libraries like <a href=\"https:\/\/pandas.pydata.org\/\" target=\"_blank\" rel=\"noreferrer noopener\">Pandas<\/a> for data analysis. Python&#8217;s flexibility and extensive ecosystem make it a tool for everything from <a href=\"https:\/\/www.tableau.com\/learn\/articles\/what-is-data-cleaning\" target=\"_blank\" rel=\"noreferrer noopener\">data cleaning<\/a> to machine learning. However, esProc SPL offers a different approach to data processing that can be more intuitive and efficient for certain tasks.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-programming-paradigms-do-python-and-esproc-spl-use\">What programming paradigms do Python and esProc SPL use?<\/h2>\n\n\n\n<p>The first and most fundamental difference between Python and SPL are their programming paradigms. Python follows an <a href=\"https:\/\/en.wikipedia.org\/wiki\/Imperative_programming\" target=\"_blank\" rel=\"noreferrer noopener\">imperative programming<\/a> model, where you specify a sequence of operations to transform data. SPL, on the other hand, uses a <a href=\"https:\/\/devopedia.org\/dataflow-programming\" target=\"_blank\" rel=\"noreferrer noopener\">dataflow programming<\/a> model, where you define a series of steps that data flows through.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"670\" height=\"478\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-71.png\" alt=\"An image showing the differences between Python and SPL's programming paradigms.\" class=\"wp-image-108933\" style=\"width:670px;height:auto\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-71.png 670w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-71-300x214.png 300w\" sizes=\"auto, (max-width: 670px) 100vw, 670px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-python-s-imperative-programming-model-explained\">Python&#8217;s imperative programming model explained<\/h3>\n\n\n\n<p>In Python, you typically write code that explicitly states how to perform operations:<\/p>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">import pandas as pd\n\n# Load data\nsales_data = pd.read_csv(\"sales.csv\")\n\n# Filter data\nhigh_value_sales = sales_data[sales_data['AMOUNT'] &gt; 1000]\n\n# Group and aggregate\nregion_totals = high_value_sales.groupby('REGION')['AMOUNT'].sum().reset_index()\n\n# Sort results\nsorted_totals = region_totals.sort_values('AMOUNT', ascending=False)\n\n# Display results\nprint(sorted_totals)<\/pre><\/div>\n\n\n\n<p>In this Python example, you create a series of variables that hold the intermediate results of your data transformations. The focus is on how to perform each step, and you need to track the flow of data through these variables. If you want to see intermediate results, you need to explicitly print them.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-esproc-spl-s-data-flow-programming-model-explained\">esProc SPL&#8217;s data-flow programming model explained<\/h3>\n\n\n\n<p>In SPL, you define a sequence of cells, each representing a step in your data processing workflow:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.select(AMOUNT&gt;2000)<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>=A2.groups(REGION;sum(AMOUNT):TOTAL)<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4<\/td><td>=A3.sort(TOTAL:-1)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>In SPL, each cell represents a transformation of the data, and the results flow naturally from one cell to the next. The focus is on what happens to the data at each step, rather than how to perform each operation. The results of each step are immediately visible in the IDE (<a href=\"https:\/\/aws.amazon.com\/what-is\/ide\/\" target=\"_blank\" rel=\"noreferrer noopener\">integrated development environment<\/a>) &#8211; unlike Python, where you must manually print intermediate results. This makes it easier to understand the data flow and verify that each step is working as expected in SPL.<\/p>\n\n\n\n<p>Additionally, instead of variables, SPL uses cell references like A1, A2, and A3 to represent each step in the workflow, making the data flow more structured and transparent. This approach contrasts with Python, where the sequence of operations determines the flow implicitly. <\/p>\n\n\n\n<p>By focusing on defining what needs to be done rather than detailing every step of how to do it, SPL can make data transformations more intuitive, especially when working with large datasets that require multiple processing steps.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-filtering-and-sorting-list-comprehensions-in-python-vs-spl-s-a-select-and-a-sort\">Filtering and sorting: list comprehensions in Python vs. SPL&#8217;s A.select() and A.sort()<\/h2>\n\n\n\n<p>Python offers multiple ways to filter and sort data, including <a href=\"https:\/\/docs.python.org\/3\/tutorial\/datastructures.html#:~:text=5.1.3.-,List%20Comprehensions,-%C2%B6\" target=\"_blank\" rel=\"noreferrer noopener\">list comprehensions<\/a> and Pandas methods. Let&#8217;s compare these with SPL&#8217;s approach.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-filtering-data-in-python-pandas-option-1\">Filtering data in Python\/Pandas (option 1)<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Using boolean indexing\nlaptops = sales_df[sales_df['PRODUCT'] == 'Laptop']\n\n# Using query method\nhigh_value_laptops = sales_df.query(\"PRODUCT == 'Laptop' and AMOUNT &gt; 1000\")\n\n# Using list comprehension (with records)\nlaptop_records = [row for row in sales_df.to_dict('records') if row['PRODUCT'] == 'Laptop']\n\nprint(f\"Number of laptop sales: {len(laptops)}\")\nprint(f\"Number of high-value laptop sales: {len(high_value_laptops)}\")<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-filtering-data-in-esproc-spl-option-1\">Filtering data in esProc SPL (option 1)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.select(PRODUCT==&#8221;Laptop&#8221;)<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>=A2.len()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 18<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4<\/td><td>=A2.select(AMOUNT&gt;1000)<\/td><td>Filtered for high-value laptops &nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5<\/td><td>=A4.len()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 12<\/td><td>&nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output shows that there are 18 laptop sales in total (A3) and 12 high-value laptop sales (A5). This means that two-thirds of all laptop sales (12 out of 18) are high-value sales over $1,000. SPL&#8217;s `select` method provides a concise way to filter data based on conditions, and the results are immediately visible in the IDE.<\/p>\n\n\n\n<section id=\"my-first-block-block_7659066a95d8335d3918aeae7cd17727\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Simple Talk is brought to you by Redgate Software<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            Take control of your databases with the trusted Database DevOps solutions provider. Automate with confidence, scale securely, and unlock growth through AI.                                    <\/div>\n            <\/div>\n                                            <a href=\"https:\/\/www.red-gate.com\/solutions\/overview\/\" class=\"btn btn--secondary btn--lg\" aria-label=\"Discover how Redgate can help you: Simple Talk is brought to you by Redgate Software\">Discover how Redgate can help you<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<h3 class=\"wp-block-heading\" id=\"h-filtering-data-in-python-pandas-option-2\">Filtering data in Python\/Pandas (option 2)<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Filtering with multiple conditions\ncomplex_filter = sales_df[\n    (sales_df['REGION'].isin(['East', 'West'])) &amp; \n    (sales_df['AMOUNT'] &gt; 1000) &amp; \n    ((sales_df['PRODUCT'] == 'Laptop') | (sales_df['PRODUCT'] == 'Server'))\n]\n\nprint(f\"Complex filter results: {len(complex_filter)}\")\nprint(complex_filter.head(3))\n<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-filtering-data-in-esproc-spl-option-2\">Filtering data in esProc SPL (option 2)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>&nbsp;<\/strong><\/td><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/strong>A<\/td><td><strong>&nbsp;<\/strong><\/td><\/tr><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/strong>1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><td>&nbsp;<\/td><\/tr><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp; <\/strong>2<\/td><td>=A1.select((REGION==&#8221;East&#8221; || REGION==&#8221;West&#8221;) &amp;&amp; AMOUNT&gt;1000 &amp;&amp; (PRODUCT==&#8221;Laptop&#8221; || PRODUCT==&#8221;Server&#8221;))&nbsp; &nbsp;<\/td><td>&nbsp;<\/td><\/tr><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <\/strong>3<\/td><td>=A2.len()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 14 &nbsp;<\/td><\/tr><tr><td><strong>&nbsp;&nbsp;&nbsp;&nbsp; <\/strong>4<\/td><td>=A2.to(3)<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; First 3 rows of filtered data &nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output of A3 shows that there are 14 rows matching the complex filter. These are high-value sales (over $1,000) of laptops or servers in the East or West regions. The output of A4 might look like:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"210\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-72.png\" alt=\"An image showing what the output of A4 may look like.\" class=\"wp-image-108934\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-72.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-72-300x65.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-72-768x165.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<p>SPL&#8217;s syntax for complex conditions is more concise and readable than Pandas&#8217; syntax, especially for <a href=\"https:\/\/www.ibm.com\/docs\/en\/tnpm\/1.4.2?topic=rules-example-nested-conditions\" target=\"_blank\" rel=\"noreferrer noopener\">nested conditions<\/a>. The use of familiar operators like `==`, `&amp;&amp;`, and `||` makes the code more intuitive, particularly for those coming from programming languages like <a href=\"https:\/\/developer.mozilla.org\/en-US\/docs\/Web\/JavaScript\" target=\"_blank\" rel=\"noreferrer noopener\">JavaScript<\/a> or <a href=\"https:\/\/learn.microsoft.com\/en-us\/dotnet\/csharp\/\" target=\"_blank\" rel=\"noreferrer noopener\">C#<\/a>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-do-you-sort-data-in-python-pandas\">How do you sort data in Python\/Pandas?<\/h2>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Sort by a single column\nsorted_by_amount = sales_df.sort_values('AMOUNT', ascending=False)\n\n# Sort by multiple columns\nmulti_sorted = sales_df.sort_values(['REGION', 'AMOUNT'], ascending=[True, False])\n\nprint(\"Sorted by AMOUNT (descending):\")\nprint(sorted_by_amount.head(3))\nprint(\"\\nSorted by REGION (ascending) and AMOUNT (descending):\")\nprint(multi_sorted.head(3))<\/pre><\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-do-you-sort-data-in-esproc-spl\">How do you sort data in esProc SPL?<\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.sort(AMOUNT:-1)<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>=A2.to(3)<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 4<\/td><td>=A1.sort(REGION,AMOUNT:-1)<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5<\/td><td>=A4.to(3)<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output of A3 might look like:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"246\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-73.png\" alt=\"Image showing what the output of A3 might look like.\" class=\"wp-image-108935\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-73.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-73-300x76.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-73-768x194.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<p>This shows the three highest-value sales in the dataset, sorted in descending order by amount. The highest sale is a server for $2,550, followed by another server sale for $2,500.<\/p>\n\n\n\n<p>The output of A5 might look like:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"227\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-74.png\" alt=\"Image showing what the output of A5 might look like.\" class=\"wp-image-108936\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-74.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-74-300x70.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-74-768x179.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<p>This shows the first three rows sorted by region (alphabetically) and then by amount (descending). All three rows are from the East region, with the highest-value sales listed first. SPL&#8217;s `sort` method provides a concise way to sort data by multiple columns in different directions.<\/p>\n\n\n\n<p>When it comes to filtering and sorting, SPL offers a more concise and readable syntax compared to Pandas, particularly when dealing with complex conditions. It uses familiar logical operators like ==, &amp;&amp;, and ||, making expressions more intuitive, whereas Pandas relies on &amp; and |, which require careful use of parentheses to avoid errors. <\/p>\n\n\n\n<p>Sorting in SPL is also straightforward, as you can specify descending order with -1, while Pandas requires setting the ascending parameter to False. Both languages support chaining operations, but SPL\u2019s cell-based execution allows you to inspect intermediate results more easily, making debugging and analysis more transparent.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-grouping-and-aggregation-in-python-vs-esproc-spl-what-are-the-differences\">Grouping and aggregation in Python vs esProc SPL &#8211; what are the differences?<\/h2>\n\n\n\n<p>Grouping and <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/sql-data-aggregation-aggravation\/\" target=\"_blank\" rel=\"noreferrer noopener\">aggregation<\/a> are common operations in data analysis, and both Python\/Pandas and SPL provide powerful tools for these tasks. Let&#8217;s compare their approaches.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-grouping-and-aggregation-in-python-pandas-option-1\">Grouping and aggregation in Python\/Pandas (option 1)<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Group by REGION and sum AMOUNT\nregion_totals = sales_df.groupby('REGION')['AMOUNT'].sum().reset_index()\n\n# Group by REGION and calculate multiple aggregates\nregion_stats = sales_df.groupby('REGION').agg({\n    'AMOUNT': ['sum', 'mean', 'count']\n}).reset_index()\n\n# Rename columns for clarity\nregion_stats.columns = ['REGION', 'TOTAL', 'AVERAGE', 'COUNT']\n\nprint(\"Region Totals:\")\nprint(region_totals)\nprint(\"\\nRegion Statistics:\")\nprint(region_stats)<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-grouping-and-aggregation-in-esproc-spl-option-1\">Grouping and aggregation in esProc SPL (option 1)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.groups(REGION;sum(AMOUNT):TOTAL)<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>=A1.groups(REGION; sum(AMOUNT):TOTAL, avg(AMOUNT):AVERAGE, count(AMOUNT):COUNT)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Multiple aggregates<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output of A2 might look like:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"292\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-75.png\" alt=\"Image showing what the output of A2 might look like.\" class=\"wp-image-108937\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-75.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-75-300x90.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-75-768x230.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<p>This shows the total sales amount for each region. The West region has the highest total sales at $37,730, followed by North at $34,370, East at $26,990, and South at $23,340.<\/p>\n\n\n\n<p>The output of A3:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"333\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-76.png\" alt=\"Image showing the output of A3.\" class=\"wp-image-108938\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-76.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-76-300x102.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-76-768x262.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<p>This provides a more comprehensive view of sales by region, including the total sales amount, average sale amount, and number of sales for each region. SPL&#8217;s `groups` method provides a concise way to calculate multiple aggregates in a single operation, with clear syntax for naming the resulting columns.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-grouping-and-aggregation-in-python-pandas-option-2\">Grouping and aggregation in Python\/Pandas (option 2)<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Group by multiple columns\nproduct_region = sales_df.groupby(['REGION', 'PRODUCT']).agg({\n    'AMOUNT': ['sum', 'count']\n}).reset_index()\n\n# Rename columns\nproduct_region.columns = ['REGION', 'PRODUCT', 'TOTAL', 'COUNT']\n\n# Filter groups after aggregation\nhigh_volume = product_region[product_region['COUNT'] &gt; 3]\n\nprint(\"Product sales by region (high volume only):\")\nprint(high_volume)<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-grouping-and-aggregation-in-esproc-spl-option-2\">Grouping and aggregation in esProc SPL (option 2)<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.groups(REGION, PRODUCT; sum(AMOUNT):TOTAL, count(AMOUNT):COUNT) &nbsp; &nbsp; Group by REGION and PRODUCT<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>&nbsp; =A2.select(COUNT&gt;3)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Filter for high-volume products<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output of A3 might look like:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"363\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-77.png\" alt=\"Image showing what the output of A3 might look like.\" class=\"wp-image-108939\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-77.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-77-300x112.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/02\/image-77-768x286.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<p>This shows the region-product combinations with more than 3 sales. Laptops are popular across all regions, with the highest total sales in the East region ($22,690). SPL&#8217;s approach to grouping and filtering is more concise and readable.<\/p>\n\n\n\n<section id=\"my-first-block-block_571f5d8eb3598b8a2a5b9e8e327aebe1\" class=\"my-first-block alignwide\">\n    <div class=\"bg-brand-600 text-base-white py-5xl px-4xl rounded-sm bg-gradient-to-r from-brand-600 to-brand-500 red\">\n        <div class=\"gap-4xl items-start md:items-center flex flex-col md:flex-row justify-between\">\n            <div class=\"flex-1 col-span-10 lg:col-span-7\">\n                <h3 class=\"mt-0 font-display mb-2 text-display-sm\">Enjoying this article? Subscribe to the Simple Talk newsletter<\/h3>\n                <div class=\"child:last-of-type:mb-0\">\n                                            Get selected articles, event information, podcasts and other industry content delivered straight to your inbox.                                    <\/div>\n            <\/div>\n                                            <a href=\"https:\/\/www.red-gate.com\/simple-talk\/subscribe\/\" class=\"btn btn--secondary btn--lg\" aria-label=\"Subscribe now: Enjoying this article? Subscribe to the Simple Talk newsletter\">Subscribe now<\/a>\n                    <\/div>\n    <\/div>\n<\/section>\n\n\n<h2 class=\"wp-block-heading\" id=\"h-what-is-python-and-esproc-spl-s-equivalent-to-the-sql-having-clause\">What is Python and esProc SPL&#8217;s equivalent to the SQL HAVING clause?<\/h2>\n\n\n\n<p>In SQL, the <a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/t-sql-programming-sql-server\/summarizing-data-using-group-by-and-having-clauses\/\" target=\"_blank\" rel=\"noreferrer noopener\">HAVING clause<\/a> filters groups after aggregation. Let&#8217;s compare how Python and SPL handle this:<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-does-the-having-clause-work-in-python-pandas\">How does the HAVING clause work in Python\/Pandas?<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Two-step process: group, then filter\nregion_totals = sales_df.groupby('REGION')['AMOUNT'].sum().reset_index()\nhigh_total_regions = region_totals[region_totals['AMOUNT'] &gt; 30000]\n\nprint(\"Regions with total sales over $30,000:\")\nprint(high_total_regions)<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-does-the-having-clause-work-in-esproc-spl\">How does the HAVING clause work in esProc SPL?<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.groups(REGION;sum(AMOUNT):TOTAL)&nbsp;&nbsp;&nbsp;&nbsp; Group by REGION and sum AMOUNT<\/td><td>&nbsp;<\/td><\/tr><tr><td>3<\/td><td>=A2.select(TOTAL&gt;30000)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>Filter for high-total regions &nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>SPL&#8217;s approach to post-aggregation filtering is similar to Pandas&#8217;. In esProc SPL, the groups method offers a clear and intuitive way to perform grouping and aggregation, separating grouping columns from aggregate expressions for better readability. <\/p>\n\n\n\n<p>Unlike Pandas, where renaming aggregated columns often requires an additional step, SPL allows direct renaming within the groups method using the : syntax. Both languages support filtering results after aggregation, but SPL&#8217;s approach tends to be more concise.<\/p>\n\n\n\n<p>Additionally, SPL simplifies calculating multiple aggregates in a single operation with a straightforward syntax for naming the resulting columns, reducing the need for extra processing steps.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-python-s-datetime-vs-esproc-spl-s-built-in-functions\">Python&#8217;s datetime vs. esProc SPL&#8217;s built-in functions<\/h2>\n\n\n\n<p>Date and time manipulation is a common task in data analysis. Let&#8217;s compare how Python and SPL handle these operations.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-do-basic-date-operations-work-in-python-pandas\">How do basic date operations work in Python\/Pandas?<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">import pandas as pd\nfrom datetime import datetime\n\n# Load CSV file and parse dates\nsales_df = pd.read_csv(\"sales.csv\", parse_dates=['DATE'])\n\n# Define date range\nstart_date = datetime(2023, 5, 1)\nend_date = datetime(2023, 5, 31)\n\n# Filter sales data within the date range\nmay_sales = sales_df[(sales_df['DATE'] &gt;= start_date) &amp; (sales_df['DATE'] &lt;= end_date)]\n\n# Display the number of sales in May 2023\nprint(f\"Number of sales in May 2023: {len(may_sales)}\")<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-do-basic-date-operations-work-in-esproc-spl\">How do basic date operations work in esProc SPL?<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.select(DATE&gt;=date(&#8220;2023-05-01&#8221;) &amp;&amp; DATE&lt;=date(&#8220;2023-05-31&#8221;))&nbsp;&nbsp;<\/td><td>Filter for May 2023 &nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>=A2.len()&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>28<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output of A3 shows that there are 28 sales in May 2023. SPL&#8217;s date functions make it easy to filter data by date range, with a syntax that&#8217;s similar to filtering by other types of values.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-extracting-date-components-in-python-vs-esproc-spl\">Extracting date components in Python vs esProc SPL<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-does-extracting-date-components-work-in-python-pandas\">How does extracting date components work in Python\/Pandas?<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Extract date components\nsales_df['YEAR'] = sales_df['DATE'].dt.year\nsales_df['MONTH'] = sales_df['DATE'].dt.month\nsales_df['DAY'] = sales_df['DATE'].dt.day\n\n# Group by month\nmonthly_sales = sales_df.groupby('MONTH')['AMOUNT'].sum().reset_index()\n\nprint(\"Monthly sales totals:\")\nprint(monthly_sales)<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-does-extracting-date-components-work-in-esproc-spl\">How does extracting date components work in esProc SPL?<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()&nbsp;<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.groups(month(DATE):MONTH; sum(AMOUNT):TOTAL)<\/td><td>&nbsp; &nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output of A2 will show the total sales for each month. SPL&#8217;s date functions make it easy to extract components from dates and use them for grouping and analysis.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-is-the-data-arithmetic-for-extracting-date-components-in-python-pandas\">What is the data arithmetic for extracting date components in Python\/Pandas?<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Add 30 days to each date\nsales_df['FUTURE_DATE'] = sales_df['DATE'] + timedelta(days=30)\n\n# Calculate days between dates\ntoday = datetime.now().date()\nsales_df['DAYS_AGO'] = (today - sales_df['DATE'].dt.date).dt.days\n\nprint(\"Dates with days ago:\")\nprint(sales_df[['DATE', 'DAYS_AGO']].head(3))<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-is-the-data-arithmetic-for-extracting-date-components-in-esproc-spl\">What is the data arithmetic for extracting date components in esProc SPL?<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct()<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.derive(FUTURE_DATE:date(DATE).addDays(30))&nbsp;&nbsp;<\/td><td>Add 30 days to each date &nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp; 3<\/td><td>=A2.derive(interval(DATE,now()):DAYS_AGO)&nbsp;&nbsp;<\/td><td>Calculate days between dates &nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp; 4<\/td><td>=A3.to(3).new(DATE,DAYS_AGO).peek(3)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>First 3 rows<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output of A4 shows the original date and the number of days between that date and today. SPL&#8217;s `interval@d` function calculates the number of days between two dates, similar to subtracting dates in Pandas.<\/p>\n\n\n\n<p>When working with dates, both Pandas and esProc SPL offer a range of functions, but their approaches differ. In Pandas, you need to parse dates when loading data, while in SPL, the string dates will be automatically converted into date objects. SPL provides built-in functions like year(), month(), and day() that are directly called on date objects, whereas Pandas requires using the .dt accessor for similar operations. <\/p>\n\n\n\n<p>Date arithmetic is also handled differently: SPL uses methods like elapse(), while Pandas relies on operators combined with timedelta. When calculating intervals between dates, SPL offers the interval@d function to return the number of days between two dates, which is similar to subtracting date objects in Pandas.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-string-manipulation-in-python-and-esproc-spl-what-are-the-differences-and-how-does-it-work-in-each\">String manipulation in Python and esProc SPL: what are the differences, and how does it work in each?<\/h2>\n\n\n\n<p><a href=\"https:\/\/www.red-gate.com\/simple-talk\/databases\/sql-server\/learn\/robyn-pages-sql-server-string-manipulation-workbench\/\" target=\"_blank\" rel=\"noreferrer noopener\">String manipulation<\/a> is another common task in data analysis. Let&#8217;s compare how Python and SPL handle these operations.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-are-the-basic-string-operations-in-python-pandas\">What are the basic string operations in Python\/Pandas?<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Convert to uppercase\nsales_df['REGION_UPPER'] = sales_df['REGION'].str.upper()\n\n# Extract substring\nsales_df['REGION_FIRST_3'] = sales_df['REGION'].str[:3]\n\n# Concatenate strings\nsales_df['PRODUCT_REGION'] = sales_df['PRODUCT'] + \" - \" + sales_df['REGION']\n\nprint(\"String operations:\")\nprint(sales_df[['REGION', 'REGION_UPPER', 'REGION_FIRST_3', 'PRODUCT_REGION']].head(3))<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-what-are-the-basic-string-operations-in-esproc-spl\">What are the basic string operations in esProc SPL?<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct(REGION)<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.derive(upper(REGION):REGION_UPPER,substr(REGION,1,3): REGION_FIRST_3,PRODUCT+&#8221; &#8211; &#8220;+REGION:PRODUCT_REGION)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>&nbsp; String operations &nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>&nbsp; =A2.to(3)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>&nbsp; First three rows<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>The output of A3 might look like:<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"229\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image.png\" alt=\"What the output of A3 could look like.\" class=\"wp-image-108945\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-300x70.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-768x180.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><\/figure>\n\n\n\n<p>This shows the results of various string operations on the REGION column. The `upper` function converts the region to uppercase, the `substr` function extracts the first three characters, and the `+` operator concatenates the product and region with a separator. SPL&#8217;s string functions are similar to Python&#8217;s, but with a more integrated approach that allows you to perform multiple operations in a single `derive` call.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-to-string-search-and-replace-in-python-and-esproc-spl\">How to string search and replace in Python and esProc SPL<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-do-you-string-search-and-replace-in-python-pandas\">How do you string search and replace in Python\/Pandas?<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \"># Check if string contains substring\nsales_df['HAS_EAST'] = sales_df['REGION'].str.contains('East')\n\n# Replace substring\nsales_df['REGION_MODIFIED'] = sales_df['REGION'].str.replace('East', 'Eastern')\n\nprint(\"String searching and replacing:\")\nprint(sales_df[['REGION', 'HAS_EAST', 'REGION_MODIFIED']].head(5))<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-do-you-string-search-and-replace-in-esproc-spl\">How do you string search and replace in esProc SPL?<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct(REGION)<\/td><td>&nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.derive(pos(REGION,&#8221;East&#8221;)&gt;0: HAS_EAST, replace(REGION,&#8221;East&#8221;,&#8221;Eastern&#8221;):REGION_MODIFIED)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>&nbsp; String searching and replacing &nbsp;<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>=A2.to(5)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<\/td><td>First five rows &nbsp;<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"975\" height=\"269\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-1.png\" alt=\"What the output of A3 might look like.\" class=\"wp-image-108946\" srcset=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-1.png 975w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-1-300x83.png 300w, https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2026\/03\/image-1-768x212.png 768w\" sizes=\"auto, (max-width: 975px) 100vw, 975px\" \/><figcaption class=\"wp-element-caption\"><em>What the output of A3 might look like<\/em><\/figcaption><\/figure>\n\n\n\n<p>This shows the results of string searching and replacing operations. The `pos` function returns the position of a substring within a string, and we use `pos(REGION,&#8221;East&#8221;)&gt;0` to check if the region contains &#8220;East&#8221;. The `replace` function replaces all occurrences of a substring with another string. SPL&#8217;s string functions provide similar capabilities to Python&#8217;s, but with a syntax that integrates well with other data operations.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-how-regular-expressions-work-in-python-and-esproc-spl\">How regular expressions work in Python and esProc SPL<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-do-regular-expressions-work-in-python-pandas\">How do regular expressions work in Python\/Pandas?<\/h3>\n\n\n\n<div class=\"wp-block-urvanov-syntax-highlighter-code-block\"><pre class=\"lang:tsql decode:true \">import re\n\n# Extract digits from product names\ndef extract_digits(text):\n    match = re.search(r'\\d+', str(text))\n    return match.group(0) if match else \"\"\n\nsales_df['PRODUCT_DIGITS'] = sales_df['PRODUCT'].apply(extract_digits)\n\nprint(\"Regular expression extraction:\")\nprint(sales_df[['PRODUCT', 'PRODUCT_DIGITS']].head(5))<\/pre><\/div>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-how-do-regular-expressions-work-in-esproc-spl\">How do regular expressions work in esProc SPL?<\/h3>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><thead><tr><td>&nbsp;<\/td><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; A<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 1<\/td><td>=file(&#8220;document\/sales.csv&#8221;).import@ct(PRODUCT)<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2<\/td><td>=A1.derive(PRODUCT.regex(&#8220;[^\\d]*(\\d+)&#8221;):PRODUCT_DIGITS)<\/td><\/tr><tr><td>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 3<\/td><td>=A2.to(3)<\/td><\/tr><\/thead><\/table><\/figure>\n\n\n\n<p>esProc SPL allows for efficient string manipulation and <a href=\"https:\/\/www.red-gate.com\/simple-talk\/blogs\/regular-expressions\/\" target=\"_blank\" rel=\"noreferrer noopener\">regular expression<\/a> operations directly within its functional framework. In this example, we first append a version number to each product name and then extract the digits using a regex-based approach. Unlike Python\u2019s Pandas, where string operations require the str accessor and method chaining, SPL applies transformations directly to data columns, making expressions more concise. <\/p>\n\n\n\n<p>And, while Python offers greater flexibility in handling complex regex patterns, SPL integrates string processing directly into data manipulation steps, reducing the need for additional function calls like `apply()&#8217;.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-summary-and-next-steps\">Summary and next steps<\/h2>\n\n\n\n<p>In this article, we&#8217;ve compared Python and esProc SPL across various aspects of data analysis, from basic operations to complex transformations. While both languages are good tools for data analysis, they approach the task from different perspectives.<\/p>\n\n\n\n<p>As a Python developer, adding SPL to your toolkit doesn&#8217;t mean abandoning Python. Instead, it gives you another perspective on data analysis and another tool for specific tasks where SPL&#8217;s approach might be more efficient or intuitive.<\/p>\n\n\n\n<p><a href=\"https:\/\/www.red-gate.com\/simple-talk\/collections\/moving-from-python-to-esproc-spl\/\" target=\"_blank\" rel=\"noreferrer noopener\">Click here for more in the &#8220;Moving from Python to esProc SPL&#8221; series.<\/a><\/p>\n\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>FAQs: Data analysis in Python and esProc SPL<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. Why should a Python user consider esProc SPL for data analysis?<\/h3>\n            <div class=\"faq-answer\">\n                <p>esProc SPL offers a different approach to data analysis that can be more intuitive for certain tasks. Its cell-based, data-flow programming model makes complex data transformations easier to understand and debug. The immediate visibility of results at each step helps you identify issues early and iterate quickly. While Python excels at general-purpose programming and has a vast ecosystem, esProc SPL can be more efficient for data transformation workflows, especially when working with tabular data.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. Does esProc SPL have built-in functions like Pandas?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Yes, esProc SPL provides built-in filtering, sorting, grouping, and aggregation functions.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">3. How does data grouping in esProc SPL compare to Python\u2019s `groupby()`?<\/h3>\n            <div class=\"faq-answer\">\n                <p>esProc SPL\u2019s `groups()` function achieves grouping and aggregation in one step.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">4. Is esProc SPL faster than Pandas for large datasets?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Performance depends on the specific operation and dataset size. esProc SPL is optimized for data processing operations and can be faster than Pandas for certain tasks, especially those involving complex transformations of tabular data. Also, its memory management is designed specifically for data processing, which can lead to better performance for large datasets. However, for large datasets that exceed memory capacity, both tools offer options for processing data in chunks or connecting to external databases.<\/p>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>Compare esProc SPL and Python for data analysis. Discover how esProc simplifies grouping, filtering, and data processing compared to Pandas.&hellip;<\/p>\n","protected":false},"author":342096,"featured_media":105472,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[143526,146042],"tags":[159117,159374,5021],"coauthors":[159002],"class_list":["post-108932","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-development","category-python","tag-esproc-spl","tag-moving-from-python-to-esproc-spl","tag-python"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/108932","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/342096"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=108932"}],"version-history":[{"count":6,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/108932\/revisions"}],"predecessor-version":[{"id":108953,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/108932\/revisions\/108953"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media\/105472"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=108932"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=108932"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=108932"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=108932"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}