{"id":95113,"date":"2022-12-02T00:00:03","date_gmt":"2022-12-02T00:00:03","guid":{"rendered":"https:\/\/www.red-gate.com\/simple-talk\/?p=95113"},"modified":"2026-04-15T18:56:01","modified_gmt":"2026-04-15T18:56:01","slug":"subqueries-in-mysql","status":"publish","type":"post","link":"https:\/\/www.red-gate.com\/simple-talk\/databases\/mysql\/subqueries-in-mysql\/","title":{"rendered":"MySQL Subqueries: Scalar, Correlated, Row, Column, and Derived Tables"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\" id=\"h-executive-summary\">Executive Summary<\/h2>\n\n\n\n<p><strong>A MySQL subquery is a SELECT statement nested inside another SQL statement &#8211; used to filter results based on data from another query, compute intermediate values, or build derived tables. MySQL supports five main subquery patterns: scalar subqueries (return a single value for use in a WHERE, HAVING, or SELECT clause), correlated subqueries (reference the outer query &#8211; evaluated once per outer row), row subqueries (return a single row for multi-column comparison), column subqueries (return a single column for IN or ANY\/ALL comparisons), and derived tables (return a full table result used in a FROM clause like a temporary table). This article covers all five with working MySQL 8 examples.<\/strong><\/p>\n\n\n\n<p><strong>This article is part of Robert Sheldon's continuing series on Learning MySQL. To see all of the items in the series, <a href=\"https:\/\/www.red-gate.com\/simple-talk\/mysql-coding-basics\/\">click here<\/a>.<\/strong><\/p>\n\n\n\n\n<p>With that in mind, let\u2019s dive into the subquery and take a look at several different ones in action. In this article, I focus primarily on how the subquery is used in <code>SELECT<\/code> statements to retrieve data in different ways from one table or from multiple tables. Like the previous articles in this series, this one is meant to introduce you to the basic concepts of working with subqueries so you have a solid foundation on which to build your skills. Also like the previous articles, it includes a number of examples to help you better understand how to work with subqueries so you can start using them in your DML statements.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-preparing-your-mysql-environment\">Preparing your MySQL environment<\/h2>\n\n\n\n<p>For the examples in this article, I used the same database and tables that I used for the previous article. The database is named <code>travel<\/code> and it includes two tables: <code>manufacturers<\/code> and <code>airplanes<\/code>. However, the sample data I use for this article is different from the last article, so I recommend that you once again rebuild the database and tables to keep things simple for this article\u2019s examples. You can set up the database by running the following script:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">DROP DATABASE IF EXISTS travel;\n\nCREATE DATABASE travel;\nUSE travel;\nCREATE TABLE manufacturers (\n  manufacturer_id INT UNSIGNED NOT NULL AUTO_INCREMENT,\n  manufacturer VARCHAR(50) NOT NULL,\n  create_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,\n  last_update TIMESTAMP NOT NULL \n    DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,\n  PRIMARY KEY (manufacturer_id) )\nENGINE=InnoDB AUTO_INCREMENT=1001;\n\nCREATE TABLE airplanes (\n  plane_id INT UNSIGNED NOT NULL AUTO_INCREMENT,\n  plane VARCHAR(50) NOT NULL,\n  manufacturer_id INT UNSIGNED NOT NULL,\n  engine_type VARCHAR(50) NOT NULL,\n  engine_count TINYINT NOT NULL,\n  max_weight MEDIUMINT UNSIGNED NOT NULL,\n  wingspan DECIMAL(5,2) NOT NULL,\n  plane_length DECIMAL(5,2) NOT NULL,\n  parking_area INT GENERATED ALWAYS AS ((wingspan * plane_length)) STORED,\n  icao_code CHAR(4) NOT NULL,\n  create_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,\n  last_update TIMESTAMP NOT NULL \n    DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,\n  PRIMARY KEY (plane_id),\n  CONSTRAINT fk_manufacturer_id FOREIGN KEY (manufacturer_id) \n    REFERENCES manufacturers (manufacturer_id) ) \nENGINE=InnoDB AUTO_INCREMENT=101;<\/pre>\n\n\n\n<p>After you\u2019ve created the database, you can add the sample data so you can follow along with the exercises in this article. Start by running the following <code>INSERT<\/code> statement to add data to the <code>manufacturers<\/code> table:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">INSERT INTO manufacturers (manufacturer)\nVALUES ('Airbus'), ('Beagle Aircraft Limited'), ('Beechcraft'), ('Boeing'), \n  ('Bombardier'), ('Cessna'), ('Dassault Aviation'), ('Embraer'), ('Piper');<\/pre>\n\n\n\n<p>The statement adds nine rows to the <code>manufacturers<\/code> table, which you can confirm by querying the table. The <code>manufacturer_id<\/code> value for the first row should be <code>1001<\/code>. After you confirm the data in the <code>manufacturers<\/code> table, you can run the following <code>INSERT<\/code> statement to populate the <code>airplanes<\/code> table, using the <code>manufacturer_id<\/code> values from the <code>manufacturers<\/code> table:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">INSERT INTO airplanes \n  (plane, manufacturer_id, engine_type, engine_count, \n    wingspan, plane_length, max_weight, icao_code)\nVALUES\n  ('A340-600',1001,'Jet',4,208.17,247.24,837756,'A346'),\n  ('A350-800 XWB',1001,'Jet',2,212.42,198.58,546700,'A358'),\n  ('A350-900',1001,'Jet',2,212.42,219.16,617295,'A359'),\n  ('A380-800',1001,'Jet',4,261.65,238.62,1267658,'A388'),\n  ('A380-843F',1001,'Jet',4,261.65,238.62,1300000,'A38F'),\n  ('A.109 Airedale',1002,'Piston',1,36.33,26.33,2750,'AIRD'),\n  ('A.61 Terrier',1002,'Piston',1,36,23.25,2400,'AUS6'),\n  ('B.121 Pup',1002,'Piston',1,31,23.17,1600,'PUP'),\n  ('B.206',1002,'Piston',2,55,33.67,7500,'BASS'),\n  ('D.5-108 Husky',1002,'Piston',1,36,23.17,2400,'D5'),\n  ('Baron 56 TC Turbo Baron',1003,'Piston',2,37.83,28,5990,'BE56'),\n  ('Baron 58 (and current G58)',1003,'Piston',2,37.83,29.83,5500,'BE58'),\n  ('Beechjet 400 (same as MU-300-10 Diamond II)',1003,'Jet',2,43.5,48.42,15780,'BE40'),\n  ('Bonanza 33 (F33A)',1003,'Piston',1,33.5,26.67,3500,'BE33'),\n  ('Bonanza 35 (G35)',1003,'Piston',1,32.83,25.17,3125,'BE35'),\n  ('747-8F',1004,'Jet',4,224.42,250.17,987000,'B748'),\n  ('747-SP',1004,'Jet',4,195.67,184.75,696000,'B74S'),\n  ('757-300',1004,'Jet',2,124.83,178.58,270000,'B753'),\n  ('767-200',1004,'Jet',2,156.08,159.17,315000,'B762'),\n  ('767-200ER',1004,'Jet',2,156.08,159.17,395000,'B762'),\n  ('Learjet 24',1005,'Jet',2,35.58,43.25,13000,'LJ24'),\n  ('Learjet 24A',1005,'Jet',2,35.58,43.25,12499,'LJ24'),\n  ('Challenger (BD-100-1A10) 350',1005,'Jet',2,69,68.75,40600,'CL30'),\n  ('Challenger (CL-600-1A11) 600',1005,'Jet',2,64.33,68.42,36000,'CL60'),\n  ('Challenger (CL-600-2A12) 601',1005,'Jet',2,64.33,68.42,42100,'CL60'),\n  ('414A Chancellor',1006,'Piston',2,44.17,36.42,6750,'C414'),\n  ('421C Golden Eagle',1006,'Piston',2,44.17,36.42,7450,'C421'),\n  ('425 Corsair-Conquest I',1006,'Turboprop',2,44.17,35.83,8600,'C425'),\n  ('441 Conquest II',1006,'Turboprop',2,49.33,39,9850,'C441'),\n  ('Citation CJ1 (Model C525)',1006,'Jet',2,46.92,42.58,10600,'C525'),\n  ('EMB 175 LR',1008,'Jet',2,85.33,103.92,85517,'E170'),\n  ('EMB 175 Standard',1008,'Jet',2,85.33,103.92,82673,'E170'),\n  ('EMB 175-E2',1008,'Jet',2,101.67,106,98767,'E170'),\n  ('EMB 190 AR',1008,'Jet',2,94.25,118.92,114199,'E190'),\n  ('EMB 190 LR',1008,'Jet',2,94.25,118.92,110892,'E190');<\/pre>\n\n\n\n<p>The <code>manufacturer_id<\/code> values from the <code>manufacturers<\/code> table provide the foreign key values needed for the <code>manufacturer_id<\/code> column in the <code>airplanes<\/code> table. After you run the second <code>INSERT<\/code> statement, you can query the <code>airplanes<\/code> table to confirm that 35 rows have been added. The first row should have been assigned <code>101<\/code> for the <code>plane_id<\/code> value, and the <code>plane_id<\/code> values for the other rows should have been incremented accordingly.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-building-a-basic-scalar-subquery\">Building a basic scalar subquery<\/h2>\n\n\n\n<p>A scalar subquery is one that returns only a single value, which is then passed into the outer query through one of its clauses. The subquery is used in place of other possible expressions, such as a constants or column names. For example, the following <code>SELECT<\/code> statement (the outer query) includes a subquery in search condition of the <code>WHERE<\/code> clause:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT plane_id, plane\nFROM airplanes\nWHERE manufacturer_id = \n  (SELECT manufacturer_id FROM manufacturers \n    WHERE manufacturer = 'Beechcraft');<\/pre>\n\n\n\n<p>The subquery is the expression on the right side of the equal sign, enclosed in parentheses. A subquery must always be enclosed in parentheses, no matter where it\u2019s used in the outer statement.<\/p>\n\n\n\n<p>Make certain that your subquery does indeed return only one value, if that\u2019s what it\u2019s supposed to do. If the subquery were to return multiple values and your <code>WHERE<\/code> clause is not set up to handle them (as in this example), MySQL will return an error letting you know that you messed up.<\/p>\n\n\n\n<p>In this case, the subquery is a simple <code>SELECT<\/code> statement that returns the <code>manufacturer_id<\/code> value for the manufacturer named Beechcraft. This value, <code>1003<\/code>, is then passed into the <code>WHERE<\/code> clause as part of its search condition. If a row in the <code>airplanes<\/code> table contains a <code>manufacturer_id<\/code> value that matches <code>1003<\/code>, the row is included in the query results, which are shown in the following figure.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"800\" height=\"260\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-5.jpeg\" alt=\"\" class=\"wp-image-95114\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>The subquery in this example retrieves data from a second table, in this case, <code>manufacturers<\/code>. However, a subquery can also retrieve data from the same table, which can be useful if the data must be handled in different ways. For example, the subquery in the following <code>SELECT<\/code> statement retrieves the average <code>max_weight<\/code> value from the <code>airplanes<\/code> table:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT plane_id, plane, max_weight\nFROM airplanes\nWHERE max_weight &gt; \n  (SELECT AVG(max_weight) FROM airplanes);<\/pre>\n\n\n\n<p>The <code>WHERE<\/code> clause search condition in the outer statement uses the average to return only rows with a <code>max_weight<\/code> value greater than that average. If you were to run the subquery on its own, you would see that the average maximum weight is 227,499 pounds. As a result, the outer <code>SELECT<\/code> statement returns only those rows with a <code>max_weight<\/code> value that exceeds 227,499 pounds, as shown in the following figure.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"747\" height=\"403\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-6.jpeg\" alt=\"\" class=\"wp-image-95115\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>As I mentioned earlier in the article, you can use subqueries in statements other than <code>SELECT<\/code>. One of those statements is the <code>SET<\/code> statement, which lets you assign variable values. You can use a <code>SET<\/code> statement to define a value that can then be passed into other statements. For example, the following <code>SET<\/code> and <code>SELECT<\/code> statements implement the same logic as the previous example and return the same results:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SET @avg_weight = \n  (SELECT ROUND(AVG(max_weight)) FROM airplanes);\nSELECT plane_id, plane, max_weight\nFROM airplanes WHERE max_weight &gt; @avg_weight;<\/pre>\n\n\n\n<p>The <code>SET<\/code> statement defines a variable named <code>@avg_weight<\/code> and uses a subquery to assign a value to that variable. This is the same subquery that is in the previous example. The <code>SELECT<\/code> statement then uses that variable in its <code>WHERE<\/code> clause (in place of the original subquery) to return only those rows with a <code>max_weight<\/code> value greater than 227,499 pounds.<\/p>\n\n\n\n<p>The examples in this section focused on using scalar subqueries in <code>SELECT<\/code> statements, but be aware that you can also use them in <code>UPDATE<\/code> and <code>DELETE<\/code> statements, as well as the <code>SET<\/code> clause of the <code>UPDATE<\/code>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-working-with-correlated-subqueries\">Working with correlated subqueries<\/h2>\n\n\n\n<p>One of the most valuable features of a subquery is its ability to reference the outer query from within the subquery. Referred to as a correlated subquery, this type of subquery can return data that is specific to the current row being evaluated by the outer statement. For example, the following <code>SELECT<\/code> statement uses a correlated subquery to calculate the average weight of the planes for each manufacturer, rather than for all planes:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT m.manufacturer_id, m.manufacturer, \n  a.plane_id, a.plane, a.max_weight\nFROM airplanes a INNER JOIN manufacturers m\n  ON a.manufacturer_id = m.manufacturer_id\nWHERE a.max_weight &gt; \n  (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id);<\/pre>\n\n\n\n<p>Unlike the previous two examples, the subquery now includes a <code>WHERE<\/code> clause that limits the returned rows to those with a <code>manufacturer_id<\/code> value that matches the current <code>manufacturer_id<\/code> value in the outer query. This is accomplished by assigning an alias (<code>a<\/code>) to the table in the outer query and using that alias when referencing the table\u2019s <code>manufacturer_id<\/code> column within the subquery.<\/p>\n\n\n\n<p>In this case, I also assigned an alias (<code>a2<\/code>) to the table referenced within the subquery, but strictly speaking, you do not need to do this. I like to include an alias for consistency and code readability, but certainly take whatever approach works for you.<\/p>\n\n\n\n<p>To better understand how the subquery works logically (as opposed to how the optimizer might actually execute it), consider the first row in the <code>airplanes<\/code> table, which has a <code>manufacturer_id<\/code> value of <code>1001<\/code>.<\/p>\n\n\n\n<p>When the outer query evaluates the first row, it compares the <code>max_weight<\/code> value to the value returned by the subquery. To carry out this comparison, the database engine first matches the <code>manufacturer_id<\/code> value in the outer query to the <code>manufacturer_id<\/code> values returned by the subquery&#8217;s <code>SELECT<\/code> statement. It then finds all rows associated with the current manufacturer and returns the average <code>max_weight<\/code> value for that manufacturer, repeating the process for each manufacturer returned by the outer query.<\/p>\n\n\n\n<p>The following figure shows the results returned by the outer <code>SELECT<\/code> statement. The <code>max_weight<\/code> values are now compared only with the manufacturer-specific averages.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1466\" height=\"568\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-8.png\" alt=\"\" class=\"wp-image-95116\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>You can also use this same subquery in the <code>SELECT<\/code> clause as one of the column expressions. In the following example, I added the subquery after the <code>max_weight<\/code> column and assigned the alias <code>avg_weight<\/code> to the new column:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT manufacturer_id, plane_id, plane, max_weight,\n  (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id) AS avg_weight\nFROM airplanes a\nWHERE max_weight &gt; \n  (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id);<\/pre>\n\n\n\n<p>The new subquery works the same as in the preceding example. It returns the average weight only for the planes from the current manufacturer, as shown in the following figure.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1298\" height=\"548\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-7.jpeg\" alt=\"\" class=\"wp-image-95117\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>In the previous example, I used a subquery to create a generated column. However, you can use a subquery when creating an even more complex generated column. In the following example, I\u2019ve added a generated column named <code>amt_over<\/code>, which subtracts the average weight returned by the subquery from the weight in the <code>max_weight<\/code> column:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT manufacturer_id, plane_id, plane, max_weight,\n  (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id) AS avg_weight,\n  -- subtracts average weight from max weight\n  (max_weight - (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id)) AS amt_over\nFROM airplanes a\nWHERE max_weight &gt; \n  (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id);<\/pre>\n\n\n\n<p>The <code>avg_weight<\/code> column in the <code>SELECT<\/code> list is a generated column that uses a subquery to return the average weight of the current manufacturer. The <code>amt_over<\/code> column is also a generated column and it uses the same subquery to return the average weight. Only this time, the column subtracts that average from the <code>max_weight<\/code> column to return the amount that exceeds the average.<\/p>\n\n\n\n<p>The following figure shows the results now returned by the outer <code>SELECT<\/code> statement. As you can see, they include the <code>amt_over<\/code> generated column, which shows the differences in the weights.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1401\" height=\"537\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-8.jpeg\" alt=\"\" class=\"wp-image-95118\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>If you want, you can also retrieve the name of the manufacturer from the <code>manufacturers<\/code> table and include that in your <code>SELECT<\/code> clause, as shown in the following example:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT manufacturer_id, \n  (SELECT manufacturer FROM manufacturers m \n    WHERE a.manufacturer_id = m.manufacturer_id) manufacturer,\n  plane_id, plane, max_weight,\n  (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id) AS avg_weight,\n  (max_weight - (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id)) AS amt_over\nFROM airplanes a\nWHERE max_weight &gt; \n  (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id);<\/pre>\n\n\n\n<p>This statement is similar to the previous statement except that it adds the <code>manufacturer<\/code> column, a generated column. The new column uses a subquery to retrieve the name of the manufacturer from the <code>manufacturers<\/code> table, based on the <code>manufacturer_id <\/code>value returned by the outer query.<\/p>\n\n\n\n<p>In some cases, a subquery might not perform as well as other types of constructs. For example, MySQL can often optimize a left outer join better than a subquery that carries out comparable logic. If you\u2019re using a subquery to perform an operation that can be achieved in another way and are concerned about performance, you should consider testing both options under a realistic workload to determine which is the best approach.<\/p>\n\n\n\n<p>The following figure shows the results with the additional column, which I\u2019ve named <code>manufacturer<\/code>.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1658\" height=\"538\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-9.jpeg\" alt=\"\" class=\"wp-image-95119\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>One other detail I want to point out about using subqueries in the <code>SELECT<\/code> list is that you can include them even if you\u2019re grouping and aggregating data. For instance, the following <code>SELECT<\/code> statement includes a subquery that retrieves the name of the manufacturer associated with each group:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT manufacturer_id, \n  (SELECT manufacturer FROM manufacturers m \n    WHERE a.manufacturer_id = m.manufacturer_id) manufacturer,\n  COUNT(*) AS plane_amt\nFROM airplanes a\nWHERE engine_type = 'piston'\nGROUP BY manufacturer_id\nORDER BY plane_amt DESC;<\/pre>\n\n\n\n<p>The subquery in this statement is similar to those you\u2019ve seen in other examples, except that now you\u2019re dealing with aggregated data, so the subquery must use a column that is mentioned in the <code>GROUP<\/code> <code>BY<\/code> clause of the outer statement, which it does (the <code>manufacturer_id<\/code> column). The statement returns the results shown in the following figure.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"757\" height=\"235\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-10.jpeg\" alt=\"\" class=\"wp-image-95120\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>As you can see, the data has been grouped based on the <code>manufacturer_id<\/code> column, and the name of the manufacturer is included with each ID. In addition, the number of airplanes with piston engines is provided for each manufacturer, with the results sorted in descending order, based on that amount.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-working-with-a-row-of-data\">Working with a row of data<\/h2>\n\n\n\n<p>So far, all the examples in this article have returned scalar values, but your subqueries can also return multiple values, as noted earlier. For example, it might be useful to use a subquery to aggregate a table\u2019s data, calculate specific averages in that data (returned as a single row), and then retrieve rows from the same table that exceed those averages, which is what I\u2019ve done in the following <code>SELECT<\/code> statement:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT \n  (SELECT manufacturer FROM manufacturers m \n    WHERE a.manufacturer_id = m.manufacturer_id) manufacturer,\n\tplane_id, plane, max_weight, parking_area,\n  (SELECT ROUND(AVG(max_weight)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id) AS avg_weight,\n  (SELECT ROUND(AVG(parking_area)) FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id) AS avg_area\nFROM airplanes a\nWHERE (max_weight, parking_area) &gt; \n  (SELECT ROUND(AVG(max_weight)), ROUND(AVG(parking_area)) \n    FROM airplanes a2\n    WHERE a.manufacturer_id = a2.manufacturer_id);<\/pre>\n\n\n\n<p>Notice that the <code>WHERE<\/code> clause in the outer statement includes the <code>max_weight<\/code> and <code>parking_area<\/code> columns in parentheses. In MySQL, this is how you create what is called a <em>row constructor,<\/em> a structure that supports simultaneous comparisons of multiple values. In this case, the row constructor is compared to the results from the subquery, which returns two corresponding values. The first value is the average weight, as you saw earlier. The second value returns the average parking area, which is based on the values in the <code>parking_area<\/code> column.<\/p>\n\n\n\n<p>In both cases, the averages are specific to the current <code>manufacturer_id<\/code> value in the outer query. For the outer query to return a row, the two values in the row constructor must be greater than both corresponding values returned by the subquery. Notice also that the <code>SELECT<\/code> list now includes both the <code>max_weight<\/code> and <code>parking<\/code>_<code>area<\/code> columns, along with the average for each one. The outer statement returns the results shown in the following figure.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"1595\" height=\"533\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-11.jpeg\" alt=\"\" class=\"wp-image-95121\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>One thing you might have noticed about the subqueries in the preceding examples, particularly those used in the <code>WHERE<\/code> clauses, is that all the search conditions in those clauses use basic comparison operators, either equal (<code>=<\/code>) or greater than (<code>&gt;<\/code>). However, you can use any of the other comparison operators, including special operators such as <code>IN<\/code> and <code>EXISTS<\/code>, as you\u2019ll see in the next section.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-working-with-a-column-of-data\">Working with a column of data<\/h2>\n\n\n\n<p>The previous section covered row subqueries. A row subquery can return only a single row, although that row can include one or more columns. This is in contrast to a scalar subquery , which returns only a single row and single column. In this section, we\u2019ll look at the column subquery, which returns only a one column with one or more rows.<\/p>\n\n\n\n<p>As with row subqueries, column subqueries are often used in the <code>WHERE<\/code> clause when building your search conditions. Also like row subqueries, your search condition must take into account that the subquery is returning more than one value.<\/p>\n\n\n\n<p>For example, the <code>WHERE<\/code> clause in the following <code>SELECT<\/code> statement uses the <code>IN<\/code> operator to compare current the <code>manufacturer_id<\/code> value in the outer statement with the list of <code>manufacturer_id<\/code> values returned by the subquery:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT manufacturer_id, manufacturer \nFROM manufacturers\nWHERE manufacturer_id IN \n  (SELECT DISTINCT manufacturer_id FROM airplanes\n    WHERE engine_type = 'piston');<\/pre>\n\n\n\n<p>The column data returned by the subquery includes only those manufacturers that offer planes with piston engines (as reflected in the current data set). The <code>IN<\/code> operator determines whether the current <code>manufacturer_id<\/code> value is included in that list. If it is, that row is returned, as shown in the following figure.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"657\" height=\"211\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-12.jpeg\" alt=\"\" class=\"wp-image-95122\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>As with many operations in MySQL, you can take different approaches to achieve the same results. For example, you can replace the <code>IN<\/code> operator with an equal comparison operator, followed by the <code>ANY<\/code> keyword, as shown in the following example:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT manufacturer_id, manufacturer \nFROM manufacturers\nWHERE manufacturer_id = ANY \n  (SELECT DISTINCT manufacturer_id FROM airplanes\n    WHERE engine_type = 'piston');<\/pre>\n\n\n\n<p>You could have instead used another comparison operator, such as greater than (<code>&gt;<\/code>) or lesser than (<code>&lt;<\/code>) or even <code>ALL<\/code> instead of <code>ANY<\/code>. The statement returns the same results as the previous example. In fact, you can also achieve the same results by rewriting the entire statement as an inner join:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT DISTINCT m.manufacturer_id, m.manufacturer\nFROM manufacturers m INNER JOIN airplanes a\n  ON m.manufacturer_id = a.manufacturer_id\nWHERE a.engine_type = 'piston';<\/pre>\n\n\n\n<p>I\u2019m not going to go into joins here, but as I mentioned earlier, joins can sometimes provide performance benefits over subqueries, so you should be familiar with how they work and how they differ from subqueries (a topic that could easily warrant its own article).<\/p>\n\n\n\n<p>With that in mind, be aware that you can also use the <code>NOT<\/code> keyword with some operators to return different results. For example, the <code>WHERE<\/code> clause in the following <code>SELECT<\/code> statement uses the <code>NOT<\/code> <code>IN<\/code> operator ensure the current <code>manufacturer_id<\/code> value is not in the list of values returned by the subquery:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT manufacturer_id, manufacturer \nFROM manufacturers\nWHERE manufacturer_id NOT IN \n  (SELECT DISTINCT manufacturer_id FROM airplanes);<\/pre>\n\n\n\n<p>In this case, the subquery returns a distinct list of all <code>manufacturer_id<\/code> values in the <code>airplanes<\/code> table. The list is then compared to each <code>manufacturer_id<\/code> value in the <code>manufacturers<\/code> table, as specified by the outer query. If the value is not in the list, the search condition evaluates to true and the row is returned. In this way, you can determine which manufacturers are in the <code>manufacturers<\/code> table but are not in the <code>airplanes<\/code> table. The query results are shown in the following figure.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"638\" height=\"184\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-13.jpeg\" alt=\"\" class=\"wp-image-95123\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Be very careful when using <code>NOT<\/code> <code>IN<\/code> with your subqueries. If the subquery returns a <code>NULL<\/code> value, your <code>WHERE<\/code> expression will never evaluate to true.<\/p>\n\n\n\n<p>Another operator you can use in the <code>WHERE<\/code> clause is <code>EXISTS<\/code> (and its counterpart <code>NOT<\/code> <code>EXISTS<\/code>). The <code>EXISTS<\/code> operator simply checks whether the subquery returns any rows. If it does, the search condition evaluates to true, otherwise it evaluates to false. For example, the following <code>SELECT<\/code> statement defines similar logic as the preceding example, except that it checks for which manufacturers are included in both tables:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT manufacturer_id, manufacturer \nFROM manufacturers m\nWHERE EXISTS\n  (SELECT * FROM airplanes A\n    WHERE a.manufacturer_id = m.manufacturer_id);<\/pre>\n\n\n\n<p>Notice that you need only specify the <code>EXISTS<\/code> operator followed by the subquery. If the subquery returns a row for the current <code>manufacturer_id<\/code> value, the search condition evaluates true and the outer query returns a row for that manufacturer. The following figure shows the results returned by the statement.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"686\" height=\"328\" src=\"https:\/\/www.red-gate.com\/simple-talk\/wp-content\/uploads\/2022\/11\/word-image-14.jpeg\" alt=\"\" class=\"wp-image-95124\"\/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>When working with column subqueries, it\u2019s important to understand how to use comparison operators such as <code>IN<\/code>, <code>NOT<\/code> <code>IN<\/code>, <code>ANY<\/code>, <code>ALL<\/code>, and <code>EXISTS<\/code>. If you\u2019re not familiar with them, be sure to refer to the MySQL documentation to learn more.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-using-subqueries-in-the-from-clause\">Using subqueries in the <code><strong>FROM<\/strong><\/code> clause<\/h2>\n\n\n\n<p>In addition to rows, columns, and scalar values, subqueries can also return tables, which are referred to as derived tables. Like other subqueries, a table subquery must be enclosed in parentheses. In addition, it must also be assigned an alias, similar to specifying a table alias when building correlated subqueries. For example, the following <code>SELECT<\/code> statement includes a table subquery named <code>total_planes<\/code>, which is included in the <code>FROM<\/code> clause of the outer statement:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">SELECT ROUND(AVG(amount), 2) avg_amt\nFROM \n  (SELECT manufacturer_id, COUNT(*) AS amount \n  FROM airplanes\n  GROUP BY manufacturer_id) AS total_planes;<\/pre>\n\n\n\n<p>Notice that the outer statement does not specify a table other than the derived table returned by the subquery. The subquery itself groups the data in the <code>airplanes<\/code> table by the <code>manufacturer_id<\/code> values and then returns the ID and total number of planes in each group. The outer statement then finds the average number of planes across all groups. In this case, the statement returns a value of <code>5.00<\/code>.<\/p>\n\n\n\n<p>Now let\u2019s look at another example of a table subquery. Although this next example is similar to the previous one in several ways, it includes something you have not seen yet, one subquery nested within another. In this case, I\u2019ve nested a table subquery within another table subquery to group data based on custom categories and then find the average across those groups:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">-- outer SELECT calculates average count across all categories\nSELECT\n  ROUND(AVG(amount), 2) AS avg_amt\nFROM\n  -- outer subquery aggregates categories and calculates count for each one\n  (SELECT category, COUNT(*) amount \n  FROM \n    -- inner subquery categorizes planes based on parking area\n    (SELECT CASE\n      WHEN parking_area &gt; 50000 THEN 'A'\n      WHEN parking_area &gt;= 20000 THEN 'B'\n      WHEN parking_area &gt;= 10000 THEN 'C'\n      WHEN parking_area &gt;= 5000 THEN 'D'\n      WHEN parking_area &gt;= 1000 THEN 'E'\n      ELSE 'F'\n    END AS category\n    FROM airplanes) AS plane_size\n  GROUP BY category\n  ORDER BY category) AS plane_cnt;<\/pre>\n\n\n\n<p>The innermost subquery\u2014the one with the <code>CASE<\/code> expression\u2014assigns one of five category values (<code>A<\/code>, <code>B<\/code>, <code>C<\/code>, <code>D<\/code>, <code>E<\/code>, and <code>F<\/code>) to each range of <code>parking<\/code> <code>area<\/code> values. The subquery returns a derived table named <code>plane_size<\/code>, which contains a single column named <code>category<\/code>. The column contains a category value for each plane in the <code>airplanes<\/code> table.<\/p>\n\n\n\n<p>The data from the <code>plane_size<\/code> table is then passed to the outer subquery. This subquery groups the <code>plane_size<\/code> data based on the <code>category<\/code> values and generates a second column named <code>amount<\/code>, which provides the total number of planes in each category. The outer subquery returns a derived table named <code>plane_cnt<\/code>. The outer statement then finds the average number of planes across all groups in the derived table, returning a value of <code>5.83<\/code>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"h-working-with-mysql-subqueries\">Working with MySQL subqueries<\/h2>\n\n\n\n<p>Like many aspects of MySQL, the topic of subqueries is a much broader than what can be covered in a single article. To help you complete the picture, I recommend that you also check out the MySQL <a href=\"https:\/\/dev.mysql.com\/doc\/refman\/8.0\/en\/subqueries.html\">documentation on subqueries<\/a>, which covers all aspects of how to use subqueries. In the meantime, you should have learned enough here to get a sense of how subqueries work and some of the ways you can use them in your SQL statements. Once you have a good foundation, you can start building more complex subqueries and use them in statements other than <code>SELECT<\/code> queries. Just be sure to keep performance in mind and consider alternative statement strategies, when necessary, especially if working with larger data sets.<\/p>\n\n\n\n<section id=\"faq\" class=\"faq-block my-5xl\">\n    <h2>FAQs: Subqueries in MySQL<\/h2>\n\n                        <h3 class=\"mt-4xl\">1. What is a MySQL subquery and when should I use one?<\/h3>\n            <div class=\"faq-answer\">\n                <p>A subquery is a SELECT statement nested inside another SELECT, INSERT, UPDATE, or DELETE statement. Use subqueries when: a query&#8217;s condition depends on the result of another query (a customer&#8217;s orders where the customer lives in the same city as another customer); you need to compute an intermediate result to filter against; or you want a derived table in the FROM clause. When a JOIN would produce the same result and is more readable, prefer the JOIN &#8211; subqueries and joins are often interchangeable, but JOINs generally have better optimiser support in MySQL for simple cases.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">2. What is a correlated subquery in MySQL?<\/h3>\n            <div class=\"faq-answer\">\n                <p>A correlated subquery references columns from the outer query &#8211; it is evaluated once for each row processed by the outer query. Example: SELECT * FROM orders o WHERE amount &gt; (SELECT AVG(amount) FROM orders WHERE customer_id = o.customer_id) &#8211; the inner query calculates the average order amount for each specific customer and the outer query returns orders above that customer&#8217;s average. Correlated subqueries can be slow on large tables because of the per-row evaluation; for better performance, consider rewriting as a JOIN against a derived table or CTE that pre-computes the aggregation.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">3. What is a derived table in MySQL?<\/h3>\n            <div class=\"faq-answer\">\n                <p>A derived table is a subquery in the FROM clause that returns a full result set used like a temporary table. Example: SELECT * FROM (SELECT customer_id, COUNT(*) AS order_count FROM orders GROUP BY customer_id) AS customer_orders WHERE order_count &gt; 5. The subquery in parentheses (followed by an alias) creates a temporary named result set that the outer query filters. Derived tables are materialised in memory for the duration of the query. They are the non-CTE equivalent of a WITH clause expression &#8211; CTEs (using WITH) are generally more readable for complex derived table scenarios.<\/p>\n            <\/div>\n                    <h3 class=\"mt-4xl\">4. When should I use a MySQL subquery versus a JOIN or CTE?<\/h3>\n            <div class=\"faq-answer\">\n                <p>Use a subquery when the inner query produces a single value or list of values for a comparison (WHERE column IN (SELECT &#8230;)) &#8211; this is clean and readable. Use a JOIN when you need columns from both tables in the output &#8211; JOINs are generally better optimised and more readable for multi-table retrieval. Use a CTE (WITH clause) when the same derived result is used multiple times in the query, or when the query becomes deeply nested &#8211; CTEs improve readability significantly for complex subquery chains. In MySQL, CTEs and derived tables often produce the same execution plan; the choice is primarily about readability.<\/p>\n            <\/div>\n            <\/section>\n","protected":false},"excerpt":{"rendered":"<p>Complete MySQL subquery guide: scalar subqueries for single values, correlated subqueries that reference the outer query, row and column subqueries, and derived tables in the FROM clause &#8211; with working MySQL 8 examples.&hellip;<\/p>\n","protected":false},"author":221841,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":true,"footnotes":""},"categories":[53,145792],"tags":[],"coauthors":[6779],"class_list":["post-95113","post","type-post","status-publish","format-standard","hentry","category-featured","category-mysql"],"acf":[],"_links":{"self":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/95113","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/users\/221841"}],"replies":[{"embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/comments?post=95113"}],"version-history":[{"count":7,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/95113\/revisions"}],"predecessor-version":[{"id":109836,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/posts\/95113\/revisions\/109836"}],"wp:attachment":[{"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/media?parent=95113"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/categories?post=95113"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/tags?post=95113"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.red-gate.com\/simple-talk\/wp-json\/wp\/v2\/coauthors?post=95113"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}