Trimmed mean calculation in MySQL

sql trimmed mean
trimmed mean in access
proc sql trimmed mean
python trimmed mean
pandas trimmed mean
sql winsorize

I want to write a function that calculates a simple trimmed mean calculation in MySQL. The function will (obviously) be an aggregate function. I am new to writing functions etc in MySQL so could do with some help.

The algorithm of the trimmed mean will be as follows (pseudocode):

CREATE AGGREGATE FUNCTION trimmed_mean(elements DOUBLE[], trim_size INTEGER)
RETURNS DOUBLE
BEGIN
   -- determine number of elements
   -- ensure that number of elements is greater than 2 * trim_size else return error
   -- order elements in ASC order
   -- chop off smallest trim_size elements and largest trim_size elements
   -- calculate arithmetic average of the remaining elements
   -- return arithmetic average
END

Can anyone help with how to write the function above correctly, for use with MySQL?

Computing the Trimmed Mean in SQL, The trimmed mean is a more robust version of the simple mean (SQL AVG() aggregate function). It is a useful tool for summarizing ill-behaved� The amount of trimming can be tuned to fit the problem. Ideally, this avoids the outliers which can plague the mean while otherwise using as much of the data as possible. This article presents several ways to compute a trimmed mean in SQL. Among the solutions is code which yields the mean, the median, or something in between depending on the

Have a look at this example (for MySQL) -

Create test table:

CREATE TABLE test_table (
  id INT(11) NOT NULL AUTO_INCREMENT,
  value INT(11) DEFAULT NULL,
  PRIMARY KEY (id)
);

INSERT INTO test_table(value) VALUES 
  (10), (2), (3), (5), (4), (7), (1), (9), (3), (5), (9);

Let's calculate avg value (edited variant):

SET @trim_size = 3;

SELECT AVG(value) avg FROM (
  SELECT value, @pos:=@pos + 1 pos FROM (SELECT * FROM test_table ORDER BY value) t1, (SELECT @pos:=0) t2
  ) t
WHERE pos > @trim_size AND pos <= @pos - @trim_size;

+--------+
| avg    |
+--------+
| 4.8000 |
+--------+

TRIMMED MEAN SQL : SQL, simple to get the trimmed mean ( instead of an average) for a population of values? I've tried for the last week understanding and learning MySQL via Khan� Calculating the median value of a column in MySQL. Unfortunately, MySQL doesn't yet offer a built-in function to calculate the median value of a column. Therefore, we'll have to build a query our own. Assume we would like to retrieve the median value from the column `grades`. Let's look into the algorithm we're going to use to build the query:

StirlingMarketingGroup/mysql-trimmean: MySQL UDF , The column of values to trim and average. `Percent`. The fractional number of data points to exclude from the calculation. For example, if percent = 0.2, 4 points � Online statistical calculator to find trimmed or truncated mean value for given set of data distribution. Code to add this calci to your website Just copy and paste the below code to your webpage where you want to display this calculator.

Statistical functions in MySQL • Open Source is Everything, Calculating averages with MySQL � Arithmetic mean � Weighted average � Harmonic average � Geometric mean � Midrange � Median � Most popular� The fractional number of data points to exclude from the calculation. For example, if percent = 0.2, 4 points are trimmed from a data set of 20 points (20 x 0.2): 2 from the top and 2 from the bottom of the set. TRIMMEAN rounds the number of excluded data points down to the nearest multiple of 2. If percent = 0.1, 10 percent of 30 data points

MySQL TRIM() Function, Definition and Usage. The TRIM() function removes leading and trailing spaces from a string. Syntax. TRIM(string). Parameter Values. Parameter, Description. Mean and Mode in SQL Server Last Updated: 28-03-2018 Mean is the average of the given data set calculated by dividing the total sum by the number of values in data set.

MySQL TRIM Function, The TRIM function provides a number of options. You can use the LEADING , TRAILING , or BOTH option to explicitly instruct the TRIM() function to remove leading� Calculate the 20% trimmed mean for the number set {8, 3, 7, 1, 3, 9} Given, Trimmed Mean Percent = 20/100 = 0.2 Sample Size=6 . To Find, Trimmed Mean Value . Solution: Step 1: Let us first calculate the value of Trimmed count (g), where g refers to number of values to be trimmed from the given series.

Comments
  • Is there a specific reason that you want to do this as a function, rather than as a query? Also, given that you preferred an answer to a previous question because it used standard SQL, will you need to be able to use this across multiple different RDBMSs (ie. not just MySQL)?
  • @MarkBannister I intended to work with PG (my favorite db!), but I had to jump through too many hoops to get PG to work with PHP (recompiling PHP [or similar crazy asks] etc), so I opted for mySQL which I already have working with PHP. The reason I wanted it as a function is that I want to return a trimmed mean as a column in a query. I suppose (if I had an SQL solution), I could hack together some SQL to 'paste' the trimmed mean values as a column to my returned dataset.
  • @MarkBannister: short answer to your question. An ANSI SQL version would be ideal. But since I happen to be working with mySQL, then if I have to be db-centric, a MySQL flavored SQL will take precedence.
  • I think, the question might be useful: stackoverflow.com/questions/8639073/…
  • I don't mind going down the C/C++ route (as a last resort) - but I'd rather not, simply because I don't want to spend hours familiarising myself with mySQL internal data types etc. If there is a 'hello world' example for an aggregate written in C/C++, that would be a very good starting point (in reducing the learning curve), since what I want to do, is relatively trivial (the algorithm part that is).
  • In the link from Dems: "The MySQL source distribution includes a file sql/udf_example.c that defines 5 new functions."
  • Surely it ought to be 4.8 - the trim should eliminate one but not both of the 3s? ie: ( not(1 + 2 + 3) + 3 + 4 + 5 + 5 + 7 + not(9 + 9 + 10) ) / 5
  • @Mark Bannister You are right. I have missed the point - order elements in ASC order. It should be ordered by value field. I have edited query. Thanks;-)