Comparing Sum Methods in C#

Related searches

I am working on a section of a project that uses large number of sum methods. These sum methods are applied on a Datatable

To test the best method, I use the following

Datatable structure

class LogParser
{
     public DataTable PGLStat_Table = new DataTable();
     public LogParser()
     {
         PGLStat_Table.Columns.Add("type", typeof(string)); 
         PGLStat_Table.Columns.Add("desc", typeof(string)); 
         PGLStat_Table.Columns.Add("count", typeof(int));
         PGLStat_Table.Columns.Add("duration", typeof(decimal));
         PGLStat_Table.Columns.Add("cper", typeof(decimal));
         PGLStat_Table.Columns.Add("dper", typeof(decimal));
         PGLStat_Table.Columns.Add("occurancedata", typeof(string));  
     }       
}

Following method is used to Fill the table

LogParser pglp = new LogParser();
Random r2 = new Random();
for (int i = 1; i < 1000000; i++)
{
    int c2 = r2.Next(1, 1000);
    pglp.PGLStat_Table.Rows.Add("Type" + i.ToString(), "desc" + i , c2, 0, 0, 0, " ");
}
  • Sum is applied on count column, where value of c2 is updated

Following Methods used to calculate Sum

Method 1 using Compute

Stopwatch s2 = new Stopwatch();
s2.Start();
object sumObject;
sumObject = pglp.PGLStat_Table.Compute("Sum(count)", " ");
s2.Stop();
long d1 = s2.ElapsedMilliseconds;

Method 2 using Foreach loop

s2.Restart();
int totalcount = 0;
foreach (DataRow dr in pglp.PGLStat_Table.Rows)
{
   int c = Convert.ToInt32(dr["count"].ToString());
   totalcount = totalcount + c;
}
s2.Stop();
long d2 = s2.ElapsedMilliseconds;

Method 3 using Linq

s2.Restart();
var sum = pglp.PGLStat_Table.AsEnumerable().Sum(x => x.Field<int>("count"));
MessageBox.Show(sum.ToString());
s2.Stop();
long d3 = s2.ElapsedMilliseconds;

After Comparison the results are

a) foreach is the fastest 481ms

b) next is linq 1016ms

c) and then Compute 2253ms


Query 1

I accidentally change "c2 to i" in the following statement

 pglp.PGLStat_Table.Rows.Add("Type" + i.ToString(), "desc" + i , i, 0, 0, 0, " ");

The Linq statement produces an error

Arithmetic operation resulted in an overflow.

Whereas the Compute and Foreach loop are still able to complete the computation although maybe incorrect.

Is such a behaviour cause of concern or am I missing a directive ? (also the figures computed are large)

Query 2

I was under the impression Linq does it fastest, is there a optimized method or parameter that makes it perform better.

thanks for advice

arvind

Fastest sum is next (with precompute DataColumn and direct cast to int):

  static int Sum(LogParser pglp)
  {
    var column = pglp.PGLStat_Table.Columns["count"];
    int totalcount = 0;
    foreach (DataRow dr in pglp.PGLStat_Table.Rows)
    {
      totalcount += (int)dr[column];
    }
    return totalcount;
  }

Statistic:

00:00:00.1442297, for/each, by column, (int)
00:00:00.1595430, for/each, by column, Field<int>
00:00:00.6961964, for/each, by name, Convert.ToInt
00:00:00.1959104, linq, cast<DataRow>, by column, (int)

Other code:

  static int Sum_ForEach_ByColumn_Field(LogParser pglp)
  {
    var column = pglp.PGLStat_Table.Columns["count"];
    int totalcount = 0;
    foreach (DataRow dr in pglp.PGLStat_Table.Rows)
    {
      totalcount += dr.Field<int>(column);
    }
    return totalcount;
  }
  static int Sum_ForEach_ByName_Convert(LogParser pglp)
  {
    int totalcount = 0;
    foreach (DataRow dr in pglp.PGLStat_Table.Rows)
    {
      int c = Convert.ToInt32(dr["count"].ToString());
      totalcount = totalcount + c;
    }
    return totalcount;
  }
  static int Sum_Linq(LogParser pglp)
  {
    var column = pglp.PGLStat_Table.Columns["count"];
    return pglp.PGLStat_Table.Rows.Cast<DataRow>().Sum(row => (int)row[column]);
  }


    var data = GenerateData();
    Sum(data);
    Sum_Linq2(data);
    var count = 3;
    foreach (var info in new[]
      {
        new {Name = "for/each, by column, (int)", Method = (Func<LogParser, int>)Sum},
        new {Name = "for/each, by column, Field<int>", Method = (Func<LogParser, int>)Sum_ForEach_ByColumn_Field},
        new {Name = "for/each, by name, Convert.ToInt", Method = (Func<LogParser, int>)Sum_ForEach_ByName_Convert},
        new {Name = "linq, cast<DataRow>, by column, (int)", Method = (Func<LogParser, int>)Sum_Linq},
      })
    {
      var watch = new Stopwatch();
      for (var i = 0; i < count; ++i)
      {
        watch.Start();
        var sum = info.Method(data);
        watch.Stop();
      }
      Console.WriteLine("{0}, {1}", TimeSpan.FromTicks(watch.Elapsed.Ticks / count), info.Name);
    }

c#, 1 using Compute Stopwatch s2 = new Stopwatch(); s2. In this post, we will discuss how to calculate sum of all elements in an integer array in C#. 1. Enumerable.Sum Method. We can make use the build-in numeric aggregation method Sum() from the System.Linq Namespace to compute the sum of numeric values in a sequence.

well you could improve a bit on the linq example (AsEnumerable) but this is expected behavior - Linq(2objects) cannot be faster as a loop (you could do even better by using a for(var i = ...) loop instead of the foreach) - I guess what you meant to do was using Linq2Sql - then the aggregation (sum) will be done on the database and it should be faster - but as you don't seem to use database-data...

Comparing F# with C#: A simple sum, I ran the test in debug mode with some slight modifications (making all calculations return a long) and came up with different results: 3508 =� C# Sum Method: Add up All Numbers. Use the Sum extension method and the selector overload. Include the System.Linq namespace. Sum. This method adds up all values in an IEnumerable. It computes the sum total of all the numbers in an array, or List, of integers. This extension method in LINQ provides an excellent way to do this with minimal calling code.

Query 1.

As you can see in documentation Enumerable.Sum extension method throws an OverflowException on integer overflow. DataTable.Compute has no such a functionality as well as integer operations you use in Method 2.

UPDATE: Query 2.

I was under the impression Linq does it fastest, is there a optimized method or parameter that makes it perform better.

AFAIK, there is no method's to optimize array summation algorithm (without using parallel computing). Linq doubles the time used by foreach. So, I don't think that's about linq performance but compute inefficiency (note that there is an overhead for query string interpretation).

Compare sum of first N-1 elements to Nth element of an array , In F# it is common for entire functions to be written on one line, as the “square” function is. The sumOfSquares function could also have been� Sum (LINQ) Enumerable.Sum is extension method from System.Linq namespace. It returns sum of numeric values in collection. Sum for Numeric Types. Gets sum of values from list of integer numbers.

Find four elements that sum to a given value, Below is the implementation of the above approach: C++; Java; Python3; C#; PHP. C++. Single.CompareTo() Method is used to compare the current instance to a specified object or to another Single instance and returns an integer which shows whether the value of the current instance is greater than, equal to, or less than the value of the specified object or the other Single instance. There are 2 methods in the overload list of

Improving list sum function based on head and tail with C# 8, A Naive Solution is to generate all possible quadruples and compare X. The following code implements this simple method using four nested� var t = (Sum: 4.5, Count: 3); Console.WriteLine($"Sum of {t.Count} elements is {t.Sum}."); (double Sum, int Count) d = (4.5, 3); Console.WriteLine($"Sum of {d.Count} elements is {d.Sum}."); Beginning with C# 7.1, if you don't specify a field name, it may be inferred from the name of the corresponding variable in a tuple initialization expression, as the following example shows:

In fact, I think it's tangled compared to the original version. The problem is the tuple deconstruction and the fact that I can't put it into expression (� Object Methods. You learned from the C# Methods chapter that methods are used to perform certain actions. Methods normally belongs to a class, and they define how an object of a class behaves. Just like with fields, you can access methods with the dot syntax. However, note that the method must be public.

Comments
  • i accept this as answer because you showed another way to calculate result
  • i mean , it seems to throw exception for big figures, i even tried to change the random min,max level to a bigger range 1,100000 and there are results which is correct in both for-each and compute , but the linq command throws an error. Although database is not in use for this scenario, but i would check if this error is produced for a table column as well.
  • i agree linq may be better is Sql related computations, and maybe if there is a orderby and where clause
  • it is the same after a certain range. please note int type is still in use fore Foreach and Compute and they both calculate and produce correct results.
  • i would like to add , the overall peformance level is really really big when atleast summing 5 or more columns for a frequently used report. Foreach performs almost 3x better than linq and 5x better than compute. although compute improves if the same compute is applied on same column but that rarely occurs for one report
  • @arvind, what do you mean by correct? When you change c2 to i you get the sum of all integer numbers from 1 to 1000000. It's 500,000,500,000 while Int32.MaxValue is 2,147,483,647. The difference is that linq detects overflow while Compute and integer summation silently calculate wrong result.
  • yes, that is what i mean and i agree that while foreach,compute are fast. if you see i mention in question that these are larger figures. but for numbers in range the difference in speed is noteworthy and if used long in place of int. but still an error should have been produced.