How to fetch,process and save huge record set in c# efficiently?

How to fetch,process and save huge record set in c# efficiently?

how to retrieve large data from database in c#
how to get large data from sql server in c#
efficiently paging through large amounts of data c#
how to scroll and view millions of records
how to retrieve data from database in c# using datatable
how to handle large data in mvc
how to get data in datatable from database in c#
adapter fill taking long time

I am trying to achieve below things:

  • get the data from SQL DB .
  • Pass data to PerformStuff method which has third party method MethodforResponse(It checks input and provide repsonse)

  • Save response(xml) back to SQL DB.

below is the sample code.performance wise its not good ,if there are 1000,000 Records in DB its very slow.

its there a better of doing it?any idea or hints to make it better.

please help.

using thirdpartylib;
 public class Program
    {

        static void Main(string[] args)
        {
            var response = PerformStuff();
            Save(response);


        }

        public class TestRequest
        {
            public int col1 { get; set; }
            public bool col2 { get; set; }
            public string col3 { get; set; }
            public bool col4 { get; set; }

            public string col5 { get; set; }
            public bool col6 { get; set; }
            public string col7 { get; set; }

        }
        public class TestResponse
        {
            public int col1 { get; set; }
            public string col2 { get; set; }
            public string col3 { get; set; }
            public int col4 { get; set; }

        }
        public TestRequest GetDataId(int id)
        {
            TestRequest testReq = null;
            try
            {
                SqlCommand cmd = DB.GetSqlCommand("proc_name");
                cmd.AddInSqlParam("@Id", SqlDbType.Int, id);
                SqlDataReader dr = new SqlDataReader(DB.GetDataReader(cmd));
                while (dr.Read())
                {
                    testReq = new TestRequest();

                    testReq.col1 = dr.GetInt32("col1");
                    testReq.col2 = dr.GetBoolean("col2");
                    testReq.col3 = dr.GetString("col3");
                    testReq.col4 = dr.GetBoolean("col4");
                    testReq.col5 = dr.GetString("col5");
                    testReq.col6 = dr.GetBoolean("col6");
                    testReq.col7 = dr.GetString("col7");



                }
                dr.Close();
            }

            catch (Exception ex)
            {
                throw;
            }
            return testReq;

        }
        public static TestResponse PerformStuff()
        {
            var response = new TestResponse();
            //give ids in list
            var ids = thirdpartylib.Methodforid()


            foreach (int id in ids)
            {

                var request = GetDataId(id);


                var output = thirdpartylib.MethodforResponse(request);

                foreach (var data in output.Elements())
                {
                    response.col4 = Convert.ToInt32(data.Id().Class());
                    response.col2 = data.Id().Name().ToString();

                }
            }
            //request details
            response.col1 = request.col1;
            response.col2 = request.col2;
            response.col3 = request.col3;

            return response;
        }

        public static void Save(TestResponse response)
        {

            var Sb = new StringBuilder();
            try
            {
                Sb.Append("<ROOT>");
                Sb.Append("<id");
                Sb.Append(" col1='" + response.col1 + "'");
                Sb.Append(" col2='" + response.col2 + "'");
                Sb.Append(" col3='" + response.col3 + "'");
                Sb.Append(" col4='" + response.col4 + "'");

                Sb.Append("></Id>");
                Sb.Append("</ROOT>");
                var cmd = DB.GetSqlCommand("saveproc");
                cmd.AddInSqlParam("@Data", SqlDbType.VarChar, Sb.ToString());
                DB.ExecuteNoQuery(cmd);

            }
            catch (Exception ex)
            {

                throw;
            }
        }

    }

Thanks!


I think the root of your problem is that you get and insert data record-by-record. There is no possible way to optimize it. You need to change the approach in general.

You should think of a solution that: 1. Gets all the data in one command to the database. 2. Process it. 3. Save it back to the database in one command, using a technique like BULK INSERT. Please be ware that BULK INSERT has certain limitations, so read the documentation carefully.

sql, Save response(xml) back to SQL DB. below is the sample code.performance wise its not good ,if there are 1000,000 Records in DB its very  I think the root of your problem is that you get and insert data record-by-record. There is no possible way to optimize it. You need to change the approach in general. You should think of a solution that: 1. Gets all the data in one command to the database. 2. Process it. 3. Save it back to the database in one command, using a technique like


Your question is very broad and the method PerformStuff() will be fundamentally slow because it takes O(n) * db_lookup_time before another iteration of the output. So, to me it seems you're going about this problem the wrong way.

Database query languages are made to optimize data traversal. So iterating by id, and then checking values, goes around this producing the slowest lookup time possible.

Instead, leverage SQL's powerful query language and use clauses like where id < 10 and value > 100 because you ultimately want to limit the size of the data set needed to be processed by C#.

So:

  1. Read just the smallest amount data you need from the DB
  2. Process this data as a unit, parallelism might help.
  3. Write back modifications in one DB connection.

Hope this sets you in the right direction.

How to fetch,process and save huge record set in c# efficiently , Processing Large Datasets Using C# and SQL Server Table Data Types One of the projects I work on involves processing large datasets and saving them into SQL Initially, we chose to process this data using a rather naïve process. We loaded the records into the database one at a time using a SQL Server stored  Efficiently Paging Through Large Amounts of Data (C#) 08/15/2006; 29 minutes to read +1; In this article. by Scott Mitchell. Download Sample App or Download PDF. The default paging option of a data presentation control is unsuitable when working with large amounts of data, as its underlying data source control retrieves all records, even though only a subset of data is displayed.


Based on your comment, there are multiple things you can enhance in your solution, from memory consumption to CPU usage.

  1. Take advantage of paging at the database level. Do not fetch all records at once, to avoid having memory leaks and/or high memory consumption in cases of 1+ million records, rather take chunk by chunk and do whatever you need to do with it.

  2. Since you don't need to save XML into a database, you can choose to save response into the file. Saving XML into file gives you an opportunity to stream data onto your local disc.

  3. Instead of assembling XML by yourself, use XmlSerializer to do that job for you. XmlSerializer works nicely with XmlWriter which in the end can work with any stream including FileStream. There is a thread about it, which you can take as an example.

To conclude, PerformStuff method won't be only faster, but it will require way fewer resources (memory, CPU) and the most important thing, you'll be easily able to constraint resource consumption of your program (by changing the size of database page).

C# and SQL Server Processing Large Datasets, Step 1: Understanding the Custom Paging Process Before we examine how to retrieve the precise subset of records for the page being displayed, technique, see A More Efficient Method for Paging Through Large Result Sets. and then click the Save icon to add the stored procedure to the database. in my c# windows application project, we need to fetch the data from the sqlquery then filling into C# datatable. Here we have to fill datatable morethan 50 Lakhs rows and 12 columns. if the data has been <10 lakhs the application takes 3 minutes of time. if it is >10 lakhs system getting hanged. This is code i am using.


Observation: your requirement looks like it matches the map / reduce pattern.

If the values in your ids collection returned by thirdpartylib.Methodforid() are reasonably dense, and the number of rows in the table behind your proc_name stored procedure has close to the same number of items in the ids collection, you can retrieve all the records you need with a single SQL query (and a many-row result set) rather than retrieving them one by one. That might look something like this:

public static TestResponse PerformStuff()
{
    var response = new TestResponse();

    var idHash = new HashSet<int> (thirdpartylib.Methodforid());

    SqlCommand cmd = DB.GetSqlCommand("proc_name_for_all_ids");
    using (SqlDataReader dr = new SqlDataReader(DB.GetDataReader(cmd)) { 
        while (dr.Read()) {
            var id = dr.GetInt32("id");
            if (idHash.Contains(id)) {
                testReq = new TestRequest();

                testReq.col1 = dr.GetInt32("col1");
                testReq.col2 = dr.GetBoolean("col2");
                testReq.col3 = dr.GetString("col3");
                testReq.col4 = dr.GetBoolean("col4");
                testReq.col5 = dr.GetString("col5");
                testReq.col6 = dr.GetBoolean("col6");
                testReq.col7 = dr.GetString("col7");

                var output = thirdpartylib.MethodforResponse(request);
                foreach (var data in output.Elements())  {
                    response.col4 = Convert.ToInt32(data.Id().Class());
                    response.col2 = data.Id().Name().ToString();
                }
            } /* end if hash.Contains(id) */  
        }  /* end while dr.Read() */
    } /* end using() */
    return response;
}

Why might this be faster? It makes many fewer database queries, and instead streams in the multiple rows of data to process. This will be far more efficient than your example.

Why might it not work?

  1. if the id values must be processed in the same order produced by thirdpartylib.Methodforid() it won't work.
  2. if there's no way to retrieve all the rows, that is no proc_name_for_all_ids stored procedure available, you won't be able to stream the rows.

Efficiently Paging Through Large Amounts of Data (C#), DataReades are used to efficiently retrieve a forward-only stream of data from a NET DataSet is the core component of the disconnected architecture of ADO. The DataReader provides an unbuffered stream of data that allows procedural logic to efficiently process results from a data source sequentially. The DataReader is a good choice when retrieving large amounts of data because the data is not cached in memory.


Retrieving Data Using a C# .NET DataReader, If you need to process a lot of rows (analytic for example) try looking into server side analytics software Why you are fetching all the data from table at a time. No need to have that many records in the datatable. Tips For Using DataTables with VERY Large Data Sets[^] Save(ConfigurationManager. To retrieve data using a DataReader, create an instance of the Command object, and then create a DataReader by calling Command.ExecuteReader to retrieve rows from a data source. The DataReader provides an unbuffered stream of data that allows procedural logic to efficiently process results from a data source sequentially.


[Solved] Filling large data to c# Data Table, what is the best way to display large set of data in WPF application. We have to retrieve the data not only for showing them in the UI specific and to just focus on efficient data retrieval through AF SDK. The happy medium would be those numbers of tags that the client needs to fully process before it  Program to writes the records reside in the input file and displays on the screen; Program to Accept The Records of Employee and Store the contents in one file and store the records of a particular field in another file; Display student records whose name starts with 'm' using like % operator; PROGRAM FOR INPUT RECORDS INTO THE FILE USING


what is the best way to display large set of da, At any given time, 100 records are cached in the client's memory. new information, these records are discarded, and a new set of 100 are fetched. This design assumes that the records are extremely large, and the memory saving of storing a lot of design and testing required to create a truly efficient virtualized ListView. Currently we have an application that checks for a condition in a table A and obtains the unique id. if a record is found, then flushes all of the dependent tables for the given id and re-inserts these records back into the dependent tables with a set of new values.