I have a fairly huge table (180 million records) in SQL Server database. Something like below:
my_table>> columns: Date, Value1, Value2, Valeu3
I also have a python script which run concurrently with pool.map() and in each child process(iteration), a connection is made to access my_table and fetch a slice of it with below script and do other computations:
select * from my_table where Date is between a1 and a2
My question is when the python script run in parallel, does each child process load the whole SQL table data (180 million rows) in memory and then slice it based on where condition?
If that is the case, each child process would have to load 180 million rows into memory and that would freeze everything.
I am pretty sure that if I query a huge table in SQL Server a couple of times, the whole data would load into memory by SQL Server just once for the first query and other queries would use the data which had loaded into RAM by the first query.