Re your first question - you could create your variable number of threads and store each one in a container (eg a vector).
When you need all the data, you can do a join on each one - a for loop perhaps. See the answer to this questionthis question. It uses boost but the idea is similar.