1

I have a table "Links" with some download links.

My .NET application reads this table, takes the link, creates a web client and downloads the associated file. I want to create several threads that do this, but each one should read a different record, otherwise two threads are trying to download the same file. How can do this?

I have tried this but it doesn't work:

    public static Boolean Get_NextProcessingVideo(ref Int32 idVideo, ref String youtubeId, ref String title)
    {
        Boolean result = false;

        using (NpgsqlConnection conn = new NpgsqlConnection(ConfigurationDB.GetInstance().ConnectionString))
        {
            conn.Open();
            NpgsqlTransaction transaction = conn.BeginTransaction();

            String query = "BEGIN WORK; LOCK TABLE links IN ACCESS EXCLUSIVE MODE; SELECT v.idlink,  v.title " +
                " FROM video v  WHERE v.schedulingflag IS FALSE AND v.errorflag IS FALSE ORDER BY v.idvideo  LIMIT 1; " + 
                " COMMIT WORK;";
            NpgsqlCommand cmd = new NpgsqlCommand(query, conn, transaction);

            NpgsqlDataReader dr = cmd.ExecuteReader();

            if (dr.HasRows)
            {
                dr.Read();


                idVideo = Convert.ToInt32(dr["idvideo"]);
                title = dr["title"].ToString();

                Validate_Scheduling(idVideo); // 
                result = true;
            }

            transaction.Commit();
            conn.Close();

        }
        return result;
    }
2
  • Have a look at advisory locks: postgresql.org/docs/current/static/… Commented Feb 5, 2013 at 13:15
  • 2
    Somewhat of an alternative approach: what if you read the set of pending records into the main thread, then pass the IDs to the new worker threads? Each thread is simply responsible for fetching the details from the DB, doing the work, and then recording the result. You'll have removed any chance of overlap this way and I am guessing the performance is going to be the same since the work done in the thread is going to be the slow part. Commented Feb 5, 2013 at 15:06

1 Answer 1

1

You have a few options here. The one thing you don't want to be doing, as you note, is locking the table.

  1. Advisory locks. The advantage is that these are extra-transactional. The disadvantage is that they are not closed at the transaction and must be closed specifically, and that leakage can eventually cause problems (essentially a shared memory leak on the back-end). Generally speaking I do not like extra-transactional locks like this and while advisory locks are cleared when the db session ends, there are still possible issues with stale locks.

  2. You can have a dedicated thread pull the pending files first, and then delegate specific retrievals to child threads. This is probably the best approach both in terms of db round-trips and simplicity of operation. I would expect that this would perform best of any of the solutions.

  3. You can SELECT FOR UPDATE NOWAIT in a stored procedure which can handle exception handling. See Select unlocked row in Postgresql for an examples.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.