0

I have a working program that loops through a list (20,000+ items) of UID's, building, connecting, serializing and saving the item properties that were found. Works fine.

What I would like to achieve is to speed it up. Those 20,000+ HTTP requests it has to make and everything after.. its not particularly fast.

Ive tried reading into multithreading and below code, about the connectionManager. Re-using the HttpClient etc. But I'm unable to understand or apply the given code to my situation.

How can I create my code such that it sends out multiple HTTP requests at the same time to speed up the process?

PoolingHttpClientConnectionManager cm = new PoolingHttpClientConnectionManager();
        CloseableHttpClient httpClient = HttpClients.custom()
                .setConnectionManager(cm)
                .build();

Below is my current code, how can I make this process faster?

JSONObject httpJSONObject;
            for (int i = 0; i < missingUIDList.size(); i++)
                try {
                    HttpGet get = new HttpGet("https://api.guildwars2.com/v2/items/" + missingUIDList.get(i));
                    HttpClient client = HttpClientBuilder.create().build();
                    HttpResponse response = client.execute(get);
                    HttpEntity entity = response.getEntity();
                    String result = EntityUtils.toString(entity);
                    httpJSONObject = new JSONObject(result);

                    itemRoot items = new Gson().fromJson(httpJSONObject.toString(), itemRoot.class);
                    String name = items.getName().replaceAll("'","''");
                    connection = DriverManager.getConnection("jdbc:sqlite:gw2.db");
                    Statement statement = connection.createStatement();
                    statement.setQueryTimeout(30);  // set timeout to 30 sec.
                    String cookie = "INSERT INTO item VALUES('" + name +
                            "','" + items.getDescription() +
                            "','" + items.getType() +
                            "'," + items.getLevel() +
                            ",'" + items.getRarity() +
                            "'," + items.getVendor_value() +
                            "," + items.getDefault_skin() +
                            "," + items.getId() +
                            ",'" + items.getChat_link() +
                            "','" + items.getIcon() +
                            "');";
                    System.out.println(cookie);
                    statement.executeUpdate(cookie);
                } catch (ClientProtocolException e) {
                    e.printStackTrace();
                } catch (IOException e) {
                    e.printStackTrace();
                } catch (JSONException e) {
                    e.printStackTrace();
                } catch (SQLException e) {
                    System.err.println(e.getMessage());
                }
        }

EDIT:

With tips from Vadim this is the, hopefully more, optimized code for single threaded.

private void addMissingItems(List<Integer> missingUIDList) {
    HttpClient client = HttpClientBuilder.create().build();
    HttpResponse response;
    HttpEntity entity;
    String result;
    try {
        connection = DriverManager.getConnection("jdbc:sqlite:gw2.db");
        statement = connection.createStatement();
        statement.setQueryTimeout(30);  // set timeout to 30 sec.
    } catch (SQLException e) {
        System.err.println(e.getMessage());
    }

    for (int i = 0; i < missingUIDList.size(); i++)
        try {
            HttpGet get = new HttpGet("https://api.guildwars2.com/v2/items/" + missingUIDList.get(i));
            response = client.execute(get);
            entity = response.getEntity();
            result = EntityUtils.toString(entity);
            JSONObject httpJSONObject = new JSONObject(result);
            itemRoot items = new Gson().fromJson(httpJSONObject.toString(), itemRoot.class);

            System.out.println(httpJSONObject.getInt("id"));
            String cookie = "INSERT INTO item VALUES('" + items.getName().replaceAll("'","''") +
                    "','" + items.getDescription() +
                    "','" + items.getType() +
                    "'," + items.getLevel() +
                    ",'" + items.getRarity() +
                    "'," + items.getVendor_value() +
                    "," + items.getDefault_skin() +
                    "," + items.getId() +
                    ",'" + items.getChat_link() +
                    "','" + items.getIcon() +
                    "');";
            statement.executeUpdate(cookie);

        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (JSONException e) {
            e.printStackTrace();
        } catch (SQLException e) {
            System.err.println(e.getMessage());
        }
}

2 Answers 2

2

Solution using Executor services

PoolingHttpClientConnectionManager cm = new   PoolingHttpClientConnectionManager();
CloseableHttpClient httpClient = HttpClients.custom()
            .setConnectionManager(cm)
            .build();

private final ExecutorService pool = Executors.newFixedThreadPool(poolSize);

for (int i = 0; i < missingUIDList.size(); i++) {
    HttpGet get = new HttpGet("https://api.guildwars2.com/v2/items/" + missingUIDList.get(i));
    pool.execute(new Worker(get));
}

class Worker implements Runnable {
    private final HttpGet get;
    private final CloseableHttpClient httpClient;
    Handler(CloseableHttpClient httpClient,HttpGet get) { 
        this.get = get;
        this.httpClient = httpClient;
    }
    public void run() {
        try {
            HttpResponse response = client.execute(get);
            HttpEntity entity = response.getEntity();
            String result = EntityUtils.toString(entity);
            httpJSONObject = new JSONObject(result);
            ....
            //rest of your code
            ....
            statement.executeUpdate(cookie);
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        } catch (JSONException e) {
            e.printStackTrace();
        } catch (SQLException e) {
            System.err.println(e.getMessage());
        }
    }
}
Sign up to request clarification or add additional context in comments.

5 Comments

With some slight adjustments this works almost perfectly for me, thank you! There is a side-effect. Im getting duplicate primary key errors. Looks to me its accessing the same url. How to prevent this?
It's mostly because of some shared object between the threads(Assuming you have endeed unique items). Try to avoid any global objects like httpJSONObject and create new objects inside the thread's run() method only.
What column is Primary Key?
@Vadim column "id" is the primary key.
So, does first "single thread" solution works fine and there are no duplicates (with the same id) in missingUIDList, always? it also can be possible if you run your code twice against same data. (i.e. your item was inserted in previous run). That is why it is aways better to do not set Primary Key (like id), but rather have it auto-generated and do not use in INSERT statement.
2

May I recommend first to optimize your existing single thread code, before diving into multi-threading? After that move it to multi-threading will be much easier.

You have two parts inside your for loop:

  1. HTTP call for data
  2. Database call to store the data

For both parts you do very time expensive operations by opening new connections.

Instead, you can: For http part (at least), move client creations out of loop like this:

HttpClient client = HttpClientBuilder.create().build();
HttpResponse response;
HttpEntity entity;
String result;

then reuse them inside the loop:

for (int i = 0; i < missingUIDList.size(); i++)
 try {
       HttpGet get = new HttpGet("https://api.guildwars2.com/v2/items/" + missingUIDList.get(i));
       response = client.execute(get);
       entity = response.getEntity();
       result = EntityUtils.toString(entity);
       httpJSONObject = new JSONObject(result);
                ...  

For the DB part (at least),

  • move connection creation out of loop (similar to above
  • make INSERT SQL with parameters instead of concatenate values (never do that- SQL Injection is there in the world)
  • create PreparedStatement outside of the loop as well
  • inside the loop set parameters and execute same query over and over again.

Optionally there are many different ways to make a bulk INSERT which inserts many records in one DB call rather then run then one by one.

6 Comments

Thank you for the suggestions. Going to try smooth the code with your suggestions and if succesfull combine it with the other suggestion.
I think, with your suggestions, I made it as optimized as possible now? Put the code as an edit on the bottom of my question. Could you take a look?
yes something like that. Perhaps for SQLLite it is good enough. BTW for stronger DB like Oracle it is better to use PreparedStatement with parameters and then call it over and over with different parameters is much more efficient. Oracle optimizer caches query and when it is prepared it run much faster for next calls... but SQLLite... I do not know is there a difference or not :-). Still concatenate data into SQL text is very-very bad idea (SQL Injection, characters threated as part of SQL not as data and so on...) - very bad. Any input data must go through parameters. period. :-)
OK, so for SQLLite it is not a case :-)
Next you may think about bulk insert... If you'd like. It helped me a lot multiple times back... As example each thread will take group of items, then insert them in one shot to DB... good luck!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.