24

Looking on the W3 Schools URL encoding webpage, it says that @ should be encoded as %40, and that space should be encoded as %20.

I've tried both URLEncoder and URI, but neither does the above properly:

import java.net.URI;
import java.net.URLEncoder;

public class Test {
    public static void main(String[] args) throws Exception {

        // Prints me%40home.com (CORRECT)
        System.out.println(URLEncoder.encode("[email protected]", "UTF-8"));

        // Prints Email+Address (WRONG: Should be Email%20Address)
        System.out.println(URLEncoder.encode("Email Address", "UTF-8"));

        // http://www.home.com/test?Email%[email protected]
        // (WRONG: it has not encoded the @ in the email address)
        URI uri = new URI("http", "www.home.com", "/test", "Email [email protected]", null);
        System.out.println(uri.toString());
    }
}

For some reason, URLEncoder does the email address correctly but not spaces, and URI does spaces currency but not email addresses.

How should I encode these 2 parameters to be consistent with what w3schools says is correct (or is w3schools wrong?)

4
  • 5
    If you are looking at w3schools.com, then you are doing it wrong. Refer to this Commented Jan 14, 2013 at 15:59
  • @Srinivas the webservice I am using explicitly ignores requests unless parameters are encoded as explained on the w3schools webpage :( Commented Jan 14, 2013 at 16:02
  • 1
    URLEncoder does not encode as per the URL specification but as per the the application/x-www-form-urlencoded MIME format (which is what most application servers expect for parameter keys/values.) The URI type encodes as per its documentation - that is, it isn't a complete URL builder. Note that different parts of the URI have different rules. See this post for more analysis. Commented Jan 14, 2013 at 16:05
  • 1
    @McDowell Yes, I think I should have asked how do I get java to do what JavaScript's encodeURIComponent() does. I'll check out your lib. Commented Jan 14, 2013 at 16:30

2 Answers 2

45

Although I think the answer from @fge is the right one, as I was using a 3rd party webservice that relied on the encoding outlined in the W3Schools article, I followed the answer from Java equivalent to JavaScript's encodeURIComponent that produces identical output?

public static String encodeURIComponent(String s) {
    String result;

    try {
        result = URLEncoder.encode(s, "UTF-8")
                .replaceAll("\\+", "%20")
                .replaceAll("\\%21", "!")
                .replaceAll("\\%27", "'")
                .replaceAll("\\%28", "(")
                .replaceAll("\\%29", ")")
                .replaceAll("\\%7E", "~");
    } catch (UnsupportedEncodingException e) {
        result = s;
    }

    return result;
}
Sign up to request clarification or add additional context in comments.

2 Comments

You forgot the & symbol which is important for decoding the url (either for GET or POST method), because its the symbol that separates the keys in the request
I am compelled to point out that w3schools is not the W3C. They are quite, quite different.
18

URI syntax is defined by RFC 3986 (permissible content for a query string are defined in section 3.4). Java's URI complies to this RFC, with a few caveats mentioned in its Javadoc.

You will notice that the pchar grammar rule is defined by:

pchar = unreserved / pct-encoded / sub-delims / ":" / "@"

Which means a @ is legal in a query string.

Trust URI. It will do the correct, "legal" stuff.

Finally, if you have a look at the Javadoc of URLEncoder, you see that it states:

This class contains static methods for converting a String to the application/x-www-form-urlencoded MIME format.

Which is not the same thing as a query string as defined by the URI specification.

3 Comments

I think the question I should have asked is how do I get java to encode a URL the same way as JavaScript encodeURIComponent, since this is what the receiving webservice expects: stackoverflow.com/questions/607176/…
Since then, I have developed a library which does URI templates (RFC 6570), which is even more powerful ;)
this is weird... the Javadocs for URI states it follows RFC 2396, even in Java 8, where RFC 2396 is from 1998, and it has been obsoleted by RFC 3986 since 2005

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.