📜 ⬆️ ⬇️

Problems transfer utf-8 from a form in JAX-RS (REST)

Introduction


There are several methods for transferring data from the web application interface, but perhaps the most common one is sending a form with the MIME-type application/x-www-form-urlencoded . Another option is multipart/form-data .

Controllers in MVC frameworks can be used as a server technology for receiving ( Spring MVC , Java Server Pages , Java Server Faces should be mentioned from the basic Java technologies. But these frameworks make life difficult for an interface developer if he is not familiar with Java or needs a step away from what the framework allows. In the case of exposing a REST interface to an application’s backend, the front is simplified: it can be done by a person who knows basic javascript and jquery, independent of the backend development. onizatora choice greatly expanded: the Apache the Velocity , the FreeMarker (it is worth mentioning that Spirng MVC well integrated with the latest), then the form data on the server side is written in Bina, which is associated with the view / controller True, JSF also observed a genetic problem.. with encodings, consideration of which is a topic for a separate article.

A brief introduction to JAX-RS was given in a previous article . And exposed via JAX-RS interface can receive GET and POST requests with form data. The problems of this approach when using non- latin-1 will be discussed in this article.

')

Retrieving form data in JAX-RS


To inject a parameter from a form into a method, there are several annotations: @Form , @FormParam . For a POST query, @FormParam equivalent to @QueryParam for GET behavior. With a small exception: behavior when using utf-8. In the case of using the GET method, decoding to a string from urlencoded occurs without problems. For POST , the Content-Type must also be set to charset=utf-8 , so that when decoding from urlencoded conversion of the byte stream takes place in UTF-8, and not Latin-1 (the default behavior).

Browsers

Browsers (or jquery, for example) do not specify the encoding, which leads to misinterpretation of form field values ​​if they contain higher CP UTF-8. This problem is avoided when using jquery by explicitly specifying the Content-Type: application/x-www-form-urlencoded; charset=utf-8 header Content-Type: application/x-www-form-urlencoded; charset=utf-8 Content-Type: application/x-www-form-urlencoded; charset=utf-8 . For a form without js, I could not succeed by specifying an enctype for the form.

Jboss resteasy

Of course, the question arises: is it necessary to use the Form annotation if the data will come from other applications? In our case, its use is justified from the principles of DRY: the same method can be used both by the client form and by external applications using this API.

When using the Resteasy client framework, several options are possible. For example, you can add @HeaderParam("Content-Type") to the required methods:

 @Path("/") public interface Rest { @POST @Path("test") public String test(@FormParam("q") String query, @HeaderParam("Content-Type") String contentType); } 


Use then pop out as follows:

 Rest client = ProxyFactory.create(Rest.class, url); client.test(query, "application/x-www-form-urlencode; charset=utf-8"); 


But it would be more appropriate and convenient to use the client interceptor, which adds a charset field to the Content-Type: application/x-www-form-urlencoded header. It is implemented as follows:

 @ClientInterceptor @HeaderDecoratorPrecedence public class RestInterceptor implements ClientExecutionInterceptor { public static final String FORM_CONTENT_TYPE = "application/x-www-form-urlencoded"; public static final String FORM_CONTENT_TYPE_WITH_CHARSET = "application/x-www-form-urlencoded; charset=utf-8"; @Override public ClientResponse execute(ClientExecutionContext context) throws Exception { String contentType = context.getRequest().getHeaders().getFirst(HttpHeaders.CONTENT_TYPE); if (formWithoutCharset(contentType)) { context.getRequest().header(HttpHeaders.CONTENT_TYPE, FORM_CONTENT_TYPE_WITH_CHARSET); } return context.proceed(); } private boolean formWithoutCharset(String contentType) { return contentType != null && contentType.contains(FORM_CONTENT_TYPE) && ! contentType.contains("charset"); } } 


Such a variant interceptor, of course, is not perfect. It is possible that you really want to send the form to latin-1 ... But this is the first step to simplifying the client code.

To activate it, you need to slightly improve the framework initialization procedure:

 public static void initResteasy() { ResteasyProviderFactory factory = ResteasyProviderFactory.getInstance(); RegisterBuiltin.register(factory); InterceptorRegistry<ClientExecutionInterceptor> registry = factory.getClientExecutionInterceptorRegistry(); registry.register(new RestInterceptor()); } 


After that, when using ClientRequest , and using proxy objects in the process of sending a request, if there is a Content-Type header with application/x-www-form-urlencoded , but without a charset , then the header containing it is put down.

Source: https://habr.com/ru/post/140270/


All Articles