解决resteasy上传表单文件名乱码

Dubbo在2.6版本后合并了dubbox的resteasy代码后，可以支持rest风格的接口发布，但是在使用form表单上传文件的时候，获取的文件名称是乱码。

下面通过对源码分析一下原因，并提供一种可行的解决方法。

首先是一个resteasy上传的使用代码

@POST
    @Path("/upload")
    @Consumes(MediaType.MULTIPART_FORM_DATA)
    @Override
    public Object uploadfile(MultipartFormDataInput input, @Context HttpServletRequest request, @Context HttpServletResponse response) {
        System.out.println("进入业务逻辑");
//        MultipartFormDataReader
        Map<String, Object> map = null;

        Map<String, List<InputPart>> uploadForm = input.getFormDataMap();
        //取得文件表单名
        List<InputPart> inputParts = uploadForm.get("file");

        final String DIRCTORY = "D:/temp/datainput/";
        initDirectory(DIRCTORY);
        InputStream inputStream = null;
        OutputStream outStream = null;
        for (InputPart inputPart : inputParts) {
            try {
                // 文件名称  
                String fileName = getFileName(inputPart.getHeaders());
                inputStream = inputPart.getBody(InputStream.class, null);
                //把文件流保存;
                File file = new File(DIRCTORY + fileName);
                byte[] buffer = new byte[inputStream.available()];
                inputStream.read(buffer);
                outStream = new FileOutputStream(file);
                outStream.write(buffer); 
            } catch (IOException e) {
                e.printStackTrace();
            }finally {
                if(null != inputStream){
                    try {
                        inputStream.close();
                    } catch (IOException e) {
                    }
                }
                if(null != outStream){
                    try {
                        outStream.close();
                    } catch (IOException e) {
                    }
                }
            }

        }

        return Response.ok().build();
    }

View Code

resteasy文件上传使用的Consumes使用的mediattype类型是MULTIPART_FORM_DATE【@Consumes(MediaType.MULTIPART_FORM_DATA)】

这个mediatype使用的Provider使用的是org.jboss.resteasy.plugins.providers.multipart.MultipartFormDataReader，其readForm的入口为

   public MultipartFormDataInput readFrom(Class<MultipartFormDataInput> type, Type genericType, Annotation[] annotations, MediaType mediaType, MultivaluedMap<String, String> httpHeaders, InputStream entityStream) throws IOException, WebApplicationException
   {
     
      String boundary = mediaType.getParameters().get("boundary");
      if (boundary == null) throw new IOException(Messages.MESSAGES.unableToGetBoundary());
      MultipartFormDataInputImpl input = new MultipartFormDataInputImpl(mediaType, workers);
      input.parse(entityStream);
      return input;
   }

View Code

在跟入上面代码的parse方法 input.parse(entityStream)中的new BinaryMessage()构造函数中，MultipartInputImpl对http的head进行了解析

private static class BinaryMessage extends Message
   {
      private BinaryMessage(InputStream is) throws IOException, MimeIOException
      {
         try {
            MimeStreamParser parser = new MimeStreamParser(null);
            
            StorageProvider storageProvider;
            if (System.getProperty(DefaultStorageProvider.DEFAULT_STORAGE_PROVIDER_PROPERTY) != null) {
               storageProvider = DefaultStorageProvider.getInstance();
            } else {
               StorageProvider backend = new CustomTempFileStorageProvider();
               storageProvider = new ThresholdStorageProvider(backend, 1024);
            }
            parser.setContentHandler(new BinaryOnlyMessageBuilder(this, storageProvider));
            parser.parse(is); // 此处未解析代码，未传入指定的字符串编码方式
         } catch (MimeException e) {
            throw new MimeIOException(e);
         }

      }
   }

View Code

在行 parser.parse(is);中，采用的是apache-mime4j-1.6版本的流解析器，由于MultipartInputImpl在调用apache-mime4j的解析方法，没有可指定字符编码的方法，此处编码的设置传递会丢失。（PS:MultipartInputImpl中的defaultPartCharset，可以通过拦截器request.setAttribute(InputPart.DEFAULT_CHARSET_PROPERTY, ENCODING_UTF_8);进行指定）。

后续未指定字符编码的调用链中，apache-mime4j对上传内容的解析采用了默认的ASCII编码进行处理，对应RawField.parseBody()

 private String parseBody() {
        int offset = colonIdx + 1;
        int length = raw.length() - offset;
        return ContentUtil.decode(raw, offset, length);
    }

View Code

该decode方法中使用的是写死的ASCII编码进行处理

public static String decode(ByteSequence byteSequence, int offset,
            int length) {
        return decode(CharsetUtil.US_ASCII, byteSequence, offset, length);
    }

View Code

所以看到这里，就了解了为什么文件名称会是乱码的了，大概也知道其他地方通过拦截器设置编码格式解决不了文件名称乱码的问题了。

所以可行的解决方法可以是（亲测可用），将apache-mime4j-1.6的源码导入工程中，并且修改ContentUtil的decode方法，如下：

public static String decode(ByteSequence byteSequence, int offset,
            int length) {
        return decode(CharsetUtil.UTF_8 //修改此处默认编码
        , byteSequence, offset, length);
    }

这种方法不好的点就是冗余了一份开源代码到自己项目中，并且项目包路径会比较奇怪。当然也可以编译一份修改后的代码放到自己公司的nexus库中。

相关阅读:
iOS 关于字体根据不同屏幕尺寸等比适配的问题(zz)
安卓开发：一种快速提取安卓app的UI图标资源的方法
 申请邓白氏编码的时候总是提示 Enter a valid Street Address 怎么办？
利用日期、经纬度求日出日落时间 C语言程序代码（zz）
JS导出Excel 代码笔记
 Bootstrap系列 -- 44. 分页导航
 Bootstrap系列 -- 43. 固定导航条
 Bootstrap系列 -- 42. 导航条中的按钮、文本和链接
 Bootstrap系列 -- 41. 带表单的导航条
 Bootstrap系列 -- 40. 导航条二级菜单
原文地址：https://www.cnblogs.com/loveyou/p/9529856.html