SOAP Services With MTOM

SOAP is an XML based protocol which means that all data inside the SOAP envelope must be text based. If you want to include binary data in a SOAP message, it too must be text based. To achieve this you can convert binary data to a base64 encoded string and simply embed the string inside the SOAP message. The diagram below shows a sample SOAP message with binary data embedded as a base64 string.

Sending Binary Data with SOAP Without MTOM

While this is a simple approach for dealing with binary data with SOAP, there are a few things to consider. When binary data is base64 encoded it increases in size by approximately 30%. For small amounts of binary data this probably won’t be an issue, but for larger volumes of data the increased message size can significantly impact performance.
Something else to consider is the overhead for the XML parsers that will consume the SOAP messages. A large binary object will result in a huge base64 encoded string, and more CPU intensive parsing for the consumer.

Introducing MTOM

Message Transmission Optimisation Protocol or MTOM for short, can be used efficiently handle binary data transmission via SOAP.  Rather than base64 encoding binary data and embedding it in the SOAP body, the binary data is sent as a MIME attachment. As shown in the diagram below, the binary data (a PDF in this case) is sent in the HTTP request/response as a MIME attachment. The SOAP message contains a unique key used to reference the MIME attachment

Sending Binary Data over SOAP With MTOM

The SOAP message is pretty lean, as we’ve avoided the base64 encoded bloat that we saw previously. A smaller XML payload also means less resource-intensive parsing by the consumer.

Sample Code

In the next few sections I’ll show you how MTOM can be configured in a CXF service. The source code for this post includes a fully working MTOM enabled CXF service and integration test. Feel free to pull it from GitHub before reading on.

Schema Definition For Binary Elements

Our sample SOAP service returns simple bank account data, represented by the Account XSD type defined below.  On line 7 a new base64Binary typed Statement element has been added to represent a PDF document.
<xsd:complexType name="Account">
    <xsd:sequence>
       <xsd:element name="AccountNumber" type="xsd:string"/>
       <xsd:element name="AccountName" type="xsd:string"/>
       <xsd:element name="AccountBalance" type="xsd:double"/>
       <xsd:element name="AccountStatus" type="EnumAccountStatus"/>
       <xsd:element name="Statement" type="xsd:base64Binary"/>
    </xsd:sequence>
</xsd:complexType>

When the WSDL2Java process is run, JAXB generates a POJO with a Statement instance variable of type byte array as shown below.

JAXB Generated Model For Base64 Binary

Using a byte array for the Statement element means that consumers will have to read the entire binary statement into memory in one go. This can be improved by telling JAXB to use a DataHandler instead of byte array. DataHandler returns an InputStream which allows the client application to stream the binary data if needs be. This is particularly useful when dealing with large volumes of binary data.

To switch from from byte array to DataHandler you need to update the XSD with the expected MIME content types. On line 7 below xmime:expectedContentTypes=”application/pdf”  indicates that we are expecting the binary data to be of MIME type application/pdf. The xmime:expectedContentTypes attribute can be set to any valid MIME type or a comma-separated list of MIME types.

<xsd:complexType name="Account">
    <xsd:sequence>
       <xsd:element name="AccountNumber" type="xsd:string"/>
       <xsd:element name="AccountName" type="xsd:string"/>
       <xsd:element name="AccountBalance" type="xsd:double"/>
       <xsd:element name="AccountStatus" type="EnumAccountStatus"/>
       <xsd:element name="Statement" type="xsd:base64Binary" xmime:expectedContentTypes="application/pdf"/>
    </xsd:sequence>
</xsd:complexType>

When I run the WSDL2Java process JAXB regenerates the domain model and the Statement element is now typed as a DataHandler and annotated with @XmlMimeType(application/pdf).

Generated JAXB Model When Expected MIME Type Specified

Configuring MTOM with CXF

The Spring configuration below defines an EndpointImpl class using an injected CXF Bus. The endpoint is configured to use 2 interceptors for logging and MTOM is enabled on line 21 by simply setting the mtom-enabled key to true.

@Configuration
@ImportResource({ "classpath:META-INF/cxf/cxf.xml" })
@PropertySource("classpath:application.properties")
public class Config {
  
  @Bean
  public ServletRegistrationBean servletRegistrationBean(ApplicationContext context) {
    return new ServletRegistrationBean(new CXFServlet(), "/*");
  }

  @Bean
  public EndpointImpl serviceEndpoint(Bus cxfBus,
                                      AccountServiceEndpoint accountServiceEndpoint,
                                      @Value("${mtom-enabled}") Boolean mtomEnabled,
                                      LoggingInInterceptor inInterceptor,
                                      LoggingOutInterceptor outInterceptor) {

    EndpointImpl endpoint = new EndpointImpl(cxfBus, accountServiceEndpoint);
    endpoint.getInInterceptors().add(inInterceptor);
    endpoint.getOutInterceptors().add(outInterceptor);
    endpoint.getProperties().put("mtom-enabled", mtomEnabled);
    endpoint.publish("http://localhost:8080/mtom-demo/service");
    return endpoint;
  }
}

The sample code includes an integration test that calls the service and logs the SOAP response. Below is an extract from the logged response showing the SOAP body and MIME attachment. Note that there is no binary data in the Statement element, but an xop:Include element instead. The href value on line 20 references the Content-ID value of the MIME attachment on line 31.
The MIME attachment content type on line 28 is application/pdf.  This is consistent with the MIME content type we set in the XSD.

ID: 1
Response-Code: 200
Encoding: UTF-8
Content-Type: multipart/related; type="application/xop+xml"; boundary="uuid:74cee6e6-296e-44f8-9de6-a2d5da7b4f2b"; start="<root.message@cxf.apache.org>"; start-info="text/xml"
Headers: {}
Payload: --uuid:74cee6e6-296e-44f8-9de6-a2d5da7b4f2b
Content-Type: application/xop+xml; charset=UTF-8; type="text/xml"
Content-Transfer-Encoding: binary
Content-ID: <root.message@cxf.apache.org>

<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
  <soap:Body>
    <AccountDetailsResponse xmlns="http://com/blog/samples/webservices/accountservice">
      <AccountDetails>
        <AccountNumber>12345</AccountNumber>
        <AccountName>Joe Bloggs</AccountName>
        <AccountBalance>3400.0</AccountBalance>
        <AccountStatus>Active</AccountStatus>
        <Statement>
          <xop:Include xmlns:xop="http://www.w3.org/2004/08/xop/include" href="cid:eec6e626-5536-4778-a821-37de8e1f018b-1@com"/>
        </Statement>
      </AccountDetails>
    </AccountDetailsResponse>
   </soap:Body>
</soap:Envelope>

--uuid:74cee6e6-296e-44f8-9de6-a2d5da7b4f2b
Content-Type: application/pdf
Content-Transfer-Encoding: binary
Content-ID: <eec6e626-5536-4778-a821-37de8e1f018b-1@com>

%PDF-1.4
%????
2 0 obj
<</Filter/FlateDecode/Length 565>>stream
x????j?0 E?z
-?E]???-??B? ?L ?,? ????_?H??2 ? ]?|t?????T?< ?=??K? B?? 7?????????a??:?9?~)C?????f$??w

Sample Code

Feel free to grab the source from GitHub and have a play around. If you have any comments or question feel free to leave a comment below.