Thursday, April 25, 2013

Tokenize string into node list with XSLT 1.0 and custom XPath function

For XSLT 2.0, use built-in tokenzie function.

For XSLT 1.0, I'll show two ways below.

At the end of the post, i also demonstrate a way to test XSLT from command line directly, it can prove invaluable if you need to debug custom xpath function, like i am doing.

== Part I - use template ==

For xlst 1.0, here is a sample. It shows a recursive template. You can put the template in the same file, but I split the files into two and used import to make the main xsl more clean.

main.xsl

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:car="http://foo/bar">
  <xsl:import href="serialNumber.xsl"/>

 <xsl:template match="/">
                  <xsl:choose>
                      <xsl:when test="contains(/inputStr, ',')=false">
                          <car:Serial>
                            <xsl:value-of select="/inputStr"/>
                          </car:Serial>
                      </xsl:when>
                      <xsl:otherwise>
                          <xsl:call-template name="serial">
                            <xsl:with-param name="commaStr" select="/inputStr"/>
                          </xsl:call-template>
                      </xsl:otherwise>
                  </xsl:choose>
  </xsl:template>
</xsl:stylesheet>

serialNumber.xsl:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:car="http://foo/bar">

  <xsl:template name="serial" >
  <xsl:param name="commaStr"/>
    <xsl:if test="normalize-space($commaStr) != ''">
        <xsl:choose>
            <xsl:when test="contains($commaStr, ',')">
                 <car:Serial>
                    <xsl:value-of select="substring-before($commaStr, ',')"/>    
                  </car:Serial>
                  <xsl:call-template name="serial">
                    <xsl:with-param name="commaStr" select="substring-after($commaStr,',')"/>
                  </xsl:call-template>        
            </xsl:when>
            <xsl:otherwise>
                 <car:Serial>
                    <xsl:value-of select="$commaStr"/>    
                  </car:Serial>
            </xsl:otherwise>
        </xsl:choose>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>

Disregard the specifics of namespace, here are some sample input and output:

<inputStr>Aaa</inputStr>
results in
<Serial>Aaa</Serial>

<inputStr>Aaa,bbb,ccc</inputStr>
result in
<Serial>Aaa</Serial><Serial>bbb</Serial><Serial>ccc</Serial>

<inputStr>Aaa,bbb,ccc,</inputStr>
will yield the same result.

== part II - use custom java XPath function ==

I developed the project based this link https://blogs.oracle.com/bwb/resource/custom_xpath_functions/Creating_Custom_XPath_functions_with_JDeveloper_FullPost.html, and https://blogs.oracle.com/reynolds/entry/building_your_own_path

My intention is to create a generic tokenizer like Java StringTokenizer. I made it work, but it's mixed result.

I only made it work with XSLT. I still need to figure out how to make it work with BPEL xpath. I suspect there is something not quite right when I pass in a single node to the xpath function whereas the signature expects a List. That's an experiment I need do later. When I do, I'll update this post.

Anyway, here is the java class and the descriptor (just a quick dirty impl for demo with two input parameters):

StringTokenizer.java

package com.foo.util.StringTokenizer;

import java.io.ByteArrayInputStream;
import java.util.ArrayList;
import java.util.List;
import java.util.StringTokenizer;
import javax.xml.parsers.DocumentBuilderFactory;
import oracle.fabric.common.xml.xpath.IXPathContext;
import oracle.fabric.common.xml.xpath.IXPathFunction;
import oracle.fabric.common.xml.xpath.XPathFunctionException;
import org.w3c.dom.Element;
import org.w3c.dom.NodeList;
import org.w3c.dom.Node;

public class Tokenizer {
    public static Object tokenizeString(String str, String token) {      
        Element node =  null;
        String xmlStr = "";
        try {
            StringTokenizer st = new StringTokenizer(str, token);
            while (st.hasMoreTokens()) {
                String nt = st.nextToken().trim();
                     System.out.println(nt);
                xmlStr +="<node>"+nt + "</node>";
            }                
            if (xmlStr != null) {
                xmlStr ="<root>" + xmlStr + "</root>";
            }    
            node = DocumentBuilderFactory
                    .newInstance()
                    .newDocumentBuilder()
                    .parse(new ByteArrayInputStream(xmlStr.getBytes()))
                    .getDocumentElement();
     
        }
        catch (Exception e) {
            System.out.println("***ex="+e.getMessage());
        }
        return nList;
    }
    public static void main(String args[]) throws Exception{        
          NodeList nList = (NodeList) Tokenizer.tokenizeString("aaa,bb,cc", ",");
         
          if (nList.getLength() > 0) {
             for (int i = 0; i < nList.getLength(); i++) {
                  Node tNode = nList.item(i);
                  String tc = tNode.getTextContent();
                 System.out.println("***"+tc);              
             }
          }
}
}

descriptor:
<?xml version="1.0" encoding="UTF-8"?>
<soa-xpath-functions
  xmlns="http://xmlns.oracle.com/soa/config/xpath"
  xmlns:tn="http://www.oracle.com/XSL/Transform/java/com.cci.util.StringTokenizer.Tokenizer">
  <function name="tn:tokenize">
    <className>com.cci.util.StringTokenizer.Tokenizer</className>
   <return type="node-set"/>
    <params>
      <param name="str" type="string"/>
     <param name="token" type="string"/>
    </params>
    <desc/>
    <detail>
       <![CDATA[This function breaks up a comma separated string return a tokenized list of nodes.]]>
    </detail>
   </function>  
</soa-xpath-functions>

snippet of test.xsl
...
                xmlns:tn="http://www.oracle.com/XSL/Transform/java/com.foo.util.StringTokenizer.Tokenizer"
...
 <xsl:variable name="sNodes"                      select="tn:tokenizeString(/foo_ESB/DATA/ESB_ORDER/SERIAL_NUMBER_RECEIVED, ',')"/>

<xsl:for-each select="$sNodes">
   <xsl:value-of select="'<serial>'"/>
    <xsl:value-of select="."/>
   <xsl:value-of select="'</serial>"/>          
</xsl:for-each>

testing from the command line

One problem of testing inside JDev XSL mapper is that you can't see your java debug output. You can test your impl from the command line directly. It provides additional error messages if something is wrong.

Here is how I do it in my environment. 

1. set classpath=C:\Oracle\Middleware\oracle_common\modules\oracle.xdk_11.1.0\xmlparserv2.jar;c:\aproj\src\StringTokenizer\deploy\stringTokenizer.jar;C:\Oracle\Middleware\jdeveloper\soa\modules\oracle.soa.fabric_11.1.1\fabric-runtime.jar

2. test with actual input source, and xslt file:
    java oracle.xml.parser.v2.oraxsl    input.xml    test.xsl

3. test with Java only (no xlst):
      java com.foo.util.StringTokenizer.Tokenizer

No comments:

Post a Comment