xml parsing of words in several languages


I'm trying to read a unicode string from an xml..

When extracting the string, the "\" transfor into a "\\".

It is important to leave the string as it is since it's a unicde.

I expected the string to be extracted with the single "\".


For example:

The string "\u65E5\u672C\u8A9E" is the japanese word for "japanese".

But the string is being extracted as "\\u65E5\\u672C\\u8A9E", so the unicode characthers are not shown correctly


Here is the xml I am using and the java code to parse it.





The java code is:

              URL fileUrl = PPMCulture.class.getResource(fileName);
              File file = new File(fileUrl.getPath());   
              DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
              DocumentBuilder db = dbf.newDocumentBuilder();
              Document doc = db.parse(file);
              Node root = doc.getFirstChild();
              NodeList childNodes = root.getChildNodes();
              Node childNode;

              for (int i=0 ; i<childNodes.getLength() ; i++)
                  childNode = childNodes.item(i);
                  if (childNode.getNodeType() == Node.TEXT_NODE)
                  NodeList childProperties = childNode.getChildNodes();
                  String nativeName = "";
                  for (int j=0 ; j<childProperties.getLength() ; j++)
                     Node childProperty = childProperties.item(j);
                     if (childProperty.getNodeName().equals("languageCode"))
                         languageCode =


                     else if (childProperty.getNodeName().equals("nativeName"))
                         nativeName = removeFirstChar(getNodeTextValue(childProperty), '\\');                       




Senior Java Developer

Backend Group
Thank you for your interest!

We will contact you as soon as possible.

Send us a message

Oops, something went wrong
Please try again or contact us by email at info@tikalk.com