Wednesday, August 21, 2013

JSON parsing in JavaScript

Recently I've has some blocking issues with some code I wrote a while ago. A colleague tried to re-use that code with a different back-end and he kept experiencing the exception:
JSON.parse: unexpected character
Using Firebug, I promptly inspected the JSON content that was retrieved from the new service and by copying and pasting it into http://jsonlint.com/ it seemed to be valid JSON. However, the exception was still there and clearly indicated an issue with the format of the incoming content. I therefore inspected the code of both client and server and it turns out that (1) the old JavaScript snippet was using the eval function and (2) the new back-end, mimicking what my old testing code was doing, was generating through a servlet some testing JSON just by concatenating some strings and serializing the results as JSON.

The serialization of the service was including some '\n' (new lines) that, in the old ecosystem, improved the visualization of the JSON content apparently without disrupting the activity of the eval function. The serialized content included also some dates in the format 'MM/dd/yyyy HH:mm:ss Z'.
The eval function invokes the JavaScript compiler. Since JSON is a proper subset of JavaScript, the compiler will correctly parse the text and produce an object structure (read more on json.org).
Strangely the same code that was working fine in my configuration, was failing in the configuration used by my colleague that seemed failing while interpreting the format of the dates.

As I had a similar problem months ago, I've decided therefore to move away from the eval function:
    In the old GWT code, making use of JNI:

    public static native JavaScriptObject parseJson(String jsonStr) /*-{
        return eval('(' + jsonStr + ')');
    }

    The equivalent in pure JavaScript would be (remember the parenthesis 
    as they turn the code into an expression that returns, rather than 
    just code to run):

    function parseJson(jsonStr) {
         return eval('(' + jsonStr + ')');
    }
to use the more recent JSON.parse which provides validation of the JSON content unlike eval that is faster but allows the string being parsed to contain absolutely anything including function calls.
Native JSON support is included in newer browsers and in the newest ECMAScript (JavaScript) standard. Similar features were already available with some JS libraries such as JQuery (http://api.jquery.com/jQuery.parseJSON/).
See http://www.w3schools.com/json/json_eval.asp for browser and software support.
When using the JSON.parse it is however necessary to escape the control characters.
According to section 2.5 of the JSON spec at ietf.org/rfc/rfc4627.txt: "All Unicode characters may be placed within the quotation marks except for the characters that must be escaped: quotation mark, reverse solidus, and the control characters (U+0000 through U+001F)."
In that specific testing case the newline character could have been present in both the JSON values and in between JSON elements.
    // 1) Example of JSON with newline in the value
    [{"name":"Paolo Ciccarese \n"}]

    // 2) Example of JSON with newline between elements
    [\n{"name":"Paolo Ciccarese"}]
The above case 1) can be addressed by escaping the content (for instance replacing '\n' with '\\n'. The case 2) is illegal in JSON and would work only if using the eval function.
In my case the servlets were generating the '\n' because of the use of out.println instead of the harmless out.print
So in my case the parseJSON function became:
    In the old GWT code, making use of JNI:

    public static native JavaScriptObject parseJson(String jsonStr) /*-{
        try {
            var jsonStr = jsonStr      
                .replace(/[\\]/g, '\\\\')
                .replace(/[\/]/g, '\\/')
                .replace(/[\b]/g, '\\b')
                .replace(/[\f]/g, '\\f')
                .replace(/[\n]/g, '\\n')
                .replace(/[\r]/g, '\\r')
                .replace(/[\t]/g, '\\t')
                .replace(/\\'/g, "\\'");
            return JSON.parse(jsonStr);
        } catch (e) {
            alert("Error while parsing the JSON message: " + e);
        }
    }-*/;

    In pure JavaScript would be:

    function parseJson(jsonStr) {
        try {
            var jsonStr = jsonStr      
                .replace(/[\\]/g, '\\\\')
                .replace(/[\/]/g, '\\/')
                .replace(/[\b]/g, '\\b')
                .replace(/[\f]/g, '\\f')
                .replace(/[\n]/g, '\\n')
                .replace(/[\r]/g, '\\r')
                .replace(/[\t]/g, '\\t')
                .replace(/\\'/g, "\\'");
            return JSON.parse(jsonStr);
        } catch (e) {
            alert("Error while parsing the JSON message: " + e);
        }
    }
  
Therefore in the case:
    [{"name":"Paolo Ciccarese \n"}]
the newline is not interpreted as a newline in the JSON source anymore but as a newline in the JSON data, which is perfectly fine.

In conclusion, I am still not sure why the dates interpretation was failing with the eval method. But with the JSON.parse approach the problem is gone.

I also found this: 'A fast and secure JSON parser in JavaScript'. I did not have time to check out yet but it is promising: ' does not attempt to validate the JSON, so may return a surprising result given a syntactically invalid input, but it does not use eval so is deterministic and is guaranteed not to modify any object other than its return value'.

Saturday, July 20, 2013

Domeo, Annotation Framework, Catch Annotation Hub and Grails Plugins architecture

I found organizing big projects in components always a reassuring idea.
Component Oriented Programming? I let you decide if that is what I mean. I’ve read several discussion on the topic Component Oriented Programming vs. Object Oriented Programming and I am personally one of those who believes the two strategies are complementary and not in competition. As I am not interested in debating the theoretical differences, I would stick to what I normally do and not what I think.
That is one of the reasons I’ve always liked - and I still like - OSGi and that is also one of the reasons I’ve been always attracted by the Grails Plugins architecture.
The components oriented approach did not always pay off. Occasionally I just gave up when I found myself fighting with the technology of the moment, which was getting a little on the way. I am sure most of my problems were related to my limited knowledge of that particular technology... still, deadlines are deadlines and I needed to get things done.
I am certainly not the first nor the last developer celebrating the Grails Plugin Oriented Architecture. Here is a blog post that shows how a domain class defined in a plugin can be reused by other components of the architecture.

However, I have been thinking about the OSGi-based Eclipse architecture for a long time and I even tried to develop a lighter Java framework for developing applications along the same lines. Naturally, since I've been using Grails, I’ve been thinking on how to reproduce the same behavior in web applications by using Grails plugins. Basically I am talking about conveniently leveraging plugins to benefit from all the perks of the Grails platform: domain classes, services, controllers and views. I will defer to future posts some of the technical details. Meanwhile I wish to provide a little context.

I am thinking of leveraging the plugin architecture for a project called CATCH that I’ve been working on for a tiny grant awarded by Harvard Library Labs. As the Domeo Annotation Tool already provided some of the features I need for CATCH I've decided to refactor and spin off some of its components. I've  created a new GitHub project called Annotation Framework which will collect all the new improved modules that will be later used by both Domeo and CATCH


CATCH Annotation Hub

The goal of CATCH is to provide a hub for collecting/searching and sharing annotation produced by several clients. These includes the Domeo annotation client, HighBrow - an annotation client developed at Harvard by Reinhard Engels - and annotator.js an open-source JavaScript library and tool - developed by Nick Stenning  - that can be added to any webpage to make it annotatable.

Both CATCH and the older sister project Domeo are meant to be installed in several instances that should be able to communicate with each other in a federated architecture. You can think this as a series of Annotation Framework Nodes that are distributed and connected so that when a user performs a search on one of the nodes, it can also find results that have been created and stored in other authorized/linked nodes. All with access control...