Safe JSON

Update: March 5th 2007:  Important change to the recommendation for Safe JSON detailed below.  It is not as safe as people think, but it can still be made to be safe.

We have been investigating the security implications of having a JSON api in Connections. It turns out that it is very easy to leave pretty big security exposures in an application if it isn’t done right.  The security exposure in this case is rogue sites being able to get at data made available via a JSON api.  The truly frightening part of this is that applications installed on a corporate intranet can actually leak data to internet sites should a user visit a rogue site. BTW, these exposures apply equally to both formally published api’s such as Yahoo’s and also any internal JSON api’s often used for AJAX tricks.

As far as I can make out there are 3 different approaches used with JSON api’s. Before detailing the vulnerabilities I’ll highlight the three approaches using the Yahoo examples (you might want to familiarize yourself with the examples before reading any further). The three approaches are :

Approach 1 – Plain JSON

Simply return JSON i.e.

{
  "Image": {
    "Width":800,
    "Height":600,
    "Title":"View from 15th Floor",
    "Thumbnail":
    {
      "Url":"http:\/\/scd.mm-b1.yimg.com\/image\/481989943",
      "Height": 125,
      "Width": "100"
    },
  "IDs":[ 116, 943, 234, 38793 ]
  }
}

Approach 2 – var assignment

Assign the JSON object to some variable that can then be accessed by the embedding application (not an approach used by Yahoo).

var result = {
  "Image": {
    "Width":800,
    "Height":600,
    "Title":"View from 15th Floor",
    "Thumbnail":
    {
      "Url":"http:\/\/scd.mm-b1.yimg.com\/image\/481989943",
      "Height": 125,
      "Width": "100"
    },
  "IDs":[ 116, 943, 234, 38793 ]
  }
}

Approach 3 – function callback

When calling the JSON Web Service pass as a parameter a callback function.  The resulting JSON response passes the JSON object as a parameter to this callback function.

callbackFunction( {
  "Image": {
    "Width":800,
    "Height":600,
    "Title":"View from 15th Floor",
    "Thumbnail":
    {
      "Url":"http:\/\/scd.mm-b1.yimg.com\/image\/481989943",
      "Height": 125,
      "Width": "100"
    },
  "IDs":[ 116, 943, 234, 38793 ]
  }
})

All approaches can be used via an XMLHttpRequest followed by a javascript eval, but as Yahoo points out Approaches 2 & 3 unlike Approach 1 don’t "run afoul of browser security restrictions that prevent files from being loaded across domains." as…

"Using JSON and callbacks, you can place the Yahoo! Web Service request inside a <script> tag, and operate on the results with a function elsewhere in the JavaScript code on the page. Using this mechanism, the JSON output from the Yahoo! Web Services request is loaded when the enclosing web page is loaded. No proxy or server trickery is required."

Indeed they have successfully navigated the browser security restrictions, which I should point out is probably fine for Yahoo as ALL their services only expose publically available data.  However, if a developer coding up an application that contains private data uses the same approach (i.e. Approach 2 or 3) then they have exposed the application to a pretty simple attack.  BTW, I’m defining private data to be any data that should not be publically accessible to the entire world (this probably covers most data on a corporate intranet but also includes any data that requires authenticatation prior to access). Here’s an example.

A user logs into a wiki on the corporate intranet.  This wiki provides a JSON api with a callback function (Approach 3).  The user then visits a rogue site on the internet.  The page from the rogue site, when rendered in the user’s browser, performs a javascript include to the wiki’s json api passing a callback function. This results in data from the wiki being made available to the rogue site’s javascript function in the page via the callback. Further javascript, on the page, can then form POST the data back to the rogue site and as such the data can be stolen. Not good.

Approach 1, on the other hand, does not contain this vulnerability as it can’t be used via a javascript include.  If attempted it does not make the any data available on the page as it is not valid javascript, indeed it, instead, results in a javascript error and so is safe for JSON api’s that contain private data.

Recommendation

I’m going to tentatively propose the following recommendation and would welcome feedback.

When developing a JSON api that contains data that should not be publically accessible to the world use Approach 1 i.e. return plain JSON.  Update: The JSON returned MUST be of type "Serialized Object" and not of type "Array" (as defined by the JSON spec).  (See the March 5th update below for the rationale behind this change).  If the data can be publically exposed then Approaches 2 & 3 have significant advantages in terms of consumability.

Update: March 5th 2007

Joe has pointed out that care still needs to be taken even when using a plain JSON return (Approach 1). From my testing and as others have pointed out the vulnerability that Joe is referring to only applies when returning JSON of type "array" (section 2.3 of  the JSON standard). However, it appears that if you return JSON of type "serialized object" (section 2.2) then, at the moment, I know of no vulnerability.  It’s worth mentioning that arrays can still be present in the JSON as long as they are not at the top level. The example in Approach 1 above is not vulnerable to attack even though it contains an embedded array.  The following structure is vulnerable though

[["ct","Your Name","foo@gmail.com"], ["ct","Another Name","bar@gmail.com"] ]

as google knows only too well

Anyway, I have updated my recommendation.  It remains tentative.

38 thoughts on “Safe JSON”

  1. Approach 2 and 3 should, simply, NEVER, EVER, EVER be used. There are plenty of libraries available today to parse JSON data structures, and none of them will EVER, EVER be able to read the whacked out Approach 2 and 3 styles. EVAR.

    Data, baby, data!

  2. Patrick,

    I have to disagree with you here. Approaches 2 & 3 are much more consumable than Approach 1 and if the data is publically available then there is no security exposure to the json producing site. Yahoo’s api is a great example here.

    However, I do agree that the consuming site now exposes itself to attacks from the producer. When using api’s such as Yahoo’s the consuming site needs to assess the risk and the data that they are now exposing e.g. Yahoo’s javascript could sniff the DOM on the page that it finds itself included in and send data back to Yahoo. So it’s very important that the consumer “trusts” the producer.

  3. I’m with Patrick – use Approach 1 all the way. If you want to increase consumability, then *separately provide* a consumable API that a consumer may choose to use if they want to.

    Besides, Approach 2 and 3 isn’t valid JSON. It is, however, valid JavaScript.

  4. True, but whatever it’s called there are folks doing things that way. It’s actually not so bad if the data involved is not sensitive; there are really quite a few very interesting things that can be done with the callback approach. For myself, I think the larger issue of trust is far more important. Any javascript embedded into a page can steal just about whatever it wants from that page and send it somewhere else you don’t it to go. That’s bad.

  5. It’s worth noting the data feed can be consumed by JavaScript, Perl, Ruby, Python, Java, and about a bazillion other languages. Approach two and three can’t. They can only be used from JavaScript.

  6. Maybe I’m missing something but where’s the security risk? If you provide a resource via a publicly available URL, doesn’t it seem obvious that people can make HTTP requests to “steal” that data?

  7. Jonathan,

    I probably could have explained it a bit better, but there are two risks, namely

    1) Data on a corporate intranet accessible via approach 2 or 3 can be stolen by rogue scripts on the public internet. So while the data is available on a url in the intranet that does not mean that the data should be available to the entire world.
    2) The user may have previously authenticated with a system that has a JSON based api. This is often in the form of a cookie that will be subsequently sent on further requests. As such this private data that requires authentication can again be stolen by rogue scripts.

    Make sense?

  8. Approach 3 can be further sub-divided into JSONP and not. A static given callback name, as (presumably) in the example is less consumable than a full JSONP API where the consumer specifies the name of the callback in the URL query parameter “callback”.

    Doing it that way lets the consumer mix and match concurrent requests for different URLs, both yours and others, and name callbacks to have them consumed by callbacks aware of which request they were tied to, without getting them mixed up due to random timing effects.

    Letting consumers specify the callback name however they please exposes your site’s cookies to theft, though, unless you restrict the callback name so it can not pass the variable “document.cookie” to itself.

    Yahoo restricts it to case insensitive alphanumerics, underscore, period and angle brackets, which is sane.

    Your concern for JSON APIs seems to miss a subtle yet important point: the only thing case 1 guards against is people doing browser side mashups of your published data with javascript alone; whichever variant you pick, any perl, ruby, java or unix shell power user can consume your data and do whatever they want with it.

    To implement *security*, you should use forms of HTTP authentification, i e basic auth, cookies, or exchanging other auth tokens between client and server. If the scheme you pick lets browsers logged on to your system leak the data to third party web pages (I presume this is your fear, though it does not come out very clear in the post), approach 1 would require the attacker to have trojaned the browser to steal the data, which approaches 2 and 3 would give them for free without any spying software.

  9. I’m not really getting it either, you seem to be saying, that with JSON enabled services, if artbitrary code is being run on the website, it has access to that data.

    Wouldn’t it be reasonable to expect that if a user is able to execute script on a users browser, then regardless their data is not safe? No matter how you call your webservice, if I can execute script on your page, I have access to it.

  10. Quoting from http://www.json.org/js.html, “Since JSON is a subset of JavaScript, it can be used in the language with no muss or fuss.” This page has said that as long as I can remember: since 2003. I don’t know about anybody else but I consider it “muss and fuss” to have to use iframes/innerHTML/parsing/eval to make JSON data from a file local to my web server, available to Javascript in my HTML pages: even Douglas Crockford himself calls the DOM “an inconvenient API.” I can’t find the URL, it might be on json.org, but Crock has also said that while JSON API’s should be strict in what they emit, they can be less so in what they accept. A JSON decoder that accepts an optional var assignment can do away with the need to jump through the iframes/innerHTML/parsing/eval hoops; var assignment could for instance be perfect for declarative form validation: one http/script/src call rather than using the AJAX hammer on a problem that is probably not a nail. That being said, you just need to remain vigilant against exposing sensitive data, where the operative word is *remain*.

  11. Kris Gray: “No matter how you call your webservice, if I can execute script on your page, I have access to it.”

    Yep… regardless of any specific exploit, the fundamental problem is that potentially untrustworthy script has full rights to do pretty much whatever it wants on a page. There’s no way of sandboxing the execution of the script or limiting what a script can do.

  12. Pingback: Jaisen's Blog
  13. “The attack that was used in the post you referenced relied on the JSON being in ().”

    No it didn’t, as far as I can see.

  14. One thing that wasn’t emphasized enough in this article is that the assumptions here are that XSRF is at play:

    1. The attack is launched from a malicious page opened by the user in their web browser or some other user-agent that support cookies
    2. The user is already “logged in” to the secure service
    3. The JSON API accepts that login cookie for authentication

    If your secure JSON APIs does not accept cookies for authentication I believe that these XSRF attacks are no longer a problem. You simply have to change the API so that instead of a cookie it uses a parameter directly on the URL or in a custom HTTP header and the valid non-malicious javascript accessing the data must include this parameter/header with each request. It can scan the cookies on its own page to find that parameter for convenience.

    The third-party site will not have access to this secure token (stored in a cookie or otherwise) and won’t be able to submit it with the request.

    Does that seem like a reasonable solution or am I missing something?

  15. Есть такая услуга – добровольное медицинское обслуживание (или ДМО).
    Она предполагает, что вы платите небольшую сумму за абонемент и ходит на прием весь год не платя за каждый прием.
    Однако соцопросы показали, что лишь 6% жителей Питера знают о такой программе.
    Почему так происходит?
    Да потому что частным клиникам намного выгодней брать плату за каждый визит.
    А если какой-нибудь сотрудник клиники попытается посоветовать добровольное медицинское обслуживание клиенту – это сулит ему увольнением.
    Эта информация уже вызвала много скандалов, после того как информацию об услуге распространил один возмущенный врач.
    Его уволили “по собственному желанию”, после того, как он предложил ДМО постоянному клиенту.
    Самое невероятное, что информация по ДМО находились в открытом доступе, просто натыкались на эту информацию единицы.
    Как отстоять свои права?
    О правилах оказания услуги и обязанностях клиник можно узнать, сделав запрос в Яндексе: “добровольное медицинское обслуживание”.
    Именно обслуживание, а не страхование.

    34j5c6h86

  16. играл внове в слоты говорить полностью доволен выигрышем, советую всем снятие делал помощью электронные системы , обработали быстрее чем я думал, консультанты на сайте вежливые и отзывчивые,нравиться было.)) Вот ссылка для самолично сайт,кому нужно http://azart.llee.ru

  17. I have noticed you don’t monetize your website, don’t
    waste your traffic, you can earn extra bucks every
    month because you’ve got high quality content. If you want to know how to
    make extra $$$, search for: Boorfe’s tips best adsense alternative

  18. Disentrance admit me foretell you, I apprehend uncountable of you translucent to debit tiring and harder in every pre-eminent unveil of prime

    but unrestricted pacific of day you foster forth that you got nothing and the uncut shooting tourney was understand the regardless .

    The while in is WHY? why it wasn’t what you expected?

    Four things that kindly exigency open exhilaration are life-work, shekels, dependability and lifestyle,

    but how can you gad roughly change-over them? The counter-argument is you compel ought to to arouse the scrupulous gizmo ,

    the conduit that can hands you to class more shekels, more heyday to squander your stable self with someone you love.

    I finesse to be a brat who try oneself to descry that de-escalate fixed true to jeopardize and mistake in the forefront until I upon

    Affiliate Marketing .

    Suffer to me flyer you to Affiliate Marketing deeply. I’m rolling in it to tower in restrain all you high-priority to be hep

    if you’re unmatched to affiliate marketing, whether you’re a retailer or a publisher.

Leave a Reply

Your email address will not be published. Required fields are marked *