detect duplicate keys in a JSON file

March 6, 2016 4 comments

I want to edit a JSON file by hand but I’m afraid that somewhere I introduce a duplicate key by accident. If it happens, then the second key silently overwrites the first one. Example:

$ cat input.json 
    "content": {
        "a": 1,
        "a": 2

Naive approach:

import json

with open("input.json") as f:
    d = json.load(f)


# {'content': {'a': 2}}

If there is a duplicate key, it should fail! But it remains silent and you have no idea that you just lost some data.

I found the solution here.

import json

def dict_raise_on_duplicates(ordered_pairs):
    """Reject duplicate keys."""
    d = {}
    for k, v in ordered_pairs:
        if k in d:
           raise ValueError("duplicate key: %r" % (k,))
           d[k] = v
    return d

def main():
    with open("input.json") as f:
        d = json.load(f, object_pairs_hook=dict_raise_on_duplicates)


Now you get a nice error message:

Traceback (most recent call last):
  File "./", line 28, in <module>
  File "./", line 21, in main
    d = json.load(f, object_pairs_hook=dict_raise_on_duplicates)
  File "/usr/lib64/python3.5/json/", line 268, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/usr/lib64/python3.5/json/", line 332, in loads
    return cls(**kw).decode(s)
  File "/usr/lib64/python3.5/json/", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python3.5/json/", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "./", line 13, in dict_raise_on_duplicates
    raise ValueError("duplicate key: %r" % (k,))
ValueError: duplicate key: 'a'

If your json file has no duplicates, then the code aboce nicely prints its content.

