Home > python > detect duplicate keys in a JSON file

detect duplicate keys in a JSON file

I want to edit a JSON file by hand but I’m afraid that somewhere I introduce a duplicate key by accident. If it happens, then the second key silently overwrites the first one. Example:

$ cat input.json 
    "content": {
        "a": 1,
        "a": 2

Naive approach:

import json

with open("input.json") as f:
    d = json.load(f)


# {'content': {'a': 2}}

If there is a duplicate key, it should fail! But it remains silent and you have no idea that you just lost some data.

I found the solution here.

import json

def dict_raise_on_duplicates(ordered_pairs):
    """Reject duplicate keys."""
    d = {}
    for k, v in ordered_pairs:
        if k in d:
           raise ValueError("duplicate key: %r" % (k,))
           d[k] = v
    return d

def main():
    with open("input.json") as f:
        d = json.load(f, object_pairs_hook=dict_raise_on_duplicates)


Now you get a nice error message:

Traceback (most recent call last):
  File "./check_duplicates.py", line 28, in <module>
  File "./check_duplicates.py", line 21, in main
    d = json.load(f, object_pairs_hook=dict_raise_on_duplicates)
  File "/usr/lib64/python3.5/json/__init__.py", line 268, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/usr/lib64/python3.5/json/__init__.py", line 332, in loads
    return cls(**kw).decode(s)
  File "/usr/lib64/python3.5/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib64/python3.5/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "./check_duplicates.py", line 13, in dict_raise_on_duplicates
    raise ValueError("duplicate key: %r" % (k,))
ValueError: duplicate key: 'a'

If your json file has no duplicates, then the code aboce nicely prints its content.

Categories: python Tags: ,
  1. October 24, 2016 at 22:40

    The problem is that it will also apply to dictionaries that are values.
    { “a”: {“b”:”b”, “b”:”c”}} will fail. Is there a way to make it only look at the keys?

  1. No trackbacks yet.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: