How to limit the number of float digits JSONEncoder produces?

ensure_ascii=false
simplejson float precision
python json
python 3' >>> json dumps encoding
python object to json
json load utf-8
object of type 'float32' is not json serializable
pip install json

I am trying to set the python json library up in order to save to file a dictionary having as elements other dictionaries. There are many float numbers and I would like to limit the number of digits to, for example, 7.

According to other posts on SO encoder.FLOAT_REPR shall be used. However it is not working.

For example the code below, run in Python3.7.1, prints all the digits:

import json
json.encoder.FLOAT_REPR = lambda o: format(o, '.7f' )
d = dict()
d['val'] = 5.78686876876089075543
d['name'] = 'kjbkjbkj'
f = open('test.json', 'w')
json.dump(d, f, indent=4)
f.close()

How can I solve that?

It might be irrelevant but I am on macOS.

EDIT

This question was marked as duplicated. However in the accepted answer (and until now the only one) to the original post it is clearly stated:

Note: This solution doesn't work on python 3.6+

So that solution is not the proper one. Plus it is using the library simplejson not the library json.

Option 1: Use regular expression matching to round.

You can dump your object to a string using json.dumps and then use the technique shown on this post to find and round your floating point numbers.

To test it out, I added some more complicated nested structures on top of the example you provided::

d = dict()
d['val'] = 5.78686876876089075543
d['name'] = 'kjbkjbkj'
d["mylist"] = [1.23456789, 12, 1.23, {"foo": "a", "bar": 9.87654321}]
d["mydict"] = {"bar": "b", "foo": 1.92837465}

# dump the object to a string
d_string = json.dumps(d, indent=4)

# find numbers with 8 or more digits after the decimal point
pat = re.compile(r"\d+\.\d{8,}")
def mround(match):
    return "{:.7f}".format(float(match.group()))

# write the modified string to a file
with open('test.json', 'w') as f:
    f.write(re.sub(pat, mround, d_string))

The output test.json looks like:

{
    "val": 5.7868688,
    "name": "kjbkjbkj",
    "mylist": [
        1.2345679,
        12,
        1.23,
        {
            "foo": "a",
            "bar": 9.8765432
        }
    ],
    "mydict": {
        "bar": "b",
        "foo": 1.9283747
    }
}

One limitation of this method is that it will also match numbers that are within double quotes (floats represented as strings). You could come up with a more restrictive regex to handle this, depending on your needs.

Option 2: subclass json.JSONEncoder

Here is something that will work on your example and handle most of the edge cases you will encounter:

import json

class MyCustomEncoder(json.JSONEncoder):
    def iterencode(self, obj):
        if isinstance(obj, float):
            yield format(obj, '.7f')
        elif isinstance(obj, dict):
            last_index = len(obj) - 1
            yield '{'
            i = 0
            for key, value in obj.items():
                yield '"' + key + '": '
                for chunk in MyCustomEncoder.iterencode(self, value):
                    yield chunk
                if i != last_index:
                    yield ", "
                i+=1
            yield '}'
        elif isinstance(obj, list):
            last_index = len(obj) - 1
            yield "["
            for i, o in enumerate(obj):
                for chunk in MyCustomEncoder.iterencode(self, o):
                    yield chunk
                if i != last_index: 
                    yield ", "
            yield "]"
        else:
            for chunk in json.JSONEncoder.iterencode(self, obj):
                yield chunk

Now write the file using the custom encoder.

with open('test.json', 'w') as f:
    json.dump(d, f, cls = MyCustomEncoder)

The output file test.json:

{"val": 5.7868688, "name": "kjbkjbkj", "mylist": [1.2345679, 12, 1.2300000, {"foo": "a", "bar": 9.8765432}], "mydict": {"bar": "b", "foo": 1.9283747}}

In order to get other keyword arguments like indent to work, the easiest way would be to read in the file that was just written and write it back out using the default encoder:

# write d using custom encoder
with open('test.json', 'w') as f:
    json.dump(d, f, cls = MyCustomEncoder)

# load output into new_d
with open('test.json', 'r') as f:
    new_d = json.load(f)

# write new_d out using default encoder
with open('test.json', 'w') as f:
    json.dump(new_d, f, indent=4)

Now the output file is the same as shown in option 1.

json — JSON encoder and decoder, The JSON produced by this module's default settings (in particular, the default parse_float, if specified, will be called with the string of every JSON float to be decoded. This can be used to raise an exception if invalid JSON numbers are encountered. Some JSON deserializer implementations may set limits on: the size  def Dump(obj, fid, float_digits=-1, **params): """Wrapper of json.dump that allows specifying the float precision used. Args: obj: The object to dump. fid: The file id to write to. float_digits: The number of digits of precision when writing floats out. **params: Additional parameters to pass to json.dumps.

Here's something that you may be able to use that's based on my answer to the question:

    Write two-dimensional list to JSON file.

I say may because it requires "wrapping" all the float values in the Python dictionary (or list) before JSON encoding it with dump().

(Tested with Python 3.7.2.)

from _ctypes import PyObj_FromPtr
import json
import re


class FloatWrapper(object):
    """ Float value wrapper. """
    def __init__(self, value):
        self.value = value


class MyEncoder(json.JSONEncoder):
    FORMAT_SPEC = '@@{}@@'
    regex = re.compile(FORMAT_SPEC.format(r'(\d+)'))  # regex: r'@@(\d+)@@'

    def default(self, obj):
        return (self.FORMAT_SPEC.format(id(obj)) if isinstance(obj, FloatWrapper)
                else super(MyEncoder, self).default(obj))

    def iterencode(self, obj, **kwargs):
        for encoded in super(MyEncoder, self).iterencode(obj, **kwargs):
            # Check for marked-up float values (FloatWrapper instances).
            match = self.regex.search(encoded)
            if match:  # Get FloatWrapper instance.
                id = int(match.group(1))
                float_wrapper = PyObj_FromPtr(id)
                json_obj_repr = '%.7f' % float_wrapper.value  # Create alt repr.
                encoded = encoded.replace(
                            '"{}"'.format(self.FORMAT_SPEC.format(id)), json_obj_repr)
            yield encoded


d = dict()
d['val'] = FloatWrapper(5.78686876876089075543)  # Must wrap float values.
d['name'] = 'kjbkjbkj'

with open('float_test.json', 'w') as file:
    json.dump(d, file, cls=MyEncoder, indent=4)

Contents of file created:

{
    "val": 5.7868688,
    "name": "kjbkjbkj"
}

Update:

As I mentioned, the above requires all the float values to be wrapped before calling json.dump(). Fortunately doing that could be automated by adding and using the following (minimally tested) utility:

def wrap_type(obj, kind, wrapper):
    """ Recursively wrap instances of type kind in dictionary and list
        objects.
    """
    if isinstance(obj, dict):
        new_dict = {}
        for key, value in obj.items():
            if not isinstance(value, (dict, list)):
                new_dict[key] = wrapper(value) if isinstance(value, kind) else value
            else:
                new_dict[key] = wrap_type(value, kind, wrapper)
        return new_dict

    elif isinstance(obj, list):
        new_list = []
        for value in obj:
            if not isinstance(value, (dict, list)):
                new_list.append(wrapper(value) if isinstance(value, kind) else value)
            else:
                new_list.append(wrap_type(value, kind, wrapper))
        return new_list

    else:
        return obj


d = dict()
d['val'] = 5.78686876876089075543
d['name'] = 'kjbkjbkj'

with open('float_test.json', 'w') as file:
    json.dump(wrap_type(d, float, FloatWrapper), file, cls=MyEncoder, indent=4)

19.2. json — JSON encoder and decoder, The json module always produces str objects, not bytes objects. parse_float, if specified, will be called with the string of every JSON float to be decoded. This can be used to raise an exception if invalid JSON numbers are encountered. This module does not impose any such limits beyond those of the relevant Python  JavaScript (and thus also JSON) does not have distinct types for integers and floating-point values. Therefore, JSON Schema can not use type alone to distinguish between integers and non-integers. The JSON Schema specification recommends, but does not require, that validators use the mathematical value to determine whether a number is an

Doesn't answer this question, but for the decoding side, you could do something like this, or override the hook method.

To solve this problem with this method though would require encoding, decoding, then encoding again, which is overly convoluted and no longer the best choice. I assumed Encode had all the bells and whistles Decode did, my mistake.

# d = dict()
class Round7FloatEncoder(json.JSONEncoder): 
    def iterencode(self, obj): 
        if isinstance(obj, float): 
            yield format(obj, '.7f')


with open('test.json', 'w') as f:
    json.dump(d, f, cls=Round7FloatEncoder)

simplejson — JSON encoder and decoder, The simplejson module will produce str objects in Python 3, not bytes objects. If ignore_nan is true (default: False ), then out of range float values ( nan , inf The default setting of 'utf-8' is fastest and should be using whenever possible. This can be used to raise an exception if invalid JSON numbers are encountered. The float() function allows the user to convert a given value into a floating-point number. In this tutorial, you will learn how to convert a number into a floating-point number having a specific number of decimal points in Python programming language.

Parsing Decimal values from JSON - Using Swift, now idea about neither Decimal nor Integer numbers. This is due to the fact that `JSONEncoder` and `JSONDecoder` are currently binary floating point number to a decimal floating point representation might func randomDecimal() -> Decimal { let double = Double(arc4random()) / Double(UInt32.max)  Unfortunately, most decimal fractions cannot be represented exactly as binary fractions. A consequence is that, in general, the decimal floating-point numbers you enter are only approximated by the binary floating-point numbers actually stored in the machine. The problem is easier to understand at first in base 10. Consider the fraction 1/3.

What Every Computer Scientist Should Know About Floating-Point , Requiring that a floating-point representation be normalized makes the representation FIGURE D-1 Normalized numbers when = 2, p = 3, emin = -1, e​max = 2  Use NumPy’s arange() and linspace() functions to generate the range of float numbers; Use Python generator to produce a range of float numbers without using any library or module; There are multiple ways to get a range of float numbers. Now, Let see one by one with the examples.

format - Functions - Configuration Language, The format function produces a string by formatting a number of other values according to a specification string. abs · ceil · floor · log · max · min · parseint · pow · signum. String Functions %#v, JSON serialization of the value, as with jsonencode . Precision can be specified after the (optional) width with a period ( . )  Rounding Floating Point Number To two Decimal Places in C and C++ How to round off a floatig point value to two places. For example, 5.567 should become 5.57 and 5.534 should become 5.53

Comments
  • @Tomas Farias In the answer of the question you posted it is clearly stated: Note: This solution doesn't work on python 3.6+ so I don't think it is a duplicate, unless of course you are sure it works: if so please tell me how.
  • I agree it wasn't a duplicate of that other question. FWIW, I've spent a fair amount of time looking into doing similar things with the json.JSONEncoder class in the past, and my conclusion from looking at its source code is that doing this kind of thing is not really feasible without changing the data-structure before passed it in. That said, since the source code is available, so you could create a custom version of the library. Also not that simplejson is very similar to Python's own json module—to the point that you can almost use them interchangeably.
  • Would you take a look at my solution? If I understand the question correctly, I believe what I posted is the simplest and best.
  • @SwimBikeRun the simplest might be but you should never say your answer is the best: it's kind of arrogant and here there are people who might have more experience than you.
  • @FrancescoBoi Yep I looked like a fool on this one :D. I was nearly sure Decode had all the bells and whistles as Encode, as I just did this exact thing but in the other direction. That's what I get for pasting untested code. Oops!
  • First of all thanks and sorry for being late in replying. Seems good but now indent=4 has no effect.
  • @Francesco Not the most elegant solution, but the easiest thing would be to read in the file you wrote and write it back out using the default encoder. The other (more complicated) option would be to update the custom encoder to handle the kwargs like indent.
  • which method(s) should be overloaded for handling indent?
  • @FrancescoBoi you could do that inside of iterencode similar to how it's done in json.JSONEncoder. However, I just had a thought. You could also dump the object to a string and then use regex to round - I'll post an update if I can get that working easily.
  • @pault Is there any diference between this complicated answer and simply class MyCustomEncoder(json.JSONEncoder): def iterencode(self, obj): if isinstance(obj, float): return format(obj, '.7f') What kind of input are you expecting that's in addition to this? I don't see that in the main question.
  • Why FloatWrapper? I don't need to apply it to each of my dictionary float values: do I?
  • I am not seeing any change in the output