Creating a namedtuple with a custom hash function

python namedtuple
namedtuple vs tuple
python namedtuple vs class
namedtuple constructor
pyspark namedtuple
unpack namedtuple
collections.counter python 3
python subclass namedtuple

Say I have a namedtuple like this:

FooTuple = namedtuple("FooTuple", "item1, item2")

And I want the following function to be used for hashing:

    return hash(self.item1) * (self.item2)

I want this because I want the order of item1 and item2 to be irrelevant (I will do the same for the comparison-operator). I thought of two ways to do this. The first would be:

FooTuple.__hash__ = foo_hash

This works, but it feels hacked. So I tried subclassing FooTuple:

class EnhancedFooTuple(FooTuple):
    def __init__(self, item1, item2):
        FooTuple.__init__(self, item1, item2)

    # custom hash function here

But then I get this:

DeprecationWarning: object.__init__() takes no parameters

So, what can I do? Or is this a bad idea altogether and I should just write my own class from scratch?

I think there is something wrong with your code (my guess is that you created an instance of the tuple with the same name, so fooTuple is now a tuple, not a tuple class), because subclassing the named tuple like that should work. Anyway, you don't need to redefine the constructor. You can just add the hash function:

In [1]: from collections import namedtuple

In [2]: Foo = namedtuple('Foo', ['item1', 'item2'], verbose=False)

In [3]: class ExtendedFoo(Foo):
   ...:     def __hash__(self):
   ...:         return hash(self.item1) * hash(self.item2)

In [4]: foo = ExtendedFoo(1, 2)

In [5]: hash(foo)
Out[5]: 2

Writing Clean Python With Namedtuples –, Python comes with a specialized “namedtuple” container type that doesn't seem you decide, you can now create new “car” objects with the Car factory function. were named that way to avoid naming collisions with user-defined tuple fields. More efficient solution: Use the special _make alternate constructor to directly construct the namedtuple from an arbitrary iterable without creating additional intermediate tuples (as star-unpacking to the main constructor requires). Runs faster, less memory churn: Point._make(t)

Starting in Python 3.6.1, this can be achieved more cleanly with the typing.NamedTuple class (as long as you are OK with type hints):

from typing import NamedTuple, Any

class FooTuple(NamedTuple):
    item1: Any
    item2: Any

    def __hash__(self):
        return hash(self.item1) * hash(self.item2)

How to provide additional initialization for a subclass of namedtuple?, EdgeBase = namedtuple("EdgeBase", "left, right"). I want to implement a custom hash-function for this, so I create the following subclass: class Edge(EdgeBase):​  1. _make() :-This function is used to return a namedtuple() from the iterable passed as argument. 2. _asdict() :-This function returns the OrdereDict() as constructed from the mapped values of namedtuple(). 3. using “**” (double star) operator :- This function is used to convert a dictionary into the namedtuple().

A namedtuple with a custom __hash__ function is useful to store immutable data models into dict and set

For example:

class Point(namedtuple('Point', ['label', 'lat', 'lng'])):
    def __eq__(self, other):
        return self.label == other.label

    def __hash__(self):
        return hash(self.label)

    def __str__(self):
        return ", ".join([str(, str(self.lng)])

Override both __eq__ and __hash__ allows grouping businesses into a set, ensuring that each business line is unique in the collection:

walgreens = Point(label='Drugstore', lat = 37.78735890, lng = -122.40822700)
mcdonalds = Point(label='Restaurant', lat = 37.78735890, lng = -122.40822700)
pizza_hut = Point(label='Restaurant', lat = 37.78735881, lng = -122.40822713)

businesses = [walgreens, mcdonalds, pizza_hut]
businesses_by_line = set(businesses)

assert len(business) == 3
assert len(businesses_by_line) == 2

Are namedtuples that popular? They always felt awkward to me. If , If you like writing in functional style, namedtuples are much more you are creating the namedtuple you can create your own namedtuple by  Different hash functions are given below: Hash Functions. The following are some of the Hash Functions − Division Method. This is the easiest method to create a hash function. The hash function can be described as − h(k) = k mod n. Here, h(k) is the hash value obtained by dividing the key value k by size of hash table n using the remainder.

collections — Container datatypes, factory function for creating tuple subclasses with named fields dict-like class for creating a single view of multiple mappings If module is defined, the __​module__ attribute of the named tuple is set to that value. a decorator and functions for automatically adding generated special methods to user-defined classes. In its most general form, a hash function projects a value from a set with many members to a value from a set with a fixed number of members. We basically convert the input into a different form by applying a transformation function. Hash functions map data of arbitrary length to data of a fixed length.

Namedtuple in Python, Python supports a type of container like dictionaries called “namedtuples()” present _make() :- This function is used to return a namedtuple() from the iterable  Creating a Cryptographic Hash Function. A Cryptographic hash function is something that mechanically takes an arbitrary amount of input, and produces an "unpredictable" output of a fixed size. The unpredictableness isn't in the operation itself. Obviously, due to its mechanical nature, every time a given input is used the same output will result.

namedtuple, Each kind of namedtuple is represented by its own class, created by using the namedtuple() factory function. The arguments are the name of the new class and a  namedtuple() Factory Function for Tuples with Named Fields¶ Named tuples assign meaning to each position in a tuple and allow for more readable, self-documenting code. They can be used wherever regular tuples are used, and they add the ability to access fields by name instead of position index.