Python: Creating a 2D histogram from a numpy matrix

numpy histogram
python histogram
seaborn 2d histogram
matplotlib histogram
matplotlib hist2d
python 2d histogram heatmap
python numpy histogram 2d
python histogram from list

I'm new to python.

I have a numpy matrix, of dimensions 42x42, with values in the range 0-996. I want to create a 2D histogram using this data. I've been looking at tutorials, but they all seem to show how to create 2D histograms from random data and not a numpy matrix.

So far, I have imported:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib import colors

I'm not sure if these are correct imports, I'm just trying to pick up what I can from tutorials I see.

I have the numpy matrix M with all of the values in it (as described above). In the end, i want it to look something like this:

obviously, my data will be different, so my plot should look different. Can anyone give me a hand?

Edit: For my purposes, Hooked's example below, using matshow, is exactly what I'm looking for.

If you have the raw data from the counts, you could use plt.hexbin to create the plots for you (IMHO this is better than a square lattice): Adapted from the example of hexbin:

import numpy as np
import matplotlib.pyplot as plt

n = 100000
x = np.random.standard_normal(n)
y = 2.0 + 3.0 * x + 4.0 * np.random.standard_normal(n)
plt.hexbin(x,y)

plt.show()

If you already have the Z-values in a matrix as you mention, just use plt.imshow or plt.matshow:

XB = np.linspace(-1,1,20)
YB = np.linspace(-1,1,20)
X,Y = np.meshgrid(XB,YB)
Z = np.exp(-(X**2+Y**2))
plt.imshow(Z,interpolation='none')

numpy.histogram2d — NumPy v1.19 Manual, Compute the bi-dimensional histogram of two data samples. Parameters If [ array, array], the bin edges in each dimension (x_edges, y_edges = bins). Next we create a histogram H with random bin content: >>> Creating a 2D Histogram . Matplotlib library provides an inbuilt function matplotlib.pyplot.hist2d() which is used to create 2D histogram.Below is the syntax of the function: matplotlib.pyplot.hist2d(x, y, bins=(nx, ny), range=None, density=False, weights=None, cmin=None, cmax=None, cmap=value)

If you have not only the 2D histogram matrix but also the underlying (x, y) data, then you could make a scatter plot of the (x, y) points and color each point according to its binned count value in the 2D-histogram matrix:

import numpy as np
import matplotlib.pyplot as plt

n = 10000
x = np.random.standard_normal(n)
y = 2.0 + 3.0 * x + 4.0 * np.random.standard_normal(n)
xedges, yedges = np.linspace(-4, 4, 42), np.linspace(-25, 25, 42)
hist, xedges, yedges = np.histogram2d(x, y, (xedges, yedges))
xidx = np.clip(np.digitize(x, xedges), 0, hist.shape[0]-1)
yidx = np.clip(np.digitize(y, yedges), 0, hist.shape[1]-1)
c = hist[xidx, yidx]
plt.scatter(x, y, c=c)

plt.show()

Histogram - Numpy and Scipy Documentation, How to: Construct a histogram from an array. Related. Construct an array from a matrix � Construct a 2D array. This is what NumPy’s histogram() function does, and it is the basis for other functions you’ll see here later in Python libraries such as Matplotlib and Pandas. Consider a sample of floats drawn from the Laplace distribution .

@unutbu's answer contains a mistake: xidx and yidx are calculated the wrong way (at least on my data sample). The correct way should be:

xidx = np.clip(np.digitize(x, xedges) - 1, 0, hist.shape[0] - 1)
yidx = np.clip(np.digitize(y, yedges) - 1, 0, hist.shape[1] - 1)

As the return dimension of np.digitize that we are interested in is between 1 and len(xedges) - 1, but the c = hist[xidx, yidx] needs indices between 0 and hist.shape - 1.


Below is the comparison of results. As you can see you get similar but not the same result.

import numpy as np
import matplotlib.pyplot as plt

fig = plt.figure()
ax1 = fig.add_subplot(211)
ax2 = fig.add_subplot(212)

n = 10000
x = np.random.standard_normal(n)
y = 2.0 + 3.0 * x + 4.0 * np.random.standard_normal(n)
xedges, yedges = np.linspace(-4, 4, 42), np.linspace(-25, 25, 42)
hist, xedges, yedges = np.histogram2d(x, y, (xedges, yedges))

xidx = np.clip(np.digitize(x, xedges), 0, hist.shape[0] - 1)
yidx = np.clip(np.digitize(y, yedges), 0, hist.shape[1] - 1)
c = hist[xidx, yidx]
old = ax1.scatter(x, y, c=c, cmap='jet')

xidx = np.clip(np.digitize(x, xedges) - 1, 0, hist.shape[0] - 1)
yidx = np.clip(np.digitize(y, yedges) - 1, 0, hist.shape[1] - 1)

c = hist[xidx, yidx]
new = ax2.scatter(x, y, c=c, cmap='jet')


plt.show()

numpy - Construct a histogram from an array, %matplotlib inline import numpy as np import matplotlib.pyplot as plt Just as we create histograms in one dimension by dividing the number-line into bins, we histograms in two-dimensions by dividing points among two-dimensional bins. We'll start by defining some data—an x and y array drawn from a multivariate� Python: Creating a 2D histogram from a numpy matrix: stackoverflow: Generate a heatmap in MatPlotLib using a scatter data set: stackoverflow: numpy.histogram2d: numpy: Python: Creating a 2D histogram from a numpy matrix: stackoverflow: pylab_examples example code: hist2d_log_demo.py: matplotlib doc: Too many values to unpack” in numpy

I'm a big fan of the 'scatter histogram', but I don't think the other solutions fully do them justice. Here is a function that implements them. The major advantage of this function compared to the other solutions is that it sorts the points by the hist data (see the mode argument). This means that the result looks more like a traditional histogram (i.e., you don't get the chaotic overlap of markers in different bins).

MCVE for this figure (using my function):

import numpy as np
import matplotlib.pyplot as plt
from hist_scatter import scatter_hist2d

fig = plt.figure(figsize=[5, 4])
ax = plt.gca()

x = randgen.randn(npoint)
y = 2 + 3 * x + 4 * randgen.randn(npoint)

scat = scatter_hist2d(x, y,
                      bins=[np.linspace(-4, 4, 42),
                            np.linspace(-25, 25, 42)],
                      s=5,
                      cmap=plt.get_cmap('viridis'))
ax.axhline(0, color='k', linestyle='--', zorder=3, linewidth=0.5)
ax.axvline(0, color='k', linestyle='--', zorder=3, linewidth=0.5)
plt.colorbar(scat)

Room for improvement?

The primary drawback of this approach is that the points in the densest areas overlap the points in lower density areas, leading to somewhat of a misrepresentation of the areas of each bin. I spent quite a bit of time exploring two approaches for resolving this:

1) using smaller markers for higher density bins

2) applying a 'clipping' mask to each bin

The first one gives results that are way too crazy. The second one looks nice -- especially if you only clip bins that have >~20 points -- but it is extremely slow (this figure took about a minute).

So, ultimately I've decided that by carefully selecting the marker size and bin size (s and bins), you can get results that are visually pleasing and not too bad in terms of misrepresenting the data. After all, these 2D histograms are usually intended to be visual aids to the underlying data, not strictly quantitative representations of it. Therefore, I think this approach is far superior to 'traditional 2D histograms' (e.g., plt.hist2d or plt.hexbin), and I presume that if you've found this page you're also not a fan of traditional (single color) scatter plots.

If I were king of science, I'd make sure all 2D histograms did something like this for the rest of forever.

Histograms, Binnings, and Density, 2D histograms are useful when you need to analyse the relationship between 2 numerical variables that have a huge number of values. It avoids the over plotting � The Numpy histogram function doesn't draw the histogram, but it computes the occurrences of input data that fall within each bin, which in turns determines the area (not necessarily the height if the bins aren't of equal width) of each bar. In this example: np.histogram([1, 2, 1], bins=[0, 1, 2, 3])

#83 basic 2D Histograms with matplotlib – The Python Graph Gallery, To create a 2d histogram in python there are several solutions: for from numpy import c_ import numpy as np import matplotlib.pyplot as plt� This Python tutorial will focus on how to create a random matrix in Python. Here we will use NumPy library to create matrix of random numbers, thus each time we run our program we will get a random matrix. We will create these following random matrix using the NumPy library. Matrix with floating values; Random Matrix with Integer values

How to create a 2d histogram with matplotlib ?, Construct a 2D-histogram with variable bin width. First define the bin edges: >>> xedges = [0, 1, 1.5, 3, 5] >>> yedges = [0, 2, 3, 4, 6]. Next we create a histogram� I have the following code: r = numpy.zeros(shape = (width, height, 9)) It creates a width x height x 9 matrix filled with zeros. Instead, I'd like to know if there's a function or way to initialize

numpy.histogram2d — NumPy v1.8 Manual, import matplotlib.pyplot as plt import numpy as np from matplotlib import colors from matplotlib.ticker import To generate a 1D histogram we only need a single vector of numbers. (array([3.0000e+00, 1.4000e+01, 6.9000e+01, 2.7700e+02, 9.6900e+02, 2.5790e+03, 5.8010e+03, Download Python source code: hist.py. You cannot get a histogram with the y axis representing the values of list elements. Per definition a histogram gives the number of elements, which fall into certain bins, or the probability to find the element in a certain bin. plt.hist is plotting function that draws a bar chart from such a histogram.

Comments
  • what is your x and y in numpy matrix, its basically 42 rows and 42 columns. what is your x, y values?
  • Yes, my matrix is 42 rows and 42 columns. In each index, there is an integer from 0-996 that was computed and placed there earlier in the program.
  • matshow is exactly what I'm looking for. Thank you so much!
  • To get the axes labeled correctly, you can pass extent to imshow with the min and max values of the bin edges.
  • When posting an answer make sure your answer doesn't rely on information given in other answers. You correctly described the problem in another answer (which is a good thing to do) but then only provided a partial solution that cannot be used whithout looking at the other answer. Please edit your answer and include a complete snippet or your post will probably be deleted.