Get Image size WITHOUT loading image into memory

pil get image size
python get image file size
python get image size from url
cv2 get image size
python image size in pixels
pillow get width of image
tkinter get image size
image size python opencv

I understand that you can get the image size using PIL in the following fashion

from PIL import Image
im = Image.open(image_filename)
width, height = im.size

However, I would like to get the image width and height without having to load the image in memory. Is that possible? I am only doing statistics on image sizes and dont care for the image contents. I just want to make my processing faster.

As the comments allude, PIL does not load the image into memory when calling .open. Looking at the docs of PIL 1.1.7, the docstring for .open says:

def open(fp, mode="r"):
    "Open an image file, without loading the raster data"

There are a few file operations in the source like:

 ...
 prefix = fp.read(16)
 ...
 fp.seek(0)
 ...

but these hardly constitute reading the whole file. In fact .open simply returns a file object and the filename on success. In addition the docs say:

open(file, mode="r")

Opens and identifies the given image file.

This is a lazy operation; this function identifies the file, but the actual image data is not read from the file until you try to process the data (or call the load method).

Digging deeper, we see that .open calls _open which is a image-format specific overload. Each of the implementations to _open can be found in a new file, eg. .jpeg files are in JpegImagePlugin.py. Let's look at that one in depth.

Here things seem to get a bit tricky, in it there is an infinite loop that gets broken out of when the jpeg marker is found:

    while True:

        s = s + self.fp.read(1)
        i = i16(s)

        if i in MARKER:
            name, description, handler = MARKER[i]
            # print hex(i), name, description
            if handler is not None:
                handler(self, i)
            if i == 0xFFDA: # start of scan
                rawmode = self.mode
                if self.mode == "CMYK":
                    rawmode = "CMYK;I" # assume adobe conventions
                self.tile = [("jpeg", (0,0) + self.size, 0, (rawmode, ""))]
                # self.__offset = self.fp.tell()
                break
            s = self.fp.read(1)
        elif i == 0 or i == 65535:
            # padded marker or junk; move on
            s = "\xff"
        else:
            raise SyntaxError("no marker found")

Which looks like it could read the whole file if it was malformed. If it reads the info marker OK however, it should break out early. The function handler ultimately sets self.size which are the dimensions of the image.

Get Image size WITHOUT loading image into memory, I understand that you can get the image size using PIL in the following fashion from PIL import Image im = Image.open(image_filename) width, height = im.size​  “Get size of the image without loading into memory” is published by mehdi parsaei.

If you don't care about the image contents, PIL is probably an overkill.

I suggest parsing the output of the python magic module:

>>> t = magic.from_file('teste.png')
>>> t
'PNG image data, 782 x 602, 8-bit/color RGBA, non-interlaced'
>>> re.search('(\d+) x (\d+)', t).groups()
('782', '602')

This is a wrapper around libmagic which read as few bytes as possible in order to identify a file type signature.

Relevant version of script:

https://raw.githubusercontent.com/scardine/image_size/master/get_image_size.py

[update]

Hmmm, unfortunately, when applied to jpegs, the above gives "'JPEG image data, EXIF standard 2.21'". No image size! – Alex Flint

Seems like jpegs are magic-resistant. :-)

I can see why: in order to get the image dimensions for JPEG files, you may have to read more bytes than libmagic likes to read.

Rolled up my sleeves and came with this very untested snippet (get it from GitHub) that requires no third-party modules.

#-------------------------------------------------------------------------------
# Name:        get_image_size
# Purpose:     extract image dimensions given a file path using just
#              core modules
#
# Author:      Paulo Scardine (based on code from Emmanuel VAÏSSE)
#
# Created:     26/09/2013
# Copyright:   (c) Paulo Scardine 2013
# Licence:     MIT
#-------------------------------------------------------------------------------
#!/usr/bin/env python
import os
import struct

class UnknownImageFormat(Exception):
    pass

def get_image_size(file_path):
    """
    Return (width, height) for a given img file content - no external
    dependencies except the os and struct modules from core
    """
    size = os.path.getsize(file_path)

    with open(file_path) as input:
        height = -1
        width = -1
        data = input.read(25)

        if (size >= 10) and data[:6] in ('GIF87a', 'GIF89a'):
            # GIFs
            w, h = struct.unpack("<HH", data[6:10])
            width = int(w)
            height = int(h)
        elif ((size >= 24) and data.startswith('\211PNG\r\n\032\n')
              and (data[12:16] == 'IHDR')):
            # PNGs
            w, h = struct.unpack(">LL", data[16:24])
            width = int(w)
            height = int(h)
        elif (size >= 16) and data.startswith('\211PNG\r\n\032\n'):
            # older PNGs?
            w, h = struct.unpack(">LL", data[8:16])
            width = int(w)
            height = int(h)
        elif (size >= 2) and data.startswith('\377\330'):
            # JPEG
            msg = " raised while trying to decode as JPEG."
            input.seek(0)
            input.read(2)
            b = input.read(1)
            try:
                while (b and ord(b) != 0xDA):
                    while (ord(b) != 0xFF): b = input.read(1)
                    while (ord(b) == 0xFF): b = input.read(1)
                    if (ord(b) >= 0xC0 and ord(b) <= 0xC3):
                        input.read(3)
                        h, w = struct.unpack(">HH", input.read(4))
                        break
                    else:
                        input.read(int(struct.unpack(">H", input.read(2))[0])-2)
                    b = input.read(1)
                width = int(w)
                height = int(h)
            except struct.error:
                raise UnknownImageFormat("StructError" + msg)
            except ValueError:
                raise UnknownImageFormat("ValueError" + msg)
            except Exception as e:
                raise UnknownImageFormat(e.__class__.__name__ + msg)
        else:
            raise UnknownImageFormat(
                "Sorry, don't know how to get information from this file."
            )

    return width, height

[update 2019]

Check out a Rust implementation: https://github.com/scardine/imsz

scardine/image_size: Get image width and height given a , Get image width and height given a file path using minimal dependencies (no #10 add functions to get size from memory buffer Closed by scardine over 1  Get Image size WITHOUT loading image into memory I understand that you can get the image size using PIL in the following fashion from PIL import Image im = Image. open (image_filename) width, height = im. size However, I would like to get the image width and height without having to load the image in memory.

There is a package on pypi called imagesize that currently works for me, although it doesn't look like it is very active.

Install:

pip install imagesize

Usage:

import imagesize

width, height = imagesize.get("test.png")
print(width, height)

Homepage: https://github.com/shibukawa/imagesize_py

PyPi: https://pypi.org/project/imagesize/

Determining height and width of an image without loading it into , getimagesize() can also return some more information in imageinfo parameter. Retrieve JPEG width and height without downloading/reading entire image. The problem with this approach is that the entire image gets loaded into memory. And since the pixel data is stored uncompressed in memory, even a small 512 × 512 image (that fills less than half of an iPhone 4’s screen) will take up 1 MB of memory.

I often fetch image sizes on the Internet. Of course, you can't download the image and then load it to parse the information. It's too time consuming. My method is to feed chunks to an image container and test whether it can parse the image every time. Stop the loop when I get the information I want.

I extracted the core of my code and modified it to parse local files.

from PIL import ImageFile

ImPar=ImageFile.Parser()
with open(r"D:\testpic\test.jpg", "rb") as f:
    ImPar=ImageFile.Parser()
    chunk = f.read(2048)
    count=2048
    while chunk != "":
        ImPar.feed(chunk)
        if ImPar.image:
            break
        chunk = f.read(2048)
        count+=2048
    print(ImPar.image.size)
    print(count)

Output:

(2240, 1488)
38912

The actual file size is 1,543,580 bytes and you only read 38,912 bytes to get the image size. Hope this will help.

getimagesize - Manual, from PIL import Image import glob, os size = 128, 128 for infile in up a lot of memory), Pillow will issue a DecompressionBombWarning if the image is over a all information in the image and the palette can be represented without a palette. Steps Get size of image without loading into memory. Calculate scale factor with image’s size. Load bitmap into memory with calculated values.

Another short way of doing it on Unix systems. It depends on the output of file which I am not sure is standardized on all systems. This should probably not be used in production code. Moreover most JPEGs don't report the image size.

import subprocess, re
image_size = list(map(int, re.findall('(\d+)x(\d+)', subprocess.getoutput("file " + filename))[-1]))

Image Module, from PIL import Image import glob, os size = 128, 128 for infile in up a lot of memory), Pillow will issue a DecompressionBombWarning if the image is over a all information in the image and the palette can be represented without a palette. for memory usage, before open, before size and after size, confirming that the 22MB image hasn't been loaded into memory (ipython started out using 32538 17252). That figure then jumps up to ~57k following im.load().

Image Module, Image::Size provides three interfaces for possible import: imgsize(stream). Returns a three-item list of the X and Y  Here is a pure JavaScript example of picking an image file, displaying it, looping through the image properties, and then re-sizing the image from the canvas into an IMG tag and explicitly setting the re-sized image type to jpeg.

Image::Size, In OpenCV, the image size (width, height) can be obtained as a tuple with the import cv2 im = cv2.imread('data/src/lena.jpg') print(type(im)) It is also possible to use an index (subscript) without assigning to a variable. Here's a version (not tested for ICO!) that gets the number of channels as well (at some point, someone else will also add in things to return the bit depth, I guess):

Get image size (width, height) with Python, OpenCV, Pillow (PIL , These are all primitive data types, bits sitting in the computer's memory ready Processing has a bunch of handy classes all ready to go without us writing any code. We can think of it as the PImage constructor for loading images from a file. x + y*img.width; // Get the R,G,B values from image float r = red (img.pixels[​loc]);  Some formats may contain no image or may contain multiple images. In these cases, getimagesize() might not be able to properly determine the image size. getimagesize() will return zero for width and height in these cases. Index 2 is one of the IMAGETYPE_XXX constants indicating the type of the image.

Comments
  • I'm not 100% sure but I don't believe that .open() reads the entire file into memory... (that's what .load()) does - so as far as I know - this is as good as it gets using PIL
  • Even if you think you have a function that only reads the image header information, filesystem readahead code may still load the whole image. Worrying about performance is unproductive unless your application requires it.
  • I became convinced of your answers. Thanks @JonClements and stark
  • A quick memory test using pmap to monitor the memory used by a process shows me that indeed PIL does not load the entire image in memory.
  • See also: Get image dimensions with Python
  • True enough, but does open get the size of the image or is that a lazy operation too? And if it's lazy, does it read the image data at the same time?
  • The doc link points to Pillow a fork from PIL. I cannot however find an official doc link on the web. If someone posts it as a comment I'll update the answer. The quote can be found in the file Docs/PIL.Image.html.
  • @MarkRansom I've attempted to answer your question, however to be 100% sure it looks like we have to dive into each image-specific implementation. The .jpeg format looks OK as long as the header is found.
  • @Hooked: Thanks very much for looking into this. I accept that you are correct although I quite like Paulo's rather minimal solution below (even though to be fair the OP did not mention wanting to avoid the PIL dependency)
  • @AlexFlint No problem, it's always fun to poke around the code. I'd say that Paulo earned his bounty though, that's a nice snippet he wrote for you there.
  • I also added the capability to retrieve number of channels (not to be confused w/ bit depth) in the comment after the version @EJEHardenberg provides above.
  • Great thing. I added support for bitmaps in the GitHub project. Thanks!
  • NOTE: the current version does not work for me. @PauloScardine has an updated working version on github.com/scardine/image_size
  • Get UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte on MacOS, python3 on data = input.read(25) , file on image gives PNG image data, 720 x 857, 8-bit/color RGB, non-interlaced