Erlang snippet: Image type detection using binary matching

| | Comments (2)

I've been devoting more time to I Play WoW lately and the next feature in the skunkworks is much better image hosting. For a while I was using Facebook's albums to manage all of that stuff but in the end, I think its going to be better if I do it myself. The backend service is going to accept image uploads, track and manage them internally and use amazon s3 for storage.

Tonight I made a breakthrough in figuring out how to make sure that a given file is an image and that its dimensions fit within the expected scope. Basically, I want to restrict image uploads to .jpg, .gif and .png files that are less than 2048x2048 (that might change) and that are under 4 megs in size. I've been looking for a while for Erlang snippets or code that tie into imagemagik or gd to do this sort of detection, but in the end I didn't find anything that fit my needs so I decided to write a small library.

To use the module, call the image_type/1 function with a string representing the path of the image file. You can also call the image_type/1 function with the binary data as is.

I've run it through a sample of 3k+ images without an issues and its picked up the type and dimensions each time. Depending on how much more image related functionality is added, I might release it on GitHub. This code is made available under the MIT license.

So this is what it looks like:

-module(ipwfiles_image).
-compile(export_all).

image_type(Filename) when is_list(Filename) ->
    case file:read_file(Filename) of
        {ok, FileHandle} -> image_type(FileHandle);
        _ -> error
    end;

%% Gif header, width and then height
image_type(<<71,73,70,56,55,97,Width:16/little,Height:16/little,_/binary>>) -> {gif, Width, Height};
image_type(<<71,73,70,56,57,97,Width:16/little,Height:16/little,_/binary>>) -> {gif, Width, Height};

%% PNG header, chunk1, width and height
%% spec: http://www.w3.org/TR/PNG/#
image_type(<<137,80,78,71,13,10,26,10,_:4/signed-integer-unit:8,73,72,68,82,Width:4/signed-integer-unit:8,Height:4/signed-integer-unit:8,_/binary>>) -> {png, Width, Height};

%% JPEG
%% ref: http://www.obrador.com/essentialjpeg/headerinfo.htm
%% thanks http://www.ravenna.com/gifs/isize.pl
image_type(Data = <<255,216,255,224,_/binary>>) ->
    {H, W} = parse_jpeg(Data),
    {jpeg, H, W};

image_type(_) -> unknown.

%% SOF0: 255 192, SOF3: 255 195
%% SOI : 255 216, SOS : 255 218, SOF3: 255 195, EOI : 255 217, COM: 255 254

parse_jpeg(Data) -> parse_jpeg(Data, {}).
parse_jpeg(<<>>, Results) -> Results;
parse_jpeg(<<255,192,_:16/signed-integer,_:8,Width:16/signed-integer,Height:16/signed-integer,Rest/binary>>, _) ->
    %% io:format("Marker SOF0 hit ~p ~p(~p)~n", [Height, Width, Pos]),
    parse_jpeg(Rest, {Height, Width});
%% parse_jpeg(<<255,195,Rest/binary>>, Pos) -> parse_jpeg(Rest, Pos);
%% parse_jpeg(<<255,216,Rest/binary>>, Pos) -> parse_jpeg(Rest, Pos);
%% parse_jpeg(<<255,218,Rest/binary>>, Pos) -> parse_jpeg(Rest, Pos);
%% parse_jpeg(<<255,217,Rest/binary>>, Pos) -> parse_jpeg(Rest, Pos);
%% parse_jpeg(<<255,254,Rest/binary>>, Pos) -> parse_jpeg(Rest, Pos);
parse_jpeg(<<_,Rest/binary>>, Pos) -> parse_jpeg(Rest, Pos).

2 Comments

Tobias Löfgren said:


Nice code snippet! I think I'll need exactly this for an image upload feature later in the project I'm working on now.

Might be a small bug in it though: seems like you return {Type, Width, Height} for gif and png but {Type, Height, Width} for jpeg?

Nick Gerakines Author Profile Page said:

Thanks for catching that!

Leave a comment

About this Entry

This page contains a single entry by Nick Gerakines published on July 8, 2008 12:59 AM.

A funeral is not the place to tweet. was the previous entry in this blog.

FriendFeed, BackPack and Identi.ca is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.