Automatically add type signatures to top-level functions

javascript function signature type
function prototype signature
signature in function overloading
python change function signature
given the following java function what is the method signature
type de signature
object signature
what is method signatures

I was lazy and wrote a Haskell module (using the excellent EclipseFP IDE) without giving type signatures to my top-level functions.

EclipseFP uses HLint to automatically label every offending function, and I can fix each one with 4 mouse clicks. Effective, but tedious.

Is there a utility program that will scan a .hs file, and emit a modified version that adds type signatures to each top-level function?

Example:

./addTypeSignatures Foo.hs 

would read a file Foo.hs:

foo x = foo + a

and emit

foo :: Num a => a -> a
foo x = x + 1

Bonus points if the tool automatically edits Foo.hs in place and saves a backup Foo.bak.hs

There's haskell-mode for emacs that has a shortcut to insert type signature of a function: C-u, C-c, C-t. It is not automatic, you have to do it for each function. But if you have only one module, it will probably take you a few minutes to go through it.

haskell, There's haskell-mode for emacs that has a shortcut to insert type signature of a function: C-u, C-c, C-t. It is not automatic, you have to do it for each function. In Java, signatures are used to identify methods and classes at the level of the virtual machine code. You have to declare types of variables in your code in order to be able to run the Java code. You have to declare types of variables in your code in order to be able to run the Java code.

Here's a variation of the above script, that uses ":browse" instead of ":type", per ehird's comment.

One major problem with this solution is that ":browse" displays fully qualified type names, whereas ":type" uses the imported (abbreviated) type names. This, if your module uses unqualified imported types (a common case), the output of this script will fail compilation.

That shortoming is fixable (using some parsing of imports), but this rabbit hole is getting deep.

#!/usr/bin/env perl

use warnings;
use strict;

sub trim {
   my $string = shift;
   $string =~ s/^\s+|\s+$//g;
   return $string;
}


my $sig=0;
my $file;

my %funcs_seen = ();
my %keywords = ();
for my $kw qw(type newtype data class) { $keywords{$kw} = 1;}

foreach $file (@ARGV) 
{
  if ($file =~ /\.lhs$/) 
  {
    print STDERR "$file: .lhs is not supported. Skipping.\n";
    next;
  }

  if ($file !~ /\.hs$/) 
  {
    print STDERR "$file is not a .hs file. Skipping.\n";
    next;
  }

  my $module = $file;
  $module =~ s/\.hs$//;

  my $browseInfo = `echo :browse | ghci $file`;
  if ($browseInfo =~ /Failed, modules loaded:/)
  {
   print STDERR "$browseInfo\n";
   print STDERR "$file is not valid Haskell source file. Skipping.\n";
   next;
  }

  my @browseLines = split("\n", $browseInfo);
  my $browseLine;
  my $func = undef;
  my %dict = ();
  for $browseLine  (@browseLines) { 
   chomp $browseLine;
   if ($browseLine =~ /::/) {
    my ($data, $type) = split ("::", $browseLine);
    $func = trim($data);
    $dict{$func} = $type;
    print STDERR "$func :: $type\n";
   } elsif ($func && $browseLine =~ /^  /) { # indent on continutation
    $dict{$func} .= " " . trim($browseLine);
    print STDERR "$func ... $browseLine\n";
   } else {
    $func = undef;
   }
  }



  my $backup = "$file.bak";
  my $new = "$module.New.hs";
  -e $backup and die "Backup $backup file exists. Refusing to overwrite. Quitting";
  open OLD, $file;
  open NEW, ">$new"; 

  print STDERR "Functions in $file:\n";
  my $block_comment = 0;
  while (<OLD>) 
  {
    my $original_line = $_;
    my $line = $_;
    my $skip = 0;
    $line =~ s/--.*//;
    if ($line =~ /{-/) { $block_comment = 1;} # start block comment
    $line =~ s/{-.*//;
    if ($block_comment and $line =~ /-}/) { $block_comment=0; $skip=1} # end block comment

    if ($line =~ /^ *$/) { $skip=1; } # comment/blank
    if ($block_comment) { $skip = 1};
    if (!$skip) 
    {
      if (/^(('|\w)+)( +(('|\w)+))* *=/ ) 
      { 
        my $object = $1;
        if ((! $keywords{$object}) and !($funcs_seen{$object})) 
        {
          $funcs_seen{$object} = 1;
          print STDERR "$object\n";
          my $type = $dict{$1};

          unless ($sig) 
          {
            if ($type) {
              print NEW "$1 :: $type\n";
              print STDERR "$1 :: $type\n";
            } else {
              print STDERR "no type for $1\n";
            }
          }
        }
      }

    $sig = /^(('|\w)+) *::/; 
    }
    print NEW $original_line;
  }
  close OLD;
  close NEW;

  my $ghciPostTest = `echo 1 | ghci $new`;
  if ($ghciPostTest !~ /Ok, modules loaded: /)
  {
   print $ghciPostTest;
   print STDERR "$new is not valid Haskell source file. Will not replace original (but you might find it useful)";
   next;
  } else {
    rename ($file, $backup) or die "Could not make backup of $file -> $backup";
    rename ($new, $file) or die "Could not make new file $new";
  }
}

Why should I add type signatures to top level bindings in Haskell , Type signatures for top-level definitions give us three core advantages: * Types are lightweight documentation. They summarize what a function does at a glance​  Type signatures for top-level definitions give us three core advantages: * Types are lightweight documentation. They summarize what a function does at a glance; types on top-level definitions make it easier for me to skim through a module and find

This perl script does a hack job at it, making some assumptions about source file structure. (Such as: .hs file (not .lhs), signatures are on the line immediately preceding definitions, definitions are flush on the left margin, etc)

It tries to handle (skip over) comments, equation-style definitions (with repeated left-hand-sides), and types that generate multi-line output in ghci.

No doubt, many interesting valid cases are not handled properly. The script isn't close to respecting the actual syntax of Haskell.

It is incredibly slow, as it launches a ghci session for each function that needs a signature. It makes a backup file File.hs.bak, prints the functions it finds to stderr, as well as signatures for functions missing signatures, and writes the upgraded source code to File.hs. It uses an intermediate file File.hs.new, and has a few safety checks to avoid overwriting your content with garbage.

USE AT YOUR OWN RISK.

This script might format your hard drive, burn your house down, unsafePerformIO, and have other impure side effects. In fact, it probably will.

I feel so dirty.

Tested on Mac OS X 10.6 Snow Leopard with a couple of my own .hs source files.

#!/usr/bin/env perl

use warnings;
use strict;

my $sig=0;
my $file;

my %funcs_seen = ();
my %keywords = ();
for my $kw qw(type newtype data class) { $keywords{$kw} = 1;}

foreach $file (@ARGV) 
{
  if ($file =~ /\.lhs$/) 
  {
    print STDERR "$file: .lhs is not supported. Skipping.";
    next;
  }

  if ($file !~ /\.hs$/) 
  {
    print STDERR "$file is not a .hs file. Skipping.";
    next;
  }

  my $ghciPreTest = `echo 1 | ghci $file`;
  if ($ghciPreTest !~ /Ok, modules loaded: /)
  {
   print STDERR $ghciPreTest;
   print STDERR "$file is not valid Haskell source file. Skipping.";
   next;
  }

  my $module = $file;
  $module =~ s/\.hs$//;

  my $backup = "$file.bak";
  my $new = "$module.New.hs";
  -e $backup and die "Backup $backup file exists. Refusing to overwrite. Quitting";
  open OLD, $file;
  open NEW, ">$new"; 

  print STDERR "Functions in $file:\n";
  my $block_comment = 0;
  while (<OLD>) 
  {
    my $original_line = $_;
    my $line = $_;
    my $skip = 0;
    $line =~ s/--.*//;
    if ($line =~ /{-/) { $block_comment = 1;} # start block comment
    $line =~ s/{-.*//;
    if ($block_comment and $line =~ /-}/) { $block_comment=0; $skip=1} # end block comment

    if ($line =~ /^ *$/) { $skip=1; } # comment/blank
    if ($block_comment) { $skip = 1};
    if (!$skip) 
    {
      if (/^(('|\w)+)( +(('|\w)+))* *=/ ) 
      { 
        my $object = $1;
        if ((! $keywords{$object}) and !($funcs_seen{$object})) 
        {
          $funcs_seen{$object} = 1;
          print STDERR "$object\n";
          my $dec=`echo ":t $1" | ghci $file  | grep -A100 "^[^>]*$module>" | grep -v "Leaving GHCi\." | sed -e "s/^[^>]*$module> //"`;

          unless ($sig) 
          {
            print NEW $dec;
            print STDERR $dec;
          }
        }
      }

    $sig = /^(('|\w)+) *::/; 
    }
    print NEW $original_line;
  }
  close OLD;
  close NEW;

  my $ghciPostTest = `echo 1 | ghci $new`;
  if ($ghciPostTest !~ /Ok, modules loaded: /)
  {
   print $ghciPostTest;
   print STDERR "$new is not valid Haskell source file. Will not replace original (but you might find it useful)";
   next;
  } else {
    rename ($file, $backup) or die "Could not make backup of $file -> $backup";
    rename ($new, $file) or die "Could not make new file $new";
  }
}

7.13. Other type system extensions, Each user-written type signature is subjected to an ambiguity check. The ambiguity check rejects functions that can never be called; for example: behave just like other type class constraints in that they are automatically propagated. may occur anywhere a normal group of Haskell bindings can occur, except at top level. Type signature. that tells, what is the type of a variable. In the example inc is the variable, Num a => is the context and a -> a is its type, namely a function type with the kind * -> *. which restricts the variable title to the the type String. Binding a value of any other type will lead to a type missmatch.

For the Atom Editor its possible to automatically insert the type signature per function with the package haskell-ghc-mod which provides:

 'ctrl-alt-T': 'haskell-ghc-mod:insert-type'

https://atom.io/packages/haskell-ghc-mod#keybindings

Signature (functions), A function signature (or type signature, or method signature) defines input The type will get determined automatically while the program is being processed. In Java, signatures are used to identify methods and classes at the level of Get the latest and greatest from MDN delivered straight to your inbox. From the New messages drop-down list, select the signature that you created. To have the signature included on replies and forwards, select the signature from the Replies/forwards drop-down list. Click OK in the Signatures and Stationery dialog box. The signature will now be automatically included in all new, replied to, and forwarded messages.

Here's another hacky attempt based on parsing GHC -Wmissing-signatures warnings, so the script doesn't have to parse Haskell. It transforms the warnings into a sed script that does the insertions and prints its result to stdout, or modifies the file inplace if -i is given.

Requires a Stack project as configured below, but you can change the buildCmd.

Works on the few files I tried it on with GHC 8.2.2 and 8.4.3, but same warnings as in @misterbee's first answer apply :) Also, it will obviously break with older or newer GHCs if they produce differently formatted warnings (but for me, the more sophisticated tooling seem to break all the time too, so...).

#!/bin/zsh

set -eu
setopt rematchpcre

help="Usage: ${0:t} [-d] [-i | -ii] HASKELL_FILE

Options:
  -d   Debug
  -i   Edit target file inplace instead of printing to stdout
           (Warning: Trying to emulate this option by piping from 
            and to the same file probably won't work!)
  -ii  Like -i, but no backup
"


### CONFIG ###

buildCmd() {
    touch $inputFile
    stack build --force-dirty --ghc-options='-fno-diagnostics-show-caret -Wmissing-signatures'
}

# First group must be the filename, second group the line number
warningRegexL1='^(.*):([0-9]+):[0-9]+(-[0-9]+)?:.*-Wmissing-signatures'

# First group must be the possible same-line type signature (can be empty)
warningRegexL2='Top-level binding with no type signature:\s*(.*)'

# Assumption: The message is terminated by a blank line or an unindented line
messageEndRegex='^(\S|\s*$)'


### END OF CONFIG ###


zparseopts -D -E d=debug i+=inplace ii=inplaceNoBackup h=helpFlag

[[ -z $helpFlag ]] || { printf '%s' $help; exit 0 }

# Make -ii equivalent to -i -i
[[ -z $inplaceNoBackup ]] || inplace=(-i -i)

inputFile=${1:P} # :P takes the realpath

[[ -e $inputFile ]] || { echo "Input file does not exist: $inputFile" >&2; exit 2 }

topStderr=${${:-/dev/stderr}:P}

debugMessage()
{
    [[ -z $debug ]] || printf '[DBG] %s\n' "$*" > $topStderr
}

debugMessage "inputFile = $inputFile"

makeSedScript() 
{
    local line

    readline() {
        IFS= read -r line || return 1
        printf '[build] %s\n' $line >&2
    }

    while readline; do
        [[ $line =~ $warningRegexL1 ]] || { debugMessage "^ Line doesn't match warningRegexL1"; continue }
        file=${match[1]}
        lineNumber=${match[2]}

        [[ ${file:P} = $inputFile ]] || { debugMessage "^ Not our file: $file"; continue }

        # Begin sed insert command
        printf '%d i ' $lineNumber

        readline

        [[ $line =~ $warningRegexL2 ]] ||\
            { printf 'WARNING: Line after line matching warningRegexL1 did not match warningRegexL2:\n %s\n' $line >&2
              continue }

        inlineSig=${match[1]}

        debugMessage "^ OK, inlineSig = $inlineSig"

        printf '%s' $inlineSig

        readline


        if [[ ! ($line =~ $messageEndRegex) ]]; then

            [[ $line =~ '^(\s*)(.*)$' ]]

            indentation=${match[1]}

            [[ -z $inlineSig ]] || printf '\\n'

            printf ${match[2]}

            while readline && [[ ! ($line =~ $messageEndRegex) ]]; do
                printf '\\n%s' ${line#$indentation}
            done
        fi

        debugMessage "^ OK, Type signature ended above this line"

        # End sed insert command
        printf '\n'

    done
}

prepend() {
    while IFS= read -r line; do printf '%s%s\n' $1 $line; done
}

sedScript="$(buildCmd |& makeSedScript)"

if [[ -z $sedScript ]]; then
    echo "No type-signature warnings for the given input file were detected (try -d option to debug)" >&2
    exit 1
fi

printf "\nWill apply the following sed script:\n" >&2
printf '%s\n' $sedScript | prepend "[sed] " >&2

sedOptions=()

if [[ $#inplace -ge 1 ]]; then 
    sedOptions+=(--in-place)
    [[ $#inplace -ge 2 ]] || cp -p --backup=numbered $inputFile ${inputFile}.bak
fi


sed $sedOptions -f <(printf '%s\n' $sedScript) $inputFile

Add support for separated type annotations · Issue #341 · fsharp , About the automatic annotation adornments, check out Visual Studio Code or I like the concept of signature files for enforcing types on top-level functions in a  Automatically add type signatures HaskForce relies on ghc-mod to tell give it warnings about missing top-level signatures to provide you the suggestions to add them. If you have ghc-mod configured and you are not getting suggestions to automatically add type signatures, you may need to supply the -fwarn-missing-signatures flag to GHC.

What I Wish I Knew When Learning Haskell 2.5 ( Stephen Diehl ), Function Monad; RWS Monad; Cont; MonadPlus; MonadFail; MonadFix; ST Monad; Free Monads Version bounds in cabal files can be managed automatically with a tool -fwarn-missing-signatures, Warn about toplevel missing type signatures added extensions to support more type-level programming over the years. Signatures can be added automatically to all outgoing messages, or you can choose which messages include a signature. Note:  Each message can contain only one signature. Insert a signature automatically On the Message tab, in the Include group, click Signature, and then click Signatures.

Documentation and Markup, A type signature for a top-level function,; A definition for a top-level function However, you're encouraged to add explicit type signatures for all top-level functions, Since the module name can be inferred automatically from the source file,  Our type signatures tell us a great deal about a function, but they don't answer questions about what a function actually does. We can assume that somewhere the map method of f a must be called, since that is the only function defined by the type Functor , but we don't know how or why that map is called.

Function declaration, A function declaration introduces the function name and its type. class U> auto add(T t, U u) -> decltype(t + u); or is complicated, such as in auto fpif(int)->int(*)(​int) constraint is part of function signature, but not part of function type. 4) Top​-level cv-qualifiers are dropped from the parameter type (This  In computer science, a type signature or type annotation defines the inputs and outputs for a function, subroutine or method. A type signature includes the number, types and order of the arguments contained by a function. A type signature is typically used during overload resolution for choosing the correct definition of a function to be called among many overloaded forms.

Comments
  • The hs-lint command in Emacs will automatically apply suggestions if hs-lint-replace-without-ask is set to t. I’m not sure how to restrict it to just type signatures, but surely there must be a way. And I’m only posting this as a comment because it’s not an EclipseFP solution.
  • This is true, but it's only barely better than the EclipseFP+hlint solution.
  • You should be able to make it a lot faster by only loading the file into GHCi once and using :browse.
  • @ehird: Thanks for the pointer. I posted another answer that uses ":browse", but has other problems :-(