## Lambda calculus predecessor function reduction steps

beta reduction lambda calculus

lambda calculus multiplication

lambda calculus decrement

lambda calculus successor function

lambda calculus programming language

lambda calculus calculator

integers in lambda calculus

I am getting stuck with the Wikipedia description of the predecessor function in lambda calculus.

What Wikipedia says is the following:

PRED := λnfx.n (λgh.h (g f)) (λu.x) (λu.u)

Can someone explain reduction processes step-by-step?

Thanks.

Ok, so the idea of Church numerals is to encode "data" using functions, right? The way that works is by representing a value by some generic operation you'd perform with it. We can therefore go in the other direction as well, which can sometimes make things clearer.

Church numerals are a unary representation of the natural numbers. So, let's use `Z`

to mean zero and `Sn`

to represent the successor of `n`

. Now we can count like this: `Z`

, `SZ`

, `SSZ`

, `SSSZ`

... The equivalent Church numeral takes two arguments--the first corresponding to `S`

, and second to `Z`

--then uses them to construct the above pattern. So given arguments `f`

and `x`

, we can count like this: `x`

, `f x`

, `f (f x)`

, `f (f (f x))`

...

Let's look at what PRED does.

First, it creates a lambda taking three arguments--`n`

is the Church numeral whose predecessor we want, of course, which means that `f`

and `x`

are the arguments to the resulting numeral, which thus means that the body of that lambda will be `f`

applied to `x`

one time fewer than `n`

would.

Next, it applies `n`

to *three* arguments. This is the tricky part.

The second argument, that corresponds to `Z`

from earlier, is `λu.x`

--a constant function that ignores one argument and returns `x`

.

The first argument, that corresponds to `S`

from earlier, is `λgh.h (g f)`

. We can rewrite this as `λg. (λh.h (g f))`

to reflect the fact that only the outermost lambda is being applied `n`

times. What this function does is take the accumulated result so far as `g`

and return a new function taking one argument, which applies that argument to `g`

applied to `f`

. Which is absolutely baffling, of course.

So... what's going on here? Consider the direct substitution with `S`

and `Z`

. In a non-zero number `Sn`

, the `n`

corresponds to the argument bound to `g`

. So, remembering that `f`

and `x`

are bound in an outside scope, we can count like this: `λu.x`

, `λh. h ((λu.x) f)`

, `λh'. h' ((λh. h ((λu.x) f)) f)`

... Performing the obvious reductions, we get this: `λu.x`

, `λh. h x`

, `λh'. h' (f x)`

... The pattern here is that a function is being passed "inward" one layer, at which point an `S`

will apply it, while a `Z`

will ignore it. So we get one application of `f`

for each `S`

*except* the outermost.

The third argument is simply the identity function, which is dutifully applied by the outermost `S`

, returning the final result--`f`

applied one fewer times than the number of `S`

layers `n`

corresponds to.

**Lambda calculus,** Run. Also see Hai (Paul) Liu's step-by-step Lambda Viewer. In lambda calculus, this is called beta reduction, and we'd write this example as: ( λ a b . a 2 + b 2 ) 3 4 With the Scott encoding, we have a fast and simple predecessor function:. Lambda calculus. Lambda calculus has a way of spiraling into a lot of steps, making solving problems tedious, and it can look real hard, but it isn't actually that bad. In lambda calculus, there are only lambdas, and all you can do with them is substitution.

McCann's answer explains it pretty well. Let's take a concrete example for Pred 3 = 2:

Consider expression: n (λgh.h (g f)) (λu.x). Let K = (λgh.h (g f))

For n = 0, we encode `0 = λfx.x`

, so when we apply the beta reduction for `(λfx.x)(λgh.h(gf))`

means `(λgh.h(gf))`

is replaced 0 times. After further beta-reduction we get:

`λfx.(λu.x)(λu.u)`

reduces to

`λfx.x`

where `λfx.x = 0`

, as expected.

For n = 1, we apply K for 1 times:

```
(λgh.h (g f)) (λu.x)
=> λh. h((λu.x) f)
=> λh. h x
```

For n = 2, we apply K for 2 times:

```
(λgh.h (g f)) (λh. h x)
=> λh. h ((λh. h x) f)
=> λh. h (f x)
```

For n = 3, we apply K for 3 times:

```
(λgh.h (g f)) (λh. h (f x))
=> λh.h ((λh. h (f x)) f)
=> λh.h (f (f x))
```

Finally, we take this result and apply an id function to it, we got

```
λh.h (f (f x)) (λu.u)
=> (λu.u)(f (f x))
=> f (f x)
```

This is the definition of number 2.

The list based implementation might be easier to understand, but it takes many intermediate steps. So it is not as nice as the Church's original implementation IMO.

**Lambda calculus - Lambda Calculus,** Lambda calculus is a calculus with its core features of function definition and see through β reduction steps, how the computation progresses. applying a successor function to any number function should get us a function that represents. Note that the number of beta-reduction steps that this predecessor function takes is proportioned to the size of input number m, unlike Scott encoded predecessor, which take three steps. 2.3 Parigot Encoded Numbers Parigot encoding was proposed to enable constant time predecessor function while allowing programming with the iterators.

After Reading the previous answers (good ones), I’d like to give my own vision of the matter in hope it helps someone (corrections are welcomed). I’ll use an example.

First off, I’d like to add some parenthesis to the definition that made everything clearer to me. Let’s redifine the given formula to:

PRED := λn λf λx.(n (λgλh.h (g f)) (λu.x)) (λu.u)

Let’s also define three Church numerals that will help with the example:

Zero := λfλx.x One := λfλx. f (Zero f x) Two := λfλx. f (One f x) Three := λfλx. f (Two f x)

In order to understand how this works, let's focus first on this part of the formula:

n (λgλh.h (g f)) (λu.x)

From here, we can extract this conclusions:
`n`

is a Church numeral, the function to be applied is `λgλh.h (g f)`

and the starting data is `λu.x`

With this in mind, let's try an example:

PRED Three := λf λx.(Three (λgλh.h (g f)) (λu.x)) (λu.u)

Let's focus first on the reduction of the numeral (the part we explained before):

Three (λgλh.h (g f)) (λu.x)

Which reduces to:

(λgλh.h (g f)) (Two (λgλh.h (g f)) (λu.x)) (λgλh.h (g f)) ((λgλh.h (g f)) (One (λgλh.h (g f)) (λu.x))) (λgλh.h (g f)) ((λgλh.h (g f)) ((λgλh.h (g f)) (Zero (λgλh.h (g f)) (λu.x)))) (λgλh.h (g f)) ((λgλh.h (g f)) ((λgλh.h (g f)) ((λfλx.x) (λgλh.h (g f)) (λu.x)))) -- Here we lose one application of f (λgλh.h (g f)) ((λgλh.h (g f)) ((λgλh.h (g f)) (λu.x))) (λgλh.h (g f)) ((λgλh.h (g f)) (λh.h ((λu.x) f))) (λgλh.h (g f)) ((λgλh.h (g f)) (λh.h x)) (λgλh.h (g f)) (λh.h ((λh.h x) f)) (λgλh.h (g f)) (λh.h (f x)) (λh.h ((λh.h (f x) f)))

Ending up with:

λh.h f (f x)

So, we have:

PRED Three := λf λx.(λh.h (f (f x))) (λu.u)

Reducing again:

PRED Three := λf λx.((λu.u) (f (f x))) PRED Three := λf λx.f (f x)

As you can see in the reductions, we end up applying the function one time less thanks to a clever way of using functions.

Using add1 as `f`

and 0 as `x`

, we get:

PRED Three add1 0 := add1 (add1 0) = 2

Hope this helps.

**[PDF] CS 329 Notes on Untyped Lambda Calculus 1 ,** sive functions, even though functions in λ calculus are not given names and thus In normal order reduction we try to reduce always the left most expression of a series of The successor function applied to our representation for zero yields. The purpose of β-reduction is to calculate a value. A value in lambda calculus is a function. So β-reduction continues until the expression looks like a function abstraction. A lambda expression that cannot be reduced further, by either β-redex, or η-redex is in normal form. Note that alpha-conversion may convert functions.

You can try to understand this definition of the predecessor function (not my favourite one) in terms of continuations.

To simplify the matter a bit, let us consider the following variant

PRED := λn.n (λgh.h (g S)) (λu.0) (λu.u)

then, you can replace S with f, and 0 with x.

The body of the function iterates n times a transformation M over an argument N. The argument N is a function of type (nat -> nat) -> nat that expects a continuation for nat and returns a nat. Initially, N = λu.0, that is it ignores the continuation and just returns 0. Let us call N the current computation.

The function M: (nat -> nat) -> nat) -> (nat -> nat) -> nat modifies the computation g: (nat -> nat)->nat as follows. It takes in input a continuation h, and applies it to the result of continuing the current computation g with S.

Since the initial computation ignored the continuation, after one application of M we get the computation (λh.h 0), then (λh.h (S 0)), and so on.

At the end, we apply the computation to the identity continuation to extract the result.

**[PDF] A Tutorial Introduction to the Lambda Calculus,** Lambda calculus is a notation for describing mathematical functions and The β reduction (λx.e) e' → e{e'/x} is the basic computational step of the λ-calculus. In the assuming 3, 4, and S (the successor function) are appropriately defined:. Lambda calculus predecessor function reduction steps (3) I am getting stuck with the Wikipedia description of the predecessor function in lambda calculus. What Wikipedia says is the following: PRED := λnfx.n (λgh.h (g f)) (λu.x) (λu.u)

I'll add my explanation to the above good ones, mostly for the sake of my own understanding. Here's the definition of PRED again:

PRED := λnfx. (n (λg (λh.h (g f))) ) λu.x λu.u

The stuff on the right side of the first dot is supposed to be the (n-1) fold composition of f applied to x: f^(n-1)(x).

Let's see why this is the case by incrementally grokking the expression.

λu.x is the constant function valued at x. Let's just denote it const_x.

λu.u is the identity function. Let's call it id.

λg (λh.h (g f)) is a weird function that we need to understand. Let's call it F.

Ok, so PRED tells us to evaluate the n-fold composition of F on the constant function and then to evaluate the result on the identity function.

PRED := λnfx. F^n const_x id

Let's take a closer look at F:

F:= λg (λh.h (g f))

F sends g to evaluation at g(f). Let's denote evaluation at value y by ev_y. That is, ev_y := λh.h y

So

F = λg. ev_{g(f)}

Now we figure out what F^n const_x is.

F const_x = ev_{const_x(f)} = ev_x

and

F^2 const_x = F ev_x = ev_{ev_x(f)} = ev_{f(x)}

Similarly,

F^3 const_x = F ev_{f(x)} = ev_{f^2(x)}

and so on:

F^n const_x = ev_{f^(n-1)(x)}

Now,

PRED = λnfx. F^n const_x id = λnfx. ev_{f^(n-1)(x)} id = λnfx. id(f^(n-1)(x)) = λnfx. f^(n-1)(x)

which is what we wanted.

Super goofy. The idea is to turn doing something n times into doing f n-1 times. The solution is to apply F n times to const_x to obtain ev_{f^(n-1)(x)} and then to extract f^(n-1)(x) by evaluating at the identity function.

**The Lambda Calculus,** Using lambda calculus to write simple functions; Implementing The predecessor function on Church numerals; Recursion using the Y combinator Showing each reduction step, reduce the following term to normal form. The λ calculus is developed as a theory of functions for manipulating functions in a purely syntactic manner. All functional programming languages can be viewed as syntactic variations of the lambda calculus, so that both their semantics and implementation can be analysed in the context of the lambda calculus.

**Lambda Calculus,** The lambda calculus is a theory of functions as formulas. It is a system Finally, β-equivalence is obtained by allowing reduction steps as well as inverse reduction predecessor function is sometimes referred to as the “wisdom teeth trick”.). The Lambda Calculus can also be used to compute neural networks with arbitrary accuracy, by expressing the strengths of the connections between individual neurons, and the activation values of the neurons as numbers, and by calculating the spreading of activation through the network in very small time steps.

**[PDF] Lecture Notes on the Lambda Calculus,** Functional programming languages are based on the Lambda-calculus step with the beta-reduction rule in the lambda calculus and we shall see that, if one wish integers as results, and computes the successor function (of course the type All functions are unary in the lambda calculus. Functions of higher arity are simulated by currying. A function of two arguments is simulated by a function of one argument that returns a function of one argument that returns the result. That is, we would simulate (lambda (x y) E) in curried form by (lambda (x) (lambda (y) E)) = lx.ly.E

**Lambda-calculus,** Church developed the lambda calculus in the 1930s as a theory of functions that provides rules for (the successor function), and sqr (the squaring function). reduced. Later we show a step-by-step reduction of this lambda expres- sion to 7. Here is the sequence of $\beta$-reductions without the intermediate steps: $$(\lambda x.x\ y)(\lambda y.y\ z) \rightarrow_\beta (\lambda y.y\ z) y \rightarrow_\beta y\ z.$$ The second problem can be dealt with in the same way.

##### Comments

- I find lambda notation hard to follow, so I've done step-by-step reductions with S-expressions using Clojure for
`(pred zero)`

,`(pred one)`

and`(pred two)`

and published it on Github. - +1 for the important insight that "the body of that lambda will be
`f`

applied to`x`

one time fewer than`n`

would." But how does it achieve this goal is still beyond me from your description. It would probably help with adding some more abstractions to this formula, abstracting the ideas in a bit higher level. For example, by replacing that`Lu.u`

with`I`

, the Identity function. And maybe some others too. I saw an interesting explanation of it here: mactech.com/articles/mactech/Vol.07/07.05/LambdaCalculus/… which deciphers these lambdas as list operations (cons,car,cdr). - I think the list version ends up being a different, more elaborate implementation, albeit an easier to understand one. The predecessor definition here really is just hard to comprehend, and the best way to see how it works might be to step through the evaluation manually to get a feel for what's going on.
- I don't understand how you get from "For n = so and so many times we apply K for so and so many times" to each of the first lines of the derivations.
- Yeah it's not explicitly stated; but it comes from the
`f`

term in a numeral. For instance 2 is defined as`λfx.f(fx)`

. So that means when the beta reduction is done for`(λfx.f(fx))(λgh.h (g f))`

the`(λgh.h (g f))`

function is applied twice.