Proposal: Introducing a `nat` Datatype in Koka #616

chtenb · 2024-12-12T21:37:08Z

The nat type (natural numbers ≥ 0) provides a safer and more precise alternative to int when negative values are invalid. By introducing nat, Koka can improve type safety, enhance domain modeling, and enable performance optimizations.

Benefits of `nat`

Accurate Domain Modeling: User-defined data types can explicitly reflect fields that are inherently non-negative, improving clarity and correctness.
Type-Safe Documentation: Function signatures can specify non-negativity constraints, making code self-documenting and reducing ambiguity.
Eliminating Redundant Checks: Manual runtime validations for non-negative values become unnecessary, reducing boilerplate code.
Support for Bitwise Operations: nat enables efficient bitwise operations by leveraging base-2 representations for large naturals (as opposed to base-10 which is used for large integers). This is enables efficient bitset datastructures, but also other datastructures and algorithms could benefit from this.
Performance Benefits: The lack of a need to represent negative numbers give the opportunity to choose a memory representation that allows a larger payload than the sofa technique.

Literals

(Taken from the Unison programming language)

Natural literals (0, 1, 42) would be parsed as nat.
Negative literals (-0, -1, -42) remain int.
Positive literals (+0, +1, +42) would be parsed as int.

Integration into the Standard Library

Once Koka supports the nat datatype, the standard library can be updated to take advantage of it. Many existing functions and types in kklib currently use int where nat would be more appropriate. Examples include:

Indices: Functions operating on indices can safely use nat.

fun foreach-indexed( xs : list<a>, action : (nat, a) -> e () ) : e ()

Quantities and Sizes: Lengths, counts, and similar values can be modeled as nat.

fun length( xs : list<a> ) : nat
fun drop( n : nat, xs : list<a> ) : list<a>

Conversions between int and nat

Conversions between int and nat are (and should be) explicit because they are different types. But we want to define these conversions in the most ergonomic manner possible, so that the difference between int and nat never gets in the way.

// If x < 0 return 0, otherwise x
fun nat( x : int ) : nat
// Return n as integer
fun int( n : nat ) : int

Implementation

For large naturals the same approach can be taken as for integers, except that base-2 digits should be used instead of base-10 digits, to allow efficient bitwise operations.
For small naturals (61 bits or less), a variant of the litbit technique can be deployed, where we use the sign bit to indicate overflow.

s	payload	0	t

For any 61 bit number n, the formula 4n+1 can be used to represent the number in this fashion.

As an example, this is how addition: nat + nat -> nat could be implemented.

#include <stdint.h>
#include <stdio.h>

#define unlikely(x)     (__builtin_expect(!!(x),false))
#define likely(x)       (__builtin_expect(!!(x),true))

uint64_t generic(uint64_t x, uint64_t y) {
  printf("generic\n");
  return 0; // stub
}

inline uint64_t add(uint64_t x, uint64_t y) {
  const uint64_t z = x + y;
  if likely( (((1LLU << 63) + 3) & z) == 2 ) return z - 1;
  return generic(x,y);
} 

uint64_t run(uint64_t x, uint64_t y){
  return add(x, y);
}

The text was updated successfully, but these errors were encountered:

TimWhiting modified the milestones: Feature Requests, Considered Features Dec 21, 2024

TimWhiting added proposal stdlib labels Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Introducing a `nat` Datatype in Koka #616

Proposal: Introducing a `nat` Datatype in Koka #616

chtenb commented Dec 12, 2024

Proposal: Introducing a nat Datatype in Koka #616

Proposal: Introducing a nat Datatype in Koka #616

Comments

chtenb commented Dec 12, 2024

Benefits of nat

Literals

Integration into the Standard Library

Conversions between int and nat

Implementation

Proposal: Introducing a `nat` Datatype in Koka #616

Proposal: Introducing a `nat` Datatype in Koka #616

Benefits of `nat`