Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Introducing a nat Datatype in Koka #616

Open
chtenb opened this issue Dec 12, 2024 · 0 comments
Open

Proposal: Introducing a nat Datatype in Koka #616

chtenb opened this issue Dec 12, 2024 · 0 comments

Comments

@chtenb
Copy link
Contributor

chtenb commented Dec 12, 2024

The nat type (natural numbers ≥ 0) provides a safer and more precise alternative to int when negative values are invalid. By introducing nat, Koka can improve type safety, enhance domain modeling, and enable performance optimizations.

Benefits of nat

  • Accurate Domain Modeling: User-defined data types can explicitly reflect fields that are inherently non-negative, improving clarity and correctness.
  • Type-Safe Documentation: Function signatures can specify non-negativity constraints, making code self-documenting and reducing ambiguity.
  • Eliminating Redundant Checks: Manual runtime validations for non-negative values become unnecessary, reducing boilerplate code.
  • Support for Bitwise Operations: nat enables efficient bitwise operations by leveraging base-2 representations for large naturals (as opposed to base-10 which is used for large integers). This is enables efficient bitset datastructures, but also other datastructures and algorithms could benefit from this.
  • Performance Benefits: The lack of a need to represent negative numbers give the opportunity to choose a memory representation that allows a larger payload than the sofa technique.

Literals

(Taken from the Unison programming language)

  • Natural literals (0, 1, 42) would be parsed as nat.
  • Negative literals (-0, -1, -42) remain int.
  • Positive literals (+0, +1, +42) would be parsed as int.

Integration into the Standard Library

Once Koka supports the nat datatype, the standard library can be updated to take advantage of it. Many existing functions and types in kklib currently use int where nat would be more appropriate. Examples include:

Indices: Functions operating on indices can safely use nat.

fun foreach-indexed( xs : list<a>, action : (nat, a) -> e () ) : e ()

Quantities and Sizes: Lengths, counts, and similar values can be modeled as nat.

fun length( xs : list<a> ) : nat
fun drop( n : nat, xs : list<a> ) : list<a>

Conversions between int and nat

Conversions between int and nat are (and should be) explicit because they are different types. But we want to define these conversions in the most ergonomic manner possible, so that the difference between int and nat never gets in the way.

// If x < 0 return 0, otherwise x
fun nat( x : int ) : nat
// Return n as integer
fun int( n : nat ) : int

Implementation

For large naturals the same approach can be taken as for integers, except that base-2 digits should be used instead of base-10 digits, to allow efficient bitwise operations.
For small naturals (61 bits or less), a variant of the litbit technique can be deployed, where we use the sign bit to indicate overflow.

s payload 0 t

For any 61 bit number n, the formula 4n+1 can be used to represent the number in this fashion.

As an example, this is how addition: nat + nat -> nat could be implemented.

#include <stdint.h>
#include <stdio.h>

#define unlikely(x)     (__builtin_expect(!!(x),false))
#define likely(x)       (__builtin_expect(!!(x),true))

uint64_t generic(uint64_t x, uint64_t y) {
  printf("generic\n");
  return 0; // stub
}

inline uint64_t add(uint64_t x, uint64_t y) {
  const uint64_t z = x + y;
  if likely( (((1LLU << 63) + 3) & z) == 2 ) return z - 1;
  return generic(x,y);
} 

uint64_t run(uint64_t x, uint64_t y){
  return add(x, y);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants