Re-organizing the Random Library for V

One of my biggest contributions to an Open Source Project

by Subhomoy Haldar

Reading time: 6 minutes

Update (April 2021)

My work on the rand module is reasonably complete. I believe the foundation is solid enough for future developers to work and build on top of it. I only added a few non-uniform distribution functions. There are several more. Adding them is just a matter of implementing the correct function, preparing sufficient unit tests, informing the people in the Discord server, and getting the Pull Request merged.

The updated documentation is available here: rand | vdoc.

Backlit keyboard with monitors
Photo by Fotis Fotopoulos on Unsplash

TL;DR

Too Long; Didn’t Read
My team and I are in charge of V’s rand library and we’re working towards improving it.
In this article, I breakdown all of the work for the module, both complete and ongoing.

Why I Chose V

The Java Programming Language
Java - Where it all began

I spent a lot of time writing code in Java (2013-2019) exclusively. From 2016 onwards, Oracle decided to start collecting royalties ⟨1⟩ at the expense of losing developers and consequently, the Java community was increasingly pushed to rely on OpenJDK binaries. The speed at which new Java versions were being released meant that a lot of new breaking changes were being introduced, splitting the developers into those whose relied on the older stable versions and those who wanted new features in their projects. Several software projects broke. I didn’t want to side with either camp and decided to look for other languages to call home. I tried several languages before I realised a few things:

  1. I had my start in a strongly typed language. Therefore, I ended up preferring strongly-typed languages.
  2. I enjoyed Python ⟨2⟩ because it was quick and easy to learn. It remains my language of choice when I want to prototype something. However, I do not like duck typing ⟨3⟩.
  3. I did not like JavaScript because of the myriad ways it can cause runtime errors without telling me what the problem is. Typescript ⟨4⟩, however, is amazing. It removes a lot of the complaints I have with JS. And I never use JS-based languages outside the browser.
  4. C (not C++) is simple and therapeutic. I come back to C every frequently when I need to write software that is not too ambitious but needs to be simple and fast (for example: bump ⟨5⟩).

I discovered V Lang ⟨6⟩ in late 2019 and it piqued my interest. I didn’t get around to using it though. It was only after the pandemic hit and cancelled my MTTS Summer Camp that I decided to give Open Source development a shot.

Getting My Feet Wet

I joined The (Official) V Language & Apps Discord Server ⟨7⟩ towards the beginning of 2020. I lurked around, asked questions, got to know the community and the language. The compiler had just been rewritten from scratch and everyone was engrossed in ironing out the issues. I took a few weeks to study the codebase and came across the math.fractions package. It was a bit neglected in the sense that it was missing operator overloading and lacked several functions that I thought were important.

I ended up creating my first 5 Pull Requests ⟨8⟩ with the intention of improving the module and adding unit tests. All of them were merged. This is how I got my start in Open Source and I enjoyed the process. After I was satisfied with the work, I decided to tackle the rand module. It was a collection of several functions that were added by people who specifically needed those and nothing else. While they may have done a good job with their individual PRs, the module was lacking structure and cohesion. I decided to do something about it.

Meet the Team

I teamed up with @Delta456 and @x0r19x91 to help me with this project. We decided to split the work into several phases to make everything easier to manage and make sure that we do not inconvenience the other developers working on the main codebase.

From the organisation, @medvednikov and @spytheman were very helpful as well; they merged the pull requests, answered my queries and discussed viable implementation strategies. And thanks to @Larpon for being a great support. 😄

Deep Dive

I knew it was a monumental task to reorganize the existing rand module and it could not be updated in just one PR. Hence, I propoed three phases to execute this plan:

  1. Phase 1: Add PRNG structs and encapsulate the default libc PRNG into the proposed SysRNG struct.
  2. Phase 2: A clean-up of the existing rand module involving updating all usages of the module and putting the new structs to use. Also added a bunch of global functions for ease of use.
  3. Phase 3 (Ongoing): Further refinement of the module and adding functions for non-uniform distributions and generic array utilities.

Phase 1

Link to the Pull Request: Phase 1

As stated before, the module lacked cohesion. I made a list of all the Pseudo Random Number Generators (PRNGs) that were either already present in the module, or V users would benefit from. I then proposed a common interface for all of them and we set to work implementing them and adding unit tests. I chose to support uniform functions only and minimize the amount of redundancy. The emphasis was to make each RNG independent and as fast as possible.

This was the most amount of grunt work and required a lot of unit tests to ensure the code written works correctly. This pull request added 3,800+ lines to the codebase and remove very little in comparison. I made sure that the generator outputs were uniform, had the correct sample mean and variance, and so on. Although, I was pleased to have an assignment that justified both my majors: Mathematics and Computing simultaneously.

Owing to this PR, the following code works:

module main

import rand.seed
import rand.wyrand

fn main() {
    mut rng := wyrand.WyRandRNG{}
    rng.seed(seed.time_seed_array(2))

    for _ in 0 .. 10 {
    	println(rng.intn(100))
    }

}

Phase 2

Link to the Pull Request: Phase 2

Clean up time. This PR added 592 lines of code and removed 397. This is small compared to Phase 1, but it’s still a large PR.

I removed all the old functions and all their usages in the library. I updated the global API for the module to mirror those of all the RNG, and I added a lot of documentation as well. It is because of this PR that the random API is greatly simplified.

module main

import rand

fn main() {
    for _ in 0 .. 10 {
        println(rand.intn(100))
    }
}

The default_rng used is the WyRandRNG. The code is functionally the same but it is a lot cleaner than in the last example.

Phase 3

Note: Check the update section at the top.

Link to the Issue/Roadmap: Phase 3

Until now, the redundancy in the module was acceptable owing to potential performance gains. However, the library is far from complete. It does not have any non-uniform distributions currently. A bunch of new functions need to be introduced. I was considering the best way to ensure that programmers can use whatever generator they want while having access to these functions. We had a discussion on the Discord server and it was decided to use interfaces for the job (a relatively new feature in V).

Although there is a fair amount of work left, I am confident we can pull this off relatively quickly and have a feature-rich, easy-to-use random library in the end. We also need to fix the outdated documentation as soon as possible.

After this phase, users will be able to:

  1. Change the generator for the default RNG.
  2. Use non-linear distribution functions like Gaussian (Normal), Poisson, etc.
  3. Use a single interface for all RNGs.

What Next?

After I’m done with rand, I’m planning to help with the development of the V Scientific Library ⟨9⟩. In the future, I also plan to lead the project on adding arbitrary precision numeric types like Integer, Decimal and Fraction.

  1. Oracle asking devs to pay↗ - This started in 2016 with Oracle asking for payment for some commercial features of Java SE. It escelated to suing Google for Android. ⟨1⤴⟩
  2. The Python Programming Language↗ - I enjoy using Python because it is open source, easy to learn, and quick to prototype with. ⟨2⤴⟩
  3. Duck Typing in Python↗ - I do not like a few things about Python, this being one of them. ⟨3⤴⟩
  4. Typescript Homepage↗ - When I’m not programming in V, I’m using Typescript. ⟨4⤴⟩
  5. Version bump utility↗ - A simple project written in C that allows version numbers to be automatically incremented through a simple command. I also experimented with cross-platform builds. ⟨5⤴⟩
  6. The V Programming Language↗ - A language similar to Go Lang but with generics and other useful features. ⟨6⤴⟩
  7. The Official V Language (and Apps) Discord Server↗ - I am relatively active in this community; I answer queries and post questions from time to time. ⟨7⤴⟩
  8. List of Pull Requests↗ - This query shows all the Pull Requests I’ve made for the main V repository. ⟨8⤴⟩
  9. The V Scientific Library↗ - I’m contributing to this library, especially in the Linear Algebra module. ⟨9⤴⟩

Thoughts on this post?

If you have something to say about this post like expressing thanks, pointing out errors or seeking further clarification, feel free to contact me!

I try to reply within a week. You can find other ways to contact me in the contact page.

Tags