Treechat

A full stack, peer-to-peer OCaml chat server.

Project overview.

What do people want most? A highly efficient chat server made entirely in a functional programming language of course! In that case, look no further, this is a pure OCaml chat server for peer-to-peer messaging. I primarily used Jane Street’s OCaml Async module and the B+ Tree data structure for highly efficient message retrieval. A high level description of the project is below, but if you want to talk more technical, feel free to check out the code at the bottom of the page.

 

The interface.

Let’s start with the user interface, Client.ml. This file is run by itself in its own terminal and actually does not need to have access to any other code through the module system. The architecture revolves around two main functions: [repl] and [do’]. Essentially, [repl] prompts the user for commands, and [do’] executes those commands.

Client, implemented as a finite-state machine, iterates [repl]. Each time [repl] indexes, the internal state of the execution marches forward until it reaches ‘Repl’. Because the Client file uses the Async library, most of the functionality is monadic in nature. Therefore, any time Client.ml wants to communicate with the server, it must wait to receive a response.

After Client receives the user’s message it pushes it to the pipe, Chat.ml.

 

The pipe.

Chat.ml, the data stream pipe, uses the interface of Client to communicate with the B+ Tree through the use of asynchronous pipes.

It accomplishes this by creating a TCP Server which then calls a repl-esque handler function to deal with connections to the server. This handler uses pushback to read input and write output to the clients making request through the pipes. The pushback guarantees that the pipe does not overflow, and it also ensures seamless coordination in the communication between our server and its clients. After this is handled, Chat pushes the data to the B+ Tree.

 

The memory.

A B+ tree was used as the primary data structure to store messages within unique conversations. For the readers that aren’t familiar with the B+ Tree data structure, here is a great explanation. Given the relative ratio of inserting new data into the database compared to the need to remove data, it was found more useful to implement a Persistent B+ (unable to delete data) tree after researching the different styles in modern databases.

When the B+ Tree receives the user’s message from the pipe, it stores that message in the unique conversation pointer created during conversation inception. It may seem bloated to be unable to delete conversation histories, but the B+ Plus can store n*(n + 1)^(L−1)** pointers at a depth of L, for example our tree can hold a 3,500 unique conversations at a depth of only 4. Thus, the longest path to each node is guaranteed to be log​10(​n). This is a highly efficient data retrieval structure that can store large amounts of data such as frequent chats, with unique keys, between numerous users of the chat system.

**n = order of tree (pointers possible in each node, in this case it is 7)

 


Codebase can be found here.

Using Format