enlist[q]

Enumeration in kdb+

I was planning on writing a blog post on how to save data to disk from a q process but then realized that I need to first cover the enumeration process because any data you wish to write to disk must be enumerated if it has a sym column. If you try to save a table with columns of type symbol, you will get a type error. I am sure you are familiar with the generic concept of enumeration. If not, you can read up more about it here.

There are many benefits of enumeration but one main benefit is that enumeration normalizes data. How is that helpful? Suppose, you have a trade table with a million rows that contain data for syms AAPL, IBM, and MSFT. Your table will have numerous repeating records per sym. Think of the many times you will have to save AAPL sym. It can get a bit overwhelming for large tables…for multiple dates…for thousands of syms. If you were to apply enumeration, you would only have to save this value once. Let’s elaborate on that.

Subscribing to a table in ticker plant

I am currently working on writing a stats process that will calculate some stats such as high, low, avg, min, max price in real-time. I have a whole stack running (fh/tp/rdb/hdb) where the fh regularly sends updates to my ticker plant. You can see my previous post about how to code that.

As I started to write code to calculate stats, I first wanted to be able to capture trade table within this process as well in real time. In other words, I wanted my code in rstats.q to subscribe directly to my tp. It doesn’t seem like a difficult task and it really isn’t. I have known the theory behind achieving this for a while but never really coded it myself because it had been done in my team already. But when you sit down to code it yourself, a lot of unanticipated problems come in your way.

Coding a sample feed handler

Feed handlers sound scary. They are often written in some language other than q (or at least the ones I was exposed to initially) such as java which is a pain to look at once you get used to a compact language like q. I always avoided feed handlers until the day I read a white paper on them written by Nathan Perrem from First Derivative as part of their lecture series. Suddenly, they didn’t seem so frightening. They were like any other piece of code that I dealt with. Surprisingly, there wasn’t much to setting a sample feed going. In fact, it was really easy.

In this post, I present to you my code for a feed handler (inspired by the white paper mentioned earlier).

Tables, keyed tables and dictionaries

When I first started learning q, I had a difficult time understanding the differences between tables, keyed tables and dictionaries. The differences seemed very subtle at the time. Just recently, I was explaining a colleague (java developer with some exposure to q/kdb) how you can check meta of a table, look up the keys and types. All this seems trivial to those of us who are full time q developers but someone who only touches the surface of q/kdb in their daily jobs will have no idea about these features.

q is all about tables/dictionaries (and lists) and if you don’t know how to differentiate them properly, then you are going to have a tough time. In this post, I have highlighted some of the key similarities and differences between tables/keyed tables and dictionaries.

Using Attributes in kdb+

One of the biggest advantages of kdb+ is fast data retrieval. As a client, you simply run a qsql command (or some API) and you are given thousands and thousands of rows of data within seconds.

How does kdb+ do that?

There are many factors that contribute to high retrieval speed and one of them is table attributes. Attributes don’t actually perform any operations. They are simply a way of tagging your data (metadata) so that kdb+ knows how it is arranged before doing any lookup.