qSQL queries for performing analytics

I realize that there are many developers out there that are not looking to get into q completely and are simply using q/kdb+ along with qsql to perform analytics (i.e. quants). My job requires all of this so I have some good experience running qsql queries. Of course, the type of query you need to run really depends on what kind of data you are looking to retrieve so I can’t possibly cover them all in this post. But I will mention some common queries you can run.

All the examples will focus on these two tables:

q)q1:([];time:5?.z.t;sym:5?`AAPL`MSFT`IBM;ask:5?100;bid:5?100)
q)t1:([];time:5?.z.t;sym:5?`AAPL`MSFT`IBM;price:5?100;size:5?200)
q)q1
time         sym  ask bid
-------------------------
02:59:16.636 IBM  40  2
14:35:31.860 AAPL 88  39
16:36:29.214 AAPL 77  64
08:31:52.958 IBM  30  49
07:14:12.294 AAPL 17  82
q)t1
time         sym  price size
----------------------------
10:25:30.322 AAPL 8     36
14:17:41.480 AAPL 97    12
08:50:31.645 MSFT 52    45
15:20:08.925 AAPL 66    83
09:01:27.840 MSFT 24    94

Examples:
Find avg price for each sym

q)select avg price by sym from t1
sym | price
----| -----
AAPL| 12
IBM | 66.5
MSFT| 81

Find max ask,bid for each sym

q)select mA:max ask, mB:max bid by sym from q1
sym | mA mB
----| -----
AAPL| 88 39
IBM | 49 68
MSFT| 82 58

Find delta of price column in t1

q)update diff:price-prev price from t1
time         sym  price size diff
---------------------------------
10:25:30.322 AAPL 8     36
14:17:41.480 AAPL 97    12   89
08:50:31.645 MSFT 52    45   -45
15:20:08.925 AAPL 66    83   14
09:01:27.840 MSFT 24    94   -42

You can also use the deltas keyword:

q)update diff:deltas price from t1
time         sym  price size diff
---------------------------------
10:25:30.322 AAPL 8     36   8
14:17:41.480 AAPL 97    12   89
08:50:31.645 MSFT 52    45   -45
15:20:08.925 AAPL 66    83   14
09:01:27.840 MSFT 24    94   -42

Note that first diff is 82 and not null.

Find trades where delta in price was positive

q)select from (update diff:price-prev price from t1) where diff>0
time         sym  price size diff
---------------------------------
14:17:41.480 AAPL 97    12   89
15:20:08.925 AAPL 66    83   14

or

q)select from t1 where 0<price-prev price
select from t1 where 0<price-prev price
time         sym  price size
----------------------------
14:17:41.480 AAPL 97    12
15:20:08.925 AAPL 66    83

Find rows where price is greater than avg price

q)select time,sym,price,size from t1 where price>avg price
time         sym  price size
----------------------------
14:17:41.480 AAPL 97    12
08:50:31.645 MSFT 52    45
15:20:08.925 AAPL 66    83

Pair each row for IBM in t1 with its max bid/price from q1

q)q2:select from `sym`time xasc q1
q)q2
time         sym  ask bid
-------------------------
07:14:12.294 AAPL 17  82
14:35:31.860 AAPL 88  39
16:36:29.214 AAPL 77  64
02:59:16.636 IBM  40  2
08:31:52.958 IBM  30  49
q)wj[-2 1+\:t1.time;`sym`time;t1;(q2;(max;`ask);(min;`bid))]
time         sym  price size ask bid
------------------------------------
10:25:30.322 AAPL 8     36   17  82
14:17:41.480 AAPL 97    12   17  82
08:50:31.645 MSFT 52    45   -0W 0W
15:20:08.925 AAPL 66    83   88  39
09:01:27.840 MSFT 24    94   -0W 0W

One of the requirements for wj is that q table needs to be sorted on `sym`time.
-2 1+\:t1.time gives you the time windows for aggregation. It takes the time values from t1 and for each value it returns a value 1ms in future and 2ms in past.

q)-2 1+\:17:26:48.626
17:26:48.624 17:26:48.627

select rows where price is greater than average price…but this time, do it by sym.
Long way:

q)update avP:avg price by sym from `t1
`t1
q)t1
time         sym  price size avP
--------------------------------
10:25:30.322 AAPL 8     36   57
14:17:41.480 AAPL 97    12   57
08:50:31.645 MSFT 52    45   38
15:20:08.925 AAPL 66    83   57
09:01:27.840 MSFT 24    94   38
q)select from t1 where price>avP
time         sym  price size avP
--------------------------------
14:17:41.480 AAPL 97    12   57
08:50:31.645 MSFT 52    45   38
15:20:08.925 AAPL 66    83   57

Short way (group the two queries together by using fby):

q)select from t1 where price>(avg;price) fby sym
time         sym  price size avP
--------------------------------
14:17:41.480 AAPL 97    12   57
08:50:31.645 MSFT 52    45   38
15:20:08.925 AAPL 66    83   57

Leave a Reply

Your email address will not be published. Required fields are marked *