enlist[q]

Book Review: The Quants

Regardless of which industry you are employed in, it’s always a good idea to know the background and history of that industry. The Quants by Scott Patterson is one such book which covers the origins of quantitative trading through the lives of some of most famous quant researchers. The book focuses on the story of Ed Thorp, Ken Griffin, Peter Muller and Jim Simons.

The book begins with an introduction of Ed Thorp, a brilliant mathematician who went from using math to win at blackjack to using complicated formulas and pricing techniques to beating the stock market. He is the author of the famous books, Beat the Dealer and Beat the Market. Most of the quants mentioned in the book have had some sort of connection to Ed Thorp whether it be that they knew him personally or were inspired by his books.

Continue reading “Book Review: The Quants”

Getting started with data science

Recently, a few of my friends have shown interest in what I do and the skill set required for my job. For those who don’t know, I am a market data developer. This means that I work with time-series databases to capture and store both real-time and historical data. I am also responsible for writing queries to help my users (i.e. researchers) analyze this data efficiently. Most of the times, when people say they work with big data, they are exaggerating. But I can promise you, market data is big data. In case you don’t believe me, let me tell you that our system captures around 4 billion rows daily.

Once I explain this to my friends, they are interested in finding out more. How do I capture so much data? What tools do I use to analyze this data? How can they get into data analysis? Where should they begin?

Continue reading “Getting started with data science”

Book Review: Fooled by randomness

I stumbled upon this book called Fooled by randomness: The hidden role of chance in life and in markets and decided to give it a shot. As the title suggests, the book is about the role of randomness in our life and in markets. The book is written by Nassim Nicholas Taleb who is a trader and mathematics professor.

Continue reading “Book Review: Fooled by randomness”

Using kdb+ for big data analytics in pharmaceutical industry

I attended Kx’s latest meetup in New York this Monday at J.P. Morgan. This meetup was different. It was focused on using kdb+ for data analysis in pharmaceutical industry. I have barely any knowledge about how pharma companies work and what type/size of data they deal with. This meetup was an incredible introduction to growing need of powerful analytical tools in the pharmaceutical industry.

kdb meetup — Kx’s new CEO, Mark Sykes, at the meetup

The event started with a quick introduction by Kx’s new CEO Mark Sykes. I don’t know much about him but he seemed like a nice humble guy. He also announced KxCon that will be held in 2016 in Montauk, New York. Mark led the way for Larry Pickett, CIO of Purdue Pharma, to discuss Purdue’s big data strategy. This was quite high level and am not sure how many people found it interesting. I saw many attendees on their phones during this presentation. I, however, was quite interested in seeing how pharmaceutical companies think differently than financial firms when it comes to big data.

Continue reading “Using kdb+ for big data analytics in pharmaceutical industry”

Book Review: Inside the black box

Ever since I made the switch from investment bank to hedge fund, my interest in quantitative trading and hedge funds has grown tremendously. At my new firm, there is barely any discretionary trading (almost none). All positions are based on some quantitative model that has been extensively tested prior to going live.

To get myself familiar with the hedge fund industry, I have started reading some related books. The fact that my commute is more than twice as long as it used to be was also a compelling reason for me to read. I figured I would share my thoughts here in case my readers are looking for some book recommendations.

Continue reading “Book Review: Inside the black box”

Listing alphabets and numbers using predefined variables

When I was trying to solve Challenge#4, I was trying to remember the variables that contains string of all the alphabets such as “ABCDEFGHIJKLMNOPQRSTUVWXYZ”. After some digging, I remembered that it was .Q.A. The only reason I knew about it was because I had used it earlier at my job. If you are new to kdb, then you probably didn’t know about it. I, also, couldn’t find it on Kx reference wiki page about DotQ namespace. In fact, there are quite a few others that are not listed there but can be very handy.

Continue reading “Listing alphabets and numbers using predefined variables”

My experience at The Trading Show – quant, automated trading and big data conference

I was lucky enough to get a complimentary pass to The Trading Show from one of our vendors – OneMarketData. They were one of the sponsors and had some spare tickets. The Trading Show is a conference on “quant, automated trading, big data and high performance computing (HPC)”. I have never attended a conference at such a large scale so it was definitely a new experience for me. The event was hosted at Three Sixty Tribeca.

I spent the entire day attending keynotes, panel discussions and round-table discussions, and networking. I will give you a brief summary of what each session covered. For reference, this was the agenda.

Continue reading “My experience at The Trading Show – quant, automated trading and big data conference”

Understanding types of market data

You can, most likely, work with market data without knowing much about the data itself. You can capture it, save it and deliver it to your clients who can use this data to do transaction cost analysis, trade surveillance, P&L and a lot of other stuff. But in the long run, you will find yourself in a position where knowing the data inside out would help a lot. It can even be the deciding factor in who gets the next promotion! To be able to do your job most effectively, it’s highly crucial to know the data itself. Understand life cycle of a trade: how is the data generated, how it is provided and how it is analyzed for different purposes by different teams.

I will be writing a series of posts about market data itself. This post will focus on types of data such as trade, quote and market depth. In a future post, I will cover types of securities as well.

Most common way of approaching this topic is probably by starting from the higher level with trades and quotes and then diving deeper into depth and orderbook data. However, I am going to explain in the reverse order because that’s the natural flow of the data. It will help you understand market data with respect to the trading life cycle.

Note: I excluded time stamps from examples below as they were unnecessary.

Continue reading “Understanding types of market data”

Challenge#4 What’s in a name?

I will admit…I have been having a tough time coming up with challenges lately. So, if you have any ideas please contact me! 🙂

This week’s challenge has been borrowed from Project Euler (Problem#22). Here is what the challenge is:

Begin by sorting the names list into alphabetical order. Then working out the alphabetical value for each name, multiply this value by its alphabetical position in the list to obtain a name score. For example, when the list is sorted into alphabetical order, ELIA, which is worth 5 + 12 + 9 + 1 = 27, is the 5th name in the list. So, ELIA would obtain a score of 27 × 5 = 135. What is the total of all the name scores in the file?

Here is the list of names:

`GUS`SHAWN`LASSI`JULES`HENRY`GALINA`ERIN`EMMETT`ONITA`LORNA`ROSALINE`ELIA`BRITNI`NARCISA
`SHANNA`STEPHANY`SIOBHAN`ANNABELLE`KARLY`LASHAUN`MICHELINA`REA`SUZAN`BLAINE`LORILEE
`GABRIELLA`JOANIE`MARIELA`DIANNE`KAROL

The score should be 30822.

Continue reading “Challenge#4 What’s in a name?”

Analyzing NYC traffic data using q

In my previous post, I analyzed some NYC traffic data using Pandas. In this post, I would like to perform the same analysis using q. As far as graphing is concerned, I won’t be showing that. q is not really used for graphing. You can use GUIs like qPad or qStudio to chart the data on your own.

This post will help you see how one can achieve same results using different methods. It’s nice to have these kind of options. Earlier a major disadvantage for not using q used to be its cost but now you can just use the 32 bit version that is now available for free!

One thing to note is that the analysis done in this post is very straight foreword. Both Pandas and q are capable of handing much more complex analysis. The point of these posts is to show you the different tools available to you.

For my analysis, I will be using the same data set that I used for the earlier post.

Let’s begin!

Continue reading “Analyzing NYC traffic data using q”