Python api for getting market and financial data from IEX

Most of you have probably heard about IEX: The Investors Exchange. IEX is the exchange started by Brad Katsuyama who was the protagonist of Michael Lewis’s famous book Flash Boys (review). Just last year, IEX scored a major win when SEC approved its application to register as a national securities exchange. As time passes by, IEX continues to gain more and more market share.

Just like any other exchange, one of IEX’s most valuable asset is the market data generated by all the trading. However, unlike other exchanges, IEX makes its data available to public for free via web API. On February 22, 2017, IEX wrote a blog post announcing release of its web API. Since then, IEX has made quite a few enhancements and added support for newer datasets as well.

As of today, some of the data that IEX provides includes:

  • pricing data (latest trade and quote data as well as summary data going back up to 5 years),
  • reference data,
  • new data,
  • earnings data, and
  • financial data.

Continue…

Understanding sets in python

As I learn more and more about python’s different data types, I find myself surprised that not enough people use (or even know) sets. At my job, I am often taking some data and transforming it. Once transformed, I have to do analysis on how the data may have changed and sets are great for such comparisons.

In this post, I will cover how to create sets and show some examples on how to use them.

What is a set?
A set is an unordered collection of unique items in python. They are sort-of like lists but they only contain unique items and don’t maintain order. They also have a lot of helpful unique operations.

Continue…

Understanding list, set and dict comprehensions

Just few days ago, you were having a good time with your friends and counting down to 2017. Few days have passed and you are left with a typical cold snowy day in January. You are busy writing code for a high profile project at work. Suddenly, a situation arises where you need to create a new list from an existing one. You code it like you have always been coding:

>>> old = ['adam', 'mike', 'olga']
>>> for name in old:
        new.append(name+ ' last')
>>> new
['adam last', 'mike last', 'olga last']

But then you realize that one of your 5 new year’s resolutions is to start using list comprehensions! You have heard about them but were always a little intimidated by them. You were also not really sure of their point.

This post will help you with your new year’s resolution. However, it won’t help you with the other one about going to gym 4 times a week.

Continue…

10 python idioms to help you improve your code

If you have ever tried to learn a new language (not a programming language), you know that we always think in our native language before we translate it to the new language. This can lead to you forming some sentences that don’t make sense in the new language but are perfectly normal in your native language. For example, in a lot of languages, you ‘open’ an electronic gadget such as fan, AC or cell phone. When you say that in English, it means to literally open the gadget instead of turning it on.

The same is true for programming languages. As we pick up new languages, such as python, we are using our prior knowledge of programming in another language (q, java, c++ etc) and translating that to python. Many times, your code will work but it won’t be ‘pretty’ or fast. In python terms, your code won’t be ‘pythonic’.

In this post, I would like to cover some python idioms that can be very helpful. These idioms will:

  1. Help your code look better,
  2. Speed up your code, and
  3. Set you apart from beginners

Let’s begin!

Note: All examples are written in python 2.

Update: Thanks to Diane and my other readers for pointing out some errors in my examples!

Continue…

Getting started with regex in python

I have been wanting to learn regex. Not just because it reminds me of a bit but because it’s actually very useful and can be used within numerous languages including python. In case you don’t know, regex stands for regular expression. A regex is “a sequence of characters that define a search pattern.” Most popular use case is to search strings for a pattern.

I was looking at videos from this year’s PyCon and stumbled up on a video of Trey Hunner conducting regex workshop at PyCon 2016. If you are looking to learn regex, I encourage you to watch the video. You can also find most of the stuff mentioned in the video on this website.

After I watched the video, I did some practice on my own which I wanted to share here.

Continue…

Getting started with data science

Recently, a few of my friends have shown interest in what I do and the skill set required for my job. For those who don’t know, I am a market data developer. This means that I work with time-series databases to capture and store both real-time and historical data. I am also responsible for writing queries to help my users (i.e. researchers) analyze this data efficiently. Most of the times, when people say they work with big data, they are exaggerating. But I can promise you, market data is big data. In case you don’t believe me, let me tell you that our system captures around 4 billion rows daily.

Once I explain this to my friends, they are interested in finding out more. How do I capture so much data? What tools do I use to analyze this data? How can they get into data analysis? Where should they begin?

Continue…

Analyzing NYC traffic data using Pandas

I wrote a post earlier about how to analyze data using Pandas. In that post, I just introduced some simple Pandas functions to analyze some random data. In this post, I am using real world traffic data from NYC. NYC has made a lot of data available to public recently on this website.  There is plenty of quality data for you to play around with. I chose to look at the traffic data which you can download from here.

I performed the analysis using iPython Notebook which I have embedded here.

Continue…

Introducing data analysis with Python and Pandas

Recently, I have been playing around with Python and its data analysis library – Pandas, which is built on another library called NumPy. Most of you have probably heard of Python (if not then I don’t know what’s wrong with you. Get out there and make some programmer friends or read some blogs). Python has been in existence for a while (since 1991) though it has gained a lot of traction just recently. A lot of startups are into Python. Great thing about Python is that you can use it as a functional language or OOP language. I am more of a functional guy and prefer writing straightforward code. Moreover, I am into data analysis…as opposed to…lets say…designing ugly GUIs.

Anyways, if you want to do data analysis with Python, you must use Pandas. I mean you could use other methods but then you will face serious issues and probably not be good at your job. Pandas is quite pleasant. Coming from a kdb background, I missed seeing data in tabular format. Pandas displays data in dataframes (tables) and allows you to perform operations on columns just like kdb.

Continue…