{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# <center>EC3382 - Assignment 3</center>" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "collapsed": false, "run_control": { "frozen": false, "read_only": false } }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this assignment, I want you to take the skills you have learned to define a trading strategy and get used to how it works. Then, I want you to optimize it on some price data. In the process, you will write a mini broker. This gets you thinking about what happens once your trading program interacts with Quantopian. This is clearly an abstraction but all the questions that you will ask doing this assignment are exactly the ones that you should be asking when you code on Quantopian. \n", "\n", "We downloaded some daily historical prices from Yahoo Finance. These data were downloaded via navigating to the URL:\n", "http://ichart.finance.yahoo.com/table.csv?s=BAC&a=00&b=01&c=2000&d=11&e=31&f=2009&g=d&ignore=.csv\n", "Where the letters BAC (CMA,USB,WFC) represent the ticker and the parameters (a-g) represent the from and to dates. loadData.py loads the downloaded data as a pandas frame. Take a look to see how this works. For the grading, I will use different tickers and different dates as the data." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": false, "run_control": { "frozen": false, "read_only": false } }, "outputs": [], "source": [ "from loadData import BAC, CMA, USB, WFC\n", "#You should print these out to see what you are working with" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part A: Combining the data into a Pandas dataframe" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Write a function which takes as inputs: \n", "a) a list of Pandas dataframes formatted like the one given above \n", "b) a list of ticker names \n", "\n", "and returns a new dataframe containing the adjusted prices of the input dataframes and the date. Rename adjusted prices so that the column name is the ticker symbol. Use the df.merge command to join the dataframes together. http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html\n", "\n", "Use an inner join using 'Date' as a key. In doing this, dates where one of the stocks was not traded will be excluded. This is strictly not kosher for trading and you should be aware of it. In commercial trading systems, you will have to make sure that the algorithm only runs when all the data is \"as expected\". Code defensively. When something unexpected, your code needs to choke. For now, we ignore this issue.\n", "\n", "```python \n", "allPrices = combineData([BAC, CMA, USB],['BAC','CMA','USB'])\n", "print allPrices[0:3]\n", "```\n", "\n", "the output would be\n", "\n", "```python \n", " Date BAC CMA USB\n", "0 2000-01-03 15.289114 27.197519 10.278392\n", "1 2000-01-04 14.381631 26.064289 9.931932\n", "2 2000-01-05 14.539455 25.988740 9.903059\n", "```" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": false, "run_control": { "frozen": false, "read_only": false } }, "outputs": [], "source": [ "#Example answer. Replace with pass to release\n", "def combineData(inputList,colName):\n", " pass\n", "\n", "#allPrices = combineData([BAC, CMA, USB],['BAC','CMA','USB'])\n", "#print allPrices[0:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part B: Mean reversion strategy\n", "\n", "Create a function that takes as input \n", "a) a data frame formatted like BAC (Columsn: Date, BAC) given above. This is a subset of allPrices that you have just constructed. \n", "b) a parameter for how many lag periods to use in calculating moving average \n", "c) a parameter for the multiples of standard deviation as a threshold \n", "\n", "and returns a dataframe with an additional column called Order. Order is -1 if the current price strictly exceeds its k day moving average plus m times the k day moving standard deviation and returns 1 if the price is strictly lower than the k day moving average minus m times the k day moving standard deviation. \n", "\n", "Remember that a k day moving average for the mean and standard deviation will require k datapoints. Hence, your function should return 0 whenever there is not enough data to evaluate the strategy. \n", "\n", "Assume that the first column is always Date, and the second column is always some price of a ticker symbol (not necc. BAC).\n", "\n", "If the input is...\n", "```python\n", "print allPrices.ix[0:10,0:2]\n", " Date BAC\n", "0 2000-01-03 15.289114\n", "1 2000-01-04 14.381631\n", "2 2000-01-05 14.539455\n", "3 2000-01-06 15.782312\n", "4 2000-01-07 15.368026\n", "5 2000-01-10 14.835373\n", "6 2000-01-11 14.519727\n", "7 2000-01-12 14.677550\n", "8 2000-01-13 15.012924\n", "9 2000-01-14 15.940135\n", "10 2000-01-18 15.151019 \n", "\n", "\n", "dfTrades = meanRevert(df,k = 3,m = 0.1)\n", "print dfTrades\n", " Date BAC Order\n", " Date BAC Order\n", "0 2000-01-03 15.289114 0\n", "1 2000-01-04 14.381631 0\n", "2 2000-01-05 14.539455 1\n", "3 2000-01-06 15.782312 -1\n", "4 2000-01-07 15.368026 -1\n", "5 2000-01-10 14.835373 1\n", "6 2000-01-11 14.519727 1\n", "7 2000-01-12 14.677550 0\n", "8 2000-01-13 15.012924 -1\n", "9 2000-01-14 15.940135 -1\n", "10 2000-01-18 15.151019 1\n", "```\n", "\n" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false, "run_control": { "frozen": false, "read_only": false } }, "outputs": [], "source": [ "def meanRevert(df,k,m):\n", " pass\n", "\n", "#dfTrades = meanRevert(df = allPrices.ix[0:10,0:2],k = 3, m = 0.1)\n", "#print dfTrades\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part C: Trading\n", "\n", "Construct a simple function (robotBroker) that takes \n", "a) a dataframe formatted like dfTrades \n", "b) Money. eg: https://www.youtube.com/watch?v=l91ISfcuzDw \n", "\n", "and outputs \n", "a) the sharpe ratio\n", "\n", "Your function will simply keep track of the value of your portfolio (including money starting from the initial amount). Finally, it computes the returns and the unadjusted Sharpe Ratio from the Value column.\n", "\n", "To illustrate, I made the prices easy to work with. The initial cash in this example is 10\n", "```python \n", " Date BAC Order Position Value\n", "1 2009-12-28 1 0 0 10 = starting amount\n", "2 2009-12-29 5 1 1 10 = 5 cash + 5 long\n", "3 2009-12-30 4 -2 -1 9 = 13 cash - 4 cover\n", "```\n", "\n", "You *may* want to follow the data structure illustrated here. I am only interested in the Sharpe Ratio at the end. While there is a lot of code to write, a certain concept called copy-paste will come in useful! Take a look at Assignment 2. \n", "\n", "```python\n", "A more realistic intermediate step. I only care about the Sharpe ratio\n", " Date BAC Order Value\n", "0 2000-01-03 15.289114 0 10.000000\n", "1 2000-01-04 14.381631 0 10.000000\n", "2 2000-01-05 14.539455 1 10.000000\n", "3 2000-01-06 15.782312 -1 11.242857\n", "4 2000-01-07 15.368026 -1 11.242857\n", "5 2000-01-10 14.835373 1 11.775510\n", "6 2000-01-11 14.519727 1 11.775510\n", "7 2000-01-12 14.677550 0 11.933333\n", "8 2000-01-13 15.012924 -1 12.268707\n", "9 2000-01-14 15.940135 -1 12.268707\n", "10 2000-01-18 15.151019 1 13.057823\n", "\n", "sharpe = robotBroker(dfTrades,initCash = 10)\n", "print sharpe\n", "0.67749700464\n", "```\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "run_control": { "frozen": false, "read_only": false } }, "outputs": [], "source": [ "def robotBroker(df,initCash):\n", " pass\n", "\n", "def get_mean(a):\n", " pass\n", "\n", "def get_stddev(a):\n", " pass\n", "\n", "def get_ratio(a, baseline):\n", " pass\n", "\n", "def get_returns(a):\n", " pass\n", "\n", "def get_excess_returns(returns, baseline):\n", " pass\n", "\n", "#sharpe = robotBroker(dfTrades,initCash = 10)\n", "#print sharpe" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Part D: Optimization\n", "Let's put it all together! We want to try out different strategy parameters on different stocks. I want you to think about why you are finding what you are finding as you play with your strategy. Think BIG. Think about what is possible. This is just a simple motivational example.\n", "\n", "Construct a function that takes as inputs:\n", "\n", "a) A dataframe formatted like \"allPrices\" \n", "b) A list of tickers contained within allPrice, ie, ['BAC','USB']. These are the tickers to test. \n", "c) A list of lags k to try out, eg [3,4,5]\n", "d) A list of cutoffs m to try out, eg [0.1,0.2,0.5]\n", "e) An initial sum of money eg 100\n", "\n", "\n", "The output: \n", "i) kBest a list containing the best choice of kBest for each of the input tickers in (b) \n", "ii) mBest a list containing the best choice of m for each of the input tickers in (b) \n", "iii) The corresponding sharpe ratio\n", "\n", "Example\n", "```python\n", "[kBest, mBest, maxSharpe] = optimizeParams(allPrices[0:10],['BAC','USB'],[3,4,5],[0.1,0.2,0.5],100)\n", "print kBest, mBest, maxSharpe\n", "[3, 3] [0.5, 0.1] [0.75110882013827451, 0.51505589272506302]\n", "```" ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "run_control": { "frozen": false, "read_only": false } }, "outputs": [], "source": [ "def optimizeParams(df,tickerList,kList,mList,initCash):\n", " pass\n", "\n", "#[kBest, mBest, maxSharpe] = optimizeParams(allPrices[0:10],['BAC','USB'],[3,4,5],[0.1,0.2,0.5],100)\n", "#print kBest, mBest, maxSharpe" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [default]", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.12" } }, "nbformat": 4, "nbformat_minor": 0 }
import pandas as pd BAC = pd.read_csv('BAC.csv')[::-1] CMA = pd.read_csv('CMA.csv')[::-1] USB = pd.read_csv('USB.csv')[::-1] WFC = pd.read_csv('WFC.csv')[::-1]