This was originally presented on 4/29/2007 for the NYC Financial Engineering Meetup. It illustrates the use of J to manipulate data quickly and easily in order to explore topics in quantitative financial research. This work continues here.

How Long to Hold Winners and Losers

Background: One well-known behavioral shortcoming of investors is a tendency to take profits too early on investments that are doing well and, conversely, to hold on to losing positions for too long. One of the strengths of quantitative investing is overcoming bad investing behavior by sticking to rules that correct an investor’s natural tendency to do the wrong thing. An example of this is how the regular re-balancing discipline of asset allocation corrects the tendency to over-invest in recent winners and avoid recent losers.

Questions to be addressed: How do we define winners versus losers? Are there simple rules to determine sales? Do the rules work the same – but in reverse – for short positions? Can we identify groups of assets for which the rules work better or worse? What is the form of a good rule (e.g. flat percentage, volatility- or market- related change)? Are there exogenous factors that influence rules? Do similar rules apply if we look at excess-return space? Are the best rules good enough to be worth implementing?

A Preliminary Look at Some Data

Code is include in a fixed (Courier) font; user input is indented three spaces; output is left-justified.

   load '~User/code/SPXData.ijs'
   $CODAT
848

We have data on 848 different companies over the whole period.

   ({.,{:)DTS
+----------+----------+
| 1/02/1990|12/03/2004|
+----------+----------+

The data begins at the start of 1990 and goes almost to the end of 2004.

   $COINFO
849 5

We have five columns of data on each company, e.g. looking at the first three rows in the table:

   3{.COINFO
+------+-------------------+--------+------+----------+
|Symbol|Company Name       |CUSIP   |GVKEY |Date      |
+------+-------------------+--------+------+----------+
|MMM   |3M Co.             |88579Y10|007435| 1/02/1990|
+------+-------------------+--------+------+----------+
|ABT   |Abbott Laboratories|00282410|001078| 1/02/1990|
+------+-------------------+--------+------+----------+

Selecting only companies with data for the entire period,

   'dattit seldat inftit selinf'=: selCompleteSets ''
   $seldat
245

gives us 245 companies with data for the whole period.

   $clpxs=. ((dattit i. <'Close'){"1 &> seldat)%(dattit i. <'Adjust'){"1 &> seldat
245 3765

So, the 245 x 3765 table "clpxs" consists of prices for 245 companies over 3,765 days.

To pick a few companies on which to concentrate initially, look at the standard deviations of the prices and choose the companies with the lowest and highest.

   usus sds=. stddev"1 clpxs
4.7264748 46.584648 14.71278 7.0387579
   (sds i. (<./,>./)sds){selinf  NB. lowest and highest stddevs
+---+---------------------------+--------+------+----------+
|WOR|Worthington Industries Inc.|98181110|011600| 1/02/1990|
+---+---------------------------+--------+------+----------+
|GLW|Corning Inc.               |21935010|003532| 1/02/1990|
+---+---------------------------+--------+------+----------+

Do the same thing for the standard deviation of the share volume.

   clvol=. ((dattit i. <'Volume'){"1 &> seldat)*(dattit i. <'Adjust'){"1 &> seldat
   usus sds=. stddev"1 clvol
0.097023249 26.049271 1.5918885 2.6359499
   (sds i. (<./,>./)sds){selinf  NB. lowest and highest stddevs
+----+--------------+--------+------+----------+
|MDP |Meredith Corp.|58943310|007260| 1/02/1990|
+----+--------------+--------+------+----------+
|INTC|Intel Corp.   |45814010|006008| 1/02/1990|
+----+--------------+--------+------+----------+

So, we'll experiment a little with just these four companies (Worthington, Corning, Meredith, and Intel or WOR, GLW, MDP, INTC) for now.

First, look at a plot of the prices:

HiLoPxVoly.png clip_image002.jpg

However, we know that it’s more “quant” to look at returns:

HiLoVolyChgDlyRets.png

Since we select two companies based on the biggest variation in trading volume, let’s look at that too:

HiLoVolmVoly.png

We could also look at the second derivative of price, that is, change in daily returns:

HiLoVolyChgDlyRets.png

Here's the 20-day linear regressions of the prices for each of our four stocks over the first year or so:

Px1990LinAprox20DayWdw_20.png

Doing the same for returns presents a less clear picture:

Rets1990Lin20Dy.png

An Example of Some Problems with Data Consistency

The following may give some idea of how hard it is to get good, consistent data. Here are two samples of purported price and volume data for AT&T, not exactly an obscure stock, for a period of about a month-and-a-half. This particular period was chosen because it covers the time of the company's (reverse) stock split of 1-for-5 shares, so there are some interesting discrepancies between the two versions of the same series.

Price and Volume Data for AT&T from Two Different Sources

From Yahoo Finance

From Factset/Compustat

Ratios

Date

Close

Volume

Adj Close

Close

Volume

Adjust

FPx/YPx

F/Yadj

Yvol/Fvol

YV/FVAdj

10/30/2002

25.96

6618700

20.72

66.00

1.908

5

2.54

3.19

3.47

0.69

10/31/2002

25.66

6984600

20.48

65.20

3.095

5

2.54

3.18

2.26

0.45

11/1/2002

27.25

10612200

21.75

67.60

2.819

5

2.48

3.11

3.76

0.75

11/4/2002

27.87

10199400

22.25

69.45

3.277

5

2.49

3.12

3.11

0.62

11/5/2002

27.72

7174500

22.13

71.50

2.594

5

2.58

3.23

2.77

0.55

11/6/2002

27.80

6334400

22.19

70.30

3.089

5

2.53

3.17

2.05

0.41

11/7/2002

26.77

6267400

21.37

68.25

2.989

5

2.55

3.19

2.10

0.42

11/8/2002

27.21

8021400

21.72

69.50

2.279

5

2.55

3.20

3.52

0.70

11/11/2002

26.20

5076400

20.91

67.60

1.665

5

2.58

3.23

3.05

0.61

11/12/2002

25.01

12683200

19.96

69.30

2.65

5

2.77

3.47

4.79

0.96

11/13/2002

24.32

12866800

19.41

67.35

4.057

5

2.77

3.47

3.17

0.63

11/14/2002

24.40

7596800

19.48

68.35

2.499

5

2.80

3.51

3.04

0.61

11/15/2002

25.19

7715400

20.11

69.30

3.209

5

2.75

3.45

2.40

0.48

11/18/2002

25.64

8173000

20.47

67.55

5.065

5

2.63

3.30

1.61

0.32

11/19/2002

25.48

6724500

20.34

27.20

16.506

1

1.07

1.34

0.41

0.41

11/20/2002

26.20

9795800

20.91

27.66

13.585

1

1.06

1.32

0.72

0.72

11/21/2002

27.48

11146700

21.94

28.00

10.683

1

1.02

1.28

1.04

1.04

11/22/2002

27.55

8017700

21.99

27.97

8.639

1

1.02

1.27

0.93

0.93

11/25/2002

28.40

8141600

22.67

27.99

8.436

1

0.99

1.23

0.97

0.97

11/26/2002

27.45

7918400

21.91

27.83

8.28

1

1.01

1.27

0.96

0.96

11/27/2002

28.73

5098900

22.93

28.00

11.55

1

0.97

1.22

0.44

0.44

11/29/2002

28.50

3332300

22.75

28.04

2.157

1

0.98

1.23

1.54

1.54

12/2/2002

28.35

7492300

22.63

28.00

5.366

1

0.99

1.24

1.40

1.40

12/3/2002

26.93

8025100

21.50

28.08

5.273

1

1.04

1.31

1.52

1.52

12/4/2002

26.45

8882400

21.11

28.06

4.125

1

1.06

1.33

2.15

2.15

12/5/2002

26.25

6162700

20.95

27.95

3.943

1

1.06

1.33

1.56

1.56

12/6/2002

26.43

6156600

21.10

28.00

3.927

1

1.06

1.33

1.57

1.57

12/9/2002

25.69

5830800

20.51

27.89

4.453

1

1.09

1.36

1.31

1.31

12/10/2002

25.70

5328900

20.51

26.64

7.496

1

1.04

1.30

0.71

0.71

As you can see, the numbers are very different. Looking at the "adjustment" column for both series, we see that Yahoo is adjusting the price for the dividends (in "Adj Close") whereas Factset has a share multiplier (in "Adjust") to adjust for splits. However, neither the ratios of the unadjusted nor the adjusted prices seem consistent. Not only this, but the ratios of the share volume numbers, which should be affected by splits but not dividends, seem to vary too much to be able to simply reconcile the two series.

Concentrating on the volume numbers as they should have fewer confounding factors, we can compare the day-to-day changes in volume to see if there is some obvious correspondence or consistent difference between the series.

Compare Rates of Change Between Two Datasets

From Yahoo Finance

From Factset/Compustat

Date

Yahoo
Close

Yahoo
Volume

Yahoo
AdjClose

Factset
Close

Factset
Volume

10/31/2002

-1.16%

5.53%

-1.16%

-1.21%

62.21%

11/1/2002

6.20%

51.94%

6.20%

3.68%

-8.92%

11/4/2002

2.28%

-3.89%

2.30%

2.74%

16.25%

11/5/2002

-0.54%

-29.66%

-0.54%

2.95%

-20.84%

11/6/2002

0.29%

-11.71%

0.27%

-1.68%

19.08%

11/7/2002

-3.71%

-1.06%

-3.70%

-2.92%

-3.24%

11/8/2002

1.64%

27.99%

1.64%

1.83%

-23.75%

11/11/2002

-3.71%

-36.71%

-3.73%

-2.73%

-26.94%

11/12/2002

-4.54%

149.85%

-4.54%

2.51%

59.16%

11/13/2002

-2.76%

1.45%

-2.76%

-2.81%

53.09%

11/14/2002

0.33%

-40.96%

0.36%

1.48%

-38.40%

11/15/2002

3.24%

1.56%

3.23%

1.39%

28.41%

11/18/2002

1.79%

5.93%

1.79%

-2.53%

57.84%

11/19/2002

-0.62%

-17.72%

-0.64%

-59.73%

225.88%

11/20/2002

2.83%

45.67%

2.80%

1.69%

-17.70%

11/21/2002

4.89%

13.79%

4.93%

1.23%

-21.36%

11/22/2002

0.25%

-28.07%

0.23%

-0.11%

-19.13%

11/25/2002

3.09%

1.55%

3.09%

0.07%

-2.35%

11/26/2002

-3.35%

-2.74%

-3.35%

-0.57%

-1.85%

11/27/2002

4.66%

-35.61%

4.66%

0.61%

39.49%

11/29/2002

-0.80%

-34.65%

-0.78%

0.14%

-81.32%

12/2/2002

-0.53%

124.84%

-0.53%

-0.14%

148.77%

12/3/2002

-5.01%

7.11%

-4.99%

0.29%

-1.73%

12/4/2002

-1.78%

10.68%

-1.81%

-0.07%

-21.77%

12/5/2002

-0.76%

-30.62%

-0.76%

-0.39%

-4.41%

12/6/2002

0.69%

-0.10%

0.72%

0.18%

-0.41%

12/9/2002

-2.80%

-5.29%

-2.80%

-0.39%

13.39%

12/10/2002

0.04%

-8.61%

0.00%

-4.48%

68.34%

Graphing the pair of daily volume change series:

volumeChangesComparison.png

As we can see, there is a general correspondence but some puzzling differences remain. However, the biggest difference seems to occur on the date of the reverse split (11/19/2002), so can be explained by that.

This work continues here.

DevonMcCormick/Research/HoldingWinnersSellingLosers0 (last edited 2011-03-03 18:28:38 by DevonMcCormick)