Forums  > Pricing & Modelling  > Bootstrapping missing option quotes?  
     
Page 1 of 1
Display using:  

Strange


Total Posts: 1249
Joined: Jun 2004
 
Posted: 2012-08-20 02:23
I am sure people here have bumped into this problem before.

I have daily historical option data for a specific asset and I am trying to get volatility surfaces. Problem is that on any given day only some options have quotes, sometimes the whole expiration slices are missing and sometime just some options are missing. Sometimes most of the surface is missing (e.g there would only be two out of 10 expirations present).

So I have decided to "bootstrap" this data. My thinking is as follows. I would like to create a 3-dimensional "array" where dimension 1 would be option expiration dates, dimension 2 would be strike of the option and dimension 3 would be the quote date. Feels like any missing volatility level for any quote-date/strike/expiration should be well "interpolatable" from the neighboring data in these 3 dimensions.

I already figure out how to create an cube like that (in R, none-the-less) from historical data. What I still have not figured out is how exactly do I perform this interpolation in an efficient manner. I can't spend too much time on this, after all, this is nothing more then a data-cleaning task, but would like to at least give it a try.

It's buy futures, sell futures, when there is no future!

silverside


Total Posts: 1224
Joined: Jun 2004
 
Posted: 2012-08-20 09:02
I think you are underestimating the difficulty :/

Essentially you are trying to "mark" a full vol surface each day, retrospectively, based on limited historical data.

It sounds like you want to do this in an automatic way, meaning you would need robust (but still validated) fitting/interpolation rules. There will be trial and error involved, the choices could include: global fit of parameters each day, marking of skew parameters manually at intervals (monthly), simple (cubic spline for example) interpolation of missing values.

It sounds like an interesting project, good luck with it.

mtsm


Total Posts: 94
Joined: Dec 2010
 
Posted: 2012-08-20 15:25
I don't think that this is well posed enough a problem to give you straight answers. What is the asset class being considered?

For example if it's OTC options you are considering, my answer would be don't bother trying to fill holes in a market that is unobservable.

I also don't really believe that term structure model approaches involving global fit are apropriate. I think that a lot of quant finance suffers from applying too much methodology where there is no underlying science to apply it to.

Other than this I would interpolate a) linearly across calendar time or even just roll, b) in variance across expiries and c) arbitrage-free across strikes. For c) that is one of the main reasons people like to use a skew model, so as to not have to mess around with convexity preserving interpolators.

BTW, I still think that you should have bypassed R and moved on to a proper object oriented scripting language such as python. Would make your life so much easier.

athletico


Total Posts: 894
Joined: Jun 2004
 
Posted: 2012-08-20 15:58

I agree with silverside - the full solution is tougher than you might think.  But if you are after a quick and dirty method, and you have a lot of data, I'd just discard any {trade date underlyer, option chain} record that is missing options and copy the previous days' fitted surface parameters.  Be careful to never interpolate in your trade date dimension; always copy the previous set of parameters to stay honest.

For the full solution, in the past I've decomposed this problem into two steps: (1) static vol surface fit given a set of EOD option quotes for an underlying, and (2) the dynamic problem of updating the most recent set of best estimates for my vol surface parameters, given the solution to (1).


Strange


Total Posts: 1249
Joined: Jun 2004
 
Posted: 2012-08-20 17:41
Thank you for your replies!

These are listed options on a few equity indices and I know for a fact that some options have traded on that day and definitely all have been quoted, it's just that they are missing from the data set. I already worked out how to fit a vol "slice" when it's fully populated (simple 4th order polynomial fit vs. log(strike/fwd)/sqrt(t) plus I also store the min vol for each slice). So now it's a question what to do in the following cases:

(1) Case 1: Data for the underlying is completely missing for this particular day. This one is fairly clear:
a. copy over fixed strike volatilities for the previous day
b. use previous days forward growth rates with this days spot
c. use a + b to create a new parametrisation

(2) Case 2: Data for underlying is there, but one or more whole expiration slices are missing. In this case I should probably use parametrisation from the neighbouring slices and interpolate the fit parameters.

(3) Case 3: A slice is present, but multiple strikes are missing and parametrisation is likely to fail. This one is the trickiest, since when exactly do I start thinking that parametrisation is going to fail and use the "fall-back" model. But I think in this case I should take the changes in the implied volatility for the strikes present in the current and previous day and interpolate/extrapolate these changes to the missing strikes. Once that's done, I should do a fit as if no strikes were missing.

Does that make sense?

It's buy futures, sell futures, when there is no future!

Patrik
Founding Member

Total Posts: 1178
Joined: Mar 2004
 
Posted: 2012-08-20 18:33
I've no experience about equity index vol, but coming from a commod background I'd start with the "features" I know about my surface and that I have some intuition around. I'll start by saying that ATM is more dynamic than skews (my case, just giving an example, you translate to your vol dynamics). For days when wing data is missing I'd start with yday's wing (parameterized in whatever suitable way) and possibly adjust it a tad based on other observable wing changes for the day in question you have for prev/next expiry. For missing ATM data I'd look at things like keeping the relative implied stddevs constant from the previous day for month X that you have data for and month Y that you don't since I would expect those to stay pretty stable day on day.

I'd do those things in a space and units that makes sense to me (make sense = have intuition about dynamics and/or believe to be relatively stable parameters in such description), be that sticky strike or sticky delta, log vols or normal vols, etc. For me it wouldn't be strike and percentage lognormal vols, I'd pick something that is more stable and needs fewer datapoints. Not sure what makes sense for equity index vol.

In short I'd do the same things I'd do manually if I was sitting there at the end of the day and had to mark my vols and I was missing some markets and it was too late to go check (closed).

Guess also depends on what the results are used for. My approach is more to get at where my estimate of mids are and being right on "average", less for picking up smaller day on day anomalies to trade on.

Capital Structure Demolition LLC Radiation

Strange


Total Posts: 1249
Joined: Jun 2004
 
Posted: 2012-08-20 20:47
Actually, in the equity space vol at the strike level is the most stable parameter (atm varies with the forward as it slides along the skew), so it feels to me that using time series of the volatility slices might be the best way to producing clean data series. Not sure. Problem is that 90% of data is ok but the remaining 10% completely destroy models.

It's buy futures, sell futures, when there is no future!

TonyC
Nuclear Energy Trader

Total Posts: 1147
Joined: May 2004
 
Posted: 2012-08-24 20:24
how 'bout fitting a jackwerth "generalized binomial tree" to the observable options and using that tree to infer the missing bits of the surface

flaneur/boulevardier/remittance man/energy trader
Previous Thread :: Next Thread 
Page 1 of 1