Recommender systems for non-frequent purchases or what I’d do with a lot of airline/holiday data

Declaration of interest: I am doing a lot of learning in the area of machine learning, classification, recommender and personalisation systems at the moment (at least compared to 3 months ago). 

If you were to look at the recommendations which Amazon offer me in the area of books, you’d probably wonder a little about me. The two front runners, content wise, are ethnic recipe books, and machine learning related programming or algorithms.

I go through them every once in a while, usually late at night, and update them with useful information such as which of the recommendations I already own, and which I absolutely don’t want. And I might occasionally add something unexpected to my wishlist.

This has a fascinating impact on my recommendations. Last night, the addition of a single machine learning book to my wishlist had the net impact of dropping the number one recommendation, a cookery book called Jerusalem, down to number 6. A subsequent addition of an Edward Tufte datavisualisation book caused two new datavisualisation books to get into the top ten including one I had never heard of at number 3 (after Jerusalem got pushed down to number 6, Stephen Few wound up in number 1 with a book called Show me the Numbers). I haven’t decided yet whether I want Jerusalem or not either; I have over 100 cookbooks so theoretically, I can’t argue that I need it.

Deletions of books I wasn’t interested in usually resulted in the list just shuffling up a bit. Additions to the wishlist caused changes to the content of the list. From this I can conclude there’s a greater weight given to additions to the wishlist rather than deletion from the recommendation list. I would love to see the underlying datastructure and code for this. There’s this but it’s 10 years old and I have no doubt but that they’ve done a serious amount of work in the interim.

What does all this mean for the supposed content of this blog post? Well I realise that the Amazon data set relating to me is large and gathered over around 10 years at this stage, but deep down a part of me would like to do a little more research into it.

However, during the week, I was also considering recommender systems for less frequently used services and in particular, airlines.

Recommender systems work best if you have a decent picture of your individual customer at the point of loading up the site. Amazon does this using accounts. If you have a look at the airlines, in general, they have a mixed experience in that front. The majority of them offer you some form of registering, although not all, some of them allow you to connect your account to a frequent flier card, and some of them allow you to create an account.

However, I’m not sure how many of them compel you to create an account to book a flight directly with them. I’m pretty certain that the last few times I booked airline tickets, I did so without an account.

This is not necessarily an impediment to providing some personalisation services. While I do have a Hotels.com account, for example, they are well capable of remembering where I was last looking for hotels even if I haven’t signed in with my own account.

There is an issue, however, in that the airlines are already perceived to, perhaps, game that sort of idea by providing you higher charges the second time you look. This isn’t ideal from the point of view of endeavouring to provide any sort of personalisation and recommendation system.

The other key issue is that arguably, how do you provide personalisation services to a cohort that doesn’t buy airline tickets every other day (or at five past midnight when they can’t sleep)? If you take any of the major airlines, they carry millions of passengers, and by definition, a lot of them have to be duplicates courtesy of return ticketing, business travelling, family visits. The airline business got on the loyalty business early with the frequent flier cards but again, the picture of airline travel has changed a lot for a lot of the market since those things were invented. There is not necessarily a lot in common between your Netflix recommendations and your frequent flier points.

I have no doubt work is going on in this area – check this out from Rick Seaney in USA Today – however, what follows are some of my own thoughts on the subject.

  1. Passengers need to be classified. You could have sixty million passengers travelling with you every year, but that’s unlikely to be sixty million different people so it is possibly not as huge a task to classify them; what matters is the feature selection side of things. Not just into leisure travel and business travel, or short trip travel and long trip travel (they don’t always overlap), or a few subsets of those, but into enough classes that allows you to provide a reasonable level of personalisation. Late last night, I figured on 20-25 groupings but I’d argue that figure is possibly dependent on what airline is doing the classification. A long haul operation like Etihad is likely to be very different to a short haul operation concentrated mainly within Europe (countless).
  2. Routes need to be classified. To be fair, the vast majority of airlines already have this one sewn up.
  3. Passenger booking behaviour needs to be classified. Again, not just in terms of how often they book, but how frequently they book, how frequently they buy carhire, hotels, travel insurance. Whether they turn up to fly. Whether they look for refunds when they don’t fly. Not just for the amount of free money you gather up from them, but to add to your picture of them.

There are a couple of things which I could see coming out of this.

Here’s something that would certainly buy my interest immediately if, for example, I was travelling to Paris every Monday morning and coming back on a Tuesday evening for business. Provide me a login that generates a page that has two buttons: Paris and Other. The Paris button could be prefilled with the most likely routing/timing options if they are available. Or, Sorry Miss Lynch, your usual flight is fully booked. Allow me to create another personalised button based on possible plans. For example, I might want to fly to oh, Malaga to go kitesurfing in Tarifa maybe six times a year. Let me build one of those so that my landing screen is Paris, Malaga and Other. Include sports equipment as an option by default in the Malaga booking. Learn enough about me to know that, for example, I have annual travel insurance, and don’t try to sell me more. Know enough about me to know that if I am flying into Nice, I’ll hire a car, but not if I fly into London. Even if I am not booking, it might be worth letting me build dreams on your site like this for three reasons:

  1. it makes me happy
  2. it tells you a lot more about potential customers.
  3. It can support families sorting out holiday plans
  4. It can support groups organising trips away together

You can make it clear you are not locking down a fare at that point, but you do get a picture of some of the possible bookings on that flight and this may have an impact on how you manage bookings on that route around those dates. While you’re at it, keep an eye on possible efforts to game your recommender system and identify it as a class of behaviour.

Based on the information I provide when I am booking, airlines can obtain enough data to do this, even without tying the behaviour to an account. However, right now, this is not the approach that they take.

But here’s something else you could do.

Suppose I click on my Malaga button and the flights for the dates I choose are full. Maybe there is some golf competition on there and you know this because you’re good at knowing when events are on but the average kitesurfer might not care about the European PGA. Or it’s the week before the school holidays. Or O’Reilly have decided to run a big technical conference down there. Any number of reasons, but the flight from, say, Dublin to Malaga is full. Or any flight to Malaga is full depending on where I am living.

If I, as an airline, know that a lot of kitesurfers take their kitesurfing gear to Tenerife, or, at least have built potential bookings, I could suggest Tenerife as an alternative – a targeted alternative (particularly if I am flying alone), with the practical date data already provided for Malaga filled into a new booking form. Or if Tenerife is your first choice, Lanzarote is a viable alternative. Or Faro. Or Madeira. Based on the time frame and the amount of money concerned, and whether you interline with anyone, you have endless opportunity here. Clearly someone going golfing in Portugal for four days is not going to want to fly 11 hours via London to somewhere in Italy – but someone going for 14 days might consider a non-direct option.

Of course to do this, you need to know that my sports gear is kitesurfing equipment. But this is not impossible. And of course, you’ll never ask if I want to bring kitesurfing equipment on my regular Monday morning trip to Paris because you know already I don’t. If I don’t have much of a direct history with you, the data you have on other people can be leveraged to build a feature set to classify me.

The point I am trying to make here is that, publicly, there is a perception that airlines basically use whatever personalisation options they have to increase the fares by trapping you. Airline yield management is complex so with the best will in the world, it’s never likely to be quite that simple. But if airline personalisation tools made life easier for their customers, they might engender a lot more repeat business, particularly now.  Obviously gaining that trust in a way which is not perceived to be creepy is going to be a challenge because it’s based on knowing a lot about your customers which is something a lot of people go out of their way to discourage. I mean, I know people who deliberately like to confuse Amazon about their taste in books and music – I’m not one but then you’re talking to someone who got pleasure out of checking out how her recommendations changed by updating her wishlist.

Another interesting thing which could be done with this sort of model of engaging with your customer, based on what you know about them is telling them how many seats are available or grading the flight as commonly searched Hotels.com does this with hotel rooms. Two rooms left at this price. This is useful because while it may not cause me to book at that point in time, it’s hardly going to come as a shock to me that the price of a room in the Georges V in Paris has increased in the last two or three hours since I managed to get my travelling companion on the phone. It provides some trust. If my flight is rated Red for popular, I’ll know I am competing with, for example, 5000 Munster fans for that last seat on a flight the day before a match.

All of this is only possible if my customers trust me to use this data effectively to support them and not, specifically, to abuse them. I mean, if I assume someone who books every Monday morning will always book every Monday morning and start applying stealthy price increases to them that I do not necessarily apply to non-regular passengers, I will wind up with some public relations issues. And the loss of regular streams of income.

In summary, I believe it is possible to personalise the booking experience to the benefit of both passenger and airline. I can see that hotel booking agencies are already working in this area but I think there’s even more potential there. Even after the booking experience is personalised down to the nth degree, this information could have a huge impact on targeting promotional emails (which is something, in my experience, the hotels aren’t quite getting right yet).