All for the want of a nail

Last week, it was announced that the Web Summit would be leaving Dublin and this caused a certain amount of handwringing about the impact this would have on Ireland. There were muttered comments about hotels and room rates as well.

I found it very difficult to get excited and upset about this. The bigger news in Dublin last week was the “delaying” of the interconnector, a massive, massive piece of infrastructure which the city is screaming out for and has been screaming out for since I do not know when. Dublin public transport is painful.

The other issue is that I’ve never really seen the Web Summit as anything to get excited about. I have never seen it as a technology conference and most of the people I know who went to it worked in marketing or were students. I’ve always felt a lot of claims have been made for it but when push came to shove, they really only seemed to mention one big deal that was done there. And it was the sort of deal which doesn’t happen without a lot of advance preparation. In short, it was the sort of deal which I suspect would have happened anyway.

So where does that leave us? Well, I still don’t care about the fact that the Web Summit is relocating to Lisbon. My personal experience of rush hour traffic getting from Lisbon to the airport in Lisbon would not necessarily lead me to believe things are massively better there. But there are linked issues. We don’t really have the infrastructure for a large indoor conference like that (although let’s face it, we do a decent enough job on music festivals and ploughing championships). The question is, would it be worth our while building somewhere to handle conferences and fairs with an attendance of tens of thousand? I have some doubts. If we learned anything much in the last 10 years – and I doubt it was much – it was that Build It And They Will Come is not a recipe for success. I cannot see the point in doing something like that unless we could identify a minimum number of annual events to make it worth our while to build it. I’m not in event management so maybe someone could come up with it. However, I do have certain interests and can say that frankly I have had cause to look at some of the events which go to the Hannover Messehalle and Frankfurte Messe. We’ve a long way to go.

That being said, the big issue I have with the Web Summit is that it’s not a technology conference much and it certainly isn’t tech sector in my view. It’s event management and that’s it. I don’t think it’s a loss to our tech sector and for all that’s said about it, evidence that it has a major impact on our tech sector seems to be scant. Given the ability to shout about it, I find this curious.

If we are to be concerned about the Web Summit at all, it is purely in the context of whether we want to be able to attract large conferences/fairs/shows to Dublin and whether, given our relative isolation, we would be able to. We can be expensive enough to reach if you’re not on a point to point connection. So no decision on whether the loss of Web Summit is good or bad should be made in the context of ochon o my chroi, we’ve lost the Web Summit, but in the hard cold calculations of whether we can, at least 4 times a year, get large numbers of people to come to Dublin for a conference of any description. When I look at the conferences I’m more familiar with such as CeBit in Hanover and the London Stationery Show (I have diverse interests), I recognise that we are nowhere close to even being able to start with these things. And for all the big shows which take place like the various car shows, yes these places have airports. Most of them also have rail connectivity to other urban areas. Put simply, I doubt you could argue in favour of a large conference venue in Dublin absent a high speed train connection to more places than Cork and Belfast. We are not Frankfurt, we are not Hanover. We are not London. Our hinterland is too small for things like this in my view. For all that Portugal has a bigger population, I’m not convinced that they are any better either.

I’m of the opinion that if Web Summit genuinely had a vision of a future where they were huge and could bring that level of audience to Ireland consistently, they would be able to build their own premises as a conference centre. If the demand for large conferences in Ireland was there, it would make them a profit. If it isn’t, it wouldn’t. They’ve very obviously voted with their feet.

GIT and open source, the victory or not

During the week, Wired published a piece under the title Github’s Top Coding Languages Show Open Source Has Won.

This is basically – and I am being diplomatic here – not what Github’s Top Coding Languages shows.

Fundamentally, for Github to show this, every piece of operational code would have to be on Github. It isn’t. I’d be willing to bet less than half of it is, and probably less than a quarter, but that’s a finger in the air guess. Most companies don’t have their code on Github.

What Github’s top ten coding language shows is that these are the ten most popular languages posted by people who use Github. Nothing more and nothing less.

I suspect Github know this. I really wonder why Wired does not.

 

National bus stops

Having done Dublin Bus, it occurred to me to see if the Bus Eireann network was available. It is.

BusEireann

This is actually a bit more interesting than the Dublin Bus one for various reasons, specifically the gaps. I’m fascinated by the big hole in the middle and I will probably look at doing some additional work in terms of other spatial data.

I’ve just been asked where I get the data and it’s remiss of me not to credit the data source. The data for both Bus Eireann and Dublin Bus, plus a number of other operators is available on Transport for Ireland’s website. The link is here.

I haven’t cleaned up this graph all that much and I have additional plans for this and the other transport data that I have been looking at.

The beauty in Dublin Bus Stops

I have an ongoing project in the area of public transport in Dublin, which has a) stagnated for a while and b) grown a bit since I have had to interact more with public transport in Dublin.

DublinBusStops

This is one of the items on the project list. This image is a scatter plot of Dublin Bus latitude/longitude values for Dublin bus stops. The file this is based on, which I pulled into Excel to do this (yes, I have other plans involving R at some point which may see this revisited) has more than 4700 datapoints. Looking at it like this, I’m going to see if I can find a similar dataset for street lights. I think it’s a rather beautiful looking spiders’ web.

 

Flight routes operated by Aer Lingus out of Ireland

Following yesterday’s project with the Ryanair route data, I did the same for Aer Lingus this morning. I included one extra chart, mainly because of the northwest Atlantic destinations which are served by Dublin and Shannon and I wanted a view on how many airports service two routes.

So here are the charts.

AerLingus_EX_IRL

Complete network

AerLingus_EX_IRL_2

Airports serving at least 2 routes

AerLingus_EX_IRL_3

Airports serving at least 3 routes.

So there are a couple of points to note about this. Both the Ryanair and Aer Lingus data were imported into Illustrator to build a web friendly file format (I exported the graphs as PDFs. I am primarily an expert in Photoshop rather than Illustrator so there are a few things I missed yesterday, which I could fix on today’s files particularly with respect to label positioning. I did this specifically for the second and third charts.

The underlying data have some similarities. For Aer Lingus, Dublin has something in the region of 80 destinations which is not that dissimilar to Ryanair’s offering. Aer Lingus flies out of a couple of extra airports, namely Belfast and Donegal but the routes concerne are limited – Belfast targets London at the moment, and Donegal targets Dublin. The other point to note about this data is that it includes destinations operated by either Aer Lingus, or Aer Lingus Regional (operated by Stobart) but not connecting destinations beyond their hubs in North America, for example. Direct flights only.

Just a brief comment on those airports with three or more routes outbound: they include the following:

  1. Cork
  2. Dublin
  3. Shannon
  4. London Gatwick
  5. London Heathrow
  6. Manchester
  7. Birmingham
  8. Bristol
  9. Edinburgh
  10. Lanzarote
  11. Malaga
  12. Faro

This is significantly UK focused compared to Ryanair which was highly holiday destination focused. I’m not saying you couldn’t go on your holidays in Manchester or Birmingham…but I suspect most people don’t.

I am not really finished with this project – I have a couple of other thoughts about it and I’d also like to look at combined connectivity out of Ireland across all airlines. That data is going to take a while to gather up and certain things I want to do I am not sure are possible using Gephi. I will also look at graphic decisions like the fonts and colours as well.

Flight routes from Ireland operated by Ryanair

Flights operated by Ryanair from Ireland

The image above is a network chart of flights out of Ireland as operated by Ryanair from the following airports:

  • Cork;
  • Dublin;
  • Kerry;
  • Knock Ireland West; and
  • Shannon.

The data is not organised geographically, but in terms of connections between nodes. The nodes on the charge are various airports, and the links, or edges are operated routes. This chart shows all the connections between the five Irish airports above and any airport that Ryanair flies to from those airports.

However, Gephi allows you to fine tune what you want to see, and so there is this:

Ryanair_EX_IRL_3

This basically includes only those nodes which have three connections to other nodes. So only those airports which are destinations for at least 3 other nodes in this network. If you like, it basically includes destinations which are served by Ryanair from at least three airports in Ireland. Without looking in too much detail, you could probably guess a few: London Stansted is an obvious candidate. Dublin still has the most connections; this is not surprising as it had 80 or so to begin with.  But the target airports are interesting:

  • Alicante;
  • Faro;
  • London Stansted;
  • Malaga;
  • Tenerife South;
  • London Gatwick;
  • Palma;
  • Lanzarote;
  • Milan Bergamo;
  • Girona Barcelona; and
  • Liverpool.

Most of those airports, with the exceptions of London Gatwick and Stansted, and Liverpool, are holiday destinations. The outlier – as in the one I did not really expect to see – was Milan Bergamo.

The graph would probably be bigger if I stripped it down to airports with two connnections, mainly because Dublin has destinations in common with most of the other airports. Fuerteventura is one of the few destinations which is served by Cork and Shannon but not by Dublin, for example.

Gephi is a really nice tool to use for stuff like this and I have other plans for it. This is actually the first project I have done using it and I have an interest in figuring out how much more I can customise to use more data. For example, there is no weighting on any of the edges in either of the graphs above, and that parameter could probably be used to demonstrate frequency or seasonality. I have other plans as well. I also have plans to do something like this with bus routes in Dublin and general public transport, for example.

Couple of notes about the data:

  1. Route data was collected on 12 May 2015. As such, it will go out of date as Ryanair update their routes out of Ireland.
  2. It does not take account of any seasonal differences: all of these flights may not operate year round. Personally am considering a flight to Grenoble myself as I did not know one existed until today.
  3. This is a proof of concept for other work I want to do later with transport routing.
  4. I will probably look at other airlines later if I can access the data easily.

Future work

Via twitter yesterday, I was pointed to this piece on one of the WSJ’s blogs. Basically it looks at the likelihood that given job type might or might not be replaced by some automated function. Interestingly, the WSJ suggested that the safest job might be amongst the interpreter/translation industry. I found that interesting for a number of reasons so I dug a little more. The paper that blogpost is based on is this one, from Nesta.

I had a few problems with it so I also looked back at this paper which is earlier work by two of the authors involved in the Nesta paper.  Two of the authors are based at the Oxford Martin institute; the third author of the Nesta paper is linked with the charity Nesta itself.

So much for the background. Now for my views on the subject.

I’m not especially impressed with the underlying work here: there’s a lot of subjectivity in terms of how the underlying data was generated and in terms of how the training set for classification was set up. I’m not totally surprised that you would come to the conclusion that the more creative work types are more likely to be immune to automation for the simple reason that there are gaps in terms of artificial intelligence on a lot of fronts. But I was surprised that the outcome focused on translation and interpreting.

I’m a trained interpreter and a trained translator. I also have postgraduate qualifications in the area of machine learning with some focus on unsupervised systems. You could argue I have a foot in both camps. Translation has been a target of automated systems for years and years. Whether we are there yet or not depends on how much you think you can rely on Google Translate. In some respects, there is some acknowledgement in the tech sector that you can’t (hence Wikipedia hasn’t been translated using it) and in other respects, that you can (half the world seems to think it is hilariously adequate; I think most of them are native English speakers). MS are having a go at interpreting now with Skype. As my Spanish isn’t really up to scratch I’m not absolutely sure that I’m qualified to evaluate how successful they are. But if it’s anything like machine translation of text, probably not adequately. Without monumental steps forward in natural language processing – in lots of languages – I do not think you can arrive at a situation where computers are better at translating texts than humans and in fact, even now, to learn, machine translation systems are desperately dependent on human translated texts.

The interesting point about the link above is that while I might agree with the conclusions of the paper, I remain unconvinced by some of the processes that delivered them to those conclusions. To some extent, you could argue that the processes that get automated are the ones that a) cost a lot of people a lot money and b) are used often enough to be worth automating. It is arguable that for most of industry, translation and interpreting is less commonly required. Many organisations just get around the problem by having an in house working language, for example, and most organisations outsource any unusual requirements.

The other issue is that around translation, there has been significant naiveté – and I believe there continues to be – in terms how easy it is to solve this problem automatically. Right now we have a data focus and use statistical translation methods to focus on what is more likely to be right. But the extent to which we can depend on that tend to be available data and that varies in terms of quantity and quality with respect to language pairs. Without solving the translation problem, I am not sure we can really solve the interpreting problem either given issues around accent and voice recognition. For me, there are core issues around how we enable language for computers and I’ve come to the conclusion that we underestimate the non-verbal features of language such that context and cultural background is lost for a computer which has not acquired language via interactive experience (btw, I have a script somewhere to see about identifying the blockages in terms of learning a language). Language is not just 100,000 words and a few grammar rules.

So, back to the question of future work. Technology has always driven changes in employment practices and it is fair to say that the automation of boring repetitive tasks might generally be seen as good as it frees people up to higher level tasks, when that’s what it does. The papers above have pointed out that this is not always the case; that automation occasionally generates more low level work (see for example mass manufacture versus craft working).

The thing is, there is a heavy, heavy focus on suggesting that jobs disappearing through automation of vaguely creative tasks (tasks that involve a certain amount more decision making for example) might be replaced with jobs that serve the automation processes. I do not know if this will happen. Certainly, there has been a significant increase in the number of technological jobs, but many of those jobs are basically irrelevant. The world would not come to a stop in the morning if Uber shut down, for example, and a lot of the higher profile tech start ups tend to be targeting making money or getting sold rather than solving problems. If you look at the tech sector as well, it’s very fluffy for want of a better description. Outside jobs like programming, and management, and architecture (to some extent), there are few recognisable dream jobs. I doubt any ten year old would answer “business analyst” to the question “What do you want to do when you grow up”.

Right now, we see an excessive interest in disruption. Technology disrupts. I just think it tends to do so in ignorance. Microsoft, for example, admit that it’s not necessary to speak more than one language to work on machine interpreting for Skype. And at one point, I came across an article regarding Duolingo where they had very few language/pedagogy staff particularly in comparison to the number of software engineers and programmers, but the target for their product was to a) distribute translation as a task to be done freely by people in return for free language lessons and b) provide said free language lessons. The content for the language lessons is generally driven by volunteers.

So the point I am driving at is that creative tasks, which feature content creation, for example carrying out translation tasks, or providing appropriate learning tools, these are not valued by the technology industry. What point is there training to be an interpreter or translator if technology distributes the tasks in such a way as people will do it for free? We can see the same thing happening with journalism. No one really wants to pay for it.

And at the end of the day, a job which doesn’t pay is a job you can’t live on.

this is about data and technology and where I interact with both