bad things in tech – Page 2 – Musings on Languages, IT and other stuff

Future work

Via twitter yesterday, I was pointed to this piece on one of the WSJ’s blogs. Basically it looks at the likelihood that given job type might or might not be replaced by some automated function. Interestingly, the WSJ suggested that the safest job might be amongst the interpreter/translation industry. I found that interesting for a number of reasons so I dug a little more. The paper that blogpost is based on is this one, from Nesta.

I had a few problems with it so I also looked back at this paper which is earlier work by two of the authors involved in the Nesta paper. Two of the authors are based at the Oxford Martin institute; the third author of the Nesta paper is linked with the charity Nesta itself.

So much for the background. Now for my views on the subject.

I’m not especially impressed with the underlying work here: there’s a lot of subjectivity in terms of how the underlying data was generated and in terms of how the training set for classification was set up. I’m not totally surprised that you would come to the conclusion that the more creative work types are more likely to be immune to automation for the simple reason that there are gaps in terms of artificial intelligence on a lot of fronts. But I was surprised that the outcome focused on translation and interpreting.

I’m a trained interpreter and a trained translator. I also have postgraduate qualifications in the area of machine learning with some focus on unsupervised systems. You could argue I have a foot in both camps. Translation has been a target of automated systems for years and years. Whether we are there yet or not depends on how much you think you can rely on Google Translate. In some respects, there is some acknowledgement in the tech sector that you can’t (hence Wikipedia hasn’t been translated using it) and in other respects, that you can (half the world seems to think it is hilariously adequate; I think most of them are native English speakers). MS are having a go at interpreting now with Skype. As my Spanish isn’t really up to scratch I’m not absolutely sure that I’m qualified to evaluate how successful they are. But if it’s anything like machine translation of text, probably not adequately. Without monumental steps forward in natural language processing – in lots of languages – I do not think you can arrive at a situation where computers are better at translating texts than humans and in fact, even now, to learn, machine translation systems are desperately dependent on human translated texts.

The interesting point about the link above is that while I might agree with the conclusions of the paper, I remain unconvinced by some of the processes that delivered them to those conclusions. To some extent, you could argue that the processes that get automated are the ones that a) cost a lot of people a lot money and b) are used often enough to be worth automating. It is arguable that for most of industry, translation and interpreting is less commonly required. Many organisations just get around the problem by having an in house working language, for example, and most organisations outsource any unusual requirements.

The other issue is that around translation, there has been significant naiveté – and I believe there continues to be – in terms how easy it is to solve this problem automatically. Right now we have a data focus and use statistical translation methods to focus on what is more likely to be right. But the extent to which we can depend on that tend to be available data and that varies in terms of quantity and quality with respect to language pairs. Without solving the translation problem, I am not sure we can really solve the interpreting problem either given issues around accent and voice recognition. For me, there are core issues around how we enable language for computers and I’ve come to the conclusion that we underestimate the non-verbal features of language such that context and cultural background is lost for a computer which has not acquired language via interactive experience (btw, I have a script somewhere to see about identifying the blockages in terms of learning a language). Language is not just 100,000 words and a few grammar rules.

So, back to the question of future work. Technology has always driven changes in employment practices and it is fair to say that the automation of boring repetitive tasks might generally be seen as good as it frees people up to higher level tasks, when that’s what it does. The papers above have pointed out that this is not always the case; that automation occasionally generates more low level work (see for example mass manufacture versus craft working).

The thing is, there is a heavy, heavy focus on suggesting that jobs disappearing through automation of vaguely creative tasks (tasks that involve a certain amount more decision making for example) might be replaced with jobs that serve the automation processes. I do not know if this will happen. Certainly, there has been a significant increase in the number of technological jobs, but many of those jobs are basically irrelevant. The world would not come to a stop in the morning if Uber shut down, for example, and a lot of the higher profile tech start ups tend to be targeting making money or getting sold rather than solving problems. If you look at the tech sector as well, it’s very fluffy for want of a better description. Outside jobs like programming, and management, and architecture (to some extent), there are few recognisable dream jobs. I doubt any ten year old would answer “business analyst” to the question “What do you want to do when you grow up”.

Right now, we see an excessive interest in disruption. Technology disrupts. I just think it tends to do so in ignorance. Microsoft, for example, admit that it’s not necessary to speak more than one language to work on machine interpreting for Skype. And at one point, I came across an article regarding Duolingo where they had very few language/pedagogy staff particularly in comparison to the number of software engineers and programmers, but the target for their product was to a) distribute translation as a task to be done freely by people in return for free language lessons and b) provide said free language lessons. The content for the language lessons is generally driven by volunteers.

So the point I am driving at is that creative tasks, which feature content creation, for example carrying out translation tasks, or providing appropriate learning tools, these are not valued by the technology industry. What point is there training to be an interpreter or translator if technology distributes the tasks in such a way as people will do it for free? We can see the same thing happening with journalism. No one really wants to pay for it.

And at the end of the day, a job which doesn’t pay is a job you can’t live on.

Falling out of love with Amazon

I remember a time when I used to love Amazon. It was back around the time when there was a lot less stuff on the web and it was an amazing database of books. Books, Books, Books.

I can’t remember when it ended. I find the relationship with Amazon has deteriorated into one of convenience more than anything; I need it to get books, but it’s doing an awful job of selling me books at the moment too. Its promises have changed, my expectations have risen and fallen accordingly. Serendipity is failing. I don’t know if it is me, or if it is Amazon.

But something has gone wrong and I don’t know if Amazon is going to be able to fix it.

There are a couple of problems for me, which I suspect are linked to the quality of the data in Amazon’s databases. I can’t be sure of course – it could be linked to the decision making gates in its software. What I do know is it is something I really can’t fix.

Amazon’s search is awful. Beyond awful. Atrocious. A disaster. It’s not unique in that respect (I’ve already noted the shocking localisation failings for Google if you Are English Speaking But You Live In Ireland And Not The United States When Looking For Online Shops) but in terms of returning books which are relevant to the search you put in, it is increasingly a total failure. The more specific your search terms as well, the more likely to are to get what can only be described as a totally random best guess. So, for example, if I look for books regarding Early Irish History, then search returning books on Tudor England are so far removed from what I want that it’s laughable. On 1 May 2015 (ie, day of writing) fewer than a quarter of the first 32 search results refer to Ireland, and only 1 of them is even remotely appropriate.

Even if you are fortunate enough to give them an author, they regularly return searches of books not by that author.

I find this frustrating at the best of times because it wastes my time.

Browsing is frustrating. The match between the categories and the books in those categories can be random. The science category is full of new age nonsense and it often is very much best selling so the best sellers page becomes utterly useless. School books also completely litter the categories, particularly in science. I have no way of telling Amazon that I live in Ireland and have no real interest in UK school books, or, in fact, any school books when I am browsing geography.

Mainly I shouldn’t have to anyway. They KNOW I live in Ireland. They care very much about me living in Ireland when it comes to telling me they can deliver stuff. They just keep trying to sell me stuff that really, someone in Ireland probably isn’t going to want. Or possibly can’t buy (cf the whinge about Prime Streaming video to come in a few paragraphs). Amazon is not leveraging the information it has on me effectively AT ALL.

The long tail isn’t going to work if I can’t find things accidentally because I give up having scrolled through too many Key Stage Three books.

Foreign Languages: Amazon makes no distinction between text books and, for want of a better word, non-text books in its Books in Foreign Languages section. So again, once you’ve successfully drilled down to – for example – German – you are greeted with primarily Learn German books and Dictionaries, probably because of the algorithm which prioritises best sellers.

How can I fix this?

Basically, Amazon won’t allow me to fix things or customise things such that I’m likely to find stuff that interests me more. I don’t know whether they are trying to deal with these problems in the background – it’s hard to say because well, they don’t tend to tell you.

But.

It would be nice to be able to reconfigure Treasa’s Amazon. Currently, its flagship item is Amazon Prime Streaming Video, which is not available in Ireland.Amazon knows I am in Ireland. It generally advises me how soon it can deliver stuff to Ireland if I’m even remotely tempted to buy some hardcopy actual book. Ideally they wouldn’t serve their promotions for Amazon Prime Streaming Video, but if they have to inflict ads for stuff they can’t sell me, the least they could do is let me re-order the containers in which each piece of information appears. So I could prioritise books and coffee which I do buy, over streaming video and music downloads which I either can’t or don’t buy from amazon usually.
It would be nice to be able to set up favourite subject streams in books or music or dvds. I’d prefer to prioritise non-fiction over beach fiction, for example.
I’d like to be able to do (2) for two other languages as well. One of the most frustrating things with the technology sector is the assumption of monolinguality. I’d LIKE to be able to buy more books in German, in fact I’m actively TRYING to read more German for various reasons, and likewise for French.
I don’t have the time to Fix This Recommendation. They take 2 clicks and feature a pop up. As user interaction, it sucks. I’d provide more information for fixing the recommendations if I could click some sort of Reject from the main page and have them magically vanish. Other sites manage this.

But there are core problems with Amazon’s underlying data I think. Search is so awful and so prone to bringing back wrong results, it can only be because metadata for the books in question is wrong or incomplete. If they are using text analysis to classify books based on title and description, it’s not working. Not only that, their bucket classification is probably too broadbased. Their history section includes a metric tonne of historical fiction, ie, books which belong in fiction and not in history. If humans are categorising Amazon’s books, they are making a mess of it. If machine learning algorithsm are, they are making a mess of it.

There is an odd quirk in the sales based recommender which means that I can buy 50 books on computer programming but as soon as I buy one oh book of prayers as a gift for a relative, my recommender becomes highly religious focused and prayer books outplay programming books. Seriously: 1 prayer book to 50 programming books means you could probably temper the prayer books. Maybe if I bought 2 or 3 prayer books you could stop assuming it was an anomaly. This use of anomalous purchases to pollute the recommendations is infuriating and could be avoided by Amazon not overly weighting rare purchases.

I’m glad Amazon exists. But the service it has provided, particularly in terms of book buying, is nowhere near as useful as it used to be. Finding stuff I know I want is hard. Finding stuff I didn’t know I wanted but now I HAVE to have is downright impossible.

And this is a real pity because if the whole finding stuff I wanted to buy was easier on the book front, I’d be happy to spend money on it. After all, the delivery mechanisms, by way of Kindle etc have, have become far, far easier.

In search, Google’s localisation seems to be poor

Google are able to identify my location via useful clues like the GPS on my phone, and, I suppose, a reverse look up of the IP from which I connect to the internet sometimes. On my computer, Google knows exactly where I am, down to demonstrating my location when I open Google Maps, for example. There are additional clues: I’ve told it, in the past, that I am based in Ireland, and, mostly, when I run search, it is via Google.ie.

But it has become increasingly useless as far as finding outlets for online shopping. Today, I am looking for top spiral bound A4 notebooks – we’ll skip why exactly that is the case because it doesn’t matter. Google returns to me, as top search results, companies uniquely in America. This problem is not unique to top spiral bound A4 notebooks – I have had similar frustrating experiences with art supplies. There could be a thousand stationery shops in the UK, Ireland, and most of Europe, and Google still seems to think that someone based in Ireland is going to order off companies in the United States of America.

I appreciate some of this is based on search engine optimisation carried out by the companies concerned, but given that Google’s sponsored links are generally regionally appropriate, or at least more so than the first 2 or 3 of its search results, it would help if the organic search results were also regionally appropriate.

There is a wider issue with Google in my experience, however; while it provides services in a large number of languages, and provides online translation facilities, it seems to mainly operate on the assumption that most of its users are monolingual. I generally have an issue with Google News on that front, and have basically set up a feed from Twitter to pull news from a number of different source languages. For all the media organisations which Google News serves, it doesn’t seem to cope well with the idea that people might be more than monolingual.

Bookmarks in Chrome

Google rolled out updated bookmarking to the main version of Chrome lately. It arrived on my desktop a couple of days ago.

I do not usually spend much of my time checking out Chrome betas so I was unaware that this functionality – I use the word reservedly – had been a part of Chrome betas for the last year. I could be obnoxious and say I don’t need it and I didn’t want it. But more to the point, I haven’t worked out what utility it adds for me at all.

I first discovered Google’s Chrome team had done something with book marking when someone on Facebook complained that it now took them 4 clicks to book mark stuff. This was a warning.

I use bookmarks quite a bit. They are also reasonably well organised and I have an overview of how they are organised. All I really want from them is a list of websites and whatever their browser address button icon is. Oh, and I want as much of an overview of them as possible.

This is not possible with what Google have done. They’ve replaced list of bookmarks with tiled icons of images pulled from the site. This is a chronic waste of space and vastly reduces the amount of useful information you get on a single page. This is the default.

For anyone who has bookmarks sorted in folders, the folder list display now has a significant amount of white space between the folder names, effectively halving the number of folders that you can have an overview of any one time.

In addition, they have gotten rid of the folder tree option, which, if you’ve actually organised your bookmarks into folders and subfolders, means there’s a lot of information you cannot access any more. Subfolders appear as tiles on the righthand side of the dividing bar instead.

Google have provided a list view. However, this still doesn’t give you the tree overview, and more to the point if you are clicking on subfolders, the change of display from one subfolder to its contents and vice versa is animated. It is enormously distracting and I hate it.

The interface for bookmarking items via the star icon in the browser bar has been changed and now includes a large image which is not exactly necessary, clutters the interface and wastes space.

There is a method under the hood where you can configure Chrome not to use this clinically insane change – it’s not a user enhancement – and I will apply it. But I cannot count on Google to leave that backdoor option in place.

It’s one thing to provide what they think is enhanced utility (and it is entirely likely that for some people, the tiles display is useful – it just isn’t for me as I prefer a tree list and set of icons instead and they don’t need to be animated.

Other complaints include the fact that you can no longer sort bookmarks alphabetically. Google expects you to search these things, you see.

Google have a product page where this is being discussed. Feedback is universally negative. They have said they want feedback through the gears icon in Bookmark Manager where apparently 25% of the feedback is positive. That’s still a lot of negative feedback.

Ultimately, Chrome is Google’s product, and they provided it for free so yes, if they want to make changes that annoy the wider user community, they can. It is also unclear whether enough of the wider community is impacted by this. The extent to which people use bookmarks varies, and the underlying methods by which people use them varies. Google is happy enough to annoy a few million people when it suits them (Google Reader is a key example of that). Presumably, they are going somewhere with this that is not completely clear to Chrome users at the moment. I’d have to hope they are because otherwise, they’ve foisted a change for changes sake, reduced and wrecked usability, all for the sake of shiny and new.

The thing is, it’s possible that in fact, the sake of shiny and new is what drove this. The technology sector has forgotten that it’s basically a support industry and thinks it’s now a disruptive industry.

Uber, Github and You’ve got to be kidding me

In major goof, Uber stored sensive database key on public Github page.

via Ars Technica.

Disclosure: I have a Github account, on which I have stored very little. However, I do have a project going in the background to build a terminology database which will be mega simple (I like command lines) and which will have a MySQL database and an interactive Python script to get at the contents of the MySQL database. However, one thing which has exercised my mind is a reminder to myself that when I promote all this to Github (as I might in case anyone else wants a simple terminology database) to ensure that I remove my own database keys.

But this is not a corporate product, or any sort of corporate code. Nobody’s personal data will be impacted if I forget (which I won’t).

In the meantime, Uber, which is probably the highest profile start up, which has money being flung at it right left and centre by venture capitalists, managed to put a database key up on Github.

I don’t understand this. Why is Uber database related information anywhere near Github anyway? If they are planning to sell this as a product, why would you put anything related to it on an open repository?

I like the idea of an online repository for my own stuff. I don’t actually love Github but it’s easy enough to work with and, a bit like Facebook, everyone uses it. But that doesn’t mean any corporate site should allow access to unless they are open sourcing some code and even then, any such code really should be checked to ensure it doesn’t present any risk to the corporate security of the company.

Database keys in an open repo: there really is no excuse for this regardless of whether you’re a corporate or an individual.

A Magna Carta for Big Data

I need first to provide a disclaimer: I did my MSc in CompSci at University College Dublin which is one of the universities providing a home to the Insight Centre. And LinkedIn sent me the vacancy for Oliver Daniels’ job several times as a vacancy for which I was suitable. I know some of the Insight people and I have a particular amount of respect for the senior ones I know both in UCD and UCC.

With that out of the way, Oliver Daniels wrote a piece for the Huffington Post which I have some reservations about.

The data industry has to stop seeing itself as Big Data. The term is loaded. When people are talking about Big Pharma, they are talking about the pharmaceutical industry acting in its best interests (and not yours), and when they talk about Big Ag, they are talking about the agricultural-industrial complex acting in its best interests, not yours and not the environments. Big X is never a positive label for X. It implies a behemoth which really has no interest in your interests. I hate the term Big Data for this reason. It has never really meant serious data analytics, only a marketing tool for people who genuinely aren’t interest in data, but in buzzwords. Big Data is turning toxic.

If you read Oliver Daniels’ piece about a Magna Carta for Big Data, it is obvious that he is not looking for a Magna Carta for you or me, but for the right of large scale data analytics companies to have access to and use your data. There are a lot of benefits to large scale analytics but it is a stretch to call it a charter of rights when you have to give them access to your data, and they promise not to sell it to AN Other Company. The example in the Daniels piece relates to health data specifically, and the risk of sale of same to insurance companies.

Unlike Oliver Daniels, I have always known my mother’s age, and indeed, my father’s age and so I won’t be using either as an emotional hook on which to demand that people make their data available. What I would like to see Insight, and organisations attempting to be active in the health analytics side do is recognise that the vast majority of people, while not analytics experts, are not necessarily stupid. And I have issues with statements like this:

Healthcare has always been about data analytics, only now we have access to so much more data.

The thing is we don’t. We can certainly generate more data, but we don’t necessarily have the right to use it. When Oliver Daniels is talking about a Magna Carta for big data, he is looking for the right to use it, framed in a way that suggests my rights are protected. This might be viable if the data industry – and hardly any company is not a data company at this stage – had an even remotely sane record on not losing data.

There is no point in saying “and we promise your data won’t be released to AN Company you don’t approve of” when all over the world, vendors are getting hacked, losing data, losing laptops, spending a small fortune writing to customers suggesting they get their credit cards reissued, re-enacting U2 videos by beating their chests and being sorry. Really Sorry. Very, really sorry. We lost your data.

I have already written about the cost of messing up individuals in the quest of getting access to their health data in the past.

Oliver Daniels writes:

We need the public to feel trust when they hand over details about their health.

Even if we were to take the view that of course you can have everything you want, we trust you completely not to misuse the data, the simple truth is that we already know that large scale data sites have been hacked in highly public manners. I have correspondence from Adobe apologising for losing a lot of data. I have correspondence from any number of online data centric companies explaining that they have allowed their perimeters to be breached. The data industry has simply not earned the right to respect in terms of practically protecting data.

It would be an overarching, policy-led document that describes what we want, and don’t want, from Big Data. It is a document that would put citizens at the centre of the Big Data age, and ensure that the technology develops with democracy and human rights as guiding principles.

The Magna Carta was a document of rights, not a policy document. What Oliver Daniels wants is not so much a charter of rights for humanity but a bill of rights for Big Data – he uses the term; I think he should move away from it to have access to humanity’s data. The regulatory framework at the moment, piecemeal as it might be, in Europe, in particular, errs on the side of the individual, not the gathering of large datasets.

You know this is what he is looking for with this:

A Magna Carta for Data would not be a list of protectionist rules about privacy triggered by court cases and data infractions.

A Magna Carta for Data is not a Magna Carta for owners of data.

You know this when he says this:

The Magna Carta would not enshrine privacy measures that risk bringing enlightened data research to a standstill.

The core objective of this measure is not to balance the rights of humans who generate data and companies and organisations which want to exploit that data. It is to make it easier to get access to that data. And it uses the argument that privacy concerns are already left behind by big data.

I have a couple of issues with this. At this stage, I’d like senior managers who genuinely believe in the benefits of large scale data analytics to stop calling it Big Data. It is a toxic term with strongly negative connotations.

I also take issue with describing this as a Magna Carta for Data. This is a marketing metaphor and nothing else. It is not even appropriate in the context of trying to get people to give up some existing privacy rights – rights which are not negated just because you claim they are.

I would like the data industry to understand that to date, they have already made demonstrable screw ups, both in the private sector (Target and Adobe as two examples) and the public sector (the NHS mess with attempting to sell care.data to the public).

I have a lot of time for data analytics and in particular, the machine learning side of things. I honestly believe there is a lot of insight to be gained from it. But equally, I believe that there is no god given right for access to this data, and I’d like practitioners of big data to pay more attention to the fact that a lot of what they are trying to do has been done by statisticians who recognise underlying problems with large scale analytics. The fact that you’ve 10 billion records does not automatically infer you have a wholly representative sample or, indeed, a viable model. Tim Harford has an illustrative piece here.

I’ve done some work with large datasets. I’m fully aware of the benefits of being able to get a picture of the behaviour of system components over time – such as buses running ahead of or behind schedule. But I’m also aware of the risk of assuming technology gives us more exact pictures of reality. The garbage in garbage out principle will always exist, and the cartoon I saw more than twenty years which had the tagline “The beauty of computers is that you can screw up so much more precisely”.

More than anything, I want people in the industry to stop playing with marketing tags like Magna Carta for data and Big Data. Neither of these instil much confidence. I’d hate to see the benefits of health analytics killed by pretending these things can be simplified down to a Universal Declaration of Data Rights.

Technical debt and the bazaar

Over the festive period, one of my friends posted this to his twitter feed. I found it interesting for several reasons, not least because I’ve been wondering about things like minimum viable products, lack of direction, code bloat and how computer science and programming seems to be full of decisions to be made lately.

I am not a Unix systems expert per se – I run my own Linux desktops from time to time – but I can identify what some of the comments in this regarding quality, dependencies which aren’t really dependencies from a functional point of view. I wonder sometimes if we need to take a step back and just….clean up programming.

If that resistance/ignorance of code reuse had resulted in self-contained and independent packages of software, the price of the code duplication might actually have been a good tradeoff for ease of package management. But that was not the case: the packages form a tangled web of haphazard dependencies that results in much code duplication and waste.

I’m not necessarily talking about refactoring (although that does help sometimes) but our outlook on how we get things done.

It is part of wider thoughts I have about technology.

And I would walk 10,000 hours

Once, there was a study, involving professional violinists, the outcome of which resulted in the world being told that if you wanted to get really good at something, you need to do it for 10,000 hours.

The internet is full of productivity and professionality tips and every once in a while, I come across this “and you need to do something for at least 10,000 hours before you get good at it”. As the original study related specifically to a very small proportion of people talented in a very specific skill, I’ve never felt happy about it being generalised to, for example, programming, Linux administration.

I want to live in a world where the driving force underpinning people’s work is an attachment to that world, and not the ticking off of hours. You can do something for 10,000 hours and tick off those hours but if you don’t do it with passion, you’ll get limited benefit from those 10,000 hours.

Nanodegrees and Udacity

A few days ago, an email dropped into my box from support at Udacity, from Sebastian Thrun, talking about the new sort of university and credentials Udacity have put together for industry. The credentials, if you haven’t heard of them, are called nanodegrees. I have no idea who came up with this idea and having looked at the work required for one or two of them, I don’t see what “diploma” wasn’t good enough for them, or “certificate”.

There’s an ongoing debate on the subject of industry and university and the fact that the former doesn’t seem to think the latter does enough to prepare graduates for work. I don’t think it’s quite that simple, and I don’t think that Udacity’s nanodegrees fix the problem.

I graduated for the first time in 1994. At that stage, it was accepted, pretty much across the board, that graduates, regardless of their discipline need what the Germans would probably call an Einarbeitungsphase; a period of time to get someone up to speed. This is reality for everyone, not just graduates. No one is immediately useful in the grand scheme of things.

The technology industry (in particular) seems to have a problem with this and they tend not to look for people who are broadly educated, but people who have a highly narrow skillset. If you look at any of the software engineering ads that float around, there tends to be a focus on particular programming languages when in all honesty, what the average tech company should be looking for is someone who thinks programmatically and solves problems. At the end of the day, it doesn’t really matter what languages they program in because just about every programming shop has or should have some sort of local style guide that – oh wait – you need a bit of time to get up to. Programming languages are, for any competent programmer, easy to learn.

If you are hiring someone and focusing not on their broader outlook in terms of their industry, you aren’t really hiring for the future. If you’re not looking at the value you can add to a candidate, but only what they offer in the short term, this suggests your long term future is a problem.

What I’ve seen of the Udacity Nanodegrees (and I really do hate that term) is that they are skills based and a bit narrow in focus. The fact that a few of Udacity’s customers offer them to their staff is a good thing. The idea that they replace a broadbased university qualification is not.

Sebastian mentioned in his email that the skills gap is broadening faster leaving people on the sidelines. I don’t think that this is untrue per se, but I do also think that part of the problem is that employers are not addressing the problem effectively. To some extent, ongoing professional development is a desirable thing; with that, you need to recognise that staff need to be supported in doing it (and Udacity is good in this respect) but that this also means that there are very few ready made hit the ground running candidates anywhere.

In IT support, it’s absolutely critical and here’s why. A lot of the skills in support are vendor specific. Tasks in IT can be quite general – keep a network running, keep security in place – but the skill set required can be vendor specific and not immediately transferable from the systems run by one vendor to the systems run by another vendor, for example.

There was a time when having a degree in computer science covered a multitude of possible jobs because it covered a multitude of related subjects. Those days are gone, and they are gone because technology employers no longer filter on the basis of knowledge, but on the basis of limited skills. Alternative qualifications may deal with the skills issue, but not necessarily the broader picture issue. I’ve seen it with people who learn to program in – for want of an example – Java – but not in the context of what computers actually do.

Learning to program in a specific language is an easily acquired skill for someone who actually understands the broader picture. Industry needs to focus on identifying these people and recognise that they provide a lot more value than someone whose only skillset is they learned to program in one specific programming language. It means recognising that the learning phase does not end at the date of graduation. In many respects, it starts.

Migratory programmers

I don’t live in the US and nor have I any desire to do so. But Paul Graham has written a piece called “Let the other 95% of Great Programmers In“.

It is another call which can be loosely translated as “Let the Tech Industry Do What it Likes”. The logic on which it is built is, however, questionable. He assumes being a great programmer is an innate talent. It isn’t. It is far more a question of experience, conversation and training than innate talent.

Paul’s logic rests on the fact that the US has ca 5% of the world’s population and therefore must only have 5% of the great programmers in the world. This might be attractive logic if it wasn’t already clear that the tech industry in the US already rules out a lot of programmers by having a very different ethnic demographic break down in the US technology industry compared to the rest of the population as a whole. It’s possible that the US has identified very few of its own great programmers and let their talents go elsewhere in industry.

What the US technology industry appears to be poor at is identifying promising talent and nurturing it if it is not white or Asian male. Possibly they might have an argument about letting in the other 95% of great programmers when their diversity figures match the general population but right now they don’t.

For me, one of the core problems with Graham’s piece is he never really identifies a great programmer and how you’d actually identify getting them into the US technology industry. What if great programmers don’t want to go to the US? This is assuming someone can identify them correctly – but that being said I don’t think that’s guaranteed either.

I’m not sure what a great programmer can easily defined as – which is why I am wary of pieces like Paul’s. Many things are context dependent. It’s entirely possible that the vast majority of great US based programmers aren’t even in California but working somewhere else because they have a different set of priorities to Paul Graham. Being blunt about it, I’ve no desire to go to the US, and certainly not to Silicon Valley. I’m not interested in 2 hour commutes and horrifically high rents. Silicon Valley is all about money, and a lot of people – including great programmers – are all about life.

I think Paul Graham is trying to solve the wrong problem. If he wants access to more talent, he may have to move to it rather than trying to get it to move to him. The US is not the sole target for great programmers after all; great programmers can set up more or less where they want to and often do. It’s what I’d do. The great programmers of the world need the US less than the US needs them; and the US’s ability to identify them is already clearly questionable.

The issue is maybe the technology industry is just not attracting the right people in the US from within the US at the moment. Not everyone wants to cram into California either. Silicon Valley’s issues in attracting talent may have little to do with the lack of immigrants that they are allowed and a lot more to do with Silicon Valley itself.