Some comments on the march of technology in interpreting

Troublesome Terps did a podcast on remote interpreting about a month ago which I finally found time to listen to yesterday. I won’t go into it in too much detail but a couple of things struck me during the conversation which I wanted to tease out as someone who is a trained interpreter, who likes the actual activity of interpreting simultaneously, and who has a bit of experience working in IT, in fact, quite a bit more than working as an interpreter.

When I listened to the piece, it wasn’t so much the discussion on value add that Jonathan Downie discussed – this ties in with a view I’ve expressed elsewhere about how the money in the language industry is not actually in the language bit of the industry per se, but the fact that the discussion caused me to think of two companies in particular putting effort into the self driving sector, namely Tesla and Uber, both, potentially with a view to having a fleet of self driving cars carrying out the work currently done mainly by cabbies. In the meantime, Tesla are selling you cars and learning from your driving habits and Uber are learning from your public transportation needs.

We haven’t really solved machine translation adequately yet. But it has reached a stage to where it is considered “enough” by people who are generally ill qualified to assess whether in fact it is considered “enough” for their market. Output from Google Translate is considered more than enough by lots of people every day who run newspaper articles through it to get a gist. At least one, if not two, crowdfunding campaigns are pushing simultaneous interpreting systems, often pushing its AI and machine learning credibility to sound attractive. In my view, the end game with remote interpreting is less likely to be industrial parks full of interpreting booths or home interpreting systems, and more automated interpreting. Remote interpreting allows the expectation of quality to shift.

We would laugh if any human translator translated Ghent to Cork and yet, I have seen Google Translate do this. We would also not pay the human translator for such egregious errors. But Google is free, so meh. We tolerate it and we use it much more than we ever used human translators.

At some point, after the remote interpreting system, someone is going to AI their marketing speech about an interpreting system which cuts out the need for interpreters because Machine Learning System Blah. Both voice recognition and machine learning need to improve radically across all languages to get there to match humans but if we first bring about a situation where lower standards are tolerated (or cannot be identified) then selling a lesser quality product to the consumers of interpreting services becomes easier.

Much of what remote interpreting is bringing now is basically nothing to interpreters – I have a vision of three interpreters handling a conference somewhere in Frankfurt from their kitchens in South Africa, Berlin and somewhere in Clare, and they cannot really talk to each other in terms of who will take what slots, whether someone will catch a bunch of numbers or run out and get a few bottles of water. It seems to me that a lot of what remote interpreting is about forgets that a lot of conference interpreting is not about 1 person doing some interpreting; it’s about a team of people who need contact and coordination in real time. A lot of remote interpreting is around “this market is ripe for disruption” but the disruption is not necessarily being driven by people who know much about what the service actually involves. It misses a lot of context and perhaps it needs to do that because ultimately, the endgame may not be not about remote interpreting but non-human interpreting.

 

 

Github alerts on Inbox

A day or so ago, the product team at Google Inbox made some updates in terms of how the application handles email coming from Github. I think they made similar changes to Trello too but I haven’t been using Trello much (tbh I had forgotten about it and it looks like I set up my account 4 years ago) so this probably applies to Trello if you have teams using Trello and as a result, receive lots of emails from Trello. I don’t.

I am watching a dozen or so Github open source projects, however, none of them huge, but a couple of them are relatively active and generating email on a regular basis.

One of the reasons I liked Inbox was that it effectively sorted my email into stuff that was worth annoying me about and stuff that wasn’t. This means that for all those Facebook, Twitter and other automated and mass email sendings, my phone didn’t bother me and I could review those at my leisure, like waiting for stuff to cook or whatever. Github was sorted into the Forums post and this suited me because anyone who needs to check who has updated a Github repo on their phone while they are out is not really the sort of person I tend to consort with.

As of yesterday though, this has stopped. Inbox now informs me every single time I get an email from Github. The sad part about that – for me probably – is that there are a good deal more Github notifications yesterday for one project than a) usual and b) I get from human beings on a day to day basis. As a result, Inbox has been annoying me with Github alerts, alerts which I can only get rid of by unwatching the projects in Github. Amongst the things I cannot tell Inbox to do at the moment is not to send lock screen/audible alerts to the phone for Github originating email.

The way they bundled Github in the Inbox itself is nice. But I cannot understand why it occurred to no one in Google land that enforcing an audible/buzz alert on the phones without a way to switch that off was a stupid, stupid idea which had the potential to wreck Inbox utility for some users. As for anyone whose subscribed to a lot of Github projects, their phones must be going crazy. Mine was annoying because it meant that a buzz alert no longer meant that I’d gotten actual email from a human for the most part, only that someone somewhere had updated a Git repo. Essentially, my phone started crying wolf over the email it was receiving. It used to alert me to personal/potentially important email. Now it alerts me to definitely not urgent for me email, email I want to receive, but do not want a lockscreen alert for.

I sometimes think that people working in the tech sector work inside a bubble and do not have access to a diverse enough pool of users for testing purposes. The first thing I would have said to someone if I’d been testing this is “You have to give users the options to switch off audible and lock screen alerts for these things. For many people, they may represent non-essential, non-urgent email and you’re stripping away useful meaning of those alerts”.

Up to yesterday I knew that if my phone buzzed an email alert, it was probably something I needed to look at now. As of yesterday, now if it buzzes, it’s probably a Github alert. This does not improve my life.

Code Reviews

This piece on code reviews landed in my email via an O’Reilly newsletter this morning.

I’ve posted a brief response to it but I wanted to discuss it a little further here. One of the core issues with some code reviews is that they focus on optics rather than depth. How does this code look?

There are some valid reasons for having cosmetic requirements in place. Variable names should be meaningful, but in this day and age, that doesn’t mean they also have to be limited to an arbitrary number of characters. If someone wants to be a twerp about it, they will find a way of being a twerp about it no matter what rules you put in place.

However, the core reason for code reviews should be in terms of understanding what a particular bit of code is doing and whether it does it in the safest way possible. If you’re hung up on the number of tab spaces, then perhaps, you’re going to miss aspects of this. If you wind up with code that looks wonderful on the outside but is a 20 carat mess on the inside, well…your code review isn’t understanding what code is doing and it’s not identifying whether it is safe or possible.

So what I would tend to recommend, where bureaucratically possible, is that before any code reviewing is done, coding standards are reviewed in terms of whether they are fit for purpose. Often, they are not.

It won’t matter how you review code if the framework for catching issues just isn’t there.

Ad-blocking

One of those simmering arguments in the background has been blowing up spectacularly lately. The advertising industry, and to a lesser extent, the media industry, is up in arms about ad-blocking software. They do not like it and to some extent, you can probably understand this. It does not, exactly, support their industry.

There are two approaches which I think need to be considered. The advertising industry and the media industry, instead of bleating about how stuff has to be paid for, need to consider how they have contributed to this mess. On mobile, in particular, advertising is utterly destroying the user experience. When I wind up with content that I want to read because I can’t access because there is a roll over ad blocking it, for which I cannot find a close button, then the net impact is not that I feel a warm fuzzy feeling about the advertiser and the media site in question. The net impact is that I spend less and less time on the media site in question.

So, instead of screaming about how stuff has to paid for with advertising, maybe the media companies need to recognise how advertising is wrecking their user experience and how, ultimately, that is going to cut their user numbers. The fewer eyes they have, the less their advertising is going to be worth. I have sympathy for their need to pay their bills but at some point, they need some nuance in understanding how the product they are using to pay their bills now will likely result in them being unable to pay their bills at some point in the future.

As for the advertising industry, I have less sympathy. They appear to think they have a god given right to serve me content which I never asked for, don’t really want and which might cost me money to get particularly on mobile data. Often, the ads don’t load properly and block the background media page from loading. They have made their product so completely awful as a user experience that people are working harder than ever before to avoid it. Instead of screaming about how adblockers are killing their business, it would be more in their line to recognise that they have killed their business by making it a user experience which is so awful, their audience are making every effort to avoid it.

The ability to advertise is a privilege, not a right. It would help if advertisers worked towards maximising user engagement on a voluntary basis because by forcing content in the way which is increasingly the normal – full screen blocking ads – on users they are damaging the brands and the underlying media channels. Maybe advertisers don’t care. Maybe they assume that even if every newspaper in the world closes down, they will still find some sort of a channel to push ads on.

Adblocking software should be reminding them that actually, they probably won’t.

 

All for the want of a nail

Last week, it was announced that the Web Summit would be leaving Dublin and this caused a certain amount of handwringing about the impact this would have on Ireland. There were muttered comments about hotels and room rates as well.

I found it very difficult to get excited and upset about this. The bigger news in Dublin last week was the “delaying” of the interconnector, a massive, massive piece of infrastructure which the city is screaming out for and has been screaming out for since I do not know when. Dublin public transport is painful.

The other issue is that I’ve never really seen the Web Summit as anything to get excited about. I have never seen it as a technology conference and most of the people I know who went to it worked in marketing or were students. I’ve always felt a lot of claims have been made for it but when push came to shove, they really only seemed to mention one big deal that was done there. And it was the sort of deal which doesn’t happen without a lot of advance preparation. In short, it was the sort of deal which I suspect would have happened anyway.

So where does that leave us? Well, I still don’t care about the fact that the Web Summit is relocating to Lisbon. My personal experience of rush hour traffic getting from Lisbon to the airport in Lisbon would not necessarily lead me to believe things are massively better there. But there are linked issues. We don’t really have the infrastructure for a large indoor conference like that (although let’s face it, we do a decent enough job on music festivals and ploughing championships). The question is, would it be worth our while building somewhere to handle conferences and fairs with an attendance of tens of thousand? I have some doubts. If we learned anything much in the last 10 years – and I doubt it was much – it was that Build It And They Will Come is not a recipe for success. I cannot see the point in doing something like that unless we could identify a minimum number of annual events to make it worth our while to build it. I’m not in event management so maybe someone could come up with it. However, I do have certain interests and can say that frankly I have had cause to look at some of the events which go to the Hannover Messehalle and Frankfurte Messe. We’ve a long way to go.

That being said, the big issue I have with the Web Summit is that it’s not a technology conference much and it certainly isn’t tech sector in my view. It’s event management and that’s it. I don’t think it’s a loss to our tech sector and for all that’s said about it, evidence that it has a major impact on our tech sector seems to be scant. Given the ability to shout about it, I find this curious.

If we are to be concerned about the Web Summit at all, it is purely in the context of whether we want to be able to attract large conferences/fairs/shows to Dublin and whether, given our relative isolation, we would be able to. We can be expensive enough to reach if you’re not on a point to point connection. So no decision on whether the loss of Web Summit is good or bad should be made in the context of ochon o my chroi, we’ve lost the Web Summit, but in the hard cold calculations of whether we can, at least 4 times a year, get large numbers of people to come to Dublin for a conference of any description. When I look at the conferences I’m more familiar with such as CeBit in Hanover and the London Stationery Show (I have diverse interests), I recognise that we are nowhere close to even being able to start with these things. And for all the big shows which take place like the various car shows, yes these places have airports. Most of them also have rail connectivity to other urban areas. Put simply, I doubt you could argue in favour of a large conference venue in Dublin absent a high speed train connection to more places than Cork and Belfast. We are not Frankfurt, we are not Hanover. We are not London. Our hinterland is too small for things like this in my view. For all that Portugal has a bigger population, I’m not convinced that they are any better either.

I’m of the opinion that if Web Summit genuinely had a vision of a future where they were huge and could bring that level of audience to Ireland consistently, they would be able to build their own premises as a conference centre. If the demand for large conferences in Ireland was there, it would make them a profit. If it isn’t, it wouldn’t. They’ve very obviously voted with their feet.

GIT and open source, the victory or not

During the week, Wired published a piece under the title Github’s Top Coding Languages Show Open Source Has Won.

This is basically – and I am being diplomatic here – not what Github’s Top Coding Languages shows.

Fundamentally, for Github to show this, every piece of operational code would have to be on Github. It isn’t. I’d be willing to bet less than half of it is, and probably less than a quarter, but that’s a finger in the air guess. Most companies don’t have their code on Github.

What Github’s top ten coding language shows is that these are the ten most popular languages posted by people who use Github. Nothing more and nothing less.

I suspect Github know this. I really wonder why Wired does not.

 

Falling out of love with Amazon

I remember a time when I used to love Amazon. It was back around the time when there was a lot less stuff on the web and it was an amazing database of books. Books, Books, Books.

I can’t remember when it ended. I find the relationship with Amazon has deteriorated into one of convenience more than anything; I need it to get books, but it’s doing an awful job of selling me books at the moment too. Its promises have changed, my expectations have risen and fallen accordingly. Serendipity is failing. I don’t know if it is me, or if it is Amazon.

But something has gone wrong and I don’t know if Amazon is going to be able to fix it.

There are a couple of problems for me, which I suspect are linked to the quality of the data in Amazon’s databases. I can’t be sure of course – it could be linked to the decision making gates in its software. What I do know is it is something I really can’t fix.

Amazon’s search is awful. Beyond awful. Atrocious. A disaster. It’s not unique in that respect (I’ve already noted the shocking localisation failings for Google if you Are English Speaking But You Live In Ireland And Not The United States When Looking For Online Shops) but in terms of returning books which are relevant to the search you put in, it is increasingly a total failure. The more specific your search terms as well, the more likely to are to get what can only be described as a totally random best guess. So, for example, if I look for books regarding Early Irish History, then search returning books on Tudor England are so far removed from what I want that it’s laughable. On 1 May 2015 (ie, day of writing) fewer than a quarter of the first 32 search results refer to Ireland, and only 1 of them is even remotely appropriate.

Even if you are fortunate enough to give them an author, they regularly return searches of books not by that author.

I find this frustrating at the best of times because it wastes my time.

Browsing is frustrating. The match between the categories and the books in those categories can be random. The science category is full of new age nonsense and it often is very much best selling so the best sellers page becomes utterly useless. School books also completely litter the categories, particularly in science. I have no way of telling Amazon that I live in Ireland and have no real interest in UK school books, or, in fact, any school books when I am browsing geography.

Mainly I shouldn’t have to anyway. They KNOW I live in Ireland. They care very much about me living in Ireland when it comes to telling me they can deliver stuff. They just keep trying to sell me stuff that really, someone in Ireland probably isn’t going to want. Or possibly can’t buy (cf the whinge about Prime Streaming video to come in a few paragraphs). Amazon is not leveraging the information it has on me effectively AT ALL.

The long tail isn’t going to work if I can’t find things accidentally because I give up having scrolled through too many Key Stage Three books.

Foreign Languages: Amazon makes no distinction between text books and, for want of a better word, non-text books in its Books in Foreign Languages section. So again, once you’ve successfully drilled down to – for example – German – you are greeted with primarily Learn German books and Dictionaries, probably because of the algorithm which prioritises best sellers.

How can I fix this?

Basically, Amazon won’t allow me to fix things or customise things such that I’m likely to find stuff that interests me more. I don’t know whether they are trying to deal with these problems in the background – it’s hard to say because well, they don’t tend to tell you.

But.

  1. It would be nice to be able to reconfigure Treasa’s Amazon. Currently, its flagship item is Amazon Prime Streaming Video, which is not available in Ireland.Amazon knows I am in Ireland. It generally advises me how soon it can deliver stuff to Ireland if I’m even remotely tempted to buy some hardcopy actual book. Ideally they wouldn’t serve their promotions for Amazon Prime Streaming Video, but if they have to inflict ads for stuff they can’t sell me, the least they could do is let me re-order the containers in which each piece of information appears. So I could prioritise books and coffee which I do buy, over streaming video and music downloads which I either can’t or don’t buy from amazon usually.
  2. It would be nice to be able to set up favourite subject streams in books or music or dvds. I’d prefer to prioritise non-fiction over beach fiction, for example.
  3. I’d like to be able to do (2) for two other languages as well. One of the most frustrating things with the technology sector is the assumption of monolinguality. I’d LIKE to be able to buy more books in German, in fact I’m actively TRYING to read more German for various reasons, and likewise for French.
  4. I don’t have the time to Fix This Recommendation. They take 2 clicks and feature a pop up. As user interaction, it sucks. I’d provide more information for fixing the recommendations if I could click some sort of Reject from the main page and have them magically vanish. Other sites manage this.

But there are core problems with Amazon’s underlying data I think. Search is so awful and so prone to bringing back wrong results, it can only be because metadata for the books in question is wrong or incomplete. If they are using text analysis to classify books based on title and description, it’s not working. Not only that, their bucket classification is probably too broadbased. Their history section includes a metric tonne of historical fiction, ie, books which belong in fiction and not in history. If humans are categorising Amazon’s books, they are making a mess of it. If machine learning algorithsm are, they are making a mess of it.

There is an odd quirk in the sales based recommender which means that I can buy 50 books on computer programming but as soon as I buy one oh book of prayers as a gift for a relative, my recommender becomes highly religious focused and prayer books outplay programming books. Seriously: 1 prayer book to 50 programming books means you could probably temper the prayer books. Maybe if I bought 2 or 3 prayer books you could stop assuming it was an anomaly. This use of anomalous purchases to pollute the recommendations is infuriating and could be avoided by Amazon not overly weighting rare purchases.

I’m glad Amazon exists. But the service it has provided, particularly in terms of book buying, is nowhere near as useful as it used to be. Finding stuff I know I want is hard. Finding stuff I didn’t know I wanted but now I HAVE to have is downright impossible.

And this is a real pity because if the whole finding stuff I wanted to buy was easier on the book front, I’d be happy to spend money on it. After all, the delivery mechanisms, by way of Kindle etc have, have become far, far easier.

Uber, Github and You’ve got to be kidding me

In major goof, Uber stored sensive database key on public Github page.

via Ars Technica.

Disclosure: I have a Github account, on which I have stored very little. However, I do have a project going in the background to build a terminology database which will be mega simple (I like command lines) and which will have a MySQL database and an interactive Python script to get at the contents of the MySQL database. However, one thing which has exercised my mind is a reminder to myself that when I promote all this to Github (as I might in case anyone else wants a simple terminology database) to ensure that I remove my own database keys.

But this is not a corporate product, or any sort of corporate code. Nobody’s personal data will be impacted if I forget (which I won’t).

In the meantime, Uber, which is probably the highest profile start up, which has money being flung at it right left and centre by venture capitalists, managed to put a database key up on Github.

I don’t understand this. Why is Uber database related information anywhere near Github anyway? If they are planning to sell this as a product, why would you put anything related to it on an open repository?

I like the idea of an online repository for my own stuff. I don’t actually love Github but it’s easy enough to work with and, a bit like Facebook, everyone uses it. But that doesn’t mean any corporate site should allow access to unless they are open sourcing some code and even then, any such code really should be checked to ensure it doesn’t present any risk to the corporate security of the company.

Database keys in an open repo: there really is no excuse for this regardless of whether you’re a corporate or an individual.

 

Language skills.

The Economist is shouting about lack of language skills in the UK again. Their basic thesis is that the lack of language skills amongst UK workers costs in economic growth. I’m not sure how much we can stand over that assertion – the Economist admits as much –

This lack of language skills also lowers growth. By exactly how much is hard to say, but one estimate, by James Foreman-Peck of Cardiff University, puts the “gross language effect” (the income foregone because language barriers alter and reduce international trade) in 2012 as high as £59 billion ($90 billion), or 3.5% of GDP.

which suggests it’s basically educated guesswork.

For unrelated reasons, I had a look at CPL’s language vacancies yesterday and the one thing that interested me is how low the salaries are on average.

The simple issue is this: if we do not value language skills economically, people will not study to acquire those skills.

Comparatively, we value programming skills more highly although they are significantly easier to come by. Put simply, the amount of time required to get usefully acquainted with a programming language (including assembler) is significantly less than the amount of time required to get usefully acquainted with a foreign language.

Put simply, the return on effort in acquiring foreign language skills to a high level, is low compared to the return on effort in acquiring programming skills.

I might have more sympathy for the idea that the economy was suffering by a supposed lack of foreign language skills if foreign language skills related salaries were increasing. The truth is they aren’t, really, because the skills are being imported.