ALL NEW ENTRIES WILL APPEAR ON WWW.TREASALYNCH.COM.
Alex Ross recently wrote a piece for the Wall Street Journal basically suggesting that in 10 years time, we will have something like a Babelfish.
I did not agree much with the piece for various reasons but I have been busy for the last few weeks and frankly, this discussion (podcast based) by three interpreters demonstrates possible gaps in understanding what is required here, and where our technology actually in reality more than I have time to draft at the moment.
Way back in the history of computer science, computer programmers who were unfamiliar with the world outside the English language created an encoding system that couldn’t handle accented charactors. One of the points that struck me about this is that a significant amount of the progress in NLP is for recognising English…but we need significantly more work amongst a lot more languages.
Many things have been forecast for some time down the road. Those things rarely arrive, or if they do, they arrive in an entirely different shape to what we predicted.
I will say that I don’t agree that interpreters sell themselves as being dictionaries on legs – I’ve always seen interpreters as a bridge.
Here’s a thing. I wanted to build a small utility to automate a task which would be handy, which I don’t need right now, but which I reckon would take about 8-10 hours to build in Python. So as I have some time, I’m doing it now.
For it to do what I want, I need the script to be able to read and write to a MySQL database. I chose that one because MySQL is open source and also because compared to Oracle 11g it uses fewer resources on my laptop. This is not going to be a big utility and I really don’t need serious heavy lifting at this point in time. But I do need the MySQL Python connector library.
So far, so good. I don’t have the connector library installed, and need to go and get it from Oracle.
To do this, I need to sign into Oracle. Fine. Password forgotten, so password reset, nuisance, but there you go. It’s a fact of life with things like this.
Once signed in, oh wait, now I have to answer some survey. They want to know what I’m using it for, what industry sector, how many employees, what sort of application, and then they offer me a list of reasons for which they can contact me further. Not on the list is “You don’t need to contact me”.
I’m not trying to download MySQL. I already have it installed. I just want a library that will enable me to write some code to connect a Python script to an existing install.
Downloading a single library really should be a lot easier.
The Samaritans have designed a new app that scans your friends’ twitter feeds and lets you know when one or other of them might be vulnerable so you can call them and may be prevent them from committing suicide.
It has caused a lot of discussion, and publicly at least, the feedback to the organisation, is not massively positive. I have problems with it on several fronts.
By definition, it sounds like it is basically doing some sort of sentiment analysis. Before we ever get into the details of privacy, consent, and all that, I would say “stop right there“.
Sentiment analysis is highly popular at the moment. My twitter feed gets littered in promoted tweets for text mining subjects. It is also fair to say that its accuracy is not guaranteed. Before I’d even look at this application, I would want to know on what basis, the application is assessing tweets as evidence of problems. We’ve seen some rather superficial sentiment analysis done in high profile (and controversial as a result) studies in the past, including that study by Facebook, for example. Accuracy in something like this is massively important and unfortunately, I have absolutely no faith that we can rely on this to work.
According to the Samaritans:
Our App searches for specific words and phrases that may indicate someone is struggling to cope and alerts you via email if you are following that person on Twitter.
The Samaritans include a list of phrases which may cause concern, and, on their own, yes, they are the type of phrases which you would expect to cause concern. But it’s not clear how much more granular the underlying text analysis is, or, on what basis their algorithm works. This is something which Jam, the digital agency responsible for this product really, really should be far more open about.
In principle, here is how the application works. A person signs up to it, and their incoming feed is constantly scanned until a tweet from one of their contacts trips the algorithm and the app generates an email to the person who signed up to say one of their contacts may be vulnerable.
This may come across as fluffy and nice and helpful and it is if you avoid thinking about several unpleasant factors.
- Just because a person signs up to Radar does not mean their friends signed up to have their tweets processed and acted upon in this way.
- Textual analysis is not always correct and there is a risk of false positives and false negatives.
Ultimately, my twitter account is public and always has been. You will find stuff about photographs, machine learning, data analysis, the tech industry, the weather, knitting, lace making, general chat with friends. I’m aware people may scan it for marketing reasons. I’m less enthusiastic about the idea of people a) scanning it to check my mental health and b) enabling a decision being taken without any consideration of whether I agree to such decisions to cause a friend or acquaintance to act on it.
It also assumes that everyone who actually follows me on Twitter is a close friend. This is an incredibly naive assumption given the nature of Twitter. 1200 people follow me from various worlds on which my life touches including data analysis and machine learning. Many of them are people I have never, ever met.
One of the comments on the Samaritans’ site about this is telling:
Unfortunately, we can’t remove individuals as it’s important that Radar is able to identify their Tweets if they need support.
Actually this isn’t true any more because a lot of people on Twitter made it clear they weren’t happy about having their tweets processed in this way.
Effectively, someone thought it was a good idea to opt a lot of people into a warning system without their consent. I can’t understand who would be so missing the point.
Anyway, now there is a whitelist you can use to opt out. Here’s how that works.
Radar has a whitelist of Twitter handles for those who would like to opt out of alerts being sent to their Twitter followers. To add yourself to the Samaritans Radar whitelist, you can send a direct message on Twitter to @samaritans. We have enabled the function that allows anyone to direct message us on Twitter, however, if you’re experiencing problems, please email: firstname.lastname@example.org
So, I’ve never downloaded Radar, I want nothing to do with it, but to ensure that I have nothing to do with it, I have to get my Twitter ID put on a list.
In technical terms, this is a beyond STUPID way of doing things. There’s a reason people do not like automatic opt-in on marketing mail and that’s with companies they’ve dealt with. I have no reason to deal with the Samaritans but now I’m expected to tell them they must not check my tweets for being suicidal otherwise they’ll do it if just one of my friends signs up to Radar? And now, the app, how does it work, check the text or the userid first? If the app resides on a phone, does it have to call home to the Samaritans every single time to check an updated list? What impact will that have on data usage?
Ultimately, the first problem I have with this is I’m dubious about relying on text analysis for anything at all, never mind mental health matters, and the second problem I have is that the Samaritans don’t appear to understand that just because my tweets are public does not mean I want an email sent to one of my friend suggesting they need to take action re my state of mental well being.
The Samaritans have received a lot of negative feedback on twitter about it. Various other blogs have pointed out that the Samaritans really should have asked people’s permission before signing them up to some early warning system that they might not even know exists plus the annotating of tweets generates data about users which they didn’t give permission to be generated.
So they issued an updated piece of text trying to do what I call the “there there” act on people who are unhappy about this. It does nothing to calm the waters.
We want to reassure Twitter users that Samaritans does not receive alerts about people’s Tweets. The only people who will be able to see the alerts, and the tweets flagged in them, are followers who would have received these Tweets in their current feed already.
Sorry, not good enough. I don’t want alerts generated off the back of my tweets. Don’t do it. It’s bold. Also, don’t ask me to stop it happening because I never asked for it to happen in the first place. It’s, a bit, Big Brother Is Watching You. It’s why at some point, people will get very antsy about big data.
Having heard people’s feedback since launch, we would like to make clear that the app has a whitelist function. This can be used by organisations and we are now extending this to individuals who would not like their Tweets to appear in Samaritans Radar alerts.
Allowing individuals to opt out of this invasive drivel was not there by default (in fact they made it clear they didn’t want it) and now to get out of it, they expect twitter users to opt out. I have to make the effort to get me out of the spider’s web of stupidity. The existence of a whitelist is not a solution to this problem. People should not have to opt out of something that they never opted into in the first place. Defaulting the entirety of twitter into this was a crazy design decision. I’m stunned that Twitter didn’t pull them up on this.
It’s important to clarify that Samaritans Radar has been in development for well over a year and has been tested with several different user groups who have contributed to its creation, as have academic experts through their research. In developing the app we have rigorously checked the functionality and approach taken and believe that this app does not breach data protection legislation.
- I want to see the test plans and reports. It sounds to me like it never included checking whether people wanted this in the first place
- Name the academics.
- They cannot possibly have claimed to have checked the functionality and approach when almost the first change they’ve had to make is broaden access to the whitelist
- Presumably the app is only available in the UK but does it check whether the contacts are in the UK?
Those who sign up to the app don’t necessarily need to act on any of the alerts they receive, in the same way that people may not respond to a comment made in the physical world. However, we strongly believe people who have signed up to Samaritans Radar do truly want to be able to help their friends who may be struggling to cope.
Yes but the point is that the app may not be fully accurate – I would love to know how they tested its accuracy rates to be frank – and additionally, the people whose permission they are not the people who sign up to Radar, but the people whose tweets get acted on. Suggesting “People may not do anything” is logically a stupid justification for this: the app is theoretically predicated on the idea that they will.
So here are two questions:
Do I want my friends getting email alerts in case I’m unlucky enough to post something which trips a text analysis tool which may or may not be accurate? The answer to that question is no.
Do I want to give my name to the Samaritans to go on a list of people who are dumb enough not to want their friends to check up on them in case things are down? The answer to that question is no.
I’m deeply disappointed in the Samaritans about this. For all their wailing that they talked to this expert and that expert, it’s abundantly clear that they don’t appear to have planned for any negative fall out. They claim to be listening and yet there’s very limited evidence of that.
You could argue that there needs to be serious research into examining how accurate the tool is in identifying people who need help; there also needs to be understanding that even if, to the letter of the law in the UK, it doesn’t break data protection, there are serious ethical concerns in this. I’d be stunned if any mental health professional thought that relying on textual analysis of texts of 140 characters was a viable way of classifying a person as being in need of help or not, even if you could rely on textual analysis. This application, after all, is credited to a digital agency, not a set of health professionals.
If I were someone senior in the Samaritans, I’d pull the app immediately. It is brand damaging – and that may ultimately have fundraising issues as well. I would also talk to someone seriously to understand how such a public relations mess could have been created. And I would also ask for serious, serious research on the use of textual analysis in terms of identifying mental health states and without it, I would not have released this.
It is one of the most stupid campaigns I have seen in a long time. It is creepy and invasive and it depends on a technology which is not without its inadequacies here.
Someone should have called a halt before it ever reached the public.
I have some doubts about the effectiveness of anything which depends heavily on natural language processing at the moment – I think there’s a lot to interest in the field but I don’t really think it has reached a point of dependability. One of the highest profile – I hesitate to use the word experiment – pieces of work this year, for example, included this comment:
Posts were determined to be positive or negative if they contained at least one positive or negative word, as defined by Linguistic Inquiry and Word Count software (LIWC2007) (9) word counting system, which correlates with self-reported and physiological measures of well-being, and has been used in prior research on emotional expression (7, 8, 10)
(Experimental evidence of massive-scale emotional contagion through social networks, otherwise known as the Facebook emotion study)
Anyway, the reason I am writing about this again today was that this piece from Forbes turned up in my twitter feed and the line which caught my eye was this:
Terms like “overhead bin” and “hate” in the same tweet, for example, might drive down an airline’s raking in the luggage category while “wifi” and “love” might drive up the entertainment category.
Basically, the piece is a bit of a puff piece for a company called Luminoso, and it has as its source this piece from Re/Code. Both pieces are talking about some work Luminoso did to rate airlines according to the sentiment they evoke on twitter.
If you look at the quote from the Facebook study above, the first thing that should step out immediately to you is that under their stated criteria, it is clearly possible for a piece of text to be both positive and negative at the same time. All it has to do is feature one word from each of the positive and negative word lists. Without seeing their data, it is hard to make a call on how much or, whether they checked how frequently, that happened, whether they controlled for it, or whether they excluded. The Forbes quote above likewise is worryingly simplistic in terms of understanding what needs to be done.
This is Luminoso’s description of their methodology. It doesn’t give away very much but given that they claim abilities in a number of languages, I really would not mind seeing more about how they are doing this.
I have spent a lot of the last 24 hours reading discussions on the subject of Ubuntu and Unity in particular. I had (and have again) Linux Mint install but following issues linked to the screen lock with processes running in a Python window, I temporarily switched over to Ubuntu.
In the time that it was installed, I discovered user interface design decisions which appeared to be made with no consideration of users, and it crashed a couple of times. It’s gone and I have gone back to Mint, reconfigured screen savers (ie, switched them and screen sleep completely off) and the issues which I had previously do not appear to have (yet) remanifested themselves.
But Ubuntu…Someone in Canonical thought it was a good idea to a) remove the application menu from the application window and b) put it on the global menu at the top of the screen and c) hide it.
The first time this charade manifested itself was with Sublime Text – my text editor of choice for most serious work – and I could not find the menu. It’s one thing to take it away from the application window – unwise in my view but not unknown and probably tolerable. Hiding it was not.
I know that Canonical have done something about this with 14.04 which released very recently. But this fiasco has been reality for a few years now and a lot of people screamed blue murder about it. It may be a small and cosmetic thing but it interferes with usability. It may seem overdramatic but it is the one single feature of Ubuntu that made me decide that the desktop environment was unusable for me. Its key outcome was to make software I wanted to use and was reasonably familiar with much, much harder to use. The fact that it took nearly 3 years for some sort of a fix isn’t really that edifying to be honest and few people are going to put the very newest version of a piece of software on when a) they know it’s about a week or two in release and b) they need some form of stability.
I’m aware that Ubuntu’s response to criticisms of Unity has been to recommend other distros. When I come to Ubuntu as a new user, that doesn’t really make me feel that Ubuntu is particularly interested in dialogue with your users. No matter how free your stuff is, no one is going to want to use it if they think they are being stomped on.
The other thing which someone decided was that no one really needed any sort of a reasonable hierarchical application menu. Up front, if you wanted to get at your applications, you had to search for them either through the general lens or the application lens. There are some benefits to being able to do a search like this. However, there are wholesale user disadvantages to not having a reasonable hierarchical and catogorisable view of your software as well. For all the world’s complaints about it, even Windows 8’s Metro UI allows you the option of arranging your applications in a logical set of groups. Linux Mint gives you a menu.
Ubuntu gives you a search field. That’s fine for documents and for email in my gmail account. It is utterly frustrating for managing applications and more specifically, launchers for your applications.
There is only so much real estate in the not-movable launcher on the left handside, and anyway, the first thing you have to do on installing Ubuntu is to get rid of the – I was going to say junk – but shall we say “stuff you don’t need” before you can do anything. The default install size of the launcher is too big (but at least that can be customised) and it comes with a lot of Libre Office stuff and a direct link to Amazon.
I remember when Windows machines used to come preloaded with all sorts of commercial launchers on the desktop. I didn’t like it then and I don’t like it now. And yes, I know Ubuntu is free.
And this is its big problem. It’s possible that if it wasn’t free and easily replaceable with other free things, I’d spend two or three days getting rid of Unity, installing a more functional desktop but of course I have to go and test a bunch of them before hoping there are no stability issues. The great beauty of Linux is that you can do a lot of customisation (although some of that is seriously limited within Unity). The great disadvantage for Linux is that sometimes, people don’t have enough time to do this. They have tasks they want to achieve, they know that in theory they are easier to achieve in Linux than they are on Windows (viz some Python related stuff and running a few other open source applications like R). Ultimately, there is a lot to be said for ensuring that when they open a basic, high profile distro, it works.
Most of what I’ve seen written about Unity by users – viz people who comment on blogs as opposed to people who write blogs – is that they’ve gotten used to it. It seems to be more a resigned tolerance than anything. A lot of people have switched over to Linux Mint. A lot have switched back to Debian. A lot have looked for ways of making other desktop environments usable. And a lot complain that it’s only a vocal minority whinging, who don’t like change. Most people, in my experience, don’t mind change which enhances their lives. When it is utterly disruptive and makes their lives harder, that’s an entirely different kettle of fish.
I’m not a long term Linux user. It’s unlikely that I will ever again go near Ubuntu. Unity was unusable and when I looked into it any any detail, it was obvious that Canonical didn’t want to take on board any negative feedback, and it took three years for them to fix – sort of – one of the more annoying interface issues. I know some people find the whole keyboard centric search options fine. But I don’t see it as an OS for people who are superuser keyboarders. I see it as an OS to be avoided by people who are interested in structuring the information and assets they have on their computer. It’s all fine having search to find everything for you, except the few things you squash onto the Launcher. Everything I tried to do with it up front was a struggle. It’s possible that tinkering around with Linux is a hobby and a game for some people. Other people actually need it to function.
In my view, if you want to try Linux, Ubuntu really isn’t the best choice. Stick with Mint for now.
One of the joys of being back at university is the unexpected bits of inspiration that pop up. Today was one of those days when…well…
Nao came in to visit today, with one of the PhD students who is doing some research on robot-human interaction. I’ve never seen anything quite like him/her (decision to be made really).
I mean, how can you not love something like this:
Nao gets to know you. “Look at my eyes until they turn green”. And they do.
It is fair to say that every single student who met Nao was utterly entranced by him. I would love a Nao of my every own. Nao has five thousand brothers and sisters dotted around the world. Surely there could be one for me?
Here is Nao dancing:
And Gangam style thanks to the University of Canterbury
This is the promo video from Nao’s parents, Aldebaran Robotics.
Here’s what I would do if I wanted to get more people into information technology, computer science and related cutting edge technology. I would acquire a couple of these robots, and I would hand them over to school outreach programs. And I would send them into primary schools and junior cycle secondary and I would say “Look at what you can do if you study work on maths and related.”
This is the stuff of dreams and inspiration. We’re behind the game, I think, if we’re putting iPads into school. If we put Nao into schools, we are putting the future into schools.
Very few schools have the funds to fund a robot like this. It is something that needs to be done at a national level, or possibly by the universities.
Eoghan McCabe and a bunch of his colleagues came to UCD Computer Science the other day to have a chat with some of the 4th years and postgrads about how opportunities were changing in Dublin compared, in particular, to how things were when he graduated.
I’m older than Eoghan, and I’m a bit unorthodox in that my background is not really computer science but I did take an unusual journey through life and spent more than a quarter of my life (but not quite a third) working on IBM big iron. But he had a message which resonated quite a bit in that the opportunities available to graduates today have broadened quite a bit compared to what was available less than 10 years ago, and even more say, compared to what was available 20 years ago.
This is true in a monumental way; but the way it gets discussed rarely focuses on those changes. The concept of starting your own business, and the question of innovations is pushed a lot more than it ever has been before – it seems like every third level college has some sort of incubator program in place now. The whole market of available jobs has changed – there are a lot more interesting small software firms springing up of which Intercom is obviously one, and there are a few more getting ready to push from America to Ireland like New Relic. The big institutional employers are basically not the only show in town and this is fundamentally important because people are not uniform and they tend to thrive in different environments. We have this tendency in humanity to go with the one size fits all approach in the face of overwhelming evidence that in fact, one size has never fitted all.
I’m not a fourth year – I have 20 years work experience under my belt and not all of it has been in the technology arena. But I do believe that when you have a widening of employment and employer culture, it fundamentally benefits society and supports general growth.
One thing which we did discuss however is the tendency of people to think that Silicon Valley can be recreated here, and the tendency of politicians in particular to think about recreating Silicon Valley in Ireland. I think this is unrealistic because mostly it rests on an incomplete understanding of what drives the Valley at the moment – and also, the fact that what drives the Valley has evolved over time. Possibly the weather helps a lot but a key feature which supports the structure in California is probably the finance.
So I do wish, sometimes, we could recognise that this, along with a friendlier approach to failure, are key components of how you drive a start up culture. The last time I heard a politician in Ireland discuss this, he just wanted to import more people to work here.
More than anything, however, I wish that we got shot of this idea of wanting to Be Like Something Else. I’m pretty sure the valley infrastructure won’t last forever; it’s not even that unique as there are similar things happening in the northeast United States, in Berlin, and to a lesser extent in London, in terms of funding interesting ideas. Something or someone will come along and seriously disrupt it; that’s what happens. Or, more possibly, a tech bubble will blow up.
In the meantime, the funding available to start ups in Ireland is on a small scale. When you consider the amount of investment money that went into property in 2006 – some 40% of lending for new developments were for buy to let investments – you have to wonder whether the issue isn’t so much that we don’t have the money to generate a start up scene of some description here, probably with a more limited utility focus, or idea factories but that we misapply it.
So companies like Intercom wind up going to San Francisco to get funding. I do honestly believe that understanding this is important for generating a local start up culture,
On a related note, Eoghan made two remarks which I thought were worth remembering.
- the vast majority of successful start ups are not run by drop outs but by people who completed their studies (and then some in a few cases)
- the average age of a start up founder is 40.
This, I think is good to know, even if you’re 25 years old.
On a completely unrelated note, there was something I really liked about Intercom before Eoghan and his colleagues came in to talk and that is that Code Kata ran there on a Wednesday morning. I made it in there one morning but I liked the idea of doing something like that not just from a networking point of view but from a diversity point of view – yes, there were mainly men there (I think I was the only woman the day I did go) – but because people from different companies tend to have different cultures. In many ways, it was illuminating.
About the most exciting thing I have to report at this point is that the wireless is now working on the Raspbian install which is an improvement over the last three times I’ve plugged in that particular SD card.
This is important because it means that I can finally start working in comfort at my desk rather than curled up on the living room floor.
I ran into two main issues:
- the wireless would not work in Raspbian
- two of the three keyboards I have at my disposal did not want to work effectively – I wound up with repeating letters which made getting a password entered impossible. The third keyboard is working and to facilitate that and the new monitor, major desk reorg required.
So okay, I’ve got a browser working on it; I can fire up Wolfram and Mathematica, Python is installed, what next?
I am very shortly going to go and get one of the Raspberry Pi books and look into building a can’t fail media centre and I will write instructions about that here when it’s done and running.
I also want to try and build a weather station. And a robot. And I want to build snake on it as well but I think I may have code for that.
My reading list, for anyone who is interested includes:
- Raspberry Pi for Kids
- Raspberry Pi User Guide
- Raspberry Pi in Easy Steps and
- Linux User Issue 134
There are also numerous websites. Raspberry Pi’s own website and Wolfram’s site, for example. I anticipate hours of endless fun.