Data science

The last few weeks have been pretty busy on the assignment front as there were three in total due in the last couple of weeks, two maths and one statistics so I am really only catching on up on things here.

I started studying mathematics and statistics for a couple of reasons; (i) I liked mathematics a lot as a kid, but when push came to shoved aged 17, languages got higher up the priority list and (ii) the amount of data in the world is increasing; the number of people equipped to interpret it however doesn’t seem to be increasing. Also increasing are the number of people creating information graphics and data visualisations.

Some people are very good at this. The New York Times, for example, do sterling work in this area, as does the Office for National Statistics in the UK.

Some are not so good in interpreting underlying data. I’ve seen one absolutely beautifully drawn graphic that purported to display the strength of FaceBook in the social media world which compared FaceBook pageloads with Flickr image uploads. A fairer comparison would be pageloads for both sites. And this is a very simple criticism.

In other words, without a reasonable grounding in data analysis, it probably isn’t guaranteed that good datagraphics are going to appear.

Big Data is a buzzword which is turning up in my newsfeeds increasingly often. I’m not always sure what people understand by it but it is definitely flavour of the month and so we turn to this report from Silicon Republic on the subject of support for data science courses.

I am of the opinion that STEM (not sure I like that term for science, technology and maths courses but it has its uses) is definitely something worth investing in the future. However, like a lot of things, important and all as it is, it isn’t often adequately rewarded economically. Here, there are debates about how much people working in universities get paid; typically in the UK, funding for research is falling, and a lot of privately funded research is moving out of the UK, or its validity is being criticised purely on the grounds of the commercial nature of its funding (see pharmaceutical research as an example here – it is difficult to make any conclusion without some accusation of bias). In certain respects, research into options for the future is between a rock and a hard place.

EMC are best known to me for data storage. It’s interesting to see one of their senior guys talking about the importance of data science and I’d be interested to know if it’s coming from their interest in providing storage for large, nay massive quantities of data, or whether they also have some interest in how that information is organised. Obviously the big name in terms of how information is organised is Google. I will be interested to see if UCC do actually put a data science course together.

In the meantime, I have another 3-4 years of my own maths/stats to go and no doubt, the industry will change a bit again in that time.