Statistics on Jon Calder
https://www.joncalder.co.za/tags/statistics/
Recent content in Statistics on Jon CalderHugo -- gohugo.ioWed, 26 Jul 2017 00:00:00 +0000The spark of statistics
https://www.joncalder.co.za/2017-07-26-the-spark-of-statistics/
Wed, 26 Jul 2017 00:00:00 +0000https://www.joncalder.co.za/2017-07-26-the-spark-of-statistics/<p>I’ve enjoyed maths for pretty much as long as I can remember. I’m aware that it is rare to make such a statement. Probably straight up weird to some. But that just shows how much I have to be grateful for. I am indebted to my parents, extended family and friends and a number of great teachers over the years who made sacrifices in order to give me solid learning opportunities and a supportive environment.</p>
<p><img src="https://www.joncalder.co.za/img/small-imgs/iris_sepalwidth.png" alt="Density Plot" title="Density Plot" /></p>
<p>I mention my affinity for maths because it is pretty hard to enjoy statistics without first enjoying maths. A few weeks ago I watched a cool Ted Talk entitled “Why you should love statistics”. It got me thinking about how I came to love statistics, and for me it most definitely began with a love for maths.</p>
<p>I tutored a lot of undergraduate statistics during my university days and most of the students who struggled had little appreciation for statistics because they were constantly caught up in the underlying mathematical mechanics. My recollection is that often their maths deficiency was not due to a lack of ability, but could instead be attributed to a foundational gap of some sort. Whether it was moving schools, a change of teacher, prolonged sickness or some other catalyst for falling behind, I was left with the impression that somewhere along the way seeds of doubt had been sown for these students, and proved a force to be reckoned with further on down the road.</p>
<p>Alan Smith begins his Ted Talk by highlighting the low levels of numeracy in a number of <a href="http://www.oecd.org/">OECD countries</a> but is quick to warn us against what he considers a false dichotomy: the idea that some of us can “do numbers” and some of us can’t. I’m inclined to agree with him. Sure, some aspects of numeracy and mathematical ability are probably innate, but I think that as with many other skillsets, the driving factors for success are (1) a conducive learning environment; (2) a good support structure; (3) a positive mindset; and (4) practice. The problem is, boxes (1) and (2) aren’t ticked for many young students (especially in less developed countries like mine which are plagued by poverty and related hardships), and even if they are, it’s hard to have a positive mindset (3) and to invest in practice (4) if you’re given the impression that you simply <em>can’t do numbers</em> after getting off to a rocky start, or after hitting a few speedbumps along the way.</p>
<p>Two recent cases came to mind which re-inforce this idea for me. The first was a powerful personal story shared as part of a message at church. Given that I’m referencing the story outside of the context in which it was originally shared, I’ve paraphrased the story below and avoided any direct reference to the speaker himself.</p>
<blockquote>
<p>In Standard 5 I discovered that I was stupid. In the process of switching schools, I had moved from Standard 3 straight to Standard 5. So in addition to the language barrier I now had to overcome at the new school, I also had to catch up on Standard 4. Catching up on the maths content I had missed proved particularly difficult. I recall fractions were a problem. My teacher said to me: “As soon as you have an opportunity to choose subjects, you should choose history, because you will never be able to do maths.” At the end of that year I got 32% for maths. I remember thinking, where did I get all these marks?</p>
<p>After getting to Standard 6 the following year, I found out that I could only change subjects in Standard 7. So that meant I had to endure another year of maths. I had a new teacher and after our first maths test, she called me in (no surprise there). She said: “I’m not gonna mark your paper because this is not who you are. Not only do you have the potential to get good grades, but you are actually the best maths student we have in this school.” What a strange thing to say… …she didn’t know me at all. At the time I thought she was really, really crazy (and I think she was a little bit). In the next test, I got 62%. I hadn’t studied harder. I didn’t change anything. And then she called me in again and said: “This is not who you are. You are the best maths student in this school.”</p>
<p>The following test, I <strong>became that</strong>.</p>
</blockquote>
<p><img src="https://www.joncalder.co.za/img/small-imgs/believe_in_yourself.jpg" alt="Believe In Yourself" title="Believe In Yourself" /></p>
<p>The second example came in the form of a <a href="http://www.news24.com/SouthAfrica/Local/Peoples-Post/maths-star-goes-international-20170612">local news article</a> about Tim Schlesinger, who was recently selected by the South African Mathematics Foundation to represent the country at the International Mathematics Olympiad and Pan African Mathematics Olympiad competitions overseas. The extract below is taken directly from the article:</p>
<blockquote>
<p>He also gave words of advice to other learners who would like to do better in maths, saying they should focus on enjoying maths.</p>
<p>“From there, everything else follows easily. Many think that you are either born able to do maths or not, but you really can learn both theory and problem solving. It has been encouraging to see how much time and effort others are willing to invest in young mathematicians.”</p>
</blockquote>
<p>So having (hopefully) convinced you that many people who think they can’t do maths probably could if they would just stick it out long enough to overcome their doubts - let’s now get back to my story!</p>
<p>One of my earliest recollections of enjoying maths was during grade 3, when we were first introduced to “word problems”. The irony of that statement is not lost on me, because by matric I had discovered that English would actually be my most challenging subject at school. Spelling and grammar were fine. Sure, there often seemed to be more exceptions than there were rules (especially in the case of English), but at least there were <strong>SOME</strong> rules to follow. However when it came to the required ‘creative’ interpretations of poetry and literature I often found myself at a loss.</p>
<p><a href="http://explosm.net/comics/1695/"><img src="https://www.joncalder.co.za/img/small-imgs/drinking_problem.png" alt="Cyanide and Happiness" title="Cyanide and Happiness" /></a></p>
<p>Anyway, word problems introduced me to the elegance of thinking about and communicating a concept or problem mathematically, by contrast to the wordy version which often seemed (to me at least) a lot more complex. Happily, this enjoyment of maths continued for me right through both primary and secondary school, and my family still laughs at the recollection that sometimes when I got home from school in the afternoons my mom had to ask me to go and play outside with my younger sister <strong>before</strong> diving into my maths homework.</p>
<p>Admittedly by my 2nd year of university I thought some of the ‘pure math’ started getting a little weird though. Eventually I grew to appreciate the more abstract notions of mathematical rigor that were introduced to us in our Real Analysis module as we looked at the construction of the real number system, limits and convergence, continuity etc but it did require a more concerted effort. It was around this time that statistics started to take precedence for me.</p>
<p>I recall our first encounter with double integrals surfacing early on in a 2nd year statistics course, long before it was actually introduced in the concurrent and pre-requisite maths (calculus) course. We were given a quick run through and just had to roll with it until our maths syllabus caught up. Once these theory gaps had been filled in we had a much better appreciation of the applied nature of what we were learning in the statistical domain. This was a picture of things to come, since as I moved into the latter part of my degree and transitioned into post-grad, maths became subservient to my pursuit of statistics.</p>
<p>The interesting thing though is that I barely even knew statistics existed during most of my time at school. So how did I come to be interested in statistics at university? I believe a key catalyst was something which took place back in my grade 11 biology class. While covering the human reproductive system, we began a brief foray into the field of genetics, looking at chromosomes, DNA and RNA, proteins and all that life-shaping magic. Somewhere in the midst of all that, one day my teacher entered into a discussion about red-green colour blindness, highlighting that it is much more common in men. And so we were introduced to the concept of X-linked recessive inheritance.</p>
<p><img src="https://www.joncalder.co.za/img/small-imgs/colorblind.jpg" alt="colorblind pie chart" title="colorblind pie chart" /></p>
<p>I’m pretty sure he simplified things significantly in order to shield us from some of the complexities involved, so I will do the same here (with apologies to any geneticists who might be reading this). Here are the key pieces of information:</p>
<ul>
<li>the family of mutations that are commonly responsible for certain types of (red-green) color blindness can only occur on the X chromosome</li>
<li>men have one X and one Y chromosome whereas women have two X chromosomes</li>
<li>in women, the mutation needs to be present on both X chromosomes to cause color blindness, so some women can be ‘carriers’ while remaining unaffected</li>
<li>since men have a single X chromosome, they will be color blind if this chromosome has the mutation</li>
</ul>
<p>The resulting inheritance pattern, known as X-linked recessive inheritance, is nicely illustrated in the below diagram.</p>
<p><img src="https://www.joncalder.co.za/img/small-imgs/x_linked_recessive.png" alt="X-linked recessive inheritance" title="X-linked recessive inheritance" /></p>
<p>(diagram sourced from <a href="https://en.wikipedia.org/wiki/X-linked_recessive_inheritance">Wikipedia</a>)</p>
<p>The diagram shows that a woman who carries an X-linked recessive disorder has a 50 percent chance (with each pregnancy) of having a son who is affected and a 50 percent chance of having a daughter who carries one copy of the mutated gene. In the interest of being more thorough/clear: there is an equal 25 percent chance of having an unaffected boy, affected boy, non-carrier girl, or carrier girl.</p>
<p>Of course there are other scenarios and combinations to consider too; for example the sons of a man with an X-linked recessive disorder will not be affected (since boys always receive the Y chromosome of their father), and his daughters will carry one copy of the mutated gene (since girls always receive the X chromosome of their father). Similarly, one can enumerate the possibilities for offspring of an affected mother and unaffected father, affected father and carrier mother, or affected father and affected mother. Things then get a little more interesting when starting to look at the propogation of such a disorder across a few generations with different combinations of these scenarios in effect in each generation.</p>
<p>And it was some combination of the above mentioned scenarios that my teacher threw out as a suggestion for further investigation (homework) that day which sparked my first meaningful exploration into the world of independent and conditional events, and the law of total probability. I’m pretty sure I didn’t know anything about conditional or marginal probabilities at the time, but as I sat down that evening to work through a few scenarios I recall that it came quite naturally to me to work out the underlying probabilities by stringing together the basic principles.</p>
<p>The next day I returned to school armed with a half-page of scribbled workings and the moment the question was asked my arm went up. As it turned out, no-one else had really taken the teachers suggestion from the previous day seriously and my raised hand was not met with any competition. So I proudly described the results of my probabilistic investigation to the teacher and the rest of class. He was suitably impressed and immediately affirmed my aptitude and efforts, whilst also encouraging me to pursue statistics further after leaving school. In hindsight it’s clear to me that this was good advice.</p>
<p>So Mr Moore, wherever you are now, thank you for all the things you taught me, and thank you for your words of affirmation which have stayed with me ever since that day in biology class.</p>
<p>Another thing Alan Smith highlights in his talk which I was not previously aware of is the etymology of the word <em>statistics</em>. It is ‘<em>the science of dealing with data about the condition of a state or community</em>’. He describes it simply as ‘<strong>the science of us</strong>’. So to me there is something quite poetic in the way my introduction to statistics came via biology and genetics - a different variant of <em>the science of us</em>.</p>
<p>I love statistics, so obviously I’m biased, but I think the Ted Talk (shown below) is well worth a watch, whatever your perspective on the matter. I hope that it encourages you to become more fascinated by numbers.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/ogeGJS0GEF4" frameborder="0" allowfullscreen></iframe>
Data Science Podcasts
https://www.joncalder.co.za/2017-05-31-data-science-podcasts/
Wed, 31 May 2017 00:00:00 +0000https://www.joncalder.co.za/2017-05-31-data-science-podcasts/
<p>Podcasts are awesome. Especially when you’re stuck in traffic on the way to work.</p>
<p><img src="https://www.joncalder.co.za/img/small-imgs/mr_incredible_stuck_in_traffic.jpg" alt="Mr Incredible stuck in traffic" title="Mr Incredible stuck in traffic" /></p>
<p>Below are some podcasts I listen to that relate to data science and statistics. Each of them has something slightly different to offer, so if this is an area of interest to you then I recommend you give these a try!</p>
<p><img src="https://www.joncalder.co.za/img/small-imgs/nssd.png#floatleft" alt="NSSD logo" title="NSSD logo" /></p>
<h4 id="not-so-standard-deviations-https-soundcloud-com-nssd-podcast"><a href="https://soundcloud.com/nssd-podcast">Not So Standard Deviations</a></h4>
<blockquote>
<p>Roger Peng and Hilary Parker talk about the latest in data science and data analysis in academia and industry.</p>
</blockquote>
<p><img src="https://www.joncalder.co.za/img/small-imgs/data_skeptic.png#floatright" alt="Data Skeptic logo" title="Data Skeptic logo" /></p>
<h4 id="data-skeptic-https-dataskeptic-com"><a href="https://dataskeptic.com/">Data Skeptic</a></h4>
<blockquote>
<p>Data Skeptic is your source for a perspective of scientific skepticism on topics in statistics, machine learning, big data, artificial intelligence, and data science.</p>
</blockquote>
<p><img src="https://www.joncalder.co.za/img/small-imgs/more_or_less_behind_the_stats.png#floatleft" alt="More or Less: Behind the Stats logo" title="More or Less: Behind the Stats logo" /></p>
<h4 id="more-or-less-behind-the-stats-http-www-bbc-co-uk-programmes-p02nrss1"><a href="http://www.bbc.co.uk/programmes/p02nrss1">More or Less: Behind the Stats</a></h4>
<blockquote>
<p>Tim Harford and the More or Less team from BBC Radio 4 try to make sense of the statistics that surround us.</p>
</blockquote>
<p><img src="https://www.joncalder.co.za/img/small-imgs/the_r_podcast.png#floatright" alt="The R Podcast logo" title="The R Podcast logo" /></p>
<h4 id="the-r-podcast-https-r-podcast-org"><a href="https://r-podcast.org/">The R-Podcast</a></h4>
<blockquote>
<p>Giving practical advice on how to use R for powerful and innovative data analyses. The host of the R-Podcast is Eric Nantz, a statistician working in the life sciences industry who has been using R since 2004.</p>
</blockquote>
<p><img src="https://www.joncalder.co.za/img/small-imgs/partially_derivative.png#floatleft" alt="Partially Derivative logo" title="Partially Derivative logo" /></p>
<h4 id="partially-derivative-http-partiallyderivative-com"><a href="http://partiallyderivative.com">Partially Derivative</a></h4>
<blockquote>
<p>Hosted by Jonathon, Vidya, and Chris, Partially Derivative is a podcast about data science in the world around us. Episodes are a mix of explorations into the techniques used in data science and discussions with the field’s leading experts.</p>
</blockquote>
<p><img src="https://www.joncalder.co.za/img/small-imgs/linear_digressions.png#floatright" alt="Linear Digressions logo" title="Linear Digressions logo" /></p>
<h4 id="linear-digressions-http-lineardigressions-com"><a href="http://lineardigressions.com/">Linear Digressions</a></h4>
<blockquote>
<p>Hosts Katie Malone and Ben Jaffe explore machine learning and data science through interesting (and often very unusual) applications.</p>
</blockquote>
<p>Are there other data science podcasts missing from this list that you can recommend? Feel free to comment below and let me know!</p>