It’s easier then you think

A phrase I often hear is “security is everyone’s responsibility” but I notice that data scientists are frequently so focused on the vast number of skills that they need to know, that security goes ignored. Besides having many responsibilities, I believe that security seems daunting and appears to require lots of software engineering skill. In reality, it’s fairly easy to implement the lowest level of security into your software. I’d recommend following Charles Nwatu’s (a leader in security at Netflix and formerly StitchFix) principle “do less better.” To me, this means successfully implementing low level security is better than failing to implement high level security. …

Extracting Value From Your Text Data

A lot of the focus of ML education is around supervised learning — predicting a value or classification — which is all well and good until you begin to work with real world data sets. Getting a hold of tabular data in a consistent format can take more time than the creation of your model. That is very apparent when working with text data, since a lot of text you may be interested in is created by people that do not care about quality or format. The rate that unstructured data is being generated is increasing at faster rate than structured data, so aligning textual data with a consistency is a useful skill. …

Deployment can and should be easy

To put it frankly, if you can’t get your machine learning models out of a notebook, it’s probably not going to get used. This article should help you create a deployment of your model as quickly as possible, so that you can infuse your models with business processes. This is an important skill to have since it means you won’t be relying on a software engineer/developer to assist you.

Image for post
Image for post
Photo by Scott Graham on Unsplash

To do this we will use Watson Machine Learning, and a Jupyter Notebook. I will assume you already have Anaconda or another environment that can run notebooks. …

What Making a Video Game Taught Me About Teaching Myself

One of my roommates in college was a computer science major that had an internship every summer at Google or another Alphabet company. The effect that had on me was I developed a strong case of imposter syndrome. In turn the imposter syndrome lead me to carving out lots of time to get better as a programmer, despite being an economics and statistics double major. As someone that is self taught (and by self taught I mean I really had hundreds of teachers on YouTube, LinkedIn Learning, Medium, and Stack Overflow), I’ve come to realize that there are some common problems with teaching yourself to code. …

A simple example demonstrates the basic vocabulary of RL well.

Instructive and Evaluative Feedback

In supervised learning, your algorithm/model gets instructive feedback. This means it is instructed what the correct choice it should have made was, it then updates itself to diminish its error and make its predictions more accurate. In reinforcement learning, you give an algorithm evaluative feedback. This tells your algorithm how good an action was, but not what the best action was. How good the action was is known as reward. The RL algorithm goes through a simulation where it learns how to maximize this reward.

The Best Application for RL

The best applications of RL are when you can simulate the environment it operates in well. If we wanted to teach a RL program to drive a car, we could just let it drive a car, this would be a perfect simulation. If we wanted to teach it to call plays in an American football game, we would let it play a bunch of games of Madden, this is not a perfect simulation since we are using a video game to represent the real world. Understanding how we design the feedback in the RL algorithm in either of these scenarios can be quite complex, so I will introduce some of the basic concepts of RL in a simple situation called the K-Armed Bandit. …

An Intro to Text Analytics that can Increase Your Article Popularity

If you’ve written data science articles or are trying to get started, finding the most popular topics is a big help in getting your articles read. Below are the steps to easily determine what these topics are using R and the results of the analysis. This article can also serve as an intro to using an API, and doing some basic text processing in R. Feel free to alter this code to do other Twitter analyses and skip to the end if you’re only interested in the results.

Image for post
Image for post
Image by Gerd Altmann from Pixabay

The Twitter API

If you don’t have a twitter account, you need to make one. After that head over to Twitter Developer. After signing in with your new account, you can select the Apps Menu, then Create an App. From there fill out the information about the App’s details, for the most part, this can be left blank except for the App’s name and details. …

Focusing on engineering skills

There are many types of data scientists, with varying skillets and responsibilities. In my opinion, the most important groups you can segment data scientists into are those who write code used in production, and those who do reporting.


For a lack of a better term, I called the second group analysts. This does not mean they are not data scientists, people in these roles benefit from knowing machine learning, the ins and outs of data, and general programming skills. …

If you are giving an analytical presentation it is a good idea to replace PowerPoint with presentations created with R Markdown for these 3 reasons.

  1. Interactivity: R Markdown gives you the ability to generate interactive slides. Having interactive charts is much more interesting and should generate more engagement from those you are presenting to.
  2. Documentation: The code you wrote to generate your analysis can be stored in the slides (it can be turned on/off) and it serves as documentation of your analysis methods. …

If you haven’t had a job in data science before and you’re trying to obtain one, the best way to validate yourself in the eyes of a recruiter is to have projects with a working front end. While a research paper or yet another Jupyter Notebook may have complicated and/or technically interesting work, it does not really capture the attention of the people viewing your work. Engagement will go way up on your resume if your work is interactive. …


I’ve worked with a large number of data scientists across many environments. Some at hackathons, datathons, on Kaggle competitions, “for fun” projects, and as a data science consultant for multiple companies. I’ve also had to apply for my fair share of jobs. I believe that the best method for evaluating a data scientist is with basic python challenges, I’ll be arguing that case below.

The Spectrum of Questions/Challenges

Consider the level of ability in your questions, you have a desire to make questions/challenges for your candidates that eliminate the weakest candidates, without having a problem that takes up a ton of time to administer or evaluate for yourself or the candidate. …


Brandon Walker

Data Scientist at IBM Cloud and Cognitive Software

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store