top of page
  • Writer's pictureMike

Applied Machine Learning is Programming

Updated: Jan 16, 2020

The two primary languages used in applied machine learning today are SQL and Python.

There's no way around it. If you want to work in the real world as a machine learning engineer then you are going to need to program in at least languages two core languages, SQL and Python.

SQL or Structured Query Language. SQL is the standard language for communicating with relational databases, regardless of the vendor flavor.

Most applied machine learning models are currently sourced from relational databases. Additionally, many big data tools and cloud data services use SQL. Let's take a look at a real world job.

The job below is from a live job I pulled off Indeed is currently my favorite job board. This company is looking for an intern. Notice the third bullet point? Proficency with Python and SQL. I'm not sure how an intern becomes proficient with Python and SQL, but that's another post.

Currently, the number one requirement for a job in applied machine learning is SQL.

Python is King in Machine Learning

Right now, there are more jobs for machine learning engineers that know Python than there are for all the other languages used in this space. Python has become the gold standard for building end the end machine learning models.

Let's look at another real world job at Columbia Sportwear. The job below is incorrectly labeled. The job posting is for a data scientist and not a machine learning engineer. The company clearly doesn't understand the two roles. This role is too technical for anyone who doesn't have solid programming experience.

This bullet point gives it away.

Demonstrated capability in rapid prototyping and developing end-to-end solutions using complex algorithms

The phrase rapid-prototyping is a design workflow that consists of ideation, prototyping, and testing. It helps designers quickly discover and validate their best ideas. Also note of the phrase "developing end to end solutions."

An end to end solution is being able to work through the entire machine learning pipeline, a process for building machine learning models that is almost entirely programmatic.

I've highlighted the crucial requirement, SQL. On the same line you'll see R, Python or SAS. Again, this is a vanilla post. It's been copied from a similar job. Stay away from R and SAS and focus on Python. While there are jobs for SAS and R in the applied space you'll be limiting your job opportunities by not learning Python.

These are only two examples. However, I'd recommend you take a moment to peruse jobs in this space on your favorite job board. Ask yourself, what do all the roles have in common? Two skills you'll see time after time are SQL and Python. The two core languages used in applied machine learning.

Please check back here from time to time. I'll be updating this blog often and my new book will be ready soon.

2,766 views2 comments

Recent Posts

See All

Gradient boosting is a power technique for building predictive models. Gradient Boosting is about taking a model that by itself is a weak predictive model and combining that model with other models of

Machine learning data is represented by arrays. The core data object in machine learning is the array. Machine learning data is represented by arrays. The core data object in machine learning is the a

In order to understand machine learning you'll need to understand the machine learning hierarchy. The umbrella for all things machine learning is artificial intelligence. Before defining AI take a lo

bottom of page