When working on a machine learning problem, one follows the following high level steps.
- Gathering and exploring data
- Prototype and build models; evaluate and select best model
- Deploy model so that it’s accessible as a web service
- Visualization to consume output of models
The below table shows a quick grouping of the tools available for each of the phases of machine learning algorithm development.
Model Prototyping and Building | R or Python programs on Desktop or on AmazonR code on software from companies like Yhat, Domino Data Labs, AzureML, Amazon ML, BigML | |
Deploying models as web services | Open source rApache, hosted on AmazonSoftware from Revolution Analytics or Yhat, AzureML | |
Big Data (data that’s bigger than what can fit on a single machine) | Apache, Spark,H2O.ai, Hadoop, Google Cloud Dataflow | |
Data Visualization and Consumption | Salesforce Wave, Tableau, Domo |