Graphlab Installation Instructions For Mac
While providing a convenient and flexible environment for running a predictive service, not all scenarios are suited for a cloud-based deployment. For reasons of privacy, security, or cost you might prefer to host your predictive service locally, on a machine you own and control. We call this a Predictive Services on-premises deployment.
Get notifications on updates for this project. Get the SourceForge newsletter. Get newsletters and notices that include site news, special offers and exclusive discounts about IT products & services.
Prerequisites
We assume that you already downloaded and installed GraphLab Create on a machine that you will later use to interact with your local Predictive Services deployment. For more information on obtaining and installing GraphLab Create see Getting Started.
You will need the Predictive Services package as well as a Predictive Services product key. Both can be obtained on the installation page on dato.com.
Predictive Services on-premises uses Docker as its packaging and deployment mechanism. To install Docker on the machine that will host the predictive service, please download from https://docs.docker.com/installation/. Make sure to pick the installation that matches the host’s operating system.
OS X and Windows
Follow the instructions on the Docker website for creating a Docker VM in Mac OS X or creating a Docker VM in Windows. Once you have Docker installed, you can begin installation by starting the Docker Quickstart Terminal.
The Docker Quickstart Terminal does the following things:
- Creates a Virtualbox instance called
default
that will serve as the docker machine instance, if it is not yet created. - Starts the default instance.
- Configures the environment variables to point to the default instance.
You can verify that this is setup properly by running:
which should print something like 'default'.
You can start and stop the machine with:
If you'd prefer to run on another docker host you've created, just run the command:
The installation script is aware of the $DOCKER_MACHINE_NAME
environment variable and will load and run the docker instances appropriately.
You might run into an incompatibility issue with the included VirtualBox version, causing an error during the docker-machine create call. The current (8/31/2015) workaround is to install a more recent VirtualBox test build from https://www.virtualbox.org/wiki/Testbuilds. See also https://www.virtualbox.org/ticket/14412.
After installation is complete, be sure to configure port forwarding as noted below. This is not necessary on a Linux machine, since it is not run from inside of a Virtualbox instance.
Installation
Deployment of a predictive service is achieved by installing and running a set of Docker containers. The containers as well as a setup script are included in the package you downloaded from dato.com.
Follow these steps to install Dato Predictive Services:
- Download the dato-predictive-services-1.8.3.tar.gz (or the latest version) and your license file.
- Move the package and license file to the computer you want to install Dato Predictive Services on. For Windows hosts, be sure to do all of the work from your C drive where Docker is installed. Trying to setup from another drive may lead to problems.
- Unzip the file to a temporary folder:
- Create a Predictive Services working directory in the host machine where Predictive Services files (include docker images) will be copied to. On Windows, this must be on the same drive as your Docker installation, which is the C drive.
- Decide where the Predictive Services runtime data (state files, logs, etc.) will be stored; this could be a network file system, a S3 file path, or an HDFS file path. This path will be used by data scientists to manage the predictive service later through the GraphLab Create Python API. A common path is usually a HDFS path, like
hdfs://<hdfs-name-node>:8020/user/<ps-service-user>/dato_predictive_service
We will call this path the “ps path”. - Modify predictive_service.cfg file included in the package. You will need to make the following changes for a local setup:
internal_ip
: The internal IP address of your host, usually a private IP address such as 10.X.X.X or 192.168.X.X.external_ip
: The external IP address of your host, which is not one of the private IP addresses. This is how other machines can find your host.ps_path
: the path you chose in step 5 aboveAWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
: Specify these if you gave an S3 address forps_path
.deployment_path
: the path you chose in step 4 aboveserver_memory
: The memory size, in MB, of your predictive service container. The default of 4096 is fine for most purposes.server_port
: The internal server port. When Predictive Services is configured, there is a load balancer configured to forward traffic to a server. This port will not be used externally.use_ssl
: If you'd like to use SSL, set this totrue
. You'll also need to specifycertificate_path
andcertificate_is_self_signed
.certificate_path
: The path to the SSL certificate.certificate_is_self_signed
: If you're using SSL and a self-signed certificate, set this totrue
.lb_port
: The port used to query predictive services through the load balancer. Setting this todefault
will cause it to be set to 80 for non-SSL installations or 443 for SSL installations.lb_stats_port
: The port used for querying statistics.metrics_port
: The port used for querying metrics.max_cache_memory
: The maximum size of the cache, in MB. The default of 2048 should be fine for most applications.
- Run the setup script, providing the path to your Predictive Services product key file:
If the predictive service is setup correctly, you should see this message after the script has finished:
Note that in Windows and OS X, the setup script will configure iptables of the docker machine instance to forward traffic appropriately, mirroring the ports you configured.
At this point the docker containers are deployed. Now the predictive service needs to start up, which will take up to 1 minute (commonly not more than a few seconds). After that period the service is ready.
Use
GraphLab Create is required to connect to Dato Predictive Services and deploy/monitor/manage the service. For more information on obtaining and installing GraphLab Create see Getting Started.
After you have installed GraphLab Create, you can connect to the predictive service; In the code sample below remember to replace ps-path
with your actual ps path specified in installation step 5 above. If this is an HDFS path, you need to have set up your environment to have access to HDFS (either by setting HADOOP_CLASSPATH or HADOOP_CONF_DIR).
For more information about the API see https://dato.com/learn/userguide/deployment/pred-intro.html and https://dato.com/products/create/docs/generated/graphlab.deploy.predictive_services.html
Shutdown
If you need to shut down your predictive service (which is also necessary if you want to change any of the configuration parameters), you use the shutdown_dato_ps.sh
script. This script removes the Docker containers used by the predictive service. To restart the service, run the setup script again.
Port Forwarding for Windows and OS X
In order to access predictive services from outside your Windows or OS X host, you'll need to set up port forwarding. Port forwarding will direct incoming network to specific ports on your host to the docker machine instance.
In Virtualbox, there are two network interfaces configured for your docker machine instance. The first is a NAT interface. You can add to the 'Port Forwarding' configuration ports for the query, metrics, and admin interfaces.
Note that you do not need to stop your docker machine instance to make these changes.
To configure port forwarding:
- Open up the Virtualbox application.
- Click on the instance that serves as the docker machine. It should have the same name as your docker machine, which is
default
if you haven't changed it. - Configure the instance by clicking on the 'Settings' button with the instance highlighted.4 Click on the 'Network' tab.
- Choose the 'Adapter 1' tab. You should see that it is 'Attached to' 'NAT'. This means that the interface is attached to a NAT managed by Virtualbox itself.
- Click on the 'Port Forwarding' button.
- Add the three rules below by clicking on the 'add rule' icon and editing the fields. Please substitute the ports appropriately. IE, if you're using SSL, then you would use 443 rather than 80.
- Click 'OK' on the port forwarding dialogue
- Click 'OK' on the network interface dialogue to save your changes.
Name | Protocol | Host IP | Host Port | Guest IP | Guest Port |
---|---|---|---|---|---|
ps | TCP | (leave blank) | 80 | (leave blank) | 80 |
stats | TCP | (leave blank) | 9000 | (leave blank) | 9000 |
metrics | TCP | (leave blank) | 9015 | (leave blank) | 9015 |
On Windows, when you make these changes, you will be prompted to open up your firewall to allow incoming connections on these ports. Accept the dialogue. If you don't accept this, you can change your firewall settings in the firewall configuration.
Once the configuration is complete, when your predictive services are running, you should be able to access your predictive services installation from outside your Windows or Mac OS X host through its IP address and the appropriate port.
As part of a machine learning course I’ve been taking on Coursera, I had to get some packages installed.
Since I couldn’t find a one-stop webpage covering all the instructions, I had to go back and forth multiple webpages. And then, after I’ve installed the whole thing, it took me a while to figure out how to run it.
And so, in this single post, I try to explain everything to you.
First up, I had to install the following packages:
- IPython Notebook
- GraphLab Create
GraphLab Create is not a free software, but they provide a 1-year, renewable license for educational purposes. You’ve to first go to their webpage and register yourself.
First up, go to the official instructions page and follow the instructions!
There are two options for installation:
- Installation into Anaconda Python Environment (recommended)
- Installation in Python environment using virtualenv
After following the official recommended path, you would have
- Installed Anaconda, pip, GraphLab Create, and IPython Notebook.
- Created a new Conda environment called
gl-env
.
In case you’re wondering (like I did), rest assured that the Anaconda installation will not clash with your existing Python installation (that ships with most Linux distributions).
On their website there is an option to upgrade to a version that uses GPU acceleration. I haven’t tried that myself, but feel free to try it if you have a compatible GPU card.
Starting IPython Notebook to use GraphLab
The proper procedure for firing up the whole thing (in Linux) is:
- Open the terminal.
cd
to the directory where your IPython Notebooks are.
Strictly speaking, this step is optional; but this is what you want to do in most cases.- Activate the
gl-env
Conda environment which you created earlier (see below for a brief into to Conda).$ source activate gl-env
- Start your IPython Notebook
$ ipython notebook
And there you go! You’re all set!
Step 3 above is where everybody gets it wrong; they simply skip this step! Although IPython Notebook would start up fine, if you skip step 3, python will choke at you when you try to import the graphlab
package:
This is because, if you’ve followed the official instructions, only the gl-env
environment would have the graphlab
package installed.
Brief Introduction to Conda
Conda, in simple terms, is a tool that allows you to simultaneously have multiple installations of Python on your computer without messing up the different installations. ie., you could create different “environments” of Python, each with different packages.
Depending on your needs, you can set up the different “sandboxed” environments with different packages installed in them; even different versions of python itself! And you can easily switch between the environments. A prime advantage to working this way is that you don’t have to touch the native python installation on your OS (if it has one).
To learn more about using Conda, check out the official documentations:
Trust me, Conda makes your life so much easier.
Hope you’ve found this post helpful.
Garena download for mac. Multiple modesThis feature enables you to either play solo by yourself or choose team mates in a duo or a 4-man team where the team fights to become the last team standing. Therefore the game becomes harder as it progresses.
Useful links: