Tuesday, January 31, 2017

Python: Removing Matrix columns that contain NaN

Removing Matrix columns that contain NaN. This is a lengthy answer, but hopefully easy to follow.
def column_to_vector(matrix, i):
    return [row[i] for row in matrix]
import numpy
def remove_NaN_columns(matrix):
    import scipy
    import math
    from numpy import column_stack, vstack

    columns = A.shape[1]
    #print("columns", columns)
    result = []
    skip_column = True
    for column in range(0, columns):
        vector = column_to_vector(A, column)
        skip_column = False
        for value in vector:
            # print(column, vector, value, math.isnan(value) )
            if math.isnan(value):
                skip_column = True
        if skip_column == False:
            result.append(vector)
    return column_stack(result)

### test it
A = vstack(([ float('NaN'), 2., 3., float('NaN')], [ 1., 2., 3., 9]))
print("A shape", A.shape, "\n", A)
B = remove_NaN_columns(A)
print("B shape", B.shape, "\n", B)

A shape (2, 4) 
 [[ nan   2.   3.  nan]
 [  1.   2.   3.   9.]]
B shape (2, 2) 
 [[ 2.  3.]
 [ 2.  3.]]

Setting TensorFlow Python on MacBook Pro (yet TBD)

In the "free moment" (2-4AM?) I would like to set up TensorFlow on my Mac.

https://gist.github.com/ageitgey/819a51afa4613649bd18

but that will have to wait until I get to it.

Why free AWS is not a good solution for Machine Learning research.

The free Amazon AWS instance t2.micro gives me the memory error when running my Machine Learning exercises (normalization of a large data set).

---------------------------------------------------------------------------
MemoryError  Traceback (most recent call last)
 in ()
----> 1 X_normalized = normalize_color_intensity(X_train)

Looking closer t2.micro has following congiguration: 1 CPU 1 GB RAM So basically it is as I was running my stuff on less than a Rasberry Pi (4 cores, 2GB RAM). Ouch! To surpass my MacBook Pro (4 cores 16GB RAM) I would have to run it on t2.2xlarge with: 8C PU 32 GB RAM Which is $0.376 per hour, or $279 per month. The AWS is a good option for short sessions when very powerful computer instances are needed, and it is good for corporate web servers, but for normal researcher who needs to run experiments day in and day out I would recommend  to buy gaming machine with top GPU, or two.

Monday, January 30, 2017

Configuring AWS instance for Python & Jupyter Notebook server

The configuration of Amazon (AWS) Linux instance with Python Jupyter Notebook for Machine Learning.

Starting the AWS instance is out of scope, plenty of tutorials are available, however, start it in the region that GPU instances such as p2.xlarge are available.

Python

Some Python is already installed
$ python --version
The program 'python' can be found in the following packages:
* python-minimal
* python3
Try: sudo apt install
$ python3 --version
Python 3.5.2

Installing Conda (Anaconda)

conda --version
conda: command not found


~$ mkdir Downloads

~$ cd Downloads/

Download conda (find the newest conda install script)

The full version of Anaconda saves you time in the long run so you do not have to log into the server and install missing packages.

full version (455.91M @ 23.0MB/s takes 20s)
https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh

mini (not recommended): 
https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh


~/Downloads$ wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
~/Downloads$ bash Anaconda3-4.2.0-Linux-x86_64.sh
Follow the instructions and accept defaults
... a lot of packages get installed ...


PATH=/home/ubuntu/anaconda3/bin 

You need to refresh the terminal with the new bashrc settings.


$ source ~/.bashrc
$ conda --version
conda 4.2.9
$ python --version
Python 3.5.2 :: Anaconda 4.2.0 (64-bit)
$ jupyter --version
4.2.0 
As you can see we are in pretty good shape already!

Configure iPython (Jupyter Notebook)

$ ipython
Python 3.5.2
IPython 5.1.0
In [1]: from IPython.lib import passwd
In [2]: passwd()
Enter password: Verify password: ..
Out[2]: 'sha1:5dsfdsfdsfsdfdsfdsfdsfdsfsdfdsfsfds'
In [3]: exit


Copy the password sha1 hash to use later in the configuration file:
c.NotebookApp.password=


CREATE CERTIFICATE


$ cd ~
$ mkdir certificates
$ cd certificates/
$ openssl req -x509 -nodes -days 365 -newkey rsa:1024 -keyout mycert.pem -out mycert.pem
Generating a 1024 bit RSA private: /home/ubuntu/certificates
... follow the instructions ...
~/certificates$ ls
mycert.pem

 

Jupyter Notebook Server Configuration



$ jupyter notebook --generate-config
Writing default config to (note it is .jupyter, not .ipython): 
/home/ubuntu/.jupyter/jupyter_notebook_config.py



$ vi /home/ubuntu/.jupyter/jupyter_notebook_config.py
press "i" for INSERT mode



c = get_config()

### Kernel Configuration


# plotting should always be inline

c.IPKernelApp.pylab = 'inline'


### Notebook Configuration


c.NotebookApp.certfile = u'/home/ubuntu/certificates/mycert.pem'

c.NotebookApp.ip = '*'

# server does not have GUI browser

c.NotebookApp.open_browser = False

# generated in iPython shell with password() function

c.NotebookApp.password = u'sha1:9f____your_own_____cc'

# Make sure you open port 8888 in your AWS instance

# and run only one jupyther notebook

c.NotebookApp.port = 8888


Press ESC :wq to WRITE and QUIT vi



Make working directory where you synch your git



~$ mkdir dev

Start the Notebook Server

Normally, I start the notebook in the terminal and it closes when I close the terminal, I prefer to do that.

If you have to run the notebook experiment for a long time (hours, days, weeks) in which case keeping Terminal window is impossible, then you want to start it using:



$ nohup jupyter notebook ~/dev/ &
$ tail -f nohup.out

To shut it down you can look for the process ID (pid) and kill it, or restart the instance.

$ jupyter notebook ~/dev/
[I 15:54:17.847 NotebookApp] Writing notebook server cookie secret to /run/user/1000/jupyter/notebook_cookie_secret
[I 15:54:18.076 NotebookApp] Serving notebooks from local directory: /home/ubuntu/dev
[I 15:54:18.077 NotebookApp] 0 active kernels 
[I 15:54:18.077 NotebookApp] The Jupyter Notebook is running at: https://[all ip addresses on your system]:8888/

[I 15:54:18.077 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).


Running in the browser

Make sure that your AWS has at least these ports open: 
- 22 for secure shell 
- 8888 for notebook 

Make sure you request HTTPS

From your instance grab "IPv4 Public IP" or your elastic IP (for Jupyter Notebook I do not need it)

https://your_aws_public_ip_address:8888/

You may get a HTTPS security warning, but I ignore it (in Chrome: ADVANCED: Proceed...).
You should be prompted to enter your own password.
Jupyter Notebook would be fully usable at this point.




conda update conda











Installing Ubuntu 16 on AMD box using USB stick created on Mac

I am installing Ubuntu 16.04.1 desktop on old AMD box I had sitting around.


Create USB image:



$ diskutil list
/dev/disk2 (external, physical): #:
1: Windows_FAT_32 DTSE9_32GB 31.0 GB disk2s1

Note, at first I tried 32GB drive, UNetBooting installed ISO on it, but the target computer did not recognize it, then I switched to small 2GB USB stick.

  • Download latest Ubuntu desktop (16.04 AMD 64 as of January 30, 2017)
  • Download "UNetBooting" utility (for mac) to create USB stick
  • Check your USB drive name with "diskutil list" (disk2s1 for me)
  • Open "UNetBooting" and select:
  • Diskimage (radio button)
  • ISO file location
  • disk2s1 as the USB Drive
  • click OK, it will take a long while






The next step is to start the target computer and put it into the "boot load" mode, then select the USB you have created.




Sunday, January 29, 2017

TensorFlow: MacBook Pro: detect which CPU and GPU devices are available

from tensorflow.python.client import device_lib

def get_available_gpus():
    devices = device_lib.list_local_devices()
    #return [x.name for x in devices if x.device_type == 'CPU']
    return [x.name for x in devices ]

print(get_available_gpus())


['/cpu:0']




Currently, I can see and execute only on CPU.

MacBook Pro i7 Late 2013
Device 0: "GeForce GT 750M" CUDA Driver Version / 
Runtime Version 8.0 / 8.0 CUDA Capability Major/Minor version number: 3.0 
Total amount of global memory: 2048 MBytes (2147024896 bytes) 
( 2) Multiprocessors, 
(192) CUDA Cores/MP: 384 CUDA Cores GPU 
Max Clock rate: 926 MHz (0.93 GHz) 
Memory Clock rate: 2508 Mhz Memory Bus Width: 128-bit L2 
Cache Size: 262144 bytes
http://osxdaily.com/2017/01/08/disable-gpu-switching-macbook-pro/


Still not working with TensorFlow




start = timeit.timeit()
print ("starting")

with tf.device('/cpu:0'):
    # [1.0, 2.0, 3.0, 4.0, 5.0, 6.0]
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print (sess.run(c))
end = timeit.timeit()
print ("elapsed", end - start)


starting
[[ 22.  28.]
 [ 49.  64.]]
elapsed -0.0033593010011827573

XCode

I am receining an error:

xcode-select: error: tool 'xcodebuild' requires Xcode, but active developer directory '/Library/Developer/CommandLineTools' is a command line tools

$ xcode-select --install

xcode-select: error: command line tools are already installed, use "Software Update" to install updates
$ sudo xcode-select -switch /Library/Developer/CommandLineTools
Password:

CUDA: late 2013 MacBook Pro GPU: GeForce GT 750M 384 Cores

Installing CUDA on MacBook Pro


$ brew update
$ brew upgrade



$ id -g
20

$ sudo chown -R uki:20 *
Password:

$ brew link pcre

$ brew install coreutils swig
Warning: coreutils-8.26 already installed
==> Downloading https://homebrew.bintray.com/bottles/swig-3.0.11.sierra.bottle.tar.gz
######################################################################## 100.0%
==> Pouring swig-3.0.11.sierra.bottle.tar.gz
🍺 /usr/local/Cellar/swig/3.0.11: 754 files, 5.5M


$ brew cask install cuda
🍺 cuda was successfully installed!


$ brew cask info cuda

cuda: 8.0.55

https://developer.nvidia.com/cuda-zone


$ kextstat | grep -i cuda

... com.nvidia.CUDA (1.1.0) ..


$ cd /usr/local/cuda/samples/
$ sudo make -C 1_Utilities/deviceQuery




Makefile NsightEclipse.xml deviceQuery deviceQuery.cpp deviceQuery.o readme.txt


$ /usr/local/cuda/samples/1_Utilities/deviceQuery/deviceQuery



Device 0: "GeForce GT 750M"
CUDA Driver Version / Runtime Version 8.0 / 8.0
CUDA Capability Major/Minor version number: 3.0
Total amount of global memory: 2048 MBytes (2147024896 bytes)
( 2) Multiprocessors, (192) CUDA Cores/MP: 384 CUDA Cores
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GT 750M
Result = PASS


cuDNN Download

Register.

https://developer.nvidia.com/rdp/cudnn-download

Download file for OSX: cudnn-8.0-osx-x64-v5.1.tgz

and copy the file to your favorite place..

cd ~/Dropbox/dev/NVidia_CUDA/
NVidia_CUDA $ tar zxvf cudnn-8.0-osx-x64-v5.1.tgz
cd $ /cuda/include
$ sudo cp cudnn.h /usr/local/cuda/include/
cd ../lib/
$ sudo cp libcudnn* /usr/local/cuda/lib/

Add to your bash_profile

########## CUDA cuDNN ########## created: February 6, 2017
export DYLD_LIBRARY_PATH="/usr/local/cuda/lib":$DYLD_LIBRARY_PATH


$ brew cask install java
$ brew install bazel





(carnd-term1) uki@Uki-PEs-MacBookPro 16:46 tensorflow $ TF_UNOFFICIAL_SETTING=1 ./configure


Please specify the location of python. [Default is /Users/ukilucas/anaconda3/envs/carnd-term1/bin/python]: /Users/ukilucas/anaconda3/envs/carnd-term1/bin/python


Please specify optimization flags to use during compilation when bazel option "--config=opt" is specified [Default is -march=native]:


Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N


No Google Cloud Platform support will be enabled for TensorFlow


Do you wish to build TensorFlow with Hadoop File System support? [y/N] N


No Hadoop File System support will be enabled for TensorFlow


Do you wish to build TensorFlow with the XLA just-in-time compiler (experimental)? [y/N] y


XLA JIT support will be enabled for TensorFlow


Found possible Python library paths:


/Users/ukilucas/anaconda3/envs/carnd-term1/lib/python3.5/site-packages


Please input the desired Python library path to use. Default is [/Users/ukilucas/anaconda3/envs/carnd-term1/lib/python3.5/site-packages]






Using python library path: /Users/ukilucas/anaconda3/envs/carnd-term1/lib/python3.5/site-packages


Do you wish to build TensorFlow with OpenCL support? [y/N] N


No OpenCL support will be enabled for TensorFlow


Do you wish to build TensorFlow with CUDA support? [y/N] Y


CUDA support will be enabled for TensorFlow


Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]:


Please specify the CUDA SDK version you want to use, e.g. 7.0. [Leave empty to use system default]:


Please specify the location where CUDA toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:


Please specify the Cudnn version you want to use. [Leave empty to use system default]:


Please specify the location where cuDNN library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]:


Please specify a list of comma-separated Cuda compute capabilities you want to build with.


You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.


Please note that each additional compute capability significantly increases your build time and binary size.


[Default is: "3.5,5.2"]: 3.0


INFO: Starting clean (this may take a while). Consider using --expunge_async if the clean takes more than several minutes.


............


INFO: All external dependencies fetched successfully.


Configuration finished





bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer
bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
pip install /tmp/tensorflow_pkg/tensorflow-0.6.0-py2-none-any.whl

Saturday, January 21, 2017

Managing (removing) Anaconda (conda) environments

I needed to remove some of the unused Anaconda (conda) environments:



$ conda info --envs
# conda environments:
#
CarND-TensorFlow-Lab /Users/ukilucas/anaconda3/envs/CarND-TensorFlow-Lab
IntroToTensorFlow /Users/ukilucas/anaconda3/envs/IntroToTensorFlow
py3 /Users/ukilucas/anaconda3/envs/py3
py35 /Users/ukilucas/anaconda3/envs/py35
root * /Users/ukilucas/anaconda3





to do so I use command


$ conda env remove --name IntroToTensorFlow







$ conda info --envs

# conda environments:

#

py35 * /Users/ukilucas/anaconda3/envs/py35

root /Users/ukilucas/anaconda3




(py35) $ python --version

Python 3.5.2 :: Anaconda 4.2.0 (x86_64)


Updating Anaconda TensorFlow on Mac for use with jupyter notebook

I have been running TensorFlow, but it has been acting up and needed an update.




# Initializing the variables
# init = tf.initialize_all_variables() # older TF 0.11.0-py35_0 conda-forge
init = tf.global_variables_initializer() # newer TF 0.12.1-py35_1 conda-forge



Check what environment you are running, especially if you switch often, or just restarted the computer.


$ conda info --envs
# conda environments:
#
CarND-TensorFlow-Lab /Users/ukilucas/anaconda3/envs/CarND-TensorFlow-Lab
IntroToTensorFlow /Users/ukilucas/anaconda3/envs/IntroToTensorFlow
py3 /Users/ukilucas/anaconda3/envs/py3
py35 /Users/ukilucas/anaconda3/envs/py35
root * /Users/ukilucas/anaconda3


I have been running "nohup jupyter notebook &" command and my Jupyter Notebook did not see my newest TensorFlow environments. I wanted to update my root anyway.

I execute the following:

$ conda install -c conda-forge tensorflow
Fetching package metadata .........
Solving package specifications: ..........
Package plan for installation in environment /Users/ukilucas/anaconda3:

The following packages will be UPDATED:

protobuf: 3.0.0-py35_0 conda-forge --> 3.1.0-py35_0 conda-forge
tensorflow: 0.11.0-py35_0 conda-forge --> 0.12.1-py35_1 conda-forge
Proceed ([y]/n)? y
Unlinking packages ...
[ COMPLETE ]|#####################| 100%
Linking packages ...
[ COMPLETE ]|#####################| 100%
$


Update:

I decided to create my TensorFlow in a new environment because I have Python 3.6:

$ python --version 
Python 3.6.0 :: Continuum Analytics, Inc.



$  conda create -n tensorflow python=3.5 

Another option is DOWNGRADE Python 3.6 to Python 3.5

$ conda install python=3.5