How to set up a CUDA GPU for computing only?

Most of logic board today comes with an integrated graphic card. Once a CUDA device is plugged in, X11 will recognize there is a new VGA device and configure it for X11 with memory allocations.

Blow is an example nvidia-smi output showing Xorg (X11) and gnome environment are using a large chunk of the GPU memory.

| NVIDIA-SMI 470.182.03   Driver Version: 470.182.03   CUDA Version: 11.4     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA GeForce …  Off  | 00000000:01:00.0  On |                  N/A |
|  0%   48C    P8    19W / 180W |   1201MiB /  8110MiB |     13%      Default |
|                               |                      |                  N/A |
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|    0   N/A  N/A      1738      G   /usr/lib/xorg/Xorg                246MiB |
|    0   N/A  N/A      2033      G   /usr/bin/gnome-shell                7MiB |
|    0   N/A  N/A      2812      G   /usr/lib/xorg/Xorg                514MiB |
|    0   N/A  N/A      2942      G   /usr/bin/gnome-shell              169MiB |
|    0   N/A  N/A    321543      G   …NiYQ%3D%3D&browser=chrome       36MiB |
|    0   N/A  N/A    321587      G   …/debug.log –shared-files        5MiB |
|    0   N/A  N/A    458830      G   …390373760644109118,131072      115MiB |
|    0   N/A  N/A    458872      G   …veSuggestionsOnlyOnDemand       78MiB |

If you want to use the CUDA device for computing, better free up the GPU memory. Here is a way to do it.

First, find the address of the device you want to use. You can change “Display” to “VGA” to list all the devices can be used to control the monitor.

$ lspci |grep Display
00:02.0 Display controller: Intel Corporation UHD Graphics 630 (Desktop)

Then edit or create /etc/X11/xorg.conf. Note: “sudo” privilege is required here.

Section "Device"
    Identifier      "intel"
    Driver          "intel"
    BusId           "PCI:0:2:0"

Section "Screen"
    Identifier      "intel"
    Device          "intel"

Restart the X11 environment or reboot, and you will see that GPU memory is now freed up and can all be used for CUDA computing.

~$ nvidia-smi
Mon May 1 11:03:18 2023
| NVIDIA-SMI 470.182.03 Driver Version: 470.182.03 CUDA Version: 11.4 |
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
| 0 NVIDIA GeForce … Off | 00000000:01:00.0 Off | N/A |
| 0% 55C P8 10W / 180W | 2MiB / 8119MiB | 0% Default |
| | | N/A |

| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
| No running processes found |


Using Python Scripts to Control WiFi Router

Controlling network access of a specific device to a router using web interface is a pain. Thanks to Mathieu Velten, who developed pynetgear, a Python package that supports scripting Netgear router, by reverse engineering the Netgear Genie App.

I wrote the following code utilizing the pynetgear package to turn on (Allow) and off (Block) network access for a specific device to my router.

#!/usr/bin/env python
import sys
import time
import click
from pynetgear import Netgear

class NetgearManager:
    Netgear Router Manager
    def __init__(self, pwd):
        self.session = Netgear(password=pwd)

    def search_connected_device_by_name(self, name):
        found = []
        for device in self.session.get_attached_devices():
            if == name:
        if len(found) >= 0:
            return found
        return None

    def get_device_status_by_name(self, name):
        devices = self.search_connected_device_by_name(name)
        if len(devices) > 1:
            print(f'More than one device has the name {name}', file=sys.stderr)
            for d in devices:
                print(d.ip, d.type, d.mac, d.allow_block_device)
        return False

    def set_device_status_by_name(self, name, status_code, delay):
        assert status_code in ['Allow', 'Block']
        devices = self.search_connected_device_by_name(name)
        if len(devices)==1:
            mac_address = devices[0].mac
            return self.set_device_status_by_mac(self, mac_address, status_code, delay)
        elif len(devices)>1:
            print(f'More than one device has the name {name}', file=sys.stderr)
            for d in devices:
                print(d.ip, d.type, d.mac, d.allow_or_block, file=sys.stderr)
        return False

    def set_device_status_by_mac(self, mac, status_code, delay):
        if delay > 0:
        return self.session.allow_block_device(mac, device_status=status_code)
def cli():

@click.option('--password', prompt=True, hide_input=True, confirmation_prompt=False, help='Netgear outer admmin password')
@click.option('--name', help='Device name')
@click.option('--mac', default=None, help='Device MAC address')
@click.option('--status', help='Status: Allow or Block')
@click.option('--delay', default=0, help='Time delay in minutes')
def switch(password, name, mac, status, delay):
    manager = NetgearManager(password)
    delay *= 60                 # minutes to seconds
    if mac:
        result = manager.set_device_status_by_mac(mac, status, delay)
        result = manager.set_device_status_by_name(name, status, delay)
    if not result:
        msg = f'Cannot switch device {name} status to {status}'
        print(msg, file=sys.stderr)

@click.option('--password', prompt=True, hide_input=True, confirmation_prompt=False, help='Netgear outer admmin password')
def show(password):
    manager = NetgearManager(password)
    devices = []
    for idx, dev in enumerate(manager.session.get_attached_devices()):
        print(str(idx),, dev.ip, dev.type, dev.mac, dev.allow_or_block)


if __name__ == '__main__':

container emacs

Containerized Jupyter Lab Authentication for Emacs-IPython-Notebook

In my earlier post, I mentioned how to run Apache Spark in a container with Jupyter notebook. While I like to use Jupyter notebook to document my work and share it with my colleagues, I still prefer to use Emacs instead of a web browser to write code. The answer is EIN — Emacs IPython Notebook. Installing EIN is straight forward. The tricky part is to make it working with the containerized Jupyter.

Once you enter EIN, you will need to use the command “ein:login” to connect to Jupyter. Internally it goes through websocket and if you just use your password, as you would from a browser, EIN will throw an error of expired websocket and you will not be able to execute any code or create a new notebook. One fix of this problem is to use the Jupyter token as the password. The token can be found from the command below:

host$ docker exec spark-jupyter jupyter server list
Currently running servers:
http://1e2480237424:8888/?token=4899803e93c22739a8de56fa4deed22aef6568f93025c901 :: /home/jovyan

Jupyter can use both token and password for user authentication based on how it is configured. By default, token changes every time, which makes it more secure but also harder to use. In a secured environment, we can fix the token by define the JUPYTER_TOKEN environment (first line below) and pass it to the Docker container when starting Jupyter (second line):

export JUPYTER_TOKEN="4899803e93c22739a8de56fa4deed22aef6568f93025c901"

host$ docker run --name spark -d -p 8888:8888 -p 4040:4040 -p 4041:4041 -e JUPYTER_TOKEN=$JUPYTER_TOKEN -v $PWD:/home/jovyan --name spark-jupyter jupyter/all-spark-notebook

A better configuration is to run the “jupyter server password” command (new in Jupyter version 5.0 ) in the container, which will save a hashed password in /home/jovyan/.ipython/jupyter_server_config.json. Then you will only need to put the same password when login. Though you can also disable both password and token, I would recommend not to do it.

To enable inline image display in EIN/Emacs, just put this line in your .emacs file:

(setq ein:output-area-inlined-images t)
container Data

How to Run Apache Spark from a Container

Apach Spark is a large-scale data analytics engine that can utilize distributed computing resources. It supports common data science languages, e.g. Python and R. Its support for Python is provided through the PySpark package. Some advantages of using PySpark over the traditional vanilla Python (numpy and Pandas) are:

  • Speed. Spark can operate on multiple computers. So you don’t have to write your own parallel computing code. People are claiming 100x speed gain using Spark.
  • Scale. You can develop code on a laptop and deploy it on cluster computers to process data at scale.
  • Robust. It won’t crack if some nodes are taken off during the execution time.

Other features, like SparkSQL, Spark ML, and support for data streaming sources bring additional advantages.

After a quick tryout of the Spark container image from Bitnami, I moved on to another image released by Jupyter stack with good documentation. To run the container and expose the Jupyter notebook and share the current host directory with the container, use this command:

docker run -d -p 80:8888 -p 4040:4040 -p 4041:4041 -v ${PWD}:/home/jovyan jupyter/all-spark-notebook

If you need to install additional packages to the container image provided, you could install them by either going inside the container (“docker exec -it spark /bin/bash” or modifying the original docker-compose.yml file.


Extract Hidden URLs from PDF files

To search for URLs in a PDF file, one can use the built-in search function in PDF readers and look for strings like “http.” However, when the URL is hidden, a simple string search will not work in the reader. A more reliable way to get all the URLs is to use the “pdftotext” program to convert the PDF file to text format, then use the “grep” command to catch all the URLs. The “-raw” option helps to keep the order of the text in place. The example code below converts the file to pdf, and identify the generated txt file by looking at the latest file file, grep the http, then delete the txt file.

pdftotext -raw ml.pdf && file=ls -tr|tail -1; grep http $file; rm $file


Query SRA Sequence Runs with Python

Retrieving data from SRA is a common task. NCBI has provided a nice tool collection named E-utilities to query and retrieve data from it. The example Python snippet below shows how to query NCBI SRA database using sample identifiers and get a table of linked NCBI BioProject, BioSample, Run, Download location and Size.

import sys, os                                                                                                                                                                                     
import subprocess                                                                                                                                                                                  
import shlex                                                                                                                                                                                       
import pandas as pd 

def get_SRR_from_biosamples(csv: str, batch_size=10, debug=True):                                                                                                                                  
    """Gete SRA run ID from BioSample ID.                                                                                                                                                          
    epost_cmd = 'epost -db biosample -format acc'                                                                                                                                                  
    elink_cmd = 'elink -target sra'                                                                                                                                                                
    efetch_cmd = 'efetch -db sra -format runinfo -mode xml'                                                                                                                                        
    xtract_cmd = """xtract -pattern Row -def "NA" -element BioProject\n                                                                                                                            
     BioSample Run download_path size_MB"""                                                                                                                                                        
    sample_ids = []                                                                                                                                                                                
    results = []                                                                                                                                                                                   
    with open(csv, 'r') as fh:                                                                                                                                                                     
        total_samples = fh.readlines()                                                                                                                                                             
        print(f'Total samples: {total_samples}')                                                                                                                                                   
        for idx, l in enumerate(total_samples):                                                                                                                                                    
            l = l.rstrip()                                                                                                                                                                         
            batch_num = int(idx/batch_size) + 1                                                                                                                                                    
            run_flag = True                                                                                                                                                                        
            if debug:                                                                                                                                                                              
                if batch_num > 1:                                                                                                                                                                  
                    print('Debug mode. Stop execution after 1 batch.')                                                                                                                             
                    run_flag = None                                                                                                                                                                
            if run_flag:                                                                                                                                                                           
                if  ((idx+1)%batch_size == 0) | (idx == len(total_samples) - 1):                                                                                                                   
                    print(f'Processing batch {batch_num}: {sample_ids}')                                                                                                                           
                    batch_results = []                                                                                                                                                             
                    sample_ids = ','.join(sample_ids)                                                                                                                                              
                    epost_cmd += f' -id "{sample_ids}"'                                                                                                                                            
                    epost = subprocess.Popen(shlex.split(epost_cmd),                                                                                                                               
                    elink = subprocess.Popen(shlex.split(elink_cmd),                                                                                                                               
                    efetch = subprocess.Popen(shlex.split(efetch_cmd),                                                                                                                             
                    xtract = subprocess.Popen(shlex.split(xtract_cmd),                                                                                                                             
                    while epost.returncode is None:                                                                                                                                                
                    for l in xtract.stdout.readlines():                                                                                                                                            
                        if not l.startswith('PRJ'):  # "502 Bad Gateway" when server is busy                                                                                                       
                            sys.stderr.write(f'Error processing {sample_ids}: {l}')                                                                                                                
                            if debug:                                                                                                                                                              
                    print(f'\nTotal SRA Runs in batch {batch_num}: {len(batch_results)}.\n')                                                                                                       
                    sample_ids = []                                                                                                                                                                
    print(f'Total runs in collection: {len(results)} with {idx+1} samples.')                                                                                                                       
    data = pd.DataFrame(results, columns=['BioProject', 'BioSample', 'Run', 'Download', 'size_MB'])                                                                                                
    return data                                                                                      

These E-utilities tools are used and need to be accessible from the environment: epost, elink, efetch, xtract. The subprocess module in Python is used to chain together these steps similar to Linux pipes. The samples are queried in batches to prevent too frequent queries to NCBI, which could lead to blocking of your future queries. After receiving the sample run identifiers, one can use the prefetch tool from E-utilities to download the files. And, of course, prefetch can be wrapped and chained together as well.


PyTorch Geometric and CUDA

PyTorch Geometric (PyG) is an add-on library for developing graph neural networks using Python. It supports CUDA but you’ll have to make sure to install it correctly. Below is one error message I got after installing PyG:

from import Data
OSError                                   Traceback (most recent call last)

OSError: /anaconda3/lib/python3.7/site-packages/torch_sparse/ undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSs

It is clear this error is related to CUDA version. So, I checked it:

print(torch.version.cuda, torch.version)
10.2, 1.9.0

Running $ nvidia-smi, gave a CUDA version 11.2. So my system was somehow messed up with mixed versions of CUDA. To fix the mess and get PyG working, I did the following:

$ pip uninstall torch
$ pip install torch===1.9.1+cu111 -f
$ pip install torch-scatter -f
$ pip install torch-sparse -f
$ pip install torch-geometric
$ apt-get install nvidia-modprobe

Note that there is no existing wheel built with CUDA 11.2 (cu112) so I used the closest version (cu111). Now PyG works! The “nvidia-modprobe” kernel extension fixes “RuntimeError: CUDA unknown error – this may be due to an incorrectly set up environment, e.g. changing env variable CUDA_VISIBLE_DEVICES after program start. Setting the available devices to be zero,” which I got after having two Python sessions running and both trying to using CUDA.

Update from some other testing regarding these errors:

RuntimeError: Detected that PyTorch and torch_cluster were compiled with different CUDA versions. PyTorch has CUDA version 11.1 and torch_cluster has CUDA version 10.2. Please reinstall the torch_cluster that matches your PyTorch install.

RuntimeError: Detected that PyTorch and torch_spline_conv were compiled with different CUDA versions. PyTorch has CUDA version 11.1 and torch_spline_conv has CUDA version 10.2. Please reinstall the torch_spline_conv that matches your PyTorch install.

The following commands fixed it:

$ pip install --upgrade pip
$ CUDA=cu111
$ TORCH=1.9.1
$ pip install torch-cluster==1.5.9 -f${TORCH}+${CUDA}.html
$ pip install torch-spline-conv -f${TORCH}+${CUDA}.html


Update R and Bioconductor

It’s not a straightforward thing to update R and Bioconductor (a bioinformatics package collection for R), especially if you used the R package that comes with your Linux distribution. For example, the R package from Ubuntu package repository is still 3.5, while the latest version from the r-project site is already 4.0.3. While using an older version of R itself for statistics may not be a problem, many of the Bioconductor packages do have updates and bug fixes that require R v4. And to make things worse, updating Bioconductor itself is a painful process. Below is what I did to update R, Bioconductor, and associated packages. The main lesson learned here is not to use the R package from Ubuntu repository.

  1. Remove R installed from the Ubuntu repository;
  2. Install R from r-project by adding it as an apt repository;
  3. Update R packages (system-wide);
  4. Update R packages installed in the user directory;
  5. Update BiocManager
sudo apt-get purge r-base* r-recommended r-cran-*
sudo apt autoremove
sudo apt update
sudo add-apt-repository 'deb focal-cran40/'
sudo apt-key adv --keyserver --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo apt update
sudo apt install r-base r-base-core r-recommended r-base-dev
update.packages(ask = FALSE, checkBuilt = TRUE)
if (!requireNamespace("BiocManager", quietly = TRUE))
BiocManager::install(version = "3.12")

The R code section may need to run twice once as a sudo user and once as a normal user if you also have packages installed under a user name (that is to say not a system-wide installation from system admin). The last command tells BiocManager (Bioconductor package manager) to install Bioconductor version 3.12 and update all installed Bioconductor pages.


Git with Python

There are times when you want to do the source code management programmatically, and here comes GitPython. GitPython wraps the git commands so that you can execute most git functions from within Python (and without using shutils or subprocess). The example code snippet below shows how to do a “git clone” and if the destination is already there, it will try a “git checkout” first from the “master” branch if it exists then the “main” branch if the master branch does not exist.

    git.Repo.clone_from(remote_repo, local_repo)
except GitCommandError as error:
    if error.status == 128:
        repo = git.Repo(local_repo)
        branch = 'master'
        except GitCommandError:
            branch = 'main'  
        msg = ' '.join(error.command)
        msg += error.stderr
        sys.exit(f'Error in running git: {msg}.')

Signing Git Commit using GPG

I have been enjoying using Magit in Emacs to do all the git related stuff and run into an error when tagging a release. The error message is

git … tag --annotate --sign -m my_msg
error: gpg failed to sign the data
error: unable to sign the tag

This turned out to be caused by the fact that I have not set up gpg signing and signature. Below is how the problem is fixed and from now on all my git commits are going to be signed.

$ gpg --gen-key

There were a few dialogues between these commands, e.g. asking for names, e-mail, secret key, and it is recommended that you type random keys after these questions so that when gpg generate randoms there is more entropy. In the end, you will see some text with a line like this:

gpg: key 404NOTMYREALKEYID marked as ultimately trusted

This string “404NOTMYREALKEYID” is the key id. The same key id also shows up in the output of the following command:

$ gpg --list-secret-keys --keyid-format LONG
sec   rsa3072/404NOTMYREALKEYID ......

Finally, just registering this key id with git. And the problem is solved. So the problem is not in Magit, but my configuration, since Magit uses the “–sign” option when it calls Git, which is actually a good practice.

$ git config --global commit.gpgsign true
$ git config --global user.signingkey 404NOTMYREALKEYID