Robust, Rootless Data Science Environments with Conda: Compiling Code and Sharing Ports
Modern data science doesn't just mean Python notebooks and R scripts. Increasingly, researchers must compile C/C++ code or interface with custom network services. Yet, the places where our analyses run - shared HPC clusters, Docker containers, ephemeral cloud VMs - often lack admin rights (root), making traditional setups challenging.
In this guide, I'll show you how to build a portable data science environment using Conda that can:
- Compile native code (e.g., C/C++ via Clang/LLVM)
- Securely share TCP ports to or from the outside world - even from within restrictive containers
- Do it all without root privileges!
If you need to, say, compile and test C projects, or expose a local dashboard (on Jupyter, RStudio, or custom web servers) to colleagues, this post is for you.
Step 1: Conda - Your Isolated, Rootless Playground
First, create and activate a new Conda environment. See Managing Development Environments with Conda for detailed background.
Step 2: Install OpenSSH, Build Tools, and Autossh
We'll need OpenSSH to manage tunnels, and autossh for resilient, auto-reconnecting port forwarding.
Install
openssh(no sudo needed if within Conda!):1
conda install -c conda-forge opensshInstall build tools: Compiling C++ Code in a Conda Environment Using Conda-managed Headers and Libraries
Compile and install autossh into your Conda environment:
1
2
3
4
5
6
7# Clone autossh, compile, and install
git clone https://github.com/Autossh/autossh.git
cd autossh
./configure --prefix="$CONDA_PREFIX"
make
make install
cd ..
Step 3: Enable Hassle-Free SSH Tunneling with Push/Pull Scripts
When working in rootless containers or locked-down servers, SSH's port forwarding is your lifeline for exposing internal dashboards or accessing internal-only services.
But SSH's terminology (local/remote forwarding) is confusing. Enter push-pull-port: dead-simple, POSIX shell scripts that make sharing TCP ports human-friendly.
Clone the repo:
1
git clone https://github.com/jifengwu2k/push-pull-port.git
It provides two scripts:
- push-local-port.sh: Expose a local port to the remote (even to the outside world) - uses SSH -R.
- pull-remote-port.sh: Access a remote port securely on your local machine - uses SSH -L.
Examples:
Expose a local Jupyter notebook (port 8888) on the remote server's port 9000:
1
sh push-pull-port/push-local-port.sh -l 8888 -r 9000 -u username -h remote.example.comAccess a remote Postgres DB (remote port 5432) locally on port 15432:
1
sh push-pull-port/pull-remote-port.sh -r 5432 -u username -h remote.example.com -l 15432
You'll want SSH keys set up (see ssh-keygen and ssh-copy-id), with GatewayPorts yes in /etc/ssh/sshd_config if you're pushing ports for world access.
Why This Is Awesome
- Never need root: Conda installs everything - OpenSSH, compilers, autossh, libraries - without polluting or breaking system Python/C toolchains.
- Reproducible & Movable: Your
conda env export > env.ymlsnapshot can be rebuilt identically anywhere - laptop, cluster, or container. - Debug with confidence: Push/pull scripts run in the foreground by default, so errors don't get lost (plus: no silent background failures).
- Cloud/container ready: Expose dev servers running in Docker, on localhost, or anywhere else - even behind NATs or with non-standard ports.