Bioconductor Docker Images For Multi-Node Parallel Computing On The Cloud***
Author(s): Nitesh Turaga
Affiliation(s): Dana Farber Cancer Institute
Bioconductor produces docker images that are widely used because they containerize system dependencies of all Bioconductor packages along with the community version of RStudio. Using Kubernetes, a container orchestration software, it is now possible to deploy these docker images on a cluster and use them for multi-node parallel computing. In this workshop, we introduce commands to launch such a cluster on a cloud provider (Google, Azure, AWS) and use a new BiocParallel back-end called 'RedisParam' to distribute jobs from the manager to the workers. In addition, the paradigm creates a traditional parallel computing framework on the cloud using the same containerized applications available to experiment with on local machines. The advantage of such a cluster launched by Kubernetes is fault tolerance and the potential of auto-scaling. Prequiresites: Some familiarity with BiocParallel and Bioconductor docker images.
Orchestra1. Go to Orchestra.
2. Log in.
3. Search for the workshop of interest.
4. Click "Launch" (may take a minute or two).
5. Follow instructions.