Why can't I see progress logs in my notebook during parallel runs?

Jupyter typically suppresses stdout from worker threads. This skill uses the 'as_completed' pattern to return results to the main thread, ensuring logs are visible in the notebook interface.

Why does my Jupyter kernel crash during parallel GPU tasks?

Crashes are often caused by GPU thread explosion where multiple processes attempt to access the same GPU simultaneously, leading to Out of Memory (OOM) errors. This skill uses a queue system to ensure tasks are mapped 1:1 to available GPUs.

How does this skill handle memory leaks between tasks?

It implements a verified cleanup pattern that clears CuPy's default and pinned memory pools and triggers Python's garbage collector immediately after a task completes, before the GPU is released for the next task.

Can I use this for multi-GPU setups?

Yes, it is specifically designed for multi-GPU environments, allowing you to define a pool of GPU IDs and dynamically allocate them to a parallel task queue.

GPU Parallel Scheduling & Memory Safety

Name: GPU Parallel Scheduling & Memory Safety
Author: smith6jt-cop

bysmith6jt-cop

Ciencia de Datos y ML

Implements queue-based GPU allocation and memory cleanup patterns to prevent OOM crashes and ensure reliable progress tracking in parallel workflows.

Acerca de

This skill provides specialized patterns for managing multi-GPU parallel processing, specifically optimized for high-performance computing (HPC) environments using CuPy and Jupyter notebooks. It addresses critical issues like GPU thread explosion, nested executor crashes, and Jupyter's suppression of worker thread output by enforcing a strict 'one task per GPU' rule through dynamic queue-based allocation. By integrating automated memory cleanup and main-thread progress reporting, it ensures stable execution of complex data pipelines without the common pitfalls of cumulative memory pressure or silent kernel failures.

Características Principales

Jupyter-safe progress output via main-thread result synchronization
Automated GPU memory cleanup patterns for CuPy and garbage collection
Prevention strategies for nested ThreadPoolExecutor 'thread explosions'
Queue-based GPU allocation for 1:1 resource-to-task mapping
Dynamic GPU device assignment and release workflows
0 GitHub stars

Casos de Uso

Debugging and fixing silent kernel crashes in parallel HPC workflows
Building OOM-resilient CuPy pipelines for large-scale data analysis and scientific computing
Processing multiple image channels in parallel across multiple GPUs in Jupyter notebooks

Acerca de

Características Principales

Jupyter-safe progress output via main-thread result synchronization
Automated GPU memory cleanup patterns for CuPy and garbage collection
Prevention strategies for nested ThreadPoolExecutor 'thread explosions'
Queue-based GPU allocation for 1:1 resource-to-task mapping
Dynamic GPU device assignment and release workflows
0 GitHub stars

Casos de Uso

Debugging and fixing silent kernel crashes in parallel HPC workflows
Building OOM-resilient CuPy pipelines for large-scale data analysis and scientific computing
Processing multiple image channels in parallel across multiple GPUs in Jupyter notebooks