How to enhance Python code with Concurrency and Parallelism

Concurrency and parallelism are two techniques that allow you to run several programs simultaneously. Python has many options for processing multiple tasks simultaneously and in parallel, but that can confuse many people.

Concurrency and parallelism are two techniques that allow you to run several programs simultaneously. Python has many options for processing multiple tasks simultaneously and in parallel, but that can confuse many people.

How to enhance Python code with Concurrency and Parallelism Picture 1 How to enhance Python code with Concurrency and Parallelism Picture 1

So, let's explore the tools and libraries available to properly implement concurrency and parallelism in Python!

What are Concurrency and Parallelism?

Concurrency and parallelism refer to two basic principles of task implementation in computing. Each principle has its own characteristics.

1. Concurrency is the feature of a program to manage multiple tasks at the same time without executing them at the exact same time. It revolves around the idea of alternating tasks, switching between them in a concurrent manner.

How to enhance Python code with Concurrency and Parallelism Picture 2 How to enhance Python code with Concurrency and Parallelism Picture 2

2. Parallelism involves implementing a series of tasks in parallel. It often makes use of multiple cores or CPU processors. Parallelism achieves true concurrent implementation, allowing you to perform tasks faster and suitable for extensive computing operations.

How to enhance Python code with Concurrency and Parallelism Picture 3 How to enhance Python code with Concurrency and Parallelism Picture 3

The Importance of Concurrency and Parallelism

Resource Usage : Concurrency allows for efficient use of system resources, ensuring that tasks are actively progressing instead of waiting for external resources.
Responsiveness : Concurrency can improve application responsiveness, especially in contexts involving user interfaces or web servers.
Performance : Parallelism is important for achieving optimal performance, especially for CPU-intensive tasks such as complex calculations, data processing, and simulation.
Scalable : Both concurrency and parallelism are needed for building scalable systems.
Future-proofing : As hardware trends continue to favor multi-core processors, the ability to exploit parallelism will become increasingly necessary.

Concurrency in Python

You can achieve concurrency in Python using threaded and asynchronous programming with the asyncio library.

Threading (thread) in Python

Threading is a concurrency mechanism in Python that allows you to create and manage tasks in one simple process. Threads are suitable for specific types of tasks, especially I/O-bound tasks, and can benefit from concurrent execution.

Python's threading module provides a high-level interface for creating and managing threads. While GIL (Global Interpreter Lock) limits threads in terms of true parallelism, they can still achieve concurrency by effectively interleaving tasks.

The code below shows an example implementing concurrency using threads. It uses the Python query library to send an HTTP query, a common block I/O task. It also uses the timing module to calculate execution time.

import requests import time import threading urls = [ 'https://www.google.com', 'https://www.wikipedia.org', 'https://www.makeuseof.com', ] # hàm truy vấn một URL def download_url(url): response = requests.get(url) print(f"Downloaded {url} - Status Code: {response.status_code}") # Thực thi không có luồng và đo thời gian thực hiện start_time = time.time() for url in urls: download_url(url) end_time = time.time() print(f"Sequential download took {end_time - start_time:.2f} secondsn") # Thực thi với luồng, reset thời gian đo thời điểm triển khai mới start_time = time.time() threads = [] for url in urls: thread = threading.Thread(target=download_url, args=(url,)) thread.start() threads.append(thread) # Đợi tất cả phân luồng hoàn thành for thread in threads: thread.join() end_time = time.time() print(f"Threaded download took {end_time - start_time:.2f} seconds")

Run this program and you will see that threaded queries run faster than ordered queries. Even though the difference is only a fraction of a second, you will clearly feel the improved performance when using threads for I/O related tasks.

How to enhance Python code with Concurrency and Parallelism Picture 4 How to enhance Python code with Concurrency and Parallelism Picture 4

Asynchronous programming with Asyncio

Asyncio provides an event looper that manages asynchronous tasks called coroutines. They are functions that you can pause and resume, making them ideal for I/O-related tasks. This library is especially useful for situations where the task involves waiting for external resources, such as network queries.

You can edit the previous query submission example to work with asyncio :

import asyncio import aiohttp import time urls = [ 'https://www.google.com', 'https://www.wikipedia.org', 'https://www.makeuseof.com', ] # hàm bất đồng bộ để truy vấn URL async def download_url(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: content = await response.text() print(f"Downloaded {url} - Status Code: {response.status}") # Hàm bất đồng bộ chính async def main(): # Tạo danh sách nhiệm vụ download đồng thời từng URL tasks = [download_url(url) for url in urls] # Thu thập và triển khai đồng thời các nhiệm vụ await asyncio.gather(*tasks) start_time = time.time() # Chạy hàm bất đồng bộ chính asyncio.run(main()) end_time = time.time() print(f"Asyncio download took {end_time - start_time:.2f} seconds")

Using this code, you can load web pages concurrently using asyncio and take advantage of asynchronous I/O operations. This can be more efficient than threading for I/O related tasks.

How to enhance Python code with Concurrency and Parallelism Picture 5 How to enhance Python code with Concurrency and Parallelism Picture 5

Parallelism in Python

You can implement parallelism using Python's multiprocessing module, allowing you to take full advantage of multicore processors.

Multicore processing in Python

Python's multiprocessing module provides a way to achieve parallelism by creating separate processes with their own Python interpreter and memory space. This effectively bypasses Global Interpreter Lock (GIL), making it suitable for CPU-intensive tasks.

import requests import multiprocessing import time urls = [ 'https://www.google.com', 'https://www.wikipedia.org', 'https://www.makeuseof.com', ] # hàm truy vấn một URL def download_url(url): response = requests.get(url) print(f"Downloaded {url} - Status Code: {response.status_code}") def main(): # Tạo một pool multiprocessing với một số lượng quá trình được chỉ định num_processes = len(urls) pool = multiprocessing.Pool(processes=num_processes) start_time = time.time() pool.map(download_url, urls) end_time = time.time() # Đóng pool và đợi cho toàn bộ quá trình hoàn tất pool.close() pool.join() print(f"Multiprocessing download took {end_time-start_time:.2f} seconds") main()

In this example, multiprocessing spawns multiple processes, allowing the download_url function to run in parallel.

How to enhance Python code with Concurrency and Parallelism Picture 6 How to enhance Python code with Concurrency and Parallelism Picture 6

When to use concurrency and parallelism?

Choosing between concurrency and parallelism depends on the nature of the task and the availability of hardware resources.

You can use concurrency when handling I/O-related tasks, such as reading and writing files or making network queries, and when memory constraints are a concern.

Use multiprocessing when you have CPU-intensive tasks that can benefit from true parallelism and when you have strong isolation between tasks, where the failure of one task will not affect the other tasks.

Above are the things you need to know about parallelism and concurrency in Python . Hope the article is useful to you.

Python programming