How to enhance Python code with Concurrency and Parallelism
Concurrency and parallelism are two techniques that allow you to run several programs simultaneously. Python has many options for processing multiple tasks simultaneously and in parallel, but that can confuse many people.
So, let's explore the tools and libraries available to properly implement concurrency and parallelism in Python!
What are Concurrency and Parallelism?
Concurrency and parallelism refer to two basic principles of task implementation in computing. Each principle has its own characteristics.
1. Concurrency is the feature of a program to manage multiple tasks at the same time without executing them at the exact same time. It revolves around the idea of alternating tasks, switching between them in a concurrent manner.
2. Parallelism involves implementing a series of tasks in parallel. It often makes use of multiple cores or CPU processors. Parallelism achieves true concurrent implementation, allowing you to perform tasks faster and suitable for extensive computing operations.
The Importance of Concurrency and Parallelism
- Resource Usage : Concurrency allows for efficient use of system resources, ensuring that tasks are actively progressing instead of waiting for external resources.
- Responsiveness : Concurrency can improve application responsiveness, especially in contexts involving user interfaces or web servers.
- Performance : Parallelism is important for achieving optimal performance, especially for CPU-intensive tasks such as complex calculations, data processing, and simulation.
- Scalable : Both concurrency and parallelism are needed for building scalable systems.
- Future-proofing : As hardware trends continue to favor multi-core processors, the ability to exploit parallelism will become increasingly necessary.
Concurrency in Python
You can achieve concurrency in Python using threaded and asynchronous programming with the asyncio library.
Threading (thread) in Python
Threading is a concurrency mechanism in Python that allows you to create and manage tasks in one simple process. Threads are suitable for specific types of tasks, especially I/O-bound tasks, and can benefit from concurrent execution.
Python's threading module provides a high-level interface for creating and managing threads. While GIL (Global Interpreter Lock) limits threads in terms of true parallelism, they can still achieve concurrency by effectively interleaving tasks.
The code below shows an example implementing concurrency using threads. It uses the Python query library to send an HTTP query, a common block I/O task. It also uses the timing module to calculate execution time.
import requests import time import threading urls = [ 'https://www.google.com', 'https://www.wikipedia.org', 'https://www.makeuseof.com', ] # hàm truy vấn một URL def download_url(url): response = requests.get(url) print(f"Downloaded {url} - Status Code: {response.status_code}") # Thực thi không có luồng và đo thời gian thực hiện start_time = time.time() for url in urls: download_url(url) end_time = time.time() print(f"Sequential download took {end_time - start_time:.2f} secondsn") # Thực thi với luồng, reset thời gian đo thời điểm triển khai mới start_time = time.time() threads = [] for url in urls: thread = threading.Thread(target=download_url, args=(url,)) thread.start() threads.append(thread) # Đợi tất cả phân luồng hoàn thành for thread in threads: thread.join() end_time = time.time() print(f"Threaded download took {end_time - start_time:.2f} seconds")
Run this program and you will see that threaded queries run faster than ordered queries. Even though the difference is only a fraction of a second, you will clearly feel the improved performance when using threads for I/O related tasks.
Asynchronous programming with Asyncio
Asyncio provides an event looper that manages asynchronous tasks called coroutines. They are functions that you can pause and resume, making them ideal for I/O-related tasks. This library is especially useful for situations where the task involves waiting for external resources, such as network queries.
You can edit the previous query submission example to work with asyncio :
import asyncio import aiohttp import time urls = [ 'https://www.google.com', 'https://www.wikipedia.org', 'https://www.makeuseof.com', ] # hàm bất đồng bộ để truy vấn URL async def download_url(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: content = await response.text() print(f"Downloaded {url} - Status Code: {response.status}") # Hàm bất đồng bộ chính async def main(): # Tạo danh sách nhiệm vụ download đồng thời từng URL tasks = [download_url(url) for url in urls] # Thu thập và triển khai đồng thời các nhiệm vụ await asyncio.gather(*tasks) start_time = time.time() # Chạy hàm bất đồng bộ chính asyncio.run(main()) end_time = time.time() print(f"Asyncio download took {end_time - start_time:.2f} seconds")
Using this code, you can load web pages concurrently using asyncio and take advantage of asynchronous I/O operations. This can be more efficient than threading for I/O related tasks.
Parallelism in Python
You can implement parallelism using Python's multiprocessing module, allowing you to take full advantage of multicore processors.
Multicore processing in Python
Python's multiprocessing module provides a way to achieve parallelism by creating separate processes with their own Python interpreter and memory space. This effectively bypasses Global Interpreter Lock (GIL), making it suitable for CPU-intensive tasks.
import requests import multiprocessing import time urls = [ 'https://www.google.com', 'https://www.wikipedia.org', 'https://www.makeuseof.com', ] # hàm truy vấn một URL def download_url(url): response = requests.get(url) print(f"Downloaded {url} - Status Code: {response.status_code}") def main(): # Tạo một pool multiprocessing với một số lượng quá trình được chỉ định num_processes = len(urls) pool = multiprocessing.Pool(processes=num_processes) start_time = time.time() pool.map(download_url, urls) end_time = time.time() # Đóng pool và đợi cho toàn bộ quá trình hoàn tất pool.close() pool.join() print(f"Multiprocessing download took {end_time-start_time:.2f} seconds") main()
In this example, multiprocessing spawns multiple processes, allowing the download_url function to run in parallel.
When to use concurrency and parallelism?
Choosing between concurrency and parallelism depends on the nature of the task and the availability of hardware resources.
You can use concurrency when handling I/O-related tasks, such as reading and writing files or making network queries, and when memory constraints are a concern.
Use multiprocessing when you have CPU-intensive tasks that can benefit from true parallelism and when you have strong isolation between tasks, where the failure of one task will not affect the other tasks.
Above are the things you need to know about parallelism and concurrency in Python . Hope the article is useful to you.
You should read it
- How to set up Python to program on WSL
- Why should you learn Python programming language?
- What is Python? Why choose Python?
- 5 choose the best Python IDE for you
- Object-oriented programming in Python
- How to Start Programming in Python
- Multiple choice quiz about Python - Part 3
- Multiple choice quiz about Python - Part 1
- Multiple choice test on Python - Part 11
- Programming blockchain part 3: Python programming language
- If, if ... else, if ... elif ... else commands in Python
- Multiple choice quiz about Python - Part 8
Maybe you are interested
Does Desktop or Laptop Really Save You Money Over Time?
World's strongest acid: 10 million billion times stronger than 100% concentrated sulfuric acid
Instructions to hide date and time from Windows 11 Taskbar
How to share YouTube Music over time
It's time to switch to Passkey: The anti-phishing password alternative!
How to talk to a girl you like for the first time