14 Modules and Packages
In Python, a module is a file containing Python code. A module can define functions, classes, and variables. A package is a collection of modules. Packages are used to organize related modules and provide a way to manage namespaces.
In the following sections, youβll learn how to create and use modules and packages in Python.
14.1 Modules
A module is a file containing Python definitions and statements. The file name is the module name with the .py extension appended. You can import a module using the import statement.
For example, Listing 14.1 presents a model util.py, which contains a series of time calculation functions. To use these functions in another Python script, you can import the util module using the import statement.
util.py (module util).
# util.py
# Constants
AVG_SPEED_KMPH = 50
def calculate_time_h(
distance_km,
speed_kmph=AVG_SPEED_KMPH):
"""Calculate time in hours"""
time_hours = distance_km / speed_kmph
return time_hours
def to_minutes(time_hours):
"""Convert time in hours to minutes"""
return time_hours * 60
def is_time_between(start_time, end_time, check_time):
"""Check if the check_time is between start_time and end_time"""
return start_time <= check_time <= end_time 14.1.1 Importing a Module into a Script
In Listing 14.2, the util module is imported in a Python script, and the functions defined in the module are used to calculate time and check if a time is between two other times. To use the functions, you need to prefix the function name with the module name. Your IDE may provide code completion to help you find the function names as soon as you type the module name and a dot (e.g., util.).
# main.py
import util
# Calculate time to travel 100 km at 50 km/h
time_hours = util.calculate_time_h(100, 50)
print(f"Travel Time: {time_hours:.2f} hours")
# Convert time to minutes
time_minutes = util.to_minutes(time_hours)
# Average speed in km/h
print(f"Average speed: {util.AVG_SPEED_KMPH} km/h")Example of using the util module by importing it in a Python script main.py. In this case, util.py and main.py have to be in the same directory.
14.1.2 Importing Specific Functions from a Module
You can import specific functions from a module using the from statement. This way, you can use the function names directly without prefixing them with the module name.
For example, Listing 14.3 shows how to import the is_time_between function from the util module. In this case, you can use the function directly without prefixing it with the module name.
is_time_between function from the util module in a Python script main.py.
You can also import multiple functions from a module by separating the function names with commas in the from statement. If you need to add new lines to the from statement, you can use parentheses to group the function names. For example, in Listing 14.4, the calculate_time_h and to_minutes functions are imported from the util module.
util module in a Python script main.py.
14.1.3 Importing All Functions from a Module
You can import all functions from a module using the * wildcard in the from statement. This way, you can use the function names directly without prefixing them with the module name.
For example, Listing 14.5 shows how to import all functions from the util module. In this case, you can use the function names directly without prefixing them with the module name.
util module in a Python script main.py.
Although importing all functions from a module using the * wildcard can save you time, it is generally not recommended because it can lead to naming conflicts and make the code less readable. It is better to import specific functions or use the module name as a prefix when calling the functions.
14.1.4 Aliasing Module Names
You can alias a module name when importing it using the as keyword. This way, you can use the alias instead of the full module name in your code.
For example, Listing 14.6 shows how to import the util module with the alias u. In this case, you can use the alias u instead of the full module name util when calling the functions.
14.1.5 Importing in the Interactive Interpreter
You can also import modules in the Python interactive interpreter. When you import a module, you can use the functions and variables defined in the module directly in the interpreter.
For example, Listing 14.7 shows how to import the util module in the Python interactive interpreter and use the functions defined in the module. You have to know the path to the module file to import it in the interpreter or make sure the module is in the current directory.
util module in the Python interactive interpreter and using the functions defined in the module. The util.py file is in the C:\dev\project directory. The ls command lists the files in the directory, to make sure the util.py file is present, otherwise the import will fail.
14.1.6 Using Modules as Scripts
When you run a Python module as a script, the code in the module is executed. You can use the __name__ variable to check if the module is being run as a script or imported into another module. If the module is being run as a script, the __name__ variable is set to "__main__".
For example, Listing 14.8 shows how to define a function in a module that is executed only when the module is run as a script. In this case, the main function is executed when the module is run as a script.
main) in the module util that is executed only when the module is run as a script.
# util.py
def calculate_time_h(
distance_km,
speed_kmph=50):
"""Calculate time in hours"""
time_hours = distance_km / speed_kmph
return time_hours
def to_minutes(time_hours):
"""Convert time in hours to minutes"""
return time_hours * 60
def main():
"""Main function"""
print("Running util.py as a script")
time_hours = calculate_time_h(100, 50)
print(f"Travel Time: {time_hours:.2f} hours")
if __name__ == "__main__":
main()To run the module as a script, you can use the following command:
When you run the module as a script, the main function is executed, and the output is displayed in the console.
if __name__ == '__main__'
Using the if __name__ == '__main__' statement allows you to define code that is executed only when the module is run as a script. This is useful when you want to define functions that are used only when the module is run as a script or to run tests when the module is run as a script.
14.2 Packages
A package is a collection of modules that are organized in a directory structure. Packages are used to group related modules and provide a way to manage namespaces. A package is a directory that contains an __init__.py file and one or more modules. The __init__.py file can be empty or contain initialization code for the package.
For example, suppose you want to create a package to help you solve vehicle routing problems (VRP). You can create a package named vrp that contains modules for defining VRP models and utility functions. In Figure 14.1, you can see an example of the package structure for a vrp package inside a project folder.
project/ Project folder
β
βββ vrp/ Top-level package
β βββ __init__.py Package initialization file
β βββ model/ Module for defining VRP models
β β βββ __init__.py Subpackage initialization file
β β βββ milp.py Module for mixed-integer linear programming models
β β βββ heuristics.py Module for heuristic algorithms
β β βββ ...
β βββ util/ Subpackage for utility functions
β β βββ __init__.py Subpackage initialization file
β β βββ time.py Module for time-related functions
β β βββ distance.py Module for distance-related functions
β β βββ network.py Module for network-related functions
β β βββ ...
βββ main.py Main scriptvrp package contains the model subpackage and the util subpackage. The model subpackage contains the milp.py and heuristics.py modules, and the util subpackage contains the time.py, distance.py, and network.py modules (the ... indicates other modules). The main.py script is the main entry point for the application.
__init__.py?
The __init__.py files are required to make Python treat directories containing the file as packages. It is just a way to tell Python that the directory is a package and not just a regular directory.
14.2.1 Importing Modules from a Package
You can import modules from a package using the import statement. When importing a module from a package, you need to specify the package name and the module name separated by a dot.
For example, Listing 14.9 shows how to import the model module from the vrp package. In this case, you need to prefix the module name with the package name when importing it.
main.py. Importing the milp module from the model subpackage in the vrp package. We assume that there is a create_vrp_model function in the milp.py module.
If you want to import a module from a package and use it without prefixing it with the package name, you can use the from statement. In Listing 14.10, we import a function from time.py in the util package and use it directly without prefixing it with the package name.
14.3 Importing Modules and Packages in IPython
In IPython, you can import modules and packages using the import statement. You can also use the from statement to import specific functions from a module or package. However, the file path must be in the Python path or the current directory to import the module or package. You can add the file path to the Python path using the sys.path.append() function.
For example, Listing 14.11 shows how to import the util module in IPython and use the functions defined in the module. In this case, the util.py file is in the current directory, so you can import the module directly.
util module in IPython and using the functions defined in the module. The util.py file is in the current directory.
If the module is not in the current directory, you can add the file path to the Python path using the sys.path.append() function. For example, suppose the project structure is as shown in Figure 25.1.
project/ Project folder
β
βββ vrp/ Top-level package
β βββ __init__.py Package initialization file
β βββ model/ Module for defining VRP models
β β βββ __init__.py Subpackage initialization file
β β βββ milp.py Module for mixed-integer linear programming models
β β βββ heuristics.py Module for heuristic algorithms
β β βββ ...
β βββ util/ Subpackage for utility functions
β β βββ __init__.py Subpackage initialization file
β β βββ time.py Module for time-related functions
β β βββ distance.py Module for distance-related functions
β β βββ network.py Module for network-related functions
β β βββ ...
βββ main.py Main script
βββ notebooks/ Folder for Jupyter notebooks
βββ vrp_analysis.ipynb Notebook for VRP analysisvrp package and a notebooks folder for Jupyter notebooks. The notebook vrp_analysis.ipynb is in the notebooks folder and needs to import the util module from the util package in the vrp package.
To import the util module in the vrp_analysis.ipynb notebook, you need to add the vrp directory to the Python path using the sys.path.append() function. In Listing 14.12, the vrp directory is added to the Python path, and the util module is imported and used in the notebook.
../ directory to the Python path in IPython and importing the util module in the vrp_analysis.ipynb notebook.
The sys.path.append(../) function adds the ../ directory to the Python path, allowing you to import the util module in the notebook. The ../ is used to move up one directory level from the notebooks folder to the project folder. It is known as a relative path (a path relative to the current directory). Now, the notebook can βseeβ the vrp package and import the util module. The sys.path variable contains a list of directories where Python looks for modules when importing them. By adding the directory containing the module to the Python path, you can import the module in IPython. It would also work if added the absolute path to the sys.path.append() function (e.g., sys.path.append('C:/dev/project')).
14.4 Interesting Packages
Python has a vast ecosystem of packages that you can use to extend the functionality of your applications. Here are some popular Python packages that you might find interesting:
- Mathematics and Statistics:
- Data Manipulation and Analysis:
- Pandas: A data manipulation library that provides data structures like DataFrames and Series for working with structured data (Pandas).
- pickle: A module for serializing and deserializing Python objects to and from a byte stream (pickle).
- json: A module for encoding and decoding JSON data (json).
- csv: A module for reading and writing CSV files (csv).
- Visualization:
- Matplotlib: A plotting library that allows you to create a wide variety of plots and charts (Matplotlib).
- Seaborn: A data visualization library based on Matplotlib that provides a high-level interface for creating attractive and informative statistical graphics (Seaborn).
- Plotly: An interactive plotting library that allows you to create interactive plots and dashboards (Plotly).
- Bokeh: A library for creating interactive visualizations and dashboards in web browsers (Bokeh).
- Map and Geospatial Data:
- Folium: A library for creating interactive maps and visualizing geospatial data (Folium).
- Geopandas: A library for working with geospatial data that extends the capabilities of Pandas (Geopandas).
- Shapely: A library for geometric operations like creating, analyzing, and manipulating planar geometric objects (Shapely).
- Graphs and Networks:
- Machine Learning and Data Mining:
- Scikit-learn: A machine learning library that provides tools for data mining and data analysis (Scikit-learn).
- TensorFlow: An open-source machine learning library developed by Google for building and training machine learning models (TensorFlow).
- PyTorch: An open-source machine learning library developed by Facebook for building and training machine learning models (PyTorch).
- Web Development:
- Testing and Quality Assurance:
- Mixed Integer Linear Programming:
- PuLP: A linear programming library that provides a high-level API for defining and solving optimization problems (PuLP).
- Gurobi: A commercial optimization solver that provides a Python API for solving mixed-integer linear programming problems (Gurobi).
- CPLEX: A commercial optimization solver that provides a Python API for solving mixed-integer linear programming problems (CPLEX).
- SCIP: A linear programming library that provides a Python API for solving mixed-integer linear programming problems (SCIP).
- Google OR-Tools: A library for optimization problems that provides a Python API for solving mixed-integer linear programming problems (Google OR-Tools).
- COIN-OR: A collection of open-source optimization solvers that provide Python APIs for solving mixed-integer linear programming problems (COIN-OR).
- GLPK: An open-source linear programming library that provides a Python API for solving optimization problems (GLPK).
- CBC: An open-source linear programming library that provides a Python API for solving optimization problems ([CBC](
- Testing and Quality Assurance:
- Web Scraping and Automation:
- Requests: A library for making HTTP requests in Python ([Requests](https://docs.python-requests.org/en/master
- Beautiful Soup: A library for parsing HTML and XML documents (Beautiful Soup).
- Game Development:
- Computer Vision and Image Processing:
- Natural Language Processing:
- NLTK: A library for natural language processing tasks like tokenization, stemming, and part-of-speech tagging (NLTK).
- spaCy: An open-source library for natural language processing tasks like named entity recognition, part-of-speech tagging, and dependency parsing (spaCy).
- TextBlob: A library for processing textual data that provides tools for sentiment analysis, part-of-speech tagging, and noun phrase extraction (TextBlob).
These are just a few examples of the many Python packages available for different domains and use cases. You can explore the Python Package Index (PyPI) to discover more packages and libraries that can help you in your projects.
14.5 Example: The random Module
The random module in Python provides functions for generating random numbers. Generating random numbers is useful in various applications, such as simulations, games, and testing.
Here are some examples of how to use the random module:
The import random statement imports the random module, allowing you to use its functions. Therefore, random is Python file named random.py that comes with the standard library. A library is a term used to describe a collection of modules and packages that provide specific functionality.
Standard library modules are included with the Python installation. You can find them in the Lib directory of your Python installation. For example, if you installed Python in C:\Python, the standard library modules would be located in C:\Python\Lib. In VS Code, you can usually navigate to the module definition by holding the Ctrl key (or Cmd on macOS) and clicking on the module name in the import statement.
Computers are deterministic machines, meaning they follow a set of predefined rules and instructions. Therefore, generating true randomness is challenging. The random module uses algorithms to generate pseudo-random numbers, which are not truly random but can be used for most applications that require randomness.
14.5.1 The Seed Function
The seed function in the random module initializes the random number generator. By setting a seed value, you can ensure that the sequence of random numbers generated is reproducible. This is useful for testing and debugging purposes: when a seed value is not set, the random number generator uses the current system time as the seed value, resulting in different sequences of random numbers each time the program is run. Figure 14.3 shows how to inspect the docstring of the seed function in the random module using VS Code. and Listing 14.13 presents an example of using the seed function to generate reproducible random numbers.
random module in VS Code and the seed function to visualize their docstring.
seed function in the random module to generate reproducible random numbers. Every time you run this code, it will produce the same random integer and float because the seed value is set to 42. When you do not set the seed, the random numbers generated will be different each time you run the code. Python uses the current system time as the default seed value when you do not set it explicitly.
import random
# Set the seed for reproducibility.
# From now on, all random numbers generated
# using the random module will be the same
random.seed(42)
# Generate a random integer between 1 and 10
random_integer = random.randint(1, 10)
# Generate a random float between 0 and 1
random_float = random.random()
print(f"Random Integer: {random_integer}")
print(f"Random Float: {random_float}")When you run the code in Listing 14.13 multiple times, it will always produce the same random integer because the seed value is set to 42. If you change the seed value, the sequence of random numbers generated will be different.
The number 42 is often used as a seed value in examples and tutorials because it is a reference to the book βThe Hitchhikerβs Guide to the Galaxyβ by Douglas Adams, where 42 is humorously described as the βAnswer to the Ultimate Question of Life, the Universe, and Everything.β You will see 42 used in many programming examples :-)
By now, do not worry too much about the details of how the random module works. The important takeaway is that Python provides a built-in module for generating random numbers, and you can use it in your programs by importing the module. We will cover more about randomness later when we discuss simulations.
14.6 Exercises
14.6.1 Recreating math Module Functions
Create a module named my_math.py that contains functions for basic mathematical operations. Then, create a script that imports the my_math module and uses its functions to perform calculations.