Method | Number of Customers | Objective Function | Distribution | Service Level | Distance Traveled (km) | Avg. Delivery Delay (hours) | Number of Customers Rejected | Runtime (seconds) |
---|---|---|---|---|---|---|---|---|
M1 | 100 | OF1 | 1 | SL1 | 412.362036 | 19.507143 | 20 | 88.984550 |
M1 | 100 | OF1 | 2 | SL1 | 479.055047 | 14.458328 | 20 | 72.962445 |
M1 | 100 | OF1 | 3 | SL1 | 400.112583 | 11.428668 | 12 | 51.029225 |
M1 | 100 | OF1 | 4 | SL1 | 590.972956 | 18.324426 | 15 | 50.038938 |
M1 | 100 | OF1 | 5 | SL1 | 597.663468 | 16.174815 | 21 | 51.153121 |
M1 | 100 | OF1 | 6 | SL1 | 457.432398 | 13.998610 | 25 | 61.638567 |
M1 | 100 | OF1 | 7 | SL1 | 327.181930 | 16.183860 | 21 | 75.711722 |
M1 | 100 | OF1 | 8 | SL1 | 477.724371 | 10.464504 | 28 | 84.015377 |
M1 | 100 | OF1 | 9 | SL1 | 435.149776 | 10.132650 | 23 | 90.419867 |
M1 | 100 | OF1 | 10 | SL1 | 391.384131 | 10.976721 | 29 | 62.051273 |
28 Pivot Tables
28.1 Summarizing Simulation Results (Background)
You are tasked with solving a companyβs truck routing problem. The company operates a fleet of trucks that deliver goods to customers and seeks to optimize its delivery routes to minimize costs and improve customer satisfaction. To evaluate different methods, you implemented a simulation that logs the results of various configurations into a .csv
file for analysis.
You developed and tested four methods to solve the truck routing problem:
- M1 = Simulation of the current routing strategy (baseline)
- M2 = Greedy algorithm (Nearest Neighbor Heuristic)
- M3 = Exact algorithm (Integer Linear Programming)
- M4 = Metaheuristic algorithm (Simulated Annealing)
Each method has trade-offs in terms of solution quality and computational efficiency. For example, the exact algorithm guarantees the optimal solution but is computationally expensive, while the greedy algorithm is faster but may yield suboptimal results.
28.1.1 Metrics Collected
For each simulation run, the following key performance indicators (KPIs) were collected:
- Distance Traveled (km)
- Average Delivery Delay (hours)
- Number of Customers Rejected
- Runtime (seconds)
28.1.2 Configurations Tested
In your simulations, you tested different configurations to find the optimal solution for the company. The configurations include:
- Objective Functions: Three objective functions were tested to minimize costs and improve customer satisfaction:
- OF1 = Minimize total distance traveled (km)
- OF2 = Minimize average delivery delay (hours)
- OF3 = Minimize the number of customers rejected
- Service Levels: Two service levels were evaluated to assess their impact on results:
- SL1 = Reject customers if the service is infeasible
- SL2 = Serve all customers (no rejections allowed)
- Number of Customers: Two demand sizes were considered to test the scalability of the methods:
- 100 customers
- 500 customers
- Demand Distribution: 10 random demand distributions were generated for each demand size, varying the locations and the demand needs of customers.
The fleet size was kept constant across all simulations.
28.1.3 Experimental Setup
To find the best solution, you ran simulations for all combinations of methods, objective functions, service levels, number of customers, and demand distributions. This resulted in 480 runs (4 methods Γ 3 objective functions Γ 2 service levels Γ 2 demand size Γ 10 distributions), each representing a different scenario of the truck routing problem.
Your algorithm has been implemented, and you have completed the simulations. The results of each simulation run are stored in a .csv
file named experimental_setup_results.csv
. The file contains the following columns:
Method
: The method used for the simulation (M1, M2, M3, M4).Objective Function
: The objective function used for the simulation (OF1, OF2, OF3).Service Level
: The service level used for the simulation (SL1, SL2).Number of Customers
: The Number of Customers used for the simulation (100, 500).Distance Traveled (km)
: The total distance traveled in kilometers.Avg. Delivery Delay (hours)
: The average delivery delay in hours.Number of Customers Rejected
: The number of customers rejected.Runtime (seconds)
: The runtime of the simulation in seconds.
For the following questions, you will use the simulated data stored in the experimental_setup_results.csv
file to analyze the performance of the different methods and configurations (you can download the file from here). You have to decide how to structure a series of pivot tables in Excel to achieve this analysis, namely, what fields to place in the Rows, Columns, Filters, and Values areas, and how to aggregate the data (e.g., Sum, Average, Count).
28.2 Average Distance Traveled by Method, Objective Function, and Number of Customers
You decide to create a pivot table in Excel to summarize the average distance traveled for each method, objective function, and demand size to understand how these factors influence the total distance traveled.
# | Method | Number of Customers | OF1 | OF2 | OF3 |
---|---|---|---|---|---|
1 | M1 | 100 | 471.92 | 455.38 | 436.84 |
2 | M1 | 500 | 473.34 | 473.46 | 455.38 |
3 | M2 | 100 | 359.74 | 361.00 | 326.89 |
4 | M2 | 500 | 329.33 | 334.09 | 372.29 |
5 | M3 | 100 | 261.10 | 260.90 | 250.68 |
6 | M3 | 500 | 250.18 | 280.16 | 248.51 |
7 | M4 | 100 | 313.10 | 284.41 | 318.93 |
8 | M4 | 500 | 285.85 | 292.20 | 265.00 |
Which of the following alternatives correctly describes the configuration of the pivot table in Table 28.2?
- Rows: Number of Customers, Method / Values: Average of Distance Traveled (km) / Filters: None / Columns: Objective Function
- Rows: Objective Function, Method / Values: Average of Distance Traveled (km) / Filters: Number of Customers / Columns: None
- Rows: Method, Objective Function / Values: Distance Traveled (km) / Filters: None / Columns: Number of Customers
- Rows: Method, Number of Customers / Values: Average of Distance Traveled (km) / Filters: None / Columns: Objective Function
28.3 Average Delay of Customers by Objective Function
You decide to create a pivot table in Excel to summarize the average delivery delay for each objective function to identify which objective function leads to the lowest average delivery delay. In this analysis, you want to filter the results by the service level SL2 to focus on scenarios where all customers are served without rejections.
# | Objective Function | Average of Avg. Delivery Delay (hours) |
---|---|---|
1 | OF1 | 10.95 |
2 | OF2 | 10.83 |
3 | OF3 | 10.96 |
Which of the following alternatives correctly describes the configuration of the pivot table in Table 28.3?
- Rows: None / Values: Average of Avg. Delivery Delay (hours) / Filters: Service Level = SL1 / Columns: Objective Function
- Rows: Objective Function / Values: Average of Avg. Delivery Delay (hours) / Filters: Service Level = SL1 / Columns: None
- Rows: Objective Function / Values: Average of Avg. Delivery Delay (hours) / Filters: Service Level = SL2 / Columns: None
- Rows: None / Values: Average of Avg. Delivery Delay (hours) / Filters: Service Level = SL2 / Columns: Objective Function
28.4 Total Run Time per Method
You want to analyze the total runtime of each method across all configurations to identify the most time-consuming method. You decide to create a pivot table in Excel to summarize the total runtime for each method.
# | Method | Sum of Runtime (seconds) |
---|---|---|
1 | M1 | 8,805.93 |
2 | M2 | 4,853.98 |
3 | M3 | 23,805.86 |
4 | M4 | 12,517.24 |
Which of the following alternatives correctly describes the configuration of the pivot table in Table 28.4?
- Rows: None, Values: Count of Runtime (seconds), Filters: None, Columns: None
- Rows: Method, Values: Sum of Runtime (seconds), Filters: None, Columns: None
- Rows: None, Values: Sum of Runtime (seconds), Filters: None, Columns: Sum of Runtime (seconds)
- Rows: Method, Values: Sum of Runtime (seconds), Filters: Runtime (seconds), Columns: Sum of Runtime (seconds)
28.5 Histogram of Trip Distance in NYC Rides Dataset
You have a dataset of New York City historical taxi rides in the worksheet βNYC Rides
β. The dataset contains 1,000 rows of information about the past trips. The top-10 data rows of the dataset are shown in Table 28.5.
NYC Rides
β and contains 1,000 rows (the top 10 rows are shown).
A | B | C | D | E | F | G | |
---|---|---|---|---|---|---|---|
1 | pickup_datetime | dropoff_datetime | pickup_location_id | passenger_count | trip_distance | dropoff_location_id | total_amount |
2 | 2018-06-21 05:52 | 2018-06-21 06:00 | 148 | 1 | 1.580000 | 231 | 10.560000 |
3 | 2018-06-21 06:17 | 2018-06-21 06:21 | 140 | 1 | 0.530000 | 262 | 7.300000 |
4 | 2018-06-21 06:26 | 2018-06-21 06:53 | 230 | 1 | 1.690000 | 141 | 20.160000 |
5 | 2018-06-21 06:28 | 2018-06-22 05:57 | 68 | 1 | 1.640000 | 230 | 21.960000 |
6 | 2018-06-21 06:48 | 2018-06-21 07:03 | 236 | 1 | 1.660000 | 161 | 14.160000 |
7 | 2018-06-21 07:38 | 2018-06-21 07:52 | 237 | 2 | 2.300000 | 234 | 14.760000 |
8 | 2018-06-21 08:05 | 2018-06-21 08:11 | 79 | 6 | 1.020000 | 137 | 8.300000 |
9 | 2018-06-21 08:26 | 2018-06-22 07:51 | 230 | 5 | 1.720000 | 90 | 20.760000 |
10 | 2018-06-21 09:04 | 2018-06-21 09:07 | 138 | 3 | 11.270000 | 163 | 61.060000 |
11 | 2018-06-21 11:41 | 2018-06-21 11:48 | 43 | 1 | 1.460000 | 43 | 10.560000 |
You want to analyze the distribution of trip distances using a histogram. This way, you can understand the frequency of different trip distances and identify any patterns in the data. For example, you might want to know how many trips fall within specific distance ranges and whether there are any outliers. To create the histogram, you decide to use the Analysis Toolpak in Excel. The Analysis Toolpak provides a convenient way to generate histograms and other statistical analyses in Excel.
A critical step in creating a histogram is to define the number of bins or intervals into which the data will be divided. This helps to group the data into meaningful categories and visualize the distribution more clearly.
In this case, you decide to create a histogram of the column trip_distance
with ten bins. This means that the data will be divided into ten equally spaced intervals, and the frequency of trips falling within each interval will be displayed in the histogram. To configure these bins, you create a header βBins (Upper Bound)β in column βJβ (see Table 28.6).
J1
.
J | |
---|---|
1 | Bins (Upper Bound) |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 |
What formulas or values should be entered in cells J2
. to J11
. to define the ten bins for the histogram of the trip_distance
column?
Which of the following alternatives best describes the configuration of the bins for the histogram of the trip_distance
column in Excel?
- Cell
J2
:=MIN(E2:E1001)
+
(MAX(E2:E1001)
-
MIN(E2:E1001)) / 10
/ CellJ3
:= J2
+
(MAX(E2:E1001)
-
MIN(E2:E1001)) / 10
, and drag down fromJ3
toJ11
. - Cell
J2
:=MIN(E2:E1001)
+
MAX(E2:E1001) / 10
, CellJ3
:= J2
+
MAX(E2:E1001) / 10
, and drag down fromJ3
toJ11
. - Cell
J2
:=MIN($E$2:$E$1001)
+
(MAX($E$2:$E$1001)
-
MIN($E$2:$E$1001)) / 10
/ CellJ3
:= J2
+
(MAX($E$2:$E$1001)
-
MIN($E$2:$E$1001)) / 10
, and drag down fromJ3
toJ11
. - Cell
J2
:=MIN($E$2:$E$1001)
+
MAX($E$2:$E$1001) / 10
, Cell$J3
:= J2
+
MAX($E$2:$E$1001) / 10
, and drag down from$J3
to$J11
.
Answer: Cell J2
: =MIN($E$2:$E$1001)
+
(MAX($E$2:$E$1001)
-
MIN($E$2:$E$1001)) / 10
/ Cell J3
:= J2
+
(MAX($E$2:$E$1001)
-
MIN($E$2:$E$1001)) / 10
, and drag down from J3
to J11
.
Explanation: The bin size is calculated as the range of the data divided by the number of bins (10 in this case). The formula in cell J2
calculates the bin size by subtracting the minimum value from the maximum value of the trip_distance
column and dividing by 10. The formula in cell $J3
then calculates the upper bound of each bin by adding the bin size to the previous upper bound. By using absolute references ($
) for the cell ranges, the formula can be dragged down to fill the remaining cells correctly.