28 Pivot Tables

28.1 Summarizing Simulation Results (Background)

You are tasked with solving a company’s truck routing problem. The company operates a fleet of trucks that deliver goods to customers and seeks to optimize its delivery routes to minimize costs and improve customer satisfaction. To evaluate different methods, you implemented a simulation that logs the results of various configurations into a .csv file for analysis.

You developed and tested four methods to solve the truck routing problem:

M1 = Simulation of the current routing strategy (baseline)
M2 = Greedy algorithm (Nearest Neighbor Heuristic)
M3 = Exact algorithm (Integer Linear Programming)
M4 = Metaheuristic algorithm (Simulated Annealing)

Each method has trade-offs in terms of solution quality and computational efficiency. For example, the exact algorithm guarantees the optimal solution but is computationally expensive, while the greedy algorithm is faster but may yield suboptimal results.

28.1.1 Metrics Collected

For each simulation run, the following key performance indicators (KPIs) were collected:

Distance Traveled (km)
Average Delivery Delay (hours)
Number of Customers Rejected
Runtime (seconds)

28.1.2 Configurations Tested

In your simulations, you tested different configurations to find the optimal solution for the company. The configurations include:

Objective Functions: Three objective functions were tested to minimize costs and improve customer satisfaction:
1. OF1 = Minimize total distance traveled (km)
2. OF2 = Minimize average delivery delay (hours)
3. OF3 = Minimize the number of customers rejected
Service Levels: Two service levels were evaluated to assess their impact on results:
1. SL1 = Reject customers if the service is infeasible
2. SL2 = Serve all customers (no rejections allowed)
Number of Customers: Two demand sizes were considered to test the scalability of the methods:
1. 100 customers
2. 500 customers
Demand Distribution: 10 random demand distributions were generated for each demand size, varying the locations and the demand needs of customers.

The fleet size was kept constant across all simulations.

28.1.3 Experimental Setup

To find the best solution, you ran simulations for all combinations of methods, objective functions, service levels, number of customers, and demand distributions. This resulted in 480 runs (4 methods × 3 objective functions × 2 service levels × 2 demand size × 10 distributions), each representing a different scenario of the truck routing problem.

Your algorithm has been implemented, and you have completed the simulations. The results of each simulation run are stored in a .csv file named experimental_setup_results.csv. The file contains the following columns:

Method: The method used for the simulation (M1, M2, M3, M4).
Objective Function: The objective function used for the simulation (OF1, OF2, OF3).
Service Level: The service level used for the simulation (SL1, SL2).
Number of Customers: The Number of Customers used for the simulation (100, 500).
Distance Traveled (km): The total distance traveled in kilometers.
Avg. Delivery Delay (hours): The average delivery delay in hours.
Number of Customers Rejected: The number of customers rejected.
Runtime (seconds): The runtime of the simulation in seconds.

Table 28.1: Experimental Results (top 10 rows)

Method	Number of Customers	Objective Function	Distribution	Service Level	Distance Traveled (km)	Avg. Delivery Delay (hours)	Number of Customers Rejected	Runtime (seconds)
M1	100	OF1	1	SL1	412.362036	19.507143	20	88.984550
M1	100	OF1	2	SL1	479.055047	14.458328	20	72.962445
M1	100	OF1	3	SL1	400.112583	11.428668	12	51.029225
M1	100	OF1	4	SL1	590.972956	18.324426	15	50.038938
M1	100	OF1	5	SL1	597.663468	16.174815	21	51.153121
M1	100	OF1	6	SL1	457.432398	13.998610	25	61.638567
M1	100	OF1	7	SL1	327.181930	16.183860	21	75.711722
M1	100	OF1	8	SL1	477.724371	10.464504	28	84.015377
M1	100	OF1	9	SL1	435.149776	10.132650	23	90.419867
M1	100	OF1	10	SL1	391.384131	10.976721	29	62.051273

For the following questions, you will use the simulated data stored in the experimental_setup_results.csv file to analyze the performance of the different methods and configurations (you can download the file from here). You have to decide how to structure a series of pivot tables in Excel to achieve this analysis, namely, what fields to place in the Rows, Columns, Filters, and Values areas, and how to aggregate the data (e.g., Sum, Average, Count).

28.2 Average Distance Traveled by Method, Objective Function, and Number of Customers

You decide to create a pivot table in Excel to summarize the average distance traveled for each method, objective function, and demand size to understand how these factors influence the total distance traveled.

Table 28.2: Pivot Table for Average Distance Traveled (km) by Method, Objective Function, and Number of Customers

#	Method	Number of Customers	OF1	OF2	OF3
1	M1	100	471.92	455.38	436.84
2	M1	500	473.34	473.46	455.38
3	M2	100	359.74	361.00	326.89
4	M2	500	329.33	334.09	372.29
5	M3	100	261.10	260.90	250.68
6	M3	500	250.18	280.16	248.51
7	M4	100	313.10	284.41	318.93
8	M4	500	285.85	292.20	265.00

Which of the following alternatives correctly describes the configuration of the pivot table in Table 28.2?

Rows: Number of Customers, Method / Values: Average of Distance Traveled (km) / Filters: None / Columns: Objective Function
Rows: Objective Function, Method / Values: Average of Distance Traveled (km) / Filters: Number of Customers / Columns: None
Rows: Method, Objective Function / Values: Distance Traveled (km) / Filters: None / Columns: Number of Customers
Rows: Method, Number of Customers / Values: Average of Distance Traveled (km) / Filters: None / Columns: Objective Function

28.3 Average Delay of Customers by Objective Function

You decide to create a pivot table in Excel to summarize the average delivery delay for each objective function to identify which objective function leads to the lowest average delivery delay. In this analysis, you want to filter the results by the service level SL2 to focus on scenarios where all customers are served without rejections.

Table 28.3: Pivot Table Structure for Average Delay of Customers by Objective Function Filtered by Service Level

#	Objective Function	Average of Avg. Delivery Delay (hours)
1	OF1	10.95
2	OF2	10.83
3	OF3	10.96

Which of the following alternatives correctly describes the configuration of the pivot table in Table 28.3?

Rows: None / Values: Average of Avg. Delivery Delay (hours) / Filters: Service Level = SL1 / Columns: Objective Function
Rows: Objective Function / Values: Average of Avg. Delivery Delay (hours) / Filters: Service Level = SL1 / Columns: None
Rows: Objective Function / Values: Average of Avg. Delivery Delay (hours) / Filters: Service Level = SL2 / Columns: None
Rows: None / Values: Average of Avg. Delivery Delay (hours) / Filters: Service Level = SL2 / Columns: Objective Function

28.4 Total Run Time per Method

You want to analyze the total runtime of each method across all configurations to identify the most time-consuming method. You decide to create a pivot table in Excel to summarize the total runtime for each method.

Table 28.4: Pivot Table Structure for Total Runtime per Method

#	Method	Sum of Runtime (seconds)
1	M1	8,805.93
2	M2	4,853.98
3	M3	23,805.86
4	M4	12,517.24

Which of the following alternatives correctly describes the configuration of the pivot table in Table 28.4?

Rows: None, Values: Count of Runtime (seconds), Filters: None, Columns: None
Rows: Method, Values: Sum of Runtime (seconds), Filters: None, Columns: None
Rows: None, Values: Sum of Runtime (seconds), Filters: None, Columns: Sum of Runtime (seconds)
Rows: Method, Values: Sum of Runtime (seconds), Filters: Runtime (seconds), Columns: Sum of Runtime (seconds)

28.5 Histogram of Trip Distance in NYC Rides Dataset

You have a dataset of New York City historical taxi rides in the worksheet “NYC Rides”. The dataset contains 1,000 rows of information about the past trips. The top-10 data rows of the dataset are shown in Table 28.5.

Table 28.5: NYC Rides Dataset. The columns include pickup and dropoff datetime, passenger count, trip distance (in miles), pickup and dropoff location ID, and total amount (the fare in USD). The dataset is in worksheet “NYC Rides” and contains 1,000 rows (the top 10 rows are shown).

	A	B	C	D	E	F	G
1	pickup_datetime	dropoff_datetime	pickup_location_id	passenger_count	trip_distance	dropoff_location_id	total_amount
2	2018-06-21 05:52	2018-06-21 06:00	148	1	1.580000	231	10.560000
3	2018-06-21 06:17	2018-06-21 06:21	140	1	0.530000	262	7.300000
4	2018-06-21 06:26	2018-06-21 06:53	230	1	1.690000	141	20.160000
5	2018-06-21 06:28	2018-06-22 05:57	68	1	1.640000	230	21.960000
6	2018-06-21 06:48	2018-06-21 07:03	236	1	1.660000	161	14.160000
7	2018-06-21 07:38	2018-06-21 07:52	237	2	2.300000	234	14.760000
8	2018-06-21 08:05	2018-06-21 08:11	79	6	1.020000	137	8.300000
9	2018-06-21 08:26	2018-06-22 07:51	230	5	1.720000	90	20.760000
10	2018-06-21 09:04	2018-06-21 09:07	138	3	11.270000	163	61.060000
11	2018-06-21 11:41	2018-06-21 11:48	43	1	1.460000	43	10.560000

You want to analyze the distribution of trip distances using a histogram. This way, you can understand the frequency of different trip distances and identify any patterns in the data. For example, you might want to know how many trips fall within specific distance ranges and whether there are any outliers. To create the histogram, you decide to use the Analysis Toolpak in Excel. The Analysis Toolpak provides a convenient way to generate histograms and other statistical analyses in Excel.

A critical step in creating a histogram is to define the number of bins or intervals into which the data will be divided. This helps to group the data into meaningful categories and visualize the distribution more clearly.

In this case, you decide to create a histogram of the column trip_distance with ten bins. This means that the data will be divided into ten equally spaced intervals, and the frequency of trips falling within each interval will be displayed in the histogram. To configure these bins, you create a header “Bins (Upper Bound)” in column “J” (see Table 28.6).

Table 28.6: Bins (Upper Bound) for the Histogram. The upper bounds of the bins are placed in column “J” starting from cell J1.

	J
1	Bins (Upper Bound)
2
3
4
5
6
7
8
9
10
11

What formulas or values should be entered in cells J2. to J11. to define the ten bins for the histogram of the trip_distance column?

Which of the following alternatives best describes the configuration of the bins for the histogram of the trip_distance column in Excel?

Cell J2 : =MIN(E2:E1001) + (MAX(E2:E1001) - MIN(E2:E1001)) / 10 / Cell J3 := J2 + (MAX(E2:E1001) - MIN(E2:E1001)) / 10, and drag down from J3 to J11.
Cell J2 : =MIN(E2:E1001) + MAX(E2:E1001) / 10, Cell J3 := J2 + MAX(E2:E1001) / 10, and drag down from J3 to J11.
Cell J2 : =MIN($E$2:$E$1001) + (MAX($E$2:$E$1001) - MIN($E$2:$E$1001)) / 10 / Cell J3 := J2 + (MAX($E$2:$E$1001) - MIN($E$2:$E$1001)) / 10, and drag down from J3 to J11.
Cell J2 : =MIN($E$2:$E$1001) + MAX($E$2:$E$1001) / 10, Cell $J3 := J2 + MAX($E$2:$E$1001) / 10, and drag down from $J3 to $J11.